News and Music Discovery
Play Live Radio
Next Up:
0:00
0:00
0:00 0:00
Available On Air Stations

A judge ordered Google to share its search data. What does that mean for user privacy?

A cursor moves over Google's search engine page, in Portland, Ore., on Aug. 28, 2018. The most valuable information Google is being forced to share under Judge Amit Mehta's ruling is users' click-and-query data. That refers to the information Google collects from users about what they search for (the queries) and what links they select (the clicks).
Don Ryan/AP
/
AP
A cursor moves over Google's search engine page, in Portland, Ore., on Aug. 28, 2018. The most valuable information Google is being forced to share under Judge Amit Mehta's ruling is users' click-and-query data. That refers to the information Google collects from users about what they search for (the queries) and what links they select (the clicks).

Earlier this month, when U.S. District Judge Amit Mehta issued penalties against Google for monopolizing the search engine market, he stopped short of the harshest ones — like forcing the breakup of the company.

Instead, Mehta ordered Google to share portions of its incredibly valuable search index and user click-and-query data with some of its competitors. This move, which will make it easier for rivals to build their own search engines, is meant to even the playing field in the search space and chip away at Google's monopoly power.

But it raises a new concern: How to keep Google user data private once it is handed over to third parties. Tech and data privacy analysts are warning that sharing this data could put private information at risk in ways users never agreed to.

Google already shares some aggregate user information, like search trends or how often people use Google, with third parties, including advertisers, business partners and sponsors, the company says. But it's not personally identifiable information. (More granular and identifiable information can be shared if Google is ordered to comply with a search warrant, subpoena, statute, or court order.)

"Google already shares your data. That's part of the contract that you make when you sign up for a Google product. So that should come as no surprise to us," said cybersecurity expert Betsy Cooper, the director of the Aspen Institute's Policy Academy. "The surprise now is that Google is going to share that data with other companies that are then going to be able to use that data for purposes we never imagined."

Under Mehta's ruling, this data will only be shared with "qualified competitors." But which companies will be considered qualified? Mehta assigned those hard decisions to a future five-person technical oversight committee. This panel will establish standards for which companies can get access to this data and which security measures should be adopted to safeguard user privacy.

Google representatives declined an interview for this story. But as they defended the company against the antitrust case brought by the Department of Justice, Google's attorneys repeatedly expressed concerns — and called witnesses — about how the data-sharing requirements could harm user privacy.

And in a blog post last year, Lee-Anne Mulholland, Google's Vice President of Regulatory Affairs, wrote: "The search queries you share with Google are often sensitive and personal and are protected by Google's strict security standards; in the hands of a different company without strong security practices, bad actors could access them to identify you and your search history."

Security experts say the ruling lays bare how vulnerable user data is in the hands of tech companies that rely on this data to build their products — and just how little control users actually have over who gets access.

"Data is the currency upon which this entire [search] ecosystem was created, and it's the currency upon which Google built its wealth," Cooper said. "Now you're seeing a redistribution of that currency to other competitors."

What data does Google have to share? 

Mehta's ruling ordered the company to share two kinds of data. Competitors will be able to get a one-time "snapshot" of Google's search index for a "marginal cost." They will also get a look at users' "click-and-query" data at least twice.

Google's search index can be best described as the index of a large book, and it represents a constantly-updated database of the webpages the company's bots have scraped, said Mitch Stoltz, the IP litigation and competition director at the Electronic Frontier Foundation, a digital privacy non-profit. When users type a search request into Google, it scans this database to return links to webpages.

To Google, this is incredibly valuable intellectual property, collected over years as the search industry's leader.

That industry dominance was the crux of the Department of Justice's antitrust case against Google. They argued that Google's exclusive agreements with companies like Apple and Samsung, which made Google the default search engine on phones and other devices, gave the company an unfair edge over competitors. That prime placement meant that Google handled more search queries than its rivals, allowing it to collect vast amounts of user data — and then use that data to further refine its search engine.

It would be extremely time-consuming and expensive for other companies to build out rival search indexes. So giving them a one-time peek at Google's search index is meant to give them a competitive boost.

In the near term, said Stolz, this could "help some competitors build more robust search engines that are better able to compete with Google."

"But that value will diminish pretty fast," he continued, because the internet changes all the time, and that information will quickly become outdated.

Instead, tech experts said, the most valuable information Google is being forced to share is users' click-and-query data. That refers to the information Google collects from users about what they search for (the queries) and what links they select (the clicks).

For example, when a user types a query into Google like, "the best Italian restaurants near me," Google pulls up a list of restaurants, some of the restaurants' information (including addresses and phone numbers), a Google Map of where the restaurants are and links to online reviews. Google then tracks which links users click on.

The most powerful way for Google to know if it has provided the best answers is by looking at what people actually click on from the provided results, says Jonathan Stray, a senior scientist at the Center for Human Compatible AI at UC Berkeley. Google also analyzes whether users linger on some of the results — ostensibly meaning they got what they needed — or click the "back" button within a short period of time, which likely means they did not.

"That's extremely important information, because it tells Google when it was successful at figuring out what you wanted," Stray said. "So it's a very powerful feedback signal."

The Google sign is shown over an entrance to the company's building in New York on Sept. 6, 2023.
Peter Morgan/AP / AP
/
AP
The Google sign is shown over an entrance to the company's building in New York on Sept. 6, 2023.

What are the data privacy implications?

The data privacy concerns lie entirely with that valuable click-and-query data. Some experts are worried that third parties could use that data to figure out the identity of users and what they are searching for.

After all, people type all kinds of sensitive information into Google — everything from looking up symptoms of diseases to trying to find long-lost loves. "U.S. consumers have very little control over the data that they give over to Google and other online platforms, even very personal data," Stoltz said. "I mean, we tell search engines things that we wouldn't tell a romantic partner or doctor, and it's out there, and we don't have a lot of legal recourse for what happens to it."

Mehta acknowledged that risk in his ruling, writing, "Think of a search query from a user in a small town regarding a rare health condition. Even if the user's name is not included in the data, context could reveal their identity."

And anyone's location could be revealed by their computer's IP address, Stray said. "You can normally figure out someone's location down to sort of the part of the city they're in … and that can be enough to uniquely identify someone, if you also know what they were looking for," Stray said.

While Justice Department officials declined to comment on the data privacy concerns, during the trial, the agency offered proposals to reduce such risks. Those included calling for the technical committee, requiring the competitors who receive the shared data to establish risk mitigation programs, and for those programs to undergo independent audits. The Federal Trade Commission filed a brief with the court supporting those proposals.

Mehta has tasked the technical oversight committee with determining ways to hide identifying information. And there are a host of ways to do that. For example, Stoltz said, a filter could be added that prevents third parties from getting access to queries that fewer than 10 people have ever typed.

But there are drawbacks: The more this data is anonymized and filtered, the less useful it is. "And it's not clear to me that there's really a sweet spot where the data is both protective of users' privacy and still is helpful and useful to competitors. Hopefully there is, but it's not at all clear," Stoltz said.

The technical committee

The technical committee will be powerful: It will be tasked with deciding which companies get the shared data, setting standards for data security, and monitoring Google's compliance with the ruling. It is also likely to decide what format the search index snapshot could take — for example, if it would be a gigantic static spreadsheet or some kind of interactive database.

Under Mehta's order, the committee will last for six years. The five-person panel will include one person chosen by the DOJ, another by Google, one by plaintiff states that filed the case alongside the DOJ, and two others will be agreed upon by all parties. Mehta ordered that these individuals must have expertise in software engineering, information retrieval, artificial intelligence, economics, behavioral science, data privacy or data security. But not much else is known about the panel, including when members will be selected, when they will start their work, and when the data sharing is meant to begin.

It's expected that Google will appeal both the penalties and Mehta's underlying ruling, which will likely delay the proceedings for years.

Stoltz is worried the committee will "have conflicting mandates." Its primary role is to increase competition in online search — but it also must prioritize preserving users' privacy. "And when those goals come into conflict, it's not really clear what course that technical committee is supposed to take," he said.

Still, the data privacy experts who spoke to NPR supported having an oversight panel.

"I just hope that that committee does not become a new place in which all of these issues are litigated again and lead to outcomes where the makeup of that committee becomes the new debate, rather than about the real, serious data questions that are at stake here," Cooper said.

Note: Google is a financial supporter of NPR.

Copyright 2025 NPR

Jaclyn Diaz is a reporter on Newshub.