Our Google Search Data Out in the Wild? The Countdown Has Begun!

Google forced to share all information related to search queries with competitors? An antitrust project from the European Commission that looks like a bad idea dressed up as a good one — both in cybersecurity terms and with regard to users' privacy. While the EU says it is ready to listen to criticism, time is running out to get a dangerous program back on track.

The date passed almost unnoticed, lost in the flood of Brussels regulation. And yet it matters. On May 1, 2026, at 11:59 PM, the public consultation on Google’s search data sharing project under the DMA (Digital Markets Act) will officially close. The final decision will be imposed on Google on July 27, 2026. Behind this technocratic calendar, a major upheaval is taking shape, with far-reaching consequences: the obligation for the dominant search engine to share part of its most precious fuel — the search data of Europeans.

One might think the EU did not particularly want people to look too closely at this matter: the consultation opened on April 16 and only surfaced in the media when a press release was issued on April 15. The Commission explains that it wants to improve the services of Google’s competitors by giving them access to the data Google uses to refine its results. As is often the case, the road to hell is paved with good intentions: the goal is to break the informational asymmetry that allows Google to maintain a crushing lead over its rivals. A new antitrust mechanism, in short. In practice, the chosen tool is explosive.

The mechanism is based on Article 6(11) of the DMA. It requires the American giant to share with competing search engines — including chatbots with a search function — data on queries, clicks, views, and result rankings. The Commission summarizes it itself: the aim is to share rankings, queries, clicks and views used to optimize services. In plain terms, Google must hand over to its competitors the very heart of its business model.

Data Linked to Users’ Most Intimate Secrets

The technical documents go further. They indicate that Alphabet will share search data daily and at the record level, via API, including information such as queries, timestamps, language, device type, access point (Chrome, Android, Assistant, etc.), interactions (clicks, scrolling, absence of click) and the order of displayed results. In other words, the company must provide its competitors not with a statistical dashboard, but with a detailed behavioral map of European users’ browsing habits. Of course, the Commission states that this data will be anonymized, governed by contracts, and subject to security audits. Yet this is precisely where the veneer begins to crack.

First, as the Commission’s own proposal clearly states and as Lukasz Olejnik, a cybersecurity expert, points out, the proposal does not merely open access to abstract statistics or aggregated market data. It exposes streams describing what people search for, see, and click on. Olejnik reminds us that search queries constitute deeply private data, often linked to users’ most intimate secrets.

This observation is central. Contrary to what the EU implies in its communication, this is not about redistributing a neutral resource. A query often involves a personal disclosure, reflects an anxiety, leaves a trace of vulnerability. The Commission’s own working documents acknowledge that users entrust Google with sensitive information.

Dangerously Ineffective Privacy Protections

Clare Kelly, Google’s legal director, makes this clear in a statement quoted by Reuters: hundreds of millions of Europeans trust Google with their most sensitive searches — including private questions about their health, their family, and their finances — and the Commission’s proposal would require us to hand this data over to third parties, with dangerously ineffective privacy protections.

This argument cannot be dismissed simply because Google has a vested interest in the matter. Indeed, the anonymization process rests on a fragile mechanism. The system filters query elements based on their frequency, but as Olejnik points out, there is no requirement that the complete query have been submitted by multiple users. In other words, a unique query can pass through the filter if each of its individual components is common. For instance, a common surname like Dupond and a widespread medical term like cancer would each pass the thresholds individually — yet their combination might be unique, and could still be transmitted. The system confuses the frequency of a component with the privacy safety of the complete query, the researcher concludes. This is a conceptual error, not merely an implementation flaw.

One risk mechanically leads to another, which again undermines the Commission’s assurances about the anonymity of transmitted data: re-identification through cross-referencing.

Multiplying Copies Means Multiplying Vulnerabilities

Here again, the scenario does not even depend on the explicit content of the query or the username. It is enough to cross-reference the shared data — clicked URL, timestamp, geographic area, device type — with other sources, such as the logs available to any website, to link a click to a real user. Once more, this is not a bug; it is a feature.

And as if the risks to user privacy were not enough, the project is a cybersecurity time bomb, offering potential attackers an attack surface the size of a continent. It does not merely extract sensitive data — it distributes it continuously to numerous actors via APIs. Granted, it is not a free-for-all: the Commission requires encryption, access controls, multi-factor authentication and ISAE 3000 audits. But daily, record-level sharing clearly crosses a red line in cybersecurity terms. Multiplying copies means multiplying vulnerabilities. One poorly secured actor, one compromised subcontractor, one misused access point — and the entire ecosystem becomes porous. The model relies on blind trust in the recipients and in the data flows. Yet this is precisely what cybersecurity seeks to avoid. Was Zero Trust not the central theme of the INCYBER Forum 2025?

And these are only the most direct dangers of the project. Speaking of its beneficiaries, the potential gaps are gaping. The text explicitly provides that AI chatbots with a search function can access the data.

The Largest Distributed Surveillance Database on the Continent

This category includes emerging players, sometimes poorly regulated, sometimes reliant on opaque funding, and often based outside the EU. As MEP Virginie Joron — whom we had previously interviewed about the EU’s freedom-eroding tendencies — denounces: Europe, technologically vassalized, has become a mere playground for US Big Tech.

And this is without even mentioning the interest such a data stream would hold for both corporations and states. A hostile service could create or fund a shell company that is formally compliant — such as an AI search wrapper or a regional search product — which would then have legitimate access to these sensitive data flows, Olejnik warns. The bottleneck is just paperwork, he sums up. The scenario is not far-fetched. In the world of economic or state intelligence, it is standard practice. The real question is not whether it will happen, but when. Is the system capable of preventing such a drift? Nothing suggests it is, in the European technocratic universe where compliance with an administrative standard typically amounts to a blank check.

Is it paranoid to highlight the danger of mass surveillance, which agitates some observers most loudly on social media? When the X account Kruptos speaks of the largest distributed surveillance database on the continent, it touches on a sensitive point nonetheless. The project does not, admittedly, create a centralized database freely accessible to states, police services, intelligence agencies, or EU institutions.

Everything Needs to Be Rethought in Cybersecurity Terms

The threat is more subtle: the project creates a network of continuously fed data flows describing users’ precise search behaviors. As we have seen, such a stream makes it possible, with a little work, to track individuals, locations, or events at low cost. While this is obviously not the Commission’s stated intention, it is a real capability — and capabilities always end up finding a use.

So what should be done? Abandon all ambition to foster competition? If the objective remains legitimate in itself, respecting cybersecurity rules and genuine anonymity would require a radical overhaul of how the system works.

The first set of fixes concerns the content being analyzed, and would above all require prohibiting the sharing of complete queries unless they are genuinely frequent — a threshold that must apply not just to their individual components, but to their combination. Without this, anonymization remains illusory. The same applies to searches containing elements such as the user’s voice or images, as well as metadata.

To guarantee genuine data anonymization, it would also be necessary to adopt a far broader geographic granularity than currently planned. The proposal provides for a cell covering at least 1,000 connected users and a minimum surface area of 3 km². This is far too fine-grained for certain contexts: rural areas, sensitive institutions, administrative districts, military sites, hospitals, courts, and strategic businesses.

The text also provides for a limited form of session grouping: records from the same user can be grouped chronologically under a shared random identifier. Even in limited form, this logic is dangerous: a sequence of searches reveals far more than an isolated query and makes it easier to identify the user.

Radically Changing the Distribution Model

Finally, sensitive queries should be excluded by default, even if they are common. The mechanism could include a list inspired by the GDPR, with automatic blocking of searches relating to topics such as health, sexuality, justice, finances, minors, or political matters.

The second set of fixes is more specifically cyber-related. The text provides for an initial assurance report, followed by an annual ISAE 3000 report or equivalent. This is better than nothing, but given daily data flows, daily oversight by Google and the relevant data protection authorities is needed. Any unusual extraction or non-compliant use should trigger an automatic suspension of the offending actor.

Those actors should themselves be carefully vetted: AI wrappers, prototypes, experimental conversational agents, and entities with no operational track record should be excluded at least in the initial phase, and validated companies should be monitored and excluded at the first sign of misconduct.

Finally, assuming it is genuinely appropriate to expose the Google search data of European citizens to the scrutiny of non-European actors, would it not be simpler to radically change the distribution model? No daily API stream, no multiple copies — but instead an environment in which actors could work with the data without extracting it. In short: not sharing hypersensitive information, but rather sharing the capacity to exploit it. While this solution would carry its own cybersecurity risk — the honeypot effect attracting hackers like flies — it would at least follow one common-sense rule: you do not multiply copies of a secret; you limit access to it.

Given the timeline the EU has imposed for correcting course, it is unfortunately to be feared that the die has already been cast.