google search

Oddest ChatGPT leaks yet: Cringey chat logs found in Google analytics tool

AI, chatbot, chatbot leaks, chatgpt, Google, google search, google search console. privacy, openai, Policy, SEO / Rejus Almole / November 7, 2025

ChatGPT leaks seem to confirm OpenAI scrapes Google, expert says.

Credit: Aurich Lawson | Getty Images

For months, extremely personal and sensitive ChatGPT conversations have been leaking into an unexpected destination: Google Search Console (GSC), a tool that developers typically use to monitor search traffic, not lurk private chats.

Normally, when site managers access GSC performance reports, they see queries based on keywords or short phrases that Internet users type into Google to find relevant content. But starting this September, odd queries, sometimes more than 300 characters long, could also be found in GSC. Showing only user inputs, the chats appeared to be from unwitting people prompting a chatbot to help solve relationship or business problems, who likely expected those conversations would remain private.

Jason Packer, owner of an analytics consulting firm called Quantable, was among the first to flag the issue in a detailed blog last month.

Determined to figure out what exactly was causing the leaks, he teamed up with “Internet sleuth” and web optimization consultant Slobodan Manić. Together, they conducted testing that they believe may have surfaced “the first definitive proof that OpenAI directly scrapes Google Search with actual user prompts.” Their investigation seemed to confirm the AI giant was compromising user privacy, in some cases in order to maintain engagement by seizing search data that Google otherwise wouldn’t share.

OpenAI declined Ars’ request to confirm if Packer and Manić’s theory posed in their blog was correct or answer any of their remaining questions that could help users determine the scope of the problem.

However, an OpenAI spokesperson confirmed that the company was “aware” of the issue and has since “resolved” a glitch “that temporarily affected how a small number of search queries were routed.”

Packer told Ars that he’s “very pleased that OpenAI was able to resolve the issue quickly.” But he suggested that OpenAI’s response failed to confirm whether or not OpenAI was scraping Google, and that leaves room for doubt that the issue was completely resolved.

Google declined to comment.

“Weirder” than prior ChatGPT leaks

The first odd ChatGPT query to appear in GSC that Packer reviewed was a wacky stream-of-consciousness from a likely female user asking ChatGPT to assess certain behaviors to help her figure out if a boy who teases her had feelings for her. Another odd query seemed to come from an office manager sharing business information while plotting a return-to-office announcement.

These were just two of 200 odd queries—including “some pretty crazy ones,” Packer told Ars—that he reviewed on one site alone. In his blog, Packer concluded that the queries should serve as “a reminder that prompts aren’t as private as you think they are!”

Packer suspected that these queries were connected to reporting from The Information in August that cited sources claiming OpenAI was scraping Google search results to power ChatGPT responses. Sources claimed that OpenAI was leaning on Google to answer prompts to ChatGPT seeking information about current events, like news or sports.

OpenAI has not confirmed that it’s scraping Google search engine results pages (SERPs). However, Packer thinks his testing of ChatGPT leaks may be evidence that OpenAI not only scrapes “SERPs in general to acquire data,” but also sends user prompts to Google Search.

Manić helped Packer solve a big part of the riddle. He found that the odd queries were turning up in one site’s GSC because it ranked highly in Google Search for “https://openai.com/index/chatgpt/”—a ChatGPT URL that was appended at the start of every strange query turning up in GSC.

It seemed that Google had tokenized the URL, breaking it up into a search for keywords “openai + index + chatgpt.” Sites using GSC that ranked highly for those keywords were therefore likely to encounter ChatGPT leaks, Parker and Manić proposed, including sites that covered prior ChatGPT leaks where chats were being indexed in Google search results. Using their recommendations to seek out queries in GSC, Ars was able to verify similar strings.

“Don’t get confused though, this is a new and completely different ChatGPT screw-up than having Google index stuff we don’t want them to,” Packer wrote. “Weirder, if not as serious.”

It’s unclear what exactly OpenAI fixed, but Packer and Manić have a theory about one possible path for leaking chats. Visiting the URL that starts every strange query found in GSC, ChatGPT users encounter a prompt box that seemed buggy, causing “the URL of that page to be added to the prompt.” The issue, they explained, seemed to be that:

Normally ChatGPT 5 will choose to do a web search whenever it thinks it needs to, and is more likely to do that with an esoteric or recency-requiring search. But this bugged prompt box also contains the query parameter ‘hints=search’ to cause it to basically always do a search: https://chatgpt.com/?hints=search&openaicom_referred=true&model=gpt-5

Clearly some of those searches relied on Google, Packer’s blog said, mistakenly sending to GSC “whatever” the user says in the prompt box, with “https://openai.com/index/chatgpt/” text added to the front of it.” As Packer explained, “we know it must have scraped those rather than using an API or some kind of private connection—because those other options don’t show inside GSC.”

This means “that OpenAI is sharing any prompt that requires a Google Search with both Google and whoever is doing their scraping,” Packer alleged. “And then also with whoever’s site shows up in the search results! Yikes.”

To Packer, it appeared that “ALL ChatGPT prompts” that used Google Search risked being leaked during the past two months.

OpenAI claimed only a small number of queries were leaked but declined to provide a more precise estimate. So, it remains unclear how many of the 700 million people who use ChatGPT each week had prompts routed to GSC.

OpenAI’s response leaves users with “lingering questions”

After ChatGPT prompts were found surfacing in Google’s search index in August, OpenAI clarified that users had clicked a box making those prompts public, which OpenAI defended as “sufficiently clear.” The AI firm later scrambled to remove the chats from Google’s SERPs after it became obvious that users felt misled into sharing private chats publicly.

Packer told Ars that a major difference between those leaks and the GSC leaks is that users harmed by the prior scandal, at least on some level, “had to actively share” their leaked chats. In the more recent case, “nobody clicked share” or had a reasonable way to prevent their chats from being exposed.

“Did OpenAI go so fast that they didn’t consider the privacy implications of this, or did they just not care?” Packer posited in his blog.

Perhaps most troubling to some users—whose identities are not linked in chats unless their prompts perhaps share identifying information—there does not seem to be any way to remove the leaked chats from GSC, unlike the prior scandal.

Packer and Manić are left with “lingering questions” about how far OpenAI’s fix will go to stop the issue.

Manić was hoping OpenAI might confirm if prompts entered on https://chatgpt.com that trigger Google Search were also affected. But OpenAI did not follow up on that question, or a broader question about how big the leak was. To Manić, a major concern was that OpenAI’s scraping may be “contributing to ‘crocodile mouth’ in Google Search Console,” a troubling trend SEO researchers have flagged that causes impressions to spike but clicks to dip.

OpenAI also declined to clarify Packer’s biggest question. He’s left wondering if the company’s “fix” simply ended OpenAI’s “routing of search queries, such that raw prompts are no longer being sent to Google Search, or are they no longer scraping Google Search at all for data?

“We still don’t know if it’s that one particular page that has this bug or whether this is really widespread,” Packer told Ars. “In either case, it’s serious and just sort of shows how little regard OpenAI has for moving carefully when it comes to privacy.”

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Oddest ChatGPT leaks yet: Cringey chat logs found in Google analytics tool Read More »

Reddit sues to block Perplexity from scraping Google search results

AI, ai scraping, Google, google search, Perplexity, Perplexity.ai, Policy, reddit, Web Scraping / Mike M. / October 23, 2025

“Unable to scrape Reddit directly, they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search,” Lee said. “Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself.”

On Reddit, Perplexity pushed back on Reddit’s claims that Perplexity ignored requests to license Reddit content.

“Untrue. Whenever anyone asks us about content licensing, we explain that Perplexity, as an application-layer company, does not train AI models on content,” Perplexity said. “Never has. So, it is impossible for us to sign a license agreement to do so.”

Reddit supposedly “insisted we pay anyway, despite lawfully accessing Reddit data,” Perplexity said. “Bowing to strong arm tactics just isn’t how we do business.”

Perplexity’s spokesperson, Jesse Dwyer, told Ars the company chose to post its statement on Reddit “to illustrate a simple point.”

“It is a public Reddit link accessible to anyone, yet by the logic of Reddit’s lawsuit, if you mention it or cite it in any way (which is your job as a reporter), they might just sue you,” Dwyer said.

But Reddit claimed that its business and reputation have been “damaged” by “misappropriation of Reddit data and circumvention of technological control measures.” Without a licensing deal ensuring that Perplexity and others are respecting Reddit policies, Reddit cannot control who has access to data, how they’re using data, and if data use conflicts with Reddit’s privacy policy and user agreement, the complaint said.

Further, Reddit’s worried that Perplexity’s workaround could catch on, potentially messing up Reddit’s other licensing deals. All the while, Reddit noted, it has to invest “significant resources” in anti-scraping technology, with Reddit ultimately suffering damages, including “lost profits and business opportunities, reputational harm, and loss of user trust.”

Reddit’s hoping the court will grant an injunction barring companies from scraping Reddit content from Google SERPs. It also wants companies blocked from both selling Reddit data and “developing or distributing any technology or product that is used for the unauthorized circumvention of technological control measures and scraping of Reddit data.”

If Reddit wins, companies could be required to pay substantial damages or to disgorge profits from the sale of Reddit content.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder in Reddit.

Reddit sues to block Perplexity from scraping Google search results Read More »

Enough is enough—I dumped Google’s worsening search for Kagi

AI, AI assistant, ai search, degoogle, Features, Google, google search, kagi, kagi search, screwyoogle, search, search engines, SEO, Tech / Mike M. / August 5, 2025

I like how the search engine is the product instead of me.

“Won’t be needing this anymore!” Credit: Aurich “The King” Lawson

Mandatory AI summaries have come to Google, and they gleefully showcase hallucinations while confidently insisting on their truth. I feel about them the same way I felt about mandatory G+ logins when all I wanted to do was access my damn YouTube account: I hate them. Intensely.

But unlike those mandatory G+ logins—on which Google eventually relented before shutting down the G+ service—our reading of the tea leaves suggests that, this time, the search giant is extremely pleased with how things are going.

Fabricated AI dreck polluting your search? It’s the new normal. Miss your little results page with its 10 little blue links? Too bad. They’re gone now, and you can’t get them back, no matter what ephemeral workarounds or temporarily functional flags or undocumented, could-fail-at-any-time URL tricks you use.

And the galling thing is that Google expects you to be a good consumer and just take it. The subtext of the company’s (probably AI-generated) robo-MBA-speak non-responses to criticism and complaining is clear: “LOL, what are you going to do, use a different search engine? Now, shut up and have some more AI!”

But like the old sailor used to say: “That’s all I can stands, and I can’t stands no more.” So I did start using a different search engine—one that doesn’t constantly shower me with half-baked, anti-consumer AI offerings.

Out with Google, in with Kagi.

What the hell is a Kagi?

Kagi was founded in 2018, but its search product has only been publicly available since June 2022. It purports to be an independent search engine that pulls results from around the web (including from its own index) and is aimed at returning search to a user-friendly, user-focused experience. The company’s stated purpose is to deliver useful search results, full stop. The goal is not to blast you with AI garbage or bury you in “Knowledge Graph” summaries hacked together from posts in a 12-year-old Reddit thread between two guys named /u/WeedBoner420 and /u/14HitlerWasRight88.

Kagi’s offerings (it has a web browser, too, though I’ve not used it) are based on a simple idea. There’s an (oversimplified) axiom that if a good or service (like Google search, for example, or good ol’ Facebook) is free for you to use, it’s because you’re the product, not the customer. With Google, you pay with your attention, your behavioral metrics, and the intimate personal details of your wants and hopes and dreams (and the contents of your emails and other electronic communications—Google’s got most of that, too).

With Kagi, you pay for the product using money. That’s it! You give them some money, and you get some service—great service, really, which I’m overall quite happy with and which I’ll get to shortly. You don’t have to look at any ads. You don’t have to look at AI droppings. You don’t have to give perpetual ownership of your mind-palace to a pile of optioned-out tech bros in sleeveless Patagonia vests while you are endlessly subjected to amateur AI Rorschach tests every time you search for “pierogis near me.”

How much money are we talking?

I dunno, about a hundred bucks a year? That’s what I’m spending as an individual for unlimited searches. I’m using Kagi’s “Professional” plan, but there are others, including a free offering so that you can poke around and see if the service is worth your time.

image of kagi billing panel — This is my account’s billing page, showing what I’ve paid for Kagi in the past year. (By the time this article runs, I’ll have renewed my subscription!) Credit: Lee Hutchinson

I’d previously bounced off two trial runs with Kagi in 2023 and 2024 because the idea of paying for search just felt so alien. But that was before Google’s AI enshittification rolled out in full force. Now, sitting in the middle of 2025 with the world burning down around me, a hundred bucks to kick Google to the curb and get better search results feels totally worth it. Your mileage may vary, of course.

The other thing that made me nervous about paying for search was the idea that my money was going to enrich some scumbag VC fund, but fortunately, there’s good news on that front. According to the company’s “About” page, Kagi has not taken any money from venture capitalist firms. Instead, it has been funded by a combination of self-investment by the founder, selling equity to some Kagi users in two rounds, and subscription revenue:

Kagi was bootstrapped from 2018 to 2023 with ~$3M initial funding from the founder. In 2023, Kagi raised $670K from Kagi users in its first external fundraise, followed by $1.88M raised in 2024, again from our users, bringing the number of users-investors to 93… In early 2024, Kagi became a Public Benefit Corporation (PBC).

What about DuckDuckGo? Or Bing? Or Brave?

Sure, those can be perfectly cromulent alternatives to Google, but honestly, I don’t think they go far enough. DuckDuckGo is fine, but it largely utilizes Bing’s index; and while DuckDuckGo exercises considerable control over its search results, the company is tied to the vicissitudes of Microsoft by that index. It’s a bit like sitting in a boat tied to a submarine. Sure, everything’s fine now, but at some point, that sub will do what subs do—and your boat is gonna follow it down.

And as for Bing itself, perhaps I’m nitpicky [Ed. note: He is!], but using Bing feels like interacting with 2000-era MSN’s slightly perkier grandkid. It’s younger and fresher, yes, but it still radiates that same old stanky feeling of taste-free, designed-by-committee artlessness. I’d rather just use Google—which is saying something. At least Google’s search home page remains uncluttered.

Brave Search is another fascinating option I haven’t spent a tremendous amount of time with, largely because Brave’s cryptocurrency ties still feel incredibly low-rent and skeevy. I’m slowly warming up to the Brave Browser as a replacement for Chrome (see the screenshots in this article!), but I’m just not comfortable with Brave yet—and likely won’t be unless the company divorces itself from cryptocurrencies entirely.

More anonymity, if you want it

The feature that convinced me to start paying for Kagi was its Privacy Pass option. Based on a clean-sheet Rust implementation of the Privacy Pass standard (IETF RFCs 9576, 9577, and 9578) by Raphael Robert, this is a technology that uses cryptographic token-based auth to send an “I’m a paying user, please give me results” signal to Kagi, without Kagi knowing which user made the request. (There’s a much longer Kagi blog post with actual technical details for the curious.)

To search using the tool, you install the Privacy Pass extension (linked in the docs above) in your browser, log in to Kagi, and enable the extension. This causes the plugin to request a bundle of tokens from the search service. After that, you can log out and/or use private windows, and those tokens are utilized whenever you do a Kagi search.

image of a kagi search with privacy pass enabled — Privacy pass is enabled, allowing me to explore the delicious mystery of pierogis with some semblance of privacy. Credit: Lee Hutchinson

The obvious flaw here is that Kagi still records source IP addresses along with Privacy Pass searches, potentially de-anonymizing them, but there’s a path around that: Privacy Pass functions with Tor, and Kagi maintains a Tor onion address for searches.

So why do I keep using Privacy Pass without Tor, in spite of the opsec flaw? Maybe it’s the placebo effect in action, but I feel better about putting at least a tiny bit of friction in the way of someone with root attempting to casually browse my search history. Like, I want there to be at least a SQL JOIN or two between my IP address and my searches for “best Mass Effect alien sex choices” or “cleaning tips for Garrus body pillow.” I mean, you know, assuming I were ever to search for such things.

What’s it like to use?

Moving on with embarrassed rapidity, let’s look at Kagi a bit and see how using it feels.

My anecdotal observation is that Kagi doesn’t favor Reddit-based results nearly as much as Google does, but sometimes it still has them near or at the top. And here is where Kagi curb-stomps Google with quality-of-life features: Kagi lets you prioritize or de-prioritize a website’s prominence in your search results. You can even pin that site to the top of the screen or block it completely.

This is a feature I’ve wanted Google to get for about 25 damn years but that the company has consistently refused to properly implement (likely because allowing users to exclude sites from search results notionally reduces engagement and therefore reduces the potential revenue that Google can extract from search). Well, screw you, Google, because Kagi lets me prioritize or exclude sites from my results, and it works great—I’m extraordinarily pleased to never again have to worry about Quora or Pinterest links showing up in my search results.

Further, Kagi lets me adjust these settings both for the current set of search results (if you don’t want Reddit results for this search but you don’t want to drop Reddit altogether) and also globally (for all future searches):

image of kagi search personalization options — Goodbye forever, useless crap sites. Credit: Lee Hutchinson

Another tremendous quality-of-life improvement comes via Kagi’s image search, which does a bunch of stuff that Google should and/or used to do—like giving you direct right-click access to save images without having to fight the search engine with workarounds, plugins, or Tampermonkey-esque userscripts.

The Kagi experience is also vastly more customizable than Google’s (or at least, how Google’s has become). The widgets that appear in your results can be turned off, and the “lenses” through which Kagi sees the web can be adjusted to influence what kinds of things do and do not appear in your results.

If that doesn’t do it for you, how about the ability to inject custom CSS into your search and landing pages? Or to automatically rewrite search result URLs to taste, doing things like redirecting reddit.com to old.reddit.com? Or breaking free of AMP pages and always viewing originals instead?

Image of kagi custom css field — Imagine all the things Ars readers will put here. Credit: Lee Hutchinson

Is that all there is?

Those are really all the features I care about, but there are loads of other Kagi bits to discover—like a Kagi Maps tool (it’s pretty good, though I’m not ready to take it up full time yet) and a Kagi video search tool. There are also tons of classic old-Google-style inline search customizations, including verbatim mode, where instead of trying to infer context about your search terms, Kagi searches for exactly what you put in the box. You can also add custom search operators that do whatever you program them to do, and you get API-based access for doing programmatic things with search.

A quick run-through of a few additional options pages. This is the general customization page. Lee Hutchinson

I haven’t spent any time with Kagi’s Orion browser, but it’s there as an option for folks who want a WebKit-based browser with baked-in support for Privacy Pass and other Kagi functionality. For now, Firefox continues to serve me well, with Brave as a fallback for working with Google Docs and other tools I can’t avoid and that treat non-Chromium browsers like second-class citizens. However, Orion is probably on the horizon for me if things in Mozilla-land continue to sour.

Cool, but is it any good?

Rather than fill space with a ton of comparative screenshots between Kagi and Google or Kagi and Bing, I want to talk about my subjective experience using the product. (You can do all the comparison searches you want—just go and start searching—and your comparisons will be a lot more relevant to your personal use cases than any examples I can dream up!)

My time with Kagi so far has included about seven months of casual opportunistic use, where I’d occasionally throw a query at it to see how it did, and about five months of committed daily use. In the five months of daily usage, I can count on one hand the times I’ve done a supplementary Google search because Kagi didn’t have what I was looking for on the first page of results. I’ve done searches for all the kinds of things I usually look for in a given day—article fact-checking queries, searches for details about the parts of speech, hunts for duck facts (we have some feral Muscovy ducks nesting in our front yard), obscure technical details about Project Apollo, who the hell played Dupont in Equilibrium (Angus Macfadyen, who also played Robert the Bruce in Braveheart), and many, many other queries.

Image of Firefox history window showing kagi searches for july 22 — A typical afternoon of Kagi searches, from my Firefox history window. Credit: Lee Hutchinson

For all of these things, Kagi has responded quickly and correctly. The time to service a query feels more or less like Google’s service times; according to the timer at the top of the page, my Kagi searches complete in between 0.2 and 0.8 seconds. Kagi handles misspellings in search terms with the grace expected of a modern search engine and has had no problem figuring out my typos.

Holistically, taking search customizations into account on top of the actual search performance, my subjective assessment is that Kagi gets me accurate, high-quality results on more or less any given query, and it does so without festooning the results pages with features I find detractive and irrelevant.

I know that’s not a data-driven assessment, and it doesn’t fall back on charts or graphs or figures, but it’s how I feel after using the product every single day for most of 2025 so far. For me, Kagi’s search performance is firmly in the “good enough” category, and that’s what I need.

Kagi and AI

Unfortunately, the thing that’s stopping me from being completely effusive in my praise is that Kagi is exhibiting a disappointing amount of “keeping-up-with-the-Joneses” by rolling out a big ‘ol pile of (optional, so far) AI-enabled search features.

A blog post from founder Vladimir Prelovac talks about the company’s use of AI, and it says all the right things, but at this point, I trust written statements from tech company founders about as far as I can throw their corporate office buildings. (And, dear reader, that ain’t very far).

image of kagi ai features — No thanks. But I would like to exclude AI images from my search results, please. Credit: Lee Hutchinson

The short version is that, like Google, Kagi has some AI features: There’s an AI search results summarizer, an AI page summarizer, and an “ask questions about your results” chatbot-style function where you can interactively interrogate an LLM about your search topic and results. So far, all of these things can be disabled or ignored. I don’t know how good any of the features are because I have disabled or ignored them.

If the existence of AI in a product is a bright red line you won’t cross, you’ll have to turn back now and find another search engine alternative that doesn’t use AI and also doesn’t suck. When/if you do, let me know, because the pickings are slim.

Is Kagi for you?

Kagi might be for you—especially if you’ve recently typed a simple question into Google and gotten back a pile of fabricated gibberish in place of those 10 blue links that used to serve so well. Are you annoyed that Google’s search sucks vastly more now than it did 10 years ago? Are you unhappy with how difficult it is to get Google search to do what you want? Are you fed up? Are you pissed off?

If your answer to those questions is the same full-throated “Hell yes, I am!” that mine was, then perhaps it’s time to try an alternative. And Kagi’s a pretty decent one—if you’re not averse to paying for it.

It’s a fantastic feeling to type in a search query and once again get useful, relevant, non-AI results (that I can customize!). It’s a bit of sanity returning to my Internet experience, and I’m grateful. Until Kagi is bought by a value-destroying vampire VC fund or implodes into its own AI-driven enshittification cycle, I’ll probably keep paying for it.

After that, who knows? Maybe I’ll throw away my computers and live in a cave. At least until the cave’s robot exclusion protocol fails and the Googlebot comes for me.

Lee is the Senior Technology Editor, and oversees story development for the gadget, culture, IT, and video sections of Ars Technica. A long-time member of the Ars OpenForum with an extensive background in enterprise storage and security, he lives in Houston.

Enough is enough—I dumped Google’s worsening search for Kagi Read More »

ChatGPT users shocked to learn their chats were in Google search results

AI, chatbot, chatgpt, Google, google search, Online Privacy, openai, Policy, search engines, search results / Paul Patrick / August 2, 2025

Faced with mounting backlash, OpenAI removed a controversial ChatGPT feature that caused some users to unintentionally allow their private—and highly personal—chats to appear in search results.

Fast Company exposed the privacy issue on Wednesday, reporting that thousands of ChatGPT conversations were found in Google search results and likely only represented a sample of chats “visible to millions.” While the indexing did not include identifying information about the ChatGPT users, some of their chats did share personal details—like highly specific descriptions of interpersonal relationships with friends and family members—perhaps making it possible to identify them, Fast Company found.

OpenAI’s chief information security officer, Dane Stuckey, explained on X that all users whose chats were exposed opted in to indexing their chats by clicking a box after choosing to share a chat.

Fast Company noted that users often share chats on WhatsApp or select the option to save a link to visit the chat later. But as Fast Company explained, users may have been misled into sharing chats due to how the text was formatted:

“When users clicked ‘Share,’ they were presented with an option to tick a box labeled ‘Make this chat discoverable.’ Beneath that, in smaller, lighter text, was a caveat explaining that the chat could then appear in search engine results.”

At first, OpenAI defended the labeling as “sufficiently clear,” Fast Company reported Thursday. But Stuckey confirmed that “ultimately,” the AI company decided that the feature “introduced too many opportunities for folks to accidentally share things they didn’t intend to.” According to Fast Company, that included chats about their drug use, sex lives, mental health, and traumatic experiences.

Carissa Veliz, an AI ethicist at the University of Oxford, told Fast Company she was “shocked” that Google was logging “these extremely sensitive conversations.”

OpenAI promises to remove Google search results

Stuckey called the feature a “short-lived experiment” that OpenAI launched “to help people discover useful conversations.” He confirmed that the decision to remove the feature also included an effort to “remove indexed content from the relevant search engine” through Friday morning.

ChatGPT users shocked to learn their chats were in Google search results Read More »

Cloudflare wants Google to change its AI search crawling. Google likely won’t.

AI, ai crawler, ai scraping, AI training, Artificial Intelligence, cloudflare, Google, google ai overviews, google search, googlebot, Policy / Mike M. / July 10, 2025

Ars could not immediately find any legislation that seemed to match Prince’s description, and Cloudflare did not respond to Ars’ request to comment. Passing tech laws is notoriously hard, though, partly because technology keeps advancing as policy debates drag on, and challenges with regulating artificial intelligence are an obvious example of that pattern today.

Google declined Ars’ request to confirm whether talks were underway or if the company was open to separating its crawlers.

Although Cloudflare singled out Google, other search engines that view AI search features as part of their search products also use the same bots for training as they do for search indexing. It seems likely that Cloudflare’s proposed legislation would face resistance from tech companies in a similar position to Google, as The Wall Street Journal reported that the tech companies “have few incentives to work with intermediaries.”

Additionally, Cloudflare’s initiative faces criticism from those who “worry that academic research, security scans, and other types of benign web crawling will get elbowed out of websites as barriers are built around more sites” through Cloudflare’s blocks and paywalls, the WSJ reported. Cloudflare’s system could also threaten web projects like The Internet Archive, which notably played a crucial role in helping track data deleted from government websites after Donald Trump took office.

Among commenters discussing Cloudflare’s claims about Google on Search Engine Round Table, one user suggested Cloudflare may risk a lawsuit or other penalties from Google for poking the bear.

Ars will continue monitoring for updates on Cloudflare’s attempts to get Google on board with its plan.

Cloudflare wants Google to change its AI search crawling. Google likely won’t. Read More »

Google’s AI Mode search can now answer questions about images

AI, ai search, Artificial Intelligence, Google, Google Gemini, google search, Tech / Rejus Almole / April 8, 2025

Google started cramming AI features into search in 2024, but last month marked an escalation. With the release of AI Mode, Google previewed a future in which searching the web does not return a list of 10 blue links. Google says it’s getting positive feedback on AI Mode from users, so it’s forging ahead by adding multimodal functionality to its robotic results.

AI Mode relies on a custom version of the Gemini large language model (LLM) to produce results. Google confirms that this model now supports multimodal input, which means you can now show images to AI Mode when conducting a search.

As this change rolls out, the search bar in AI Mode will gain a new button that lets you snap a photo or upload an image. The updated Gemini model can interpret the content of images, but it gets a little help from Google Lens. Google notes that Lens can identify specific objects in the images you upload, passing that context along so AI Mode can make multiple sub-queries, known as a “fan-out technique.”

Google illustrates how this could work in the example below. The user shows AI Mode a few books, asking questions about similar titles. Lens identifies each individual title, allowing AI Mode to incorporate the specifics of the books into its response. This is key to the model’s ability to suggest similar books and make suggestions based on the user’s follow-up question.

Google’s AI Mode search can now answer questions about images Read More »

It’s easier than ever to scrub your personal info from Google Search

Google, google search, personal data / Rejus Almole / February 27, 2025

As Google’s 2024 antitrust loss proved, the company has worked very, very hard to ensure its search engine is the primary roadmap for the Internet. Google scours the Internet for data about everything—even you. And if you don’t want your personal info to wind up in Google search results, you can use the just-redesigned “Results About You” tool. The tool, which began its rollout in 2022, is easier to use now, and some of the most useful features are now better integrated with search results.

The first step in using Results About You—which has not changed—is a bit alarming when you’ve set out to obscure your personal information. Just head to the new hub for Results About You and enter your personal information. Google probably already knows your phone number, email, and even physical address, but this tells the tool what specific information to pluck out of search results. If that data is out there, Google has it whether or not you remove it from search results.

Before this update, most of the Results About You features were limited to this console, but the most important features are now integrated with the search results. They’re not exactly prominently displayed, though. When scrolling through a Google search (after the AI overview, ads, knowledge graph, and more ads), you can use the three-dot menu next to a result to get data about it. This menu now includes options to remove the result right at the top.

If you request a removal due to the presence of personal information, Google will ask for more details, but that only takes a few seconds. The same interface includes non-personal removal requests—for example, if you’ve spotted illegal content. If you’re requesting a personal data removal, it has to be your data. These requests are logged in the Results About You tool for later review. Importantly, Google can’t remove content from webpages—you’re on your own there.

It’s easier than ever to scrub your personal info from Google Search Read More »

Google’s plan to keep AI out of search trial remedies isn’t going very well

AI, ai search, Antitrust law, copilot, generative ai, Google, google search, google search monopoly, microsoft, Policy, search engines / Kris Guyer / November 27, 2024

DOJ: AI is not its own market

Judge: AI will likely play “larger role” in Google search remedies as market shifts.

Google got some disappointing news at a status conference Tuesday, where US District Judge Amit Mehta suggested that Google’s AI products may be restricted as an appropriate remedy following the government’s win in the search monopoly trial.

According to Law360, Mehta said that “the recent emergence of AI products that are intended to mimic the functionality of search engines” is rapidly shifting the search market. Because the judge is now weighing preventive measures to combat Google’s anticompetitive behavior, the judge wants to hear much more about how each side views AI’s role in Google’s search empire during the remedies stage of litigation than he did during the search trial.

“AI and the integration of AI is only going to play a much larger role, it seems to me, in the remedy phase than it did in the liability phase,” Mehta said. “Is that because of the remedies being requested? Perhaps. But is it also potentially because the market that we have all been discussing has shifted?”

To fight the DOJ’s proposed remedies, Google is seemingly dragging its major AI rivals into the trial. Trying to prove that remedies would harm Google’s ability to compete, the tech company is currently trying to pry into Microsoft’s AI deals, including its $13 billion investment in OpenAI, Law360 reported. At least preliminarily, Mehta has agreed that information Google is seeking from rivals has “core relevance” to the remedies litigation, Law360 reported.

The DOJ has asked for a wide range of remedies to stop Google from potentially using AI to entrench its market dominance in search and search text advertising. They include a ban on exclusive agreements with publishers to train on content, which the DOJ fears might allow Google to block AI rivals from licensing data, potentially posing a barrier to entry in both markets. Under the proposed remedies, Google would also face restrictions on investments in or acquisitions of AI products, as well as mergers with AI companies.

Additionally, the DOJ wants Mehta to stop Google from any potential self-preferencing, such as making an AI product mandatory on Android devices Google controls or preventing a rival from distribution on Android devices.

The government seems very concerned that Google may use its ownership of Android to play games in the emerging AI sector. They’ve further recommended an order preventing Google from discouraging partners from working with rivals, degrading the quality of rivals’ AI products on Android devices, or otherwise “coercing” manufacturers or other Android partners into giving Google’s AI products “better treatment.”

Importantly, if the court orders AI remedies linked to Google’s control of Android, Google could risk a forced sale of Android if Mehta grants the DOJ’s request for “contingent structural relief” requiring divestiture of Android if behavioral remedies don’t destroy the current monopolies.

Finally, the government wants Google to be required to allow publishers to opt out of AI training without impacting their search rankings. (Currently, opting out of AI scraping automatically opts sites out of Google search indexing.)

All of this, the DOJ alleged, is necessary to clear the way for a thriving search market as AI stands to shake up the competitive landscape.

“The promise of new technologies, including advances in artificial intelligence (AI), may present an opportunity for fresh competition,” the DOJ said in a court filing. “But only a comprehensive set of remedies can thaw the ecosystem and finally reverse years of anticompetitive effects.”

At the status conference Tuesday, DOJ attorney David Dahlquist reiterated to Mehta that these remedies are needed so that Google’s illegal conduct in search doesn’t extend to this “new frontier” of search, Law360 reported. Dahlquist also clarified that the DOJ views these kinds of AI products “as new access points for search, rather than a whole new market.”

“We’re very concerned about Google’s conduct being a barrier to entry,” Dahlquist said.

Google could not immediately be reached for comment. But the search giant has maintained that AI is beyond the scope of the search trial.

During the status conference, Google attorney John E. Schmidtlein disputed that AI remedies are relevant. While he agreed that “AI is key to the future of search,” he warned that “extraordinary” proposed remedies would “hobble” Google’s AI innovation, Law360 reported.

Microsoft shields confidential AI deals

Microsoft is predictably protective of its AI deals, arguing in a court filing that its “highly confidential agreements with OpenAI, Perplexity AI, Inflection, and G42 are not relevant to the issues being litigated” in the Google trial.

According to Microsoft, Google is arguing that it needs this information to “shed light” on things like “the extent to which the OpenAI partnership has driven new traffic to Bing and otherwise affected Microsoft’s competitive standing” or what’s required by “terms upon which Bing powers functionality incorporated into Perplexity’s search service.”

These insights, Google seemingly hopes, will convince Mehta that Google’s AI deals and investments are the norm in the AI search sector. But Microsoft is currently blocking access, arguing that “Google has done nothing to explain why” it “needs access to the terms of Microsoft’s highly confidential agreements with other third parties” when Microsoft has already offered to share documents “regarding the distribution and competitive position” of its AI products.

Microsoft also opposes Google’s attempts to review how search click-and-query data is used to train OpenAI’s models. Those requests would be better directed at OpenAI, Microsoft said.

If Microsoft gets its way, Google’s discovery requests will be limited to just Microsoft’s content licensing agreements for Copilot. Microsoft alleged those are the only deals “related to the general search or the general search text advertising markets” at issue in the trial.

On Tuesday, Microsoft attorney Julia Chapman told Mehta that Microsoft had “agreed to provide documents about the data used to train its own AI model and also raised concerns about the competitive sensitivity of Microsoft’s agreements with AI companies,” Law360 reported.

It remains unclear at this time if OpenAI will be forced to give Google the click-and-query data Google seeks. At the status hearing, Mehta ordered OpenAI to share “financial statements, information about the training data for ChatGPT, and assessments of the company’s competitive position,” Law360 reported.

But the DOJ may also be interested in seeing that data. In their proposed final judgment, the government forecasted that “query-based AI solutions” will “provide the most likely long-term path for a new generation of search competitors.”

Because of that prediction, any remedy “must prevent Google from frustrating or circumventing” court-ordered changes “by manipulating the development and deployment of new technologies like query-based AI solutions.” Emerging rivals “will depend on the absence of anticompetitive constraints to evolve into full-fledged competitors and competitive threats,” the DOJ alleged.

Mehta seemingly wants to see the evidence supporting the DOJ’s predictions, which could end up exposing carefully guarded secrets of both Google’s and its biggest rivals’ AI deals.

On Tuesday, the judge noted that integration of AI into search engines had already evolved what search results pages look like. And from his “very layperson’s perspective,” it seems like AI’s integration into search engines will continue moving “very quickly,” as both parties seem to agree.

Whether he buys into the DOJ’s theory that Google could use its existing advantage as the world’s greatest gatherer of search query data to block rivals from keeping pace is still up in the air, but the judge seems moved by the DOJ’s claim that “AI has the ability to affect market dynamics in these industries today as well as tomorrow.”

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Google’s plan to keep AI out of search trial remedies isn’t going very well Read More »

Welcome to Google’s nightmare: US reveals plan to destroy search monopoly

android, Antitrust law, duckduckgo, Google, google ai overviews, google chrome, google search, google search monopoly, Policy, us department of justice / Tim Belzer / November 21, 2024

Hepner expects that the DOJ plan may be measured enough that the court may only “be interested in a nip-tuck, not a wholesale revision of what plaintiffs have put forward.”

Kamyl Bazbaz, SVP of public affairs for Google’s more privacy-focused rival DuckDuckGo, released a statement agreeing with Hepner.

“The government has put forward a proposal that would free the search market from Google’s illegal grip and unleash a new era of innovation, investment, and competition,” Bazbaz said. “There’s nothing radical about this proposal: It’s firmly based on the court’s extensive finding of fact and proposes solutions in line with previous antitrust actions.”

Bazbaz accused Google of “cynically” invoking privacy among chief concerns with a forced Chrome sale. That “is rich coming from the Internet’s biggest tracker,” Bazbaz said.

Will Apple finally compete with Google in search?

The remedies the DOJ has proposed could potentially be game-changing, Bazbaz told Ars, not just for existing rivals but also new rivals and startups the court found were previously unable to enter the market while it was under Google’s control.

If the DOJ gets its way, Google could be stuck complying with these proposed remedies for 10 years. But if the company can prove after five years that competition has substantially increased and it controls less than 50 percent of the market, the remedies could be terminated early, the DOJ’s proposed final judgment order said.

That’s likely cold comfort for Google as it prepares to fight the DOJ’s plan to break up its search empire and potentially face major new competitors. The biggest risk to Google’s dominance in AI search could even be its former partner, whom the court found was being paid handsomely to help prop up Google’s search monopoly: Apple.

On X (formerly Twitter), Hepner said that cutting off Google’s $20 billion payments to Apple for default placements in Safari alone could “have a huge effect and may finally kick Apple to enter the market itself.”

Welcome to Google’s nightmare: US reveals plan to destroy search monopoly Read More »

Google loses DOJ’s big monopoly trial over search business

Antitrust law, department of justice, Google, google monopoly trial, google search, Policy / Mike M. / August 5, 2024

Huge loss for Google —

Google’s exclusive deals maintained monopolies in two markets, judge ruled.

Ashley Belanger – Aug 5, 2024 7: 30 pm UTC

Google just lost a massive antitrust trial over its sprawling search business, as US district judge Amit Mehta released his ruling, showing that he sided with the US Department of Justice in the case that could disrupt how billions of people search the web.

“Google is a monopolist, and it has acted as one to maintain its monopoly,” Mehta wrote in his opinion. “It has violated Section 2 of the Sherman Act.”

The verdict will likely come as a shock to Google, which had long argued that punishing Google for being the best in search would be “unprecedented” and frequently pointed to the DOJ’s lack of direct evidence. However, Mehta found the limited direct evidence compelling, especially “Google’s admission that it does not ‘consider whether users will go to other specific search providers (general or otherwise) if it introduces a change to its Search product.'”

“Google’s indifference is unsurprising,” Mehta wrote. “In 2020, Google conducted a quality degradation study, which showed that it would not lose search revenue if were to significantly reduce the quality of its search product. Just as the power to raise price ‘when it is desired to do so’ is proof of monopoly power, so too is the ability to degrade product quality without concern of losing consumers.”

He also wrote that the DOJ’s indirect evidence “easily establishes Google’s monopoly power in search” and concluded that “the fact that Google makes product changes without concern that its users might go elsewhere is something only a firm with monopoly power could do.”

Google didn’t lose every battle in this big fight with the DOJ. Mehta ruled that Google did not have monopoly power in search advertising, agreed that there was no market for general search advertising, and declined to sanction Google for allegedly destroying evidence by “failing to preserve its employees’ chat messages.”

Google’s president of global affairs, Kent Walker, provided a statement to Ars, confirming that Google plans to appeal.

“This decision recognizes that Google offers the best search engine, but concludes that we shouldn’t be allowed to make it easily available,” Walker said. “We appreciate the Court’s finding that Google is ‘the industry’s highest quality search engine, which has earned Google the trust of hundreds of millions of daily users,’ that Google ‘has long been the best search engine, particularly on mobile devices,’ ‘has continued to innovate in search,’ and that ‘Apple and Mozilla occasionally assess Google’s search quality relative to its rivals and find Google’s to be superior.’ Given this, and that people are increasingly looking for information in more and more ways, we plan to appeal. As this process continues, we will remain focused on making products that people find helpful and easy to use.”

Google monopolizes two markets, judge ruled

Mehta ruled that Google spending billions on exclusive distribution agreements with companies like Apple helped the tech giant maintain monopolies in two markets, general search services and general text advertising.

The US government had argued that Google used these exclusive deals to block out competitors like Bing or DuckDuckGo, “by ensuring that all of Android and Apple and mobile users are offered Google, either as the default general search engine or the only general search engine, Google’s deals with Android and Apple clearly have a significant effect in preserving its monopoly.” The DOJ successfully argued that blocks rivals from reaching the “critical level necessary” to “pose a real threat to Google’s monopoly.”

Mehta noted that Google’s dominance had “gone unchallenged for well over a decade,” partly due to a “largely unseen advantage over its rivals: default distribution.” He found that Google’s exclusive distribution deals foreclosed a “substantial share” of the markets and allowed Google to earn more revenues. Google then shared spiking revenues with device and browser developers—spending up to $26 billion in 2021 alone for exclusive deals, the trial revealed.

Google did all this, Mehta said, to ensure that “most devices in the United States come preloaded exclusively with Google” and to force “Google’s rivals to find other ways to reach users.” The DOJ successfully argued that this posed “significant barriers that protect Google’s market dominance in general search,” with rivals having to overcome “high capital costs—”to the tune of billions of dollars,” Mehta wrote—”Google’s control of key distribution channels, brand recognition, and scale.”

Barriers to entry in general text advertising are similarly “high,” Mehta said, with new entrants facing “the same major obstacles as would the developer of a new” search engine.

One of the most scrutinized exclusive deals was between Google and Apple, which was estimated at $20 billion in 2022. “This is nearly double the payment made in 2020,” Mehta noted, suggesting that Google increasingly valued the deal locking its search engine as the default in Safari as a way to shore up its search dominance.

“Google has long recognized that, if Apple were to develop and deploy its own search engine as the default” search tool “in Safari, it would come at great cost to Google,” Mehta wrote. Without the deal, Google “would lose around 65 percent of its revenue, even assuming that it could retain some users without the Safari default” placement. But “Apple has decided not to enter general search,” Mehta said, likely because it “would forego significant revenues” and potentially face user backlash if it stopped partnering with Google. Similarly high revenue loss would occur if “Google were to lose the Android defaults,” Mehta said.

None of the pro-competitive benefits that Google claimed justified the exclusive deals persuaded Mehta, who ruled that “importantly,” Google “exercised its monopoly power by charging supracompetitive prices for general search text ads”—and thus earned “monopoly profits.”

“That Google makes changes to its text ads auctions without considering its rivals’ prices is something that only a firm with monopoly power is able to do,” Mehta wrote. And “Google in fact has profitably raised prices substantially above the competitive level. That makes ‘the existence of monopoly power” “clear.”

Ultimately, Mehta ruled that “Google has no true competitor” in general search and without any “genuine” competition, “over the last decade, Google’s grip on the market has only grown stronger.” Further, he found that “Google understands there is no genuine competition for the defaults because it knows that its partners cannot afford to go elsewhere,” disagreeing with Google’s arguments that the default deals were not exclusive.

“The key question then is this: Do Google’s exclusive distribution contracts reasonably appear capable of significantly contributing to maintaining Google’s monopoly power in the general search services market?” Mehta wrote. “The answer is ‘yes.'”

This is a developing story and is being updated.

Google loses DOJ’s big monopoly trial over search business Read More »

Google won’t downrank top deepfake porn sites unless victims mass report

ai deepfakes, ai sex images, deepfake porn, explicit deepfakes, Google, google search, non-consensual intimate imagery, Policy / Rejus Almole / July 31, 2024

Today, Google announced new measures to combat the rapidly increasing spread of AI-generated non-consensual explicit deepfakes in its search results.

Because of “a concerning increase in generated images and videos that portray people in sexually explicit contexts, distributed on the web without their consent,” Google said that it consulted with “experts and victim-survivors” to make some “significant updates” to its widely used search engine to “further protect people.”

Specifically, Google made it easier for targets of fake explicit images—which experts have said are overwhelmingly women—to report and remove deepfakes that surface in search results. Additionally, Google took steps to downrank explicit deepfakes “to keep this type of content from appearing high up in Search results,” the world’s leading search engine said.

Victims of deepfake pornography have previously criticized Google for not being more proactive in its fight against deepfakes in search results. Surfacing images and reporting each one is a “time- and energy-draining process” and “constant battle,” Kaitlyn Siragusa, a Twitch gamer with an explicit OnlyFans frequently targeted by deepfakes, told Bloomberg last year.

In response, Google has worked to “make the process easier,” partly by “helping people address this issue at scale.” Now, when a victim submits a removal request, “Google’s systems will also aim to filter all explicit results on similar searches about them,” Google’s blog said. And once a deepfake is “successfully removed,” Google “will scan for—and remove—any duplicates of that image that we find,” the blog said.

Google’s efforts to downrank harmful fake content have also expanded, the tech giant said. To help individuals targeted by deepfakes, Google will now “lower explicit fake content for” searches that include people’s names. According to Google, this step alone has “reduced exposure to explicit image results on these types of queries by over 70 percent.”

However, Google still seems resistant to downranking general searches that might lead people to harmful content. A quick Google search confirms that general searches with keywords like “celebrity nude deepfake” point searchers to popular destinations where they can search for non-consensual intimate images of celebrities or request images of less famous people.

For victims, the bottom line is that problematic links will still appear in Google’s search results for anyone willing to keep scrolling or anyone intentionally searching for “deepfakes.” The only step Google has taken recently to downrank top deepfake sites like Fan-Topia or MrDeepFakes is a promise to demote “sites that have received a high volume of removals for fake explicit imagery.”

It’s currently unclear what Google considers a “high volume,” and Google declined Ars’ request to comment on whether these sites would be downranked eventually. Instead, a Google spokesperson told Ars that “if we receive a high volume of successful removal sites from a specific website under this policy, we will use that as a ranking signal and demote the site in question for queries where the site might surface.”

Currently, Google’s spokesperson said, Google is focused on downranking “queries that include the names of individuals,” which “have the highest potential for individual harm.” But more queries will be downranked in the coming months, Google’s spokesperson said, and Google continues to tackle the “technical challenge for search engines” of differentiating between “explicit content that’s real and consensual (like an actor’s nude scenes)” and “explicit fake content (like deepfakes featuring said actor),” Google’s blog said.

“This is an ongoing effort, and we have additional improvements coming over the next few months to address a broader range of queries,” Google’s spokesperson told Ars.

Deepfake trauma “never ends”

In its blog, Google said that “these efforts are designed to give people added peace of mind, especially if they’re concerned about similar content about them popping up in the future.”

But many deepfake victims have claimed that dedicating hours or even months to removing harmful content doesn’t provide any hope that the images won’t resurface. Most recently, one deepfake victim, Sabrina Javellana, told The New York Times that even after her home state Florida passed a law against deepfakes, that didn’t stop the fake images from spreading online.

She’s given up on trying to get the images removed anywhere, telling The Times, “It just never ends. I just have to accept it.”

According to US Representative Joseph Morelle (D-NY), it will take a federal law against deepfakes to deter more bad actors from harassing and terrorizing women with deepfake porn. He’s introduced one such law, the Preventing Deepfakes of Intimate Images Act, which would criminalize creating deepfakes. It currently has 59 sponsors in the House and bipartisan support in the Senate, Morelle said on a panel this week discussing harms of deepfakes, which Ars attended.

Morelle said he’d spoken to victims of deepfakes, including teenagers, and decided that “a national ban and a national set of both criminal and civil remedies makes the most sense” to combat the problem with “urgency.”

“A patchwork of different state and local jurisdictions with different rules” would be “really hard to follow” for both victims and perpetrators trying to understand what’s legal, Morelle said, whereas federal laws that impose a liability and criminal penalty would likely have “the greatest impact.”

Victims, Morelle said, every day suffer from mental, physical, emotional, and financial harms, and as a co-panelist, Andrea Powell, pointed out, there is no healing because there is currently no justice for survivors during a period of “prolific and catastrophic increase in this abuse,” Powell warned.

Google won’t downrank top deepfake porn sites unless victims mass report Read More »

Google’s AI Overview is flawed by design, and a new company blog post hints at why

AI, ai hallucinations, ai overview, Biz & IT, chatgpt, chatgtp, confabulation, Google, Google I/O, google search, hallucination, large language models, Liz Reid, machine learning, making things up, openai, reddit / Rejus Almole / June 1, 2024

guided by voices —

Google: “There are bound to be some oddities and errors” in system that told people to eat rocks.

Benj Edwards – May 31, 2024 7: 47 pm UTC

A selection of Google mascot characters created by the company. — Enlarge / The Google “G” logo surrounded by whimsical characters, all of which look stunned and surprised.

On Thursday, Google capped off a rough week of providing inaccurate and sometimes dangerous answers through its experimental AI Overview feature by authoring a follow-up blog post titled, “AI Overviews: About last week.” In the post, attributed to Google VP Liz Reid, head of Google Search, the firm formally acknowledged issues with the feature and outlined steps taken to improve a system that appears flawed by design, even if it doesn’t realize it is admitting it.

To recap, the AI Overview feature—which the company showed off at Google I/O a few weeks ago—aims to provide search users with summarized answers to questions by using an AI model integrated with Google’s web ranking systems. Right now, it’s an experimental feature that is not active for everyone, but when a participating user searches for a topic, they might see an AI-generated answer at the top of the results, pulled from highly ranked web content and summarized by an AI model.

While Google claims this approach is “highly effective” and on par with its Featured Snippets in terms of accuracy, the past week has seen numerous examples of the AI system generating bizarre, incorrect, or even potentially harmful responses, as we detailed in a recent feature where Ars reporter Kyle Orland replicated many of the unusual outputs.

Drawing inaccurate conclusions from the web

On Wednesday morning, Google's AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993. — Enlarge / On Wednesday morning, Google’s AI Overview was erroneously telling us the Sony PlayStation and Sega Saturn were available in 1993.

Kyle Orland / Google

Given the circulating AI Overview examples, Google almost apologizes in the post and says, “We hold ourselves to a high standard, as do our users, so we expect and appreciate the feedback, and take it seriously.” But Reid, in an attempt to justify the errors, then goes into some very revealing detail about why AI Overviews provides erroneous information:

AI Overviews work very differently than chatbots and other LLM products that people may have tried out. They’re not simply generating an output based on training data. While AI Overviews are powered by a customized language model, the model is integrated with our core web ranking systems and designed to carry out traditional “search” tasks, like identifying relevant, high-quality results from our index. That’s why AI Overviews don’t just provide text output, but include relevant links so people can explore further. Because accuracy is paramount in Search, AI Overviews are built to only show information that is backed up by top web results.

This means that AI Overviews generally don’t “hallucinate” or make things up in the ways that other LLM products might.

Here we see the fundamental flaw of the system: “AI Overviews are built to only show information that is backed up by top web results.” The design is based on the false assumption that Google’s page-ranking algorithm favors accurate results and not SEO-gamed garbage. Google Search has been broken for some time, and now the company is relying on those gamed and spam-filled results to feed its new AI model.

Even if the AI model draws from a more accurate source, as with the 1993 game console search seen above, Google’s AI language model can still make inaccurate conclusions about the “accurate” data, confabulating erroneous information in a flawed summary of the information available.

Generally ignoring the folly of basing its AI results on a broken page-ranking algorithm, Google’s blog post instead attributes the commonly circulated errors to several other factors, including users making nonsensical searches “aimed at producing erroneous results.” Google does admit faults with the AI model, like misinterpreting queries, misinterpreting “a nuance of language on the web,” and lacking sufficient high-quality information on certain topics. It also suggests that some of the more egregious examples circulating on social media are fake screenshots.

“Some of these faked results have been obvious and silly,” Reid writes. “Others have implied that we returned dangerous results for topics like leaving dogs in cars, smoking while pregnant, and depression. Those AI Overviews never appeared. So we’d encourage anyone encountering these screenshots to do a search themselves to check.”

(No doubt some of the social media examples are fake, but it’s worth noting that any attempts to replicate those early examples now will likely fail because Google will have manually blocked the results. And it is potentially a testament to how broken Google Search is if people believed extreme fake examples in the first place.)

While addressing the “nonsensical searches” angle in the post, Reid uses the example search, “How many rocks should I eat each day,” which went viral in a tweet on May 23. Reid says, “Prior to these screenshots going viral, practically no one asked Google that question.” And since there isn’t much data on the web that answers it, she says there is a “data void” or “information gap” that was filled by satirical content found on the web, and the AI model found it and pushed it as an answer, much like Featured Snippets might. So basically, it was working exactly as designed.

Enlarge / A screenshot of an AI Overview query, “How many rocks should I eat each day” that went viral on X last week.

Google’s AI Overview is flawed by design, and a new company blog post hints at why Read More »