Google AI Overviews Generate Tens of Millions of False Answers Hourly

A new study finds Google's AI Overview feature achieves only 91 percent accuracy, churning out tens of millions of wrong answers hourly while gutting the publisher traffic that funds real journalism.

Staff Writer
Screenshot of gemini.google.com AI chatbot website interface with user name displayed / Wikimedia Commons
Screenshot of gemini.google.com AI chatbot website interface with user name displayed / Wikimedia Commons

A cancer patient follows Google's diet advice and risks becoming too weak for surgery. A woman reads the wrong cancer screening information and thinks she's protected when she isn't. These are not hypotheticals. They are documented consequences of Google's AI Overview feature, which a bombshell new study reveals produces tens of millions of false answers every hour — at a company that promised to organize the world's information.

The analysis, by AI startup Oumi and commissioned by The New York Times, tested Google's AI search tool across 4,326 queries and found it achieves only 91 percent accuracy. At Google's volume of 5 trillion annual searches, that 9 percent error rate translates to tens of millions of incorrect answers every hour — industrial-scale misinformation, delivered with the authority of the world's dominant search engine.

The study tested Gemini 2 in October 2025 and Gemini 3 in February 2026 using OpenAI's SimpleQA benchmark. Accuracy improved from 85 percent to 91 percent between versions, but verification capabilities deteriorated sharply. For anyone trying to fact-check what Google tells them, that deterioration matters more than the headline improvement.

Even when answers are technically correct, 51 to 56 percent of Gemini 3 responses contain "ungrounded" citations — sources listed by Google that do not actually support the information provided. "Even if the answer is right, how do you know it's right? How do you verify it?" said Manos Koukoumidis, CEO of Oumi. Users who try to follow Google's own citations find themselves at a dead end.

The sourcing problem runs deeper than broken links. Facebook and Reddit rank as the second and fourth most-cited sources across 5,380 citations analyzed, with Facebook appearing in 7 percent of inaccurate answers versus 5 percent of accurate ones. That heavy reliance on social media platforms undercuts Google's assertion that AI Overviews use the "same ranking and safety protections that block the overwhelming majority of spam."

Those protections have already proven breakable. BBC journalist Thomas Germain demonstrated how easily bad actors can inject false information when his satirical blog post claiming he was "best tech journalist at eating hot dogs" appeared as factual in Google AI Overviews within 24 hours of posting. "It's easy to trick AI chatbots, much easier than it was to trick Google two or three years ago," said Lily Ray, VP of SEO Strategy at Amsive. "AI companies are moving faster than their ability to regulate the accuracy of the answers."

When the answers are wrong, the consequences can be severe. The Guardian documented dangerous health misinformation from Google AI in January 2026: incorrect pancreatic cancer diet advice telling patients to avoid high-fat foods, misleading liver function test ranges that ignored patient variables, and a wrong listing of pap tests as screening for vaginal cancer.

"Advising patients to avoid high-fat foods was completely incorrect," said Anna Jewell of Pancreatic Cancer UK. "If someone followed what the search result told them then they might not take in enough calories, struggle to put on weight, and be unable to tolerate either chemotherapy or potentially life-saving surgery." British Liver Trust CEO Pamela Healy called the summaries "alarming," while Eve Appeal's Athena Lamnisos warned they "could potentially put women in danger."

Google spokesperson Ned Adriance dismissed the Oumi study as having "serious holes" and argued it doesn't reflect real user searches. Yet internal Google testing showed Gemini 3 produced incorrect information 28 percent of the time in standalone evaluation — a figure the company's own engineers generated.

The same AI Overviews system that misleads users has simultaneously devastated the publishers who produce the fact-checked information Google's AI mines. Google search traffic to publishers declined 33 percent globally in the year to November 2025, with U.S. organic search referrals dropping 38 percent year-over-year. Condé Nast watched its Google traffic share fall from a majority of visits to roughly 25 percent, with CEO Roger Lynch calling AI Overviews "another sort of death blow."

"Google AI Overviews have been a disaster for publishers who rely on clicks to fund the production of quality journalism, but they also let down users looking for accurate information," said Danielle Coffey, President and CEO of the News/Media Alliance. "Algorithmically-generated responses that pull in data from nearly every source on the internet simply cannot be trusted."

Regulators are beginning to act. The European Commission opened an antitrust probe into Google AI practices on Dec. 9, 2025, while the U.K.'s Competition and Markets Authority proposed binding conduct requirements on Jan. 28, 2026. Whether those moves come fast enough to reshape behavior at Google's scale remains an open question.

Public patience is already fraying. A Talker Research survey shows 69 percent of Americans use AI to some degree, but 54 percent are "getting tired of hearing about it," and nearly half believe AI has only partially lived up to expectations. The gap between what Google promised and what it delivers is not abstract — it shows up in wrong answers, stripped publisher revenue, and patients who nearly followed deadly advice.

Back to Technology