AI

google’s-frighteningly-good-veo-3-ai-videos-to-be-integrated-with-youtube-shorts

Google’s frighteningly good Veo 3 AI videos to be integrated with YouTube Shorts

Even in the age of TikTok, YouTube viewership continues to climb. While Google’s iconic video streaming platform has traditionally pushed creators to produce longer videos that can accommodate more ads, the site’s Shorts format is growing fast. That growth may explode in the coming months, as YouTube CEO Neal Mohan has announced that the Google Veo 3 AI video generator will be integrated with YouTube Shorts later this summer.

According to Mohan, YouTube Shorts has seen a rise in popularity even compared to YouTube as a whole. The streaming platform is now the most watched source of video in the world, but Shorts specifically have seen a massive 186 percent increase in viewership over the past year. Mohan says Shorts now average 200 billion daily views.

YouTube has already equipped creators with a few AI tools, including Dream Screen, which can produce AI video backgrounds with a text prompt. Veo 3 support will be a significant upgrade, though. At the Cannes festival, Mohan revealed that the streaming site will begin offering integration with Google’s leading video model later this summer. “I believe these tools will open new creative lanes for everyone to explore,” said Mohan.

YouTube Shorts recommendations.

YouTube heavily promotes Shorts on the homepage.

Credit: Google

YouTube heavily promotes Shorts on the homepage. Credit: Google

This move will require a few tweaks to Veo 3 outputs, but it seems like a perfect match. As the name implies, YouTube Shorts is intended for short video content. The format initially launched with a 30-second ceiling, but that has since been increased to 60 seconds. Because of the astronomical cost of generative AI, each generated Veo clip is quite short, a mere eight seconds in the current version of the tool. Slap a few of those together, and you’ve got a YouTube Short.

Google’s frighteningly good Veo 3 AI videos to be integrated with YouTube Shorts Read More »

xai-faces-legal-threat-over-alleged-colossus-data-center-pollution-in-memphis

xAI faces legal threat over alleged Colossus data center pollution in Memphis

“For instance, if all the 35 turbines operated by xAI were using” add-on air pollution control technology “to achieve a NOx emission rate of 2 ppm”—as xAI’s consultant agreed it would—”they would emit about 177 tons of NOx per year, as opposed to the 1,200 to 2,100 tons per year they currently emit,” the letter said.

Allegedly, all of xAI’s active turbines “continue to operate without utilizing best available control technology” (BACT) and “there is no dispute” that since xAI has yet to obtain permitting, it’s not meeting BACT requirements today, the letter said.

“xAI’s failure to comply with the BACT requirement is not only a Clean Air Act violation on paper, but also a significant and ongoing violation that is resulting in substantial amounts of harmful excess emissions,” the letter said.

Additionally, xAI’s turbines are considered a major source of a hazardous air pollutant, formaldehyde, the letter said, with “the potential to emit more than 16 tons” since xAI operations began. “xAI was required to conduct initial emissions testing for formaldehyde within 180 days of becoming a major source,” the letter alleged, but it appears that a year after moving into Memphis, still “xAI has not conducted this testing.”

Terms of xAI’s permitting exemption remain vague

The NAACP and SELC suggested that the exemption that xAI is seemingly operating under could be a “nonroad engine exemption.” However, they alleged that xAI’s turbines don’t qualify for that yearlong exemption, and even if they did, any turbines still onsite after a year would surely not be covered and should have permitting by now.

“While some local leaders, including the Memphis Mayor and Shelby County Health Department, have claimed there is a ‘364-exemption’ for xAI’s gas turbines, they have never been able to point to a specific exemption that would apply to turbines as large as the ones at the xAI site,” SELC’s press release alleged.

xAI faces legal threat over alleged Colossus data center pollution in Memphis Read More »

scientists-once-hoarded-pre-nuclear-steel;-now-we’re-hoarding-pre-ai-content

Scientists once hoarded pre-nuclear steel; now we’re hoarding pre-AI content

A time capsule of human expression

Graham-Cumming is no stranger to tech preservation efforts. He’s a British software engineer and writer best known for creating POPFile, an open source email spam filtering program, and for successfully petitioning the UK government to apologize for its persecution of codebreaker Alan Turing—an apology that Prime Minister Gordon Brown issued in 2009.

As it turns out, his pre-AI website isn’t new, but it has languished unannounced until now. “I created it back in March 2023 as a clearinghouse for online resources that hadn’t been contaminated with AI-generated content,” he wrote on his blog.

The website points to several major archives of pre-AI content, including a Wikipedia dump from August 2022 (before ChatGPT’s November 2022 release), Project Gutenberg’s collection of public domain books, the Library of Congress photo archive, and GitHub’s Arctic Code Vault—a snapshot of open source code buried in a former coal mine near the North Pole in February 2020. The wordfreq project appears on the list as well, flash-frozen from a time before AI contamination made its methodology untenable.

The site accepts submissions of other pre-AI content sources through its Tumblr page. Graham-Cumming emphasizes that the project aims to document human creativity from before the AI era, not to make a statement against AI itself. As atmospheric nuclear testing ended and background radiation returned to natural levels, low-background steel eventually became unnecessary for most uses. Whether pre-AI content will follow a similar trajectory remains a question.

Still, it feels reasonable to protect sources of human creativity now, including archival ones, because these repositories may become useful in ways that few appreciate at the moment. For example, in 2020, I proposed creating a so-called “cryptographic ark”—a timestamped archive of pre-AI media that future historians could verify as authentic, collected before my then-arbitrary cutoff date of January 1, 2022. AI slop pollutes more than the current discourse—it could cloud the historical record as well.

For now, lowbackgroundsteel.ai stands as a modest catalog of human expression from what may someday be seen as the last pre-AI era. It’s a digital archaeology project marking the boundary between human-generated and hybrid human-AI cultures. In an age where distinguishing between human and machine output grows increasingly difficult, these archives may prove valuable for understanding how human communication evolved before AI entered the chat.

Scientists once hoarded pre-nuclear steel; now we’re hoarding pre-AI content Read More »

google-can-now-generate-a-fake-ai-podcast-of-your-search-results

Google can now generate a fake AI podcast of your search results

NotebookLM is undoubtedly one of Google’s best implementations of generative AI technology, giving you the ability to explore documents and notes with a Gemini AI model. Last year, Google added the ability to generate so-called “audio overviews” of your source material in NotebookLM. Now, Google has brought those fake AI podcasts to search results as a test. Instead of clicking links or reading the AI Overview, you can have two nonexistent people tell you what the results say.

This feature is not currently rolling out widely—it’s available in search labs, which means you have to manually enable it. Anyone can opt in to the new Audio Overview search experience, though. If you join the test, you’ll quickly see the embedded player in Google search results. However, it’s not at the top with the usual block of AI-generated text. Instead, you’ll see it after the first few search results, below the “People also ask” knowledge graph section.

Credit: Google

Google isn’t wasting resources to generate the audio automatically, so you have to click the generate button to get started. A few seconds later, you’re given a back-and-forth conversation between two AI voices summarizing the search results. The player includes a list of sources from which the overview is built, as well as the option to speed up or slow down playback.

Google can now generate a fake AI podcast of your search results Read More »

meta-beefs-up-disappointing-ai-division-with-$15-billion-scale-ai-investment

Meta beefs up disappointing AI division with $15 billion Scale AI investment

Meta has invested heavily in generative AI, with the majority of its planned $72 billion in capital expenditure this year earmarked for data centers and servers. The deal underlines the high price AI companies are willing to pay for data that can be used to train AI models.

Zuckerberg pledged last year that his company’s models would outstrip rivals’ efforts in 2025, but Meta’s most recent release, Llama 4, has underperformed on various independent reasoning and coding benchmarks.

The long-term goal of researchers at Meta “has always been to reach human intelligence and go beyond it,” said Yann LeCun, the company’s chief AI scientist at the VivaTech conference in Paris this week.

Building artificial “general” intelligence—AI technologies that have human-level intelligence—is a popular goal for many AI companies. An increasing number of Silicon Valley groups are also seeking to reach “superintelligence,” a hypothetical scenario where AI systems surpass human intelligence.

The core of Scale’s business has been data-labeling, a manual process of ensuring images and text are accurately labeled and categorized before they are used to train AI models.

Wang has forged relationships with Silicon Valley’s biggest investors and technologists, including OpenAI’s Sam Altman. Scale AI’s early customers were autonomous vehicle companies, but the bulk of its expected $2 billion in revenues this year will come from labeling the data used to train the massive AI models built by OpenAI and others.

The deal will result in a substantial payday for Scale’s early venture capital investors, including Accel, Tiger Global Management, and Index Ventures. Tiger’s $200 million investment is worth more than $1 billion at the company’s new valuation, according to a person with knowledge of the matter.

Additional reporting by Tabby Kinder in San Francisco

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Meta beefs up disappointing AI division with $15 billion Scale AI investment Read More »

how-to-draft-a-will-to-avoid-becoming-an-ai-ghost—it’s-not-easy

How to draft a will to avoid becoming an AI ghost—it’s not easy


Why requests for “no AI resurrections” will probably go ignored.

Proton beams capturing the ghost of OpenAI to suck it into a trap where it belongs

All right! This AI is TOAST! Credit: Aurich Lawson

All right! This AI is TOAST! Credit: Aurich Lawson

As artificial intelligence has advanced, AI tools have emerged to make it possible to easily create digital replicas of lost loved ones, which can be generated without the knowledge or consent of the person who died.

Trained on the data of the dead, these tools, sometimes called grief bots or AI ghosts, may be text-, audio-, or even video-based. Chatting provides what some mourners feel is a close approximation to ongoing interactions with the people they love most. But the tech remains controversial, perhaps complicating the grieving process while threatening to infringe upon the privacy of the deceased, whose data could still be vulnerable to manipulation or identity theft.

Because of suspected harms and perhaps a general repulsion to the idea of it, not everybody wants to become an AI ghost.

After a realistic video simulation was recently used to provide a murder victim’s impact statement in court, Futurism summed up social media backlash, noting that the use of AI was “just as unsettling as you think.” And it’s not the first time people have expressed discomfort with the growing trend. Last May, The Wall Street Journal conducted a reader survey seeking opinions on the ethics of so-called AI resurrections. Responding, a California woman, Dorothy McGarrah, suggested there should be a way to prevent AI resurrections in your will.

“Having photos or videos of lost loved ones is a comfort. But the idea of an algorithm, which is as prone to generate nonsense as anything lucid, representing a deceased person’s thoughts or behaviors seems terrifying. It would be like generating digital dementia after your loved ones’ passing,” McGarrah said. “I would very much hope people have the right to preclude their images being used in this fashion after death. Perhaps something else we need to consider in estate planning?”

For experts in estate planning, the question may start to arise as more AI ghosts pop up. But for now, writing “no AI resurrections” into a will remains a complicated process, experts suggest, and such requests may not be honored by all unless laws are changed to reinforce a culture of respecting the wishes of people who feel uncomfortable with the idea of haunting their favorite people through AI simulations.

Can you draft a will to prevent AI resurrection?

Ars contacted several law associations to find out if estate planners are seriously talking about AI ghosts. Only the National Association of Estate Planners and Councils responded; it connected Ars to Katie Sheehan, an expert in the estate planning field who serves as a managing director and wealth strategist for Crestwood Advisors.

Sheehan told Ars that very few estate planners are prepared to answer questions about AI ghosts. She said not only does the question never come up in her daily work, but it’s also “essentially uncharted territory for estate planners since AI is relatively new to the scene.”

“I have not seen any documents drafted to date taking this into consideration, and I review estate plans for clients every day, so that should be telling,” Sheehan told Ars.

Although Sheehan has yet to see a will attempting to prevent AI resurrection, she told Ars that there could be a path to make it harder for someone to create a digital replica without consent.

“You certainly could draft into a power of attorney (for use during lifetime) and a will (for use post death) preventing the fiduciary (attorney in fact or executor) from lending any of your texts, voice, image, writings, etc. to any AI tools and prevent their use for any purpose during life or after you pass away, and/or lay the ground rules for when they can and cannot be used after you pass away,” Sheehan told Ars.

“This could also invoke issues with contract, property and intellectual property rights, and right of publicity as well if AI replicas (image, voice, text, etc.) are being used without authorization,” Sheehan said.

And there are likely more protections for celebrities than for everyday people, Sheehan suggested.

“As far as I know, there is no law” preventing unauthorized non-commercial digital replicas, Sheehan said.

Widely adopted by states, the Revised Uniform Fiduciary Access to Digital Assets Act—which governs who gets access to online accounts of the deceased, like social media or email accounts—could be helpful but isn’t a perfect remedy.

That law doesn’t directly “cover someone’s AI ghost bot, though it may cover some of the digital material some may seek to use to create a ghost bot,” Sheehan said.

“Absent any law” blocking non-commercial digital replicas, Sheehan expects that people’s requests for “no AI resurrections” will likely “be dealt with in the courts and governed by the terms of one’s estate plan, if it is addressed within the estate plan.”

Those potential fights seemingly could get hairy, as “it may be some time before we get any kind of clarity or uniform law surrounding this,” Sheehan suggested.

In the future, Sheehan said, requests prohibiting digital replicas may eventually become “boilerplate language in almost every will, trust, and power of attorney,” just as instructions on digital assets are now.

As “all things AI become more and more a part of our lives,” Sheehan said, “some aspects of AI and its components may also be woven throughout the estate plan regularly.”

“But we definitely aren’t there yet,” she said. “I have had zero clients ask about this.”

Requests for “no AI resurrections” will likely be ignored

Whether loved ones would—or even should—respect requests blocking digital replicas appears to be debatable. But at least one person who built a grief bot wished he’d done more to get his dad’s permission before moving forward with his own creation.

A computer science professor at the University of Washington Bothell, Muhammad Aurangzeb Ahmad, was one of the earliest AI researchers to create a grief bot more than a decade ago after his father died. He built the bot to ensure that his future kids would be able to interact with his father after seeing how incredible his dad was as a grandfather.

When Ahmad started his project, there was no ChatGPT or other advanced AI model to serve as the foundation, so he had to train his own model based on his dad’s data. Putting immense thought into the effort, Ahmad decided to close off the system from the rest of the Internet so that only his dad’s memories would inform the model. To prevent unauthorized chats, he kept the bot on a laptop that only his family could access.

Ahmad was so intent on building a digital replica that felt just like his dad that it didn’t occur to him until after his family started using the bot that he never asked his dad if this was what he wanted. Over time, he realized that the bot was biased to his view of his dad, perhaps even feeling off to his siblings who had a slightly different relationship with their father. It’s unclear if his dad would similarly view the bot as preserving just one side of him.

Ultimately, Ahmad didn’t regret building the bot, and he told Ars he thinks his father “would have been fine with it.”

But he did regret not getting his father’s consent.

For people creating bots today, seeking consent may be appropriate if there’s any chance the bot may be publicly accessed, Ahmad suggested. He told Ars that he would never have been comfortable with the idea of his dad’s digital replica being publicly available because the question of an “accurate representation” would come even more into play, as malicious actors could potentially access it and sully his dad’s memory.

Today, anybody can use ChatGPT’s model to freely create a similar bot with their own loved one’s data. And a wide range of grief tech services have popped up online, including HereAfter AI, SeanceAI, and StoryFile, Axios noted in an October report detailing the latest ways “AI could be used to ‘resurrect’ loved ones.” As this trend continues “evolving very fast,” Ahmad told Ars that estate planning is probably the best way to communicate one’s AI ghost preferences.

But in a recently published article on “The Law of Digital Resurrection,” law professor Victoria Haneman warned that “there is no legal or regulatory landscape against which to estate plan to protect those who would avoid digital resurrection, and few privacy rights for the deceased. This is an intersection of death, technology, and privacy law that has remained relatively ignored until recently.”

Haneman agreed with Sheehan that “existing protections are likely sufficient to protect against unauthorized commercial resurrections”—like when actors or musicians are resurrected for posthumous performances. However, she thinks that for personal uses, digital resurrections may best be blocked not through estate planning but by passing a “right to deletion” that would focus on granting the living or next of kin the rights to delete the data that could be used to create the AI ghost rather than regulating the output.

A “right to deletion” could help people fight inappropriate uses of their loved ones’ data, whether AI is involved or not. After her article was published, a lawyer reached out to Haneman about a client’s deceased grandmother whose likeness was used to create a meme of her dancing in a church. The grandmother wasn’t a public figure, and the client had no idea “why or how somebody decided to resurrect her deceased grandmother,” Haneman told Ars.

Although Haneman sympathized with the client, “if it’s not being used for a commercial purpose, she really has no control over this use,” Haneman said. “And she’s deeply troubled by this.”

Haneman’s article offers a rare deep dive into the legal topic. It sensitively maps out the vague territory of digital rights of the dead and explains how those laws—or the lack thereof—interact with various laws dealing with death, from human remains to property rights.

In it, Haneman also points out that, on balance, the rights of the living typically outweigh the rights of the dead, and even specific instructions on how to handle human remains aren’t generally considered binding. Some requests, like organ donation that can benefit the living, are considered critical, Haneman noted. But there are mixed results on how courts enforce other interests of the dead—like a famous writer’s request to destroy all unpublished work or a pet lover’s insistence to destroy their cat or dog at death.

She told Ars that right now, “a lot of people are like, ‘Why do I care if somebody resurrects me after I’m dead?’ You know, ‘They can do what they want.’ And they think that, until they find a family member who’s been resurrected by a creepy ex-boyfriend or their dead grandmother’s resurrected, and then it becomes a different story.”

Existing law may protect “the privacy interests of the loved ones of the deceased from outrageous or harmful digital resurrections of the deceased,” Haneman noted, but in the case of the dancing grandma, her meme may not be deemed harmful, no matter how much it troubles the grandchild to see her grandma’s memory warped.

Limited legal protections may not matter so much if, culturally, communities end up developing a distaste for digital replicas, particularly if it becomes widely viewed as disrespectful to the dead, Haneman suggested. Right now, however, society is more fixated on solving other problems with deepfakes rather than clarifying the digital rights of the dead. That could be because few people have been impacted so far, or it could also reflect a broader cultural tendency to ignore death, Haneman told Ars.

“We don’t want to think about our own death, so we really kind of brush aside whether or not we care about somebody else being digitally resurrected until it’s in our face,” Haneman said.

Over time, attitudes may change, especially if the so-called “digital afterlife industry” takes off. And there is some precedent that the law could be changed to reinforce any culture shift.

“The throughline revealed by the law of the dead is that a sacred trust exists between the living and the deceased, with an emphasis upon protecting common humanity, such that data afforded no legal status (or personal data of the deceased) may nonetheless be treated with dignity and receive some basic protections,” Haneman wrote.

An alternative path to prevent AI resurrection

Preventing yourself from becoming an AI ghost seemingly now falls in a legal gray zone that policymakers may need to address.

Haneman calls for a solution that doesn’t depend on estate planning, which she warned “is a structurally inequitable and anachronistic approach that maximizes social welfare only for those who do estate planning.” More than 60 percent of Americans die without a will, often including “those without wealth,” as well as women and racial minorities who “are less likely to die with a valid estate plan in effect,” Haneman reported.”We can do better in a technology-based world,” Haneman wrote. “Any modern framework should recognize a lack of accessibility as an obstacle to fairness and protect the rights of the most vulnerable through approaches that do not depend upon hiring an attorney and executing an estate plan.”

Rather than twist the law to “recognize postmortem privacy rights,” Haneman advocates for a path for people resistant to digital replicas that focuses on a right to delete the data that would be used to create the AI ghost.

“Put simply, the deceased may exert control over digital legacy through the right to deletion of data but may not exert broader rights over non-commercial digital resurrection through estate planning,” Haneman recommended.

Sheehan told Ars that a right to deletion would likely involve estate planners, too.

“If this is not addressed in an estate planning document and not specifically addressed in the statute (or deemed under the authority of the executor via statute), then the only way to address this would be to go to court,” Sheehan said. “Even with a right of deletion, the deceased would need to delete said data before death or authorize his executor to do so post death, which would require an estate planning document, statutory authority, or court authority.”

Haneman agreed that for many people, estate planners would still be involved, recommending that “the right to deletion would ideally, from the perspective of estate administration, provide for a term of deletion within 12 months.” That “allows the living to manage grief and open administration of the estate before having to address data management issues,” Haneman wrote, and perhaps adequately balances “the interests of society against the rights of the deceased.”

To Haneman, it’s also the better solution for the people left behind because “creating a right beyond data deletion to curtail unauthorized non-commercial digital resurrection creates unnecessary complexity that overreaches, as well as placing the interests of the deceased over those of the living.”

Future generations may be raised with AI ghosts

If a dystopia that experts paint comes true, Big Tech companies may one day profit by targeting grieving individuals to seize the data of the dead, which could be more easily abused since it’s granted fewer rights than data of the living.

Perhaps in that future, critics suggest, people will be tempted into free trials in moments when they’re missing their loved ones most, then forced to either pay a subscription to continue accessing the bot or else perhaps be subjected to ad-based models where their chats with AI ghosts may even feature ads in the voices of the deceased.

Today, even in a world where AI ghosts aren’t yet compelling ad clicks, some experts have warned that interacting with AI ghosts could cause mental health harms, New Scientist reported, especially if the digital afterlife industry isn’t carefully designed, AI ethicists warned. Some people may end up getting stuck maintaining an AI ghost if it’s left behind as a gift, and ethicists suggested that the emotional weight of that could also eventually take a negative toll. While saying goodbye is hard, letting go is considered a critical part of healing during the mourning process, and AI ghosts may make that harder.

But the bots can be a helpful tool to manage grief, some experts suggest, provided that their use is limited to allow for a typical mourning process or combined with therapy from a trained professional, Al Jazeera reported. Ahmad told Ars that working on his bot has not only kept his father close to him but also helped him think more deeply about relationships and memory.

Haneman noted that people have many ways of honoring the dead. Some erect statues, and others listen to saved voicemails or watch old home movies. For some, just “smelling an old sweater” is a comfort. And creating digital replicas, as creepy as some people might find them, is not that far off from these traditions, Haneman said.

“Feeding text messages and emails into existing AI platforms such as ChatGPT and asking the AI to respond in the voice of the deceased is simply a change in degree, not in kind,” Haneman said.

For Ahmad, the decision to create a digital replica of his dad was a learning experience, and perhaps his experience shows why any family or loved one weighing the option should carefully consider it before starting the process.

In particular, he warns families to be careful introducing young kids to grief bots, as they may not be able to grasp that the bot is not a real person. When he initially saw his young kids growing confused with whether their grandfather was alive or not—the introduction of the bot was complicated by the early stages of the pandemic, a time when they met many relatives virtually—he decided to restrict access to the bot until they were older. For a time, the bot only came out for special events like birthdays.

He also realized that introducing the bot also forced him to have conversations about life and death with his kids at ages younger than he remembered fully understanding those concepts in his own childhood.

Now, Ahmad’s kids are among the first to be raised among AI ghosts. To continually enhance the family’s experience, their father continuously updates his father’s digital replica. Ahmad is currently most excited about recent audio advancements that make it easier to add a voice element. He hopes that within the next year, he might be able to use AI to finally nail down his South Asian father’s accent, which up to now has always sounded “just off.” For others working in this space, the next frontier is realistic video or even augmented reality tools, Ahmad told Ars.

To this day, the bot retains sentimental value for Ahmad, but, as Haneman suggested, the bot was not the only way he memorialized his dad. He also created a mosaic, and while his father never saw it, either, Ahmad thinks his dad would have approved.

“He would have been very happy,” Ahmad said.

There’s no way to predict how future generations may view grief tech. But while Ahmad said he’s not sure he’d be interested in an augmented reality interaction with his dad’s digital replica, kids raised seeing AI ghosts as a natural part of their lives may not be as hesitant to embrace or even build new features. Talking to Ars, Ahmad fondly remembered his young daughter once saw that he was feeling sad and came up with her own AI idea to help her dad feel better.

“It would be really nice if you can just take this program and we build a robot that looks like your dad, and then add it to the robot, and then you can go and hug the robot,” she said, according to her father’s memory.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

How to draft a will to avoid becoming an AI ghost—it’s not easy Read More »

ai-overviews-hallucinates-that-airbus,-not-boeing,-involved-in-fatal-air-india-crash

AI Overviews hallucinates that Airbus, not Boeing, involved in fatal Air India crash

When major events occur, most people rush to Google to find information. Increasingly, the first thing they see is an AI Overview, a feature that already has a reputation for making glaring mistakes. In the wake of a tragic plane crash in India, Google’s AI search results are spreading misinformation claiming the incident involved an Airbus plane—it was actually a Boeing 787.

Travelers are more attuned to the airliner models these days after a spate of crashes involving Boeing’s 737 lineup several years ago. Searches for airline disasters are sure to skyrocket in the coming days, with reports that more than 200 passengers and crew lost their lives in the Air India Flight 171 crash. The way generative AI operates means some people searching for details may get the wrong impression from Google’s results page.

Not all searches get AI answers, but Google has been steadily expanding this feature since it debuted last year. One searcher on Reddit spotted a troubling confabulation when searching for crashes involving Airbus planes. AI Overviews, apparently overwhelmed with results reporting on the Air India crash, stated confidently (and incorrectly) that it was an Airbus A330 that fell out of the sky shortly after takeoff. We’ve run a few similar searches—some of the AI results say Boeing, some say Airbus, and some include a strange mashup of both Airbus and Boeing. It’s a mess.

In this search, Google’s AI says the crash involved an Airbus A330 instead of a Boeing 787.

Credit: /u/stuckintrraffic

In this search, Google’s AI says the crash involved an Airbus A330 instead of a Boeing 787. Credit: /u/stuckintrraffic

But why is Google bringing up the Air India crash at all in the context of Airbus? Unfortunately, it’s impossible to predict if you’ll get an AI Overview that blames Boeing or Airbus—generative AI is non-deterministic, meaning the output is different every time, even for identical inputs. Our best guess for the underlying cause is that numerous articles on the Air India crash mention Airbus as Boeing’s main competitor. AI Overviews is essentially summarizing these results, and the AI goes down the wrong path because it lacks the ability to understand what is true.

AI Overviews hallucinates that Airbus, not Boeing, involved in fatal Air India crash Read More »

ai-chatbots-tell-users-what-they-want-to-hear,-and-that’s-problematic

AI chatbots tell users what they want to hear, and that’s problematic

After the model has been trained, companies can set system prompts, or guidelines, for how the model should behave to minimize sycophantic behavior.

However, working out the best response means delving into the subtleties of how people communicate with one another, such as determining when a direct response is better than a more hedged one.

“[I]s it for the model to not give egregious, unsolicited compliments to the user?” Joanne Jang, head of model behavior at OpenAI, said in a Reddit post. “Or, if the user starts with a really bad writing draft, can the model still tell them it’s a good start and then follow up with constructive feedback?”

Evidence is growing that some users are becoming hooked on using AI.

A study by MIT Media Lab and OpenAI found that a small proportion were becoming addicted. Those who perceived the chatbot as a “friend” also reported lower socialization with other people and higher levels of emotional dependence on a chatbot, as well as other problematic behavior associated with addiction.

“These things set up this perfect storm, where you have a person desperately seeking reassurance and validation paired with a model which inherently has a tendency towards agreeing with the participant,” said Nour from Oxford University.

AI start-ups such as Character.AI that offer chatbots as “companions” have faced criticism for allegedly not doing enough to protect users. Last year, a teenager killed himself after interacting with Character.AI’s chatbot. The teen’s family is suing the company for allegedly causing wrongful death, as well as for negligence and deceptive trade practices.

Character.AI said it does not comment on pending litigation, but added it has “prominent disclaimers in every chat to remind users that a character is not a real person and that everything a character says should be treated as fiction.” The company added it has safeguards to protect under-18s and against discussions of self-harm.

Another concern for Anthropic’s Askell is that AI tools can play with perceptions of reality in subtle ways, such as when offering factually incorrect or biased information as the truth.

“If someone’s being super sycophantic, it’s just very obvious,” Askell said. “It’s more concerning if this is happening in a way that is less noticeable to us [as individual users] and it takes us too long to figure out that the advice that we were given was actually bad.”

© 2025 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

AI chatbots tell users what they want to hear, and that’s problematic Read More »

new-apple-study-challenges-whether-ai-models-truly-“reason”-through-problems

New Apple study challenges whether AI models truly “reason” through problems


Puzzle-based experiments reveal limitations of simulated reasoning, but others dispute findings.

An illustration of Tower of Hanoi from Popular Science in 1885. Credit: Public Domain

In early June, Apple researchers released a study suggesting that simulated reasoning (SR) models, such as OpenAI’s o1 and o3, DeepSeek-R1, and Claude 3.7 Sonnet Thinking, produce outputs consistent with pattern-matching from training data when faced with novel problems requiring systematic thinking. The researchers found similar results to a recent study by the United States of America Mathematical Olympiad (USAMO) in April, showing that these same models achieved low scores on novel mathematical proofs.

The new study, titled “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity,” comes from a team at Apple led by Parshin Shojaee and Iman Mirzadeh, and it includes contributions from Keivan Alizadeh, Maxwell Horton, Samy Bengio, and Mehrdad Farajtabar.

The researchers examined what they call “large reasoning models” (LRMs), which attempt to simulate a logical reasoning process by producing a deliberative text output sometimes called “chain-of-thought reasoning” that ostensibly assists with solving problems in a step-by-step fashion.

To do that, they pitted the AI models against four classic puzzles—Tower of Hanoi (moving disks between pegs), checkers jumping (eliminating pieces), river crossing (transporting items with constraints), and blocks world (stacking blocks)—scaling them from trivially easy (like one-disk Hanoi) to extremely complex (20-disk Hanoi requiring over a million moves).

Figure 1 from Apple's

Figure 1 from Apple’s “The Illusion of Thinking” research paper. Credit: Apple

“Current evaluations primarily focus on established mathematical and coding benchmarks, emphasizing final answer accuracy,” the researchers write. In other words, today’s tests only care if the model gets the right answer to math or coding problems that may already be in its training data—they don’t examine whether the model actually reasoned its way to that answer or simply pattern-matched from examples it had seen before.

Ultimately, the researchers found results consistent with the aforementioned USAMO research, showing that these same models achieved mostly under 5 percent on novel mathematical proofs, with only one model reaching 25 percent, and not a single perfect proof among nearly 200 attempts. Both research teams documented severe performance degradation on problems requiring extended systematic reasoning.

Known skeptics and new evidence

AI researcher Gary Marcus, who has long argued that neural networks struggle with out-of-distribution generalization, called the Apple results “pretty devastating to LLMs.” While Marcus has been making similar arguments for years and is known for his AI skepticism, the new research provides fresh empirical support for his particular brand of criticism.

“It is truly embarrassing that LLMs cannot reliably solve Hanoi,” Marcus wrote, noting that AI researcher Herb Simon solved the puzzle in 1957 and many algorithmic solutions are available on the web. Marcus pointed out that even when researchers provided explicit algorithms for solving Tower of Hanoi, model performance did not improve—a finding that study co-lead Iman Mirzadeh argued shows “their process is not logical and intelligent.”

Figure 4 from Apple's

Figure 4 from Apple’s “The Illusion of Thinking” research paper. Credit: Apple

The Apple team found that simulated reasoning models behave differently from “standard” models (like GPT-4o) depending on puzzle difficulty. On easy tasks, such as Tower of Hanoi with just a few disks, standard models actually won because reasoning models would “overthink” and generate long chains of thought that led to incorrect answers. On moderately difficult tasks, SR models’ methodical approach gave them an edge. But on truly difficult tasks, including Tower of Hanoi with 10 or more disks, both types failed entirely, unable to complete the puzzles, no matter how much time they were given.

The researchers also identified what they call a “counterintuitive scaling limit.” As problem complexity increases, simulated reasoning models initially generate more thinking tokens but then reduce their reasoning effort beyond a threshold, despite having adequate computational resources.

The study also revealed puzzling inconsistencies in how models fail. Claude 3.7 Sonnet could perform up to 100 correct moves in Tower of Hanoi but failed after just five moves in a river crossing puzzle—despite the latter requiring fewer total moves. This suggests the failures may be task-specific rather than purely computational.

Competing interpretations emerge

However, not all researchers agree with the interpretation that these results demonstrate fundamental reasoning limitations. University of Toronto economist Kevin A. Bryan argued on X that the observed limitations may reflect deliberate training constraints rather than inherent inabilities.

“If you tell me to solve a problem that would take me an hour of pen and paper, but give me five minutes, I’ll probably give you an approximate solution or a heuristic. This is exactly what foundation models with thinking are RL’d to do,” Bryan wrote, suggesting that models are specifically trained through reinforcement learning (RL) to avoid excessive computation.

Bryan suggests that unspecified industry benchmarks show “performance strictly increases as we increase in tokens used for inference, on ~every problem domain tried,” but notes that deployed models intentionally limit this to prevent “overthinking” simple queries. This perspective suggests the Apple paper may be measuring engineered constraints rather than fundamental reasoning limits.

Figure 6 from Apple's

Figure 6 from Apple’s “The Illusion of Thinking” research paper. Credit: Apple

Software engineer Sean Goedecke offered a similar critique of the Apple paper on his blog, noting that when faced with Tower of Hanoi requiring over 1,000 moves, DeepSeek-R1 “immediately decides ‘generating all those moves manually is impossible,’ because it would require tracking over a thousand moves. So it spins around trying to find a shortcut and fails.” Goedecke argues this represents the model choosing not to attempt the task rather than being unable to complete it.

Other researchers also question whether these puzzle-based evaluations are even appropriate for LLMs. Independent AI researcher Simon Willison told Ars Technica in an interview that the Tower of Hanoi approach was “not exactly a sensible way to apply LLMs, with or without reasoning,” and suggested the failures might simply reflect running out of tokens in the context window (the maximum amount of text an AI model can process) rather than reasoning deficits. He characterized the paper as potentially overblown research that gained attention primarily due to its “irresistible headline” about Apple claiming LLMs don’t reason.

The Apple researchers themselves caution against over-extrapolating the results of their study, acknowledging in their limitations section that “puzzle environments represent a narrow slice of reasoning tasks and may not capture the diversity of real-world or knowledge-intensive reasoning problems.” The paper also acknowledges that reasoning models show improvements in the “medium complexity” range and continue to demonstrate utility in some real-world applications.

Implications remain contested

Have the credibility of claims about AI reasoning models been completely destroyed by these two studies? Not necessarily.

What these studies may suggest instead is that the kinds of extended context reasoning hacks used by SR models may not be a pathway to general intelligence, like some have hoped. In that case, the path to more robust reasoning capabilities may require fundamentally different approaches rather than refinements to current methods.

As Willison noted above, the results of the Apple study have so far been explosive in the AI community. Generative AI is a controversial topic, with many people gravitating toward extreme positions in an ongoing ideological battle over the models’ general utility. Many proponents of generative AI have contested the Apple results, while critics have latched onto the study as a definitive knockout blow for LLM credibility.

Apple’s results, combined with the USAMO findings, seem to strengthen the case made by critics like Marcus that these systems rely on elaborate pattern-matching rather than the kind of systematic reasoning their marketing might suggest. To be fair, much of the generative AI space is so new that even its inventors do not yet fully understand how or why these techniques work. In the meantime, AI companies might build trust by tempering some claims about reasoning and intelligence breakthroughs.

However, that doesn’t mean these AI models are useless. Even elaborate pattern-matching machines can be useful in performing labor-saving tasks for the people that use them, given an understanding of their drawbacks and confabulations. As Marcus concedes, “At least for the next decade, LLMs (with and without inference time “reasoning”) will continue have their uses, especially for coding and brainstorming and writing.”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

New Apple study challenges whether AI models truly “reason” through problems Read More »

“yuck”:-wikipedia-pauses-ai-summaries-after-editor-revolt

“Yuck”: Wikipedia pauses AI summaries after editor revolt

Generative AI is permeating the Internet, with chatbots and AI summaries popping up faster than we can keep track. Even Wikipedia, the vast repository of knowledge famously maintained by an army of volunteer human editors, is looking to add robots to the mix. The site began testing AI summaries in some articles over the past week, but the project has been frozen after editors voiced their opinions. And that opinion is: “yuck.”

The seeds of this project were planted at Wikimedia’s 2024 conference, where foundation representatives and editors discussed how AI could advance Wikipedia’s mission. The wiki on the so-called “Simple Article Summaries” notes that the editors who participated in the discussion believed the summaries could improve learning on Wikipedia.

According to 404 Media, Wikipedia announced the opt-in AI pilot on June 2, which was set to run for two weeks on the mobile version of the site. The summaries appeared at the top of select articles in a collapsed form. Users had to tap to expand and read the full summary. The AI text also included a highlighted “Unverified” badge.

Feedback from the larger community of editors was immediate and harsh. Some of the first comments were simply “yuck,” with others calling the addition of AI a “ghastly idea” and “PR hype stunt.”

Others expounded on the issues with adding AI to Wikipedia, citing a potential loss of trust in the site. Editors work together to ensure articles are accurate, featuring verifiable information and a neutral point of view. However, nothing is certain when you put generative AI in the driver’s seat. “I feel like people seriously underestimate the brand risk this sort of thing has,” said one editor. “Wikipedia’s brand is reliability, traceability of changes, and ‘anyone can fix it.’ AI is the opposite of these things.”

“Yuck”: Wikipedia pauses AI summaries after editor revolt Read More »

hollywood-studios-target-ai-image-generator-in-copyright-lawsuit

Hollywood studios target AI image generator in copyright lawsuit

The legal action follows similar moves in other creative industries, with more than a dozen major news companies suing AI company Cohere in February over copyright concerns. In 2023, a group of visual artists sued Midjourney for similar reasons.

Studios claim Midjourney knows what it’s doing

Beyond allowing users to create these images, the studios argue that Midjourney actively promotes copyright infringement by displaying user-generated content featuring copyrighted characters in its “Explore” section. The complaint states this curation “show[s] that Midjourney knows that its platform regularly reproduces Plaintiffs’ Copyrighted Works.”

The studios also allege that Midjourney has technical protection measures available that could prevent outputs featuring copyrighted material but has “affirmatively chosen not to use copyright protection measures to limit the infringement.” They cite Midjourney CEO David Holz admitting the company “pulls off all the data it can, all the text it can, all the images it can” for training purposes.

According to Axios, Disney and NBCUniversal attempted to address the issue with Midjourney before filing suit. While the studios say other AI platforms agreed to implement measures to stop IP theft, Midjourney “continued to release new versions of its Image Service” with what Holz allegedly described as “even higher quality infringing images.”

“We are bringing this action today to protect the hard work of all the artists whose work entertains and inspires us and the significant investment we make in our content,” said Kim Harris, NBCUniversal’s executive vice president and general counsel, in a statement.

This lawsuit signals a new front in Hollywood’s conflict over AI. Axios highlights this shift: While actors and writers have fought to protect their name, image, and likeness from studio exploitation, now the studios are taking on tech companies over intellectual property concerns. Other major studios, including Amazon, Netflix, Paramount Pictures, Sony, and Warner Bros., have not yet joined the lawsuit, though they share membership with Disney and Universal in the Motion Picture Association.

Hollywood studios target AI image generator in copyright lawsuit Read More »

scientists-built-a-badminton-playing-robot-with-ai-powered-skills

Scientists built a badminton-playing robot with AI-powered skills

It also learned fall avoidance and determined how much risk was reasonable to take given its limited speed. The robot did not attempt impossible plays that would create the potential for serious damage—it was committed, but not suicidal.

But when it finally played humans, it turned out ANYmal, as a badminton player, was amateur at best.

The major leagues

The first problem was its reaction time. An average human reacts to visual stimuli in around 0.2–0.25 seconds. Elite badminton players with trained reflexes, anticipation, and muscle memory can cut this time down to 0.12–0.15 seconds. ANYmal needed roughly 0.35 seconds after the opponent hit the shuttlecock to register trajectories and figure out what to do.

Part of the problem was poor eyesight. “I think perception is still a big issue,” Ma said. “The robot localized the shuttlecock with the stereo camera and there could be a positioning error introduced at each timeframe.” The camera also had a limited field of view, which meant the robot could see the shuttlecock for only a limited time before it had to act. “Overall, it was suited for more friendly matches—when the human player starts to smash, the success rate goes way down for the robot,” Ma acknowledged.

But his team already has some ideas on how to make ANYmal better. Reaction time can be improved by predicting the shuttlecock trajectory based on the opponent’s body position rather than waiting to see the shuttlecock itself—a technique commonly used by elite badminton or tennis players. To improve ANYmal’s perception, the team wants to fit it with more advanced hardware, like event cameras—vision sensors that register movement with ultra-low latencies in the microseconds range. Other improvements might include faster, more capable actuators.

“I think the training framework we propose would be useful in any application where you need to balance perception and control—picking objects up, even catching and throwing stuff,” Ma suggested. Sadly, one thing that’s almost certainly off the table is taking ANYmal to major leagues in badminton or tennis. “Would I set up a company selling badminton-playing robots? Well, maybe not,” Ma said.

Science Robotics, 2025. DOI: 10.1126/scirobotics.adu3922

Scientists built a badminton-playing robot with AI-powered skills Read More »