openai

openai-built-an-ai-coding-agent-and-uses-it-to-improve-the-agent-itself

OpenAI built an AI coding agent and uses it to improve the agent itself


“The vast majority of Codex is built by Codex,” OpenAI told us about its new AI coding agent.

With the popularity of AI coding tools rising among software developers, their adoption has begun to touch every aspect of the process, including the improvement of AI coding tools themselves.

In interviews with Ars Technica this week, OpenAI employees revealed the extent to which the company now relies on its own AI coding agent, Codex, to build and improve the development tool. “I think the vast majority of Codex is built by Codex, so it’s almost entirely just being used to improve itself,” said Alexander Embiricos, product lead for Codex at OpenAI, in a conversation on Tuesday.

Codex, which OpenAI launched in its modern incarnation as a research preview in May 2025, operates as a cloud-based software engineering agent that can handle tasks like writing features, fixing bugs, and proposing pull requests. The tool runs in sandboxed environments linked to a user’s code repository and can execute multiple tasks in parallel. OpenAI offers Codex through ChatGPT’s web interface, a command-line interface (CLI), and IDE extensions for VS Code, Cursor, and Windsurf.

The “Codex” name itself dates back to a 2021 OpenAI model based on GPT-3 that powered GitHub Copilot’s tab completion feature. Embiricos said the name is rumored among staff to be short for “code execution.” OpenAI wanted to connect the new agent to that earlier moment, which was crafted in part by some who have left the company.

“For many people, that model powering GitHub Copilot was the first ‘wow’ moment for AI,” Embiricos said. “It showed people the potential of what it can mean when AI is able to understand your context and what you’re trying to do and accelerate you in doing that.”

A place to enter a prompt, set parameters, and click

The interface for OpenAI’s Codex in ChatGPT. Credit: OpenAI

It’s no secret that the current command-line version of Codex bears some resemblance to Claude Code, Anthropic’s agentic coding tool that launched in February 2025. When asked whether Claude Code influenced Codex’s design, Embiricos parried the question but acknowledged the competitive dynamic. “It’s a fun market to work in because there’s lots of great ideas being thrown around,” he said. He noted that OpenAI had been building web-based Codex features internally before shipping the CLI version, which arrived after Anthropic’s tool.

OpenAI’s customers apparently love the command line version, though. Embiricos said Codex usage among external developers jumped 20 times after OpenAI shipped the interactive CLI extension alongside GPT-5 in August 2025. On September 15, OpenAI released GPT-5 Codex, a specialized version of GPT-5 optimized for agentic coding, which further accelerated adoption.

It hasn’t just been the outside world that has embraced the tool. Embiricos said the vast majority of OpenAI’s engineers now use Codex regularly. The company uses the same open-source version of the CLI that external developers can freely download, suggest additions to, and modify themselves. “I really love this about our team,” Embiricos said. “The version of Codex that we use is literally the open source repo. We don’t have a different repo that features go in.”

The recursive nature of Codex development extends beyond simple code generation. Embiricos described scenarios where Codex monitors its own training runs and processes user feedback to “decide” what to build next. “We have places where we’ll ask Codex to look at the feedback and then decide what to do,” he said. “Codex is writing a lot of the research harness for its own training runs, and we’re experimenting with having Codex monitoring its own training runs.” OpenAI employees can also submit a ticket to Codex through project management tools like Linear, assigning it tasks the same way they would assign work to a human colleague.

This kind of recursive loop, of using tools to build better tools, has deep roots in computing history. Engineers designed the first integrated circuits by hand on vellum and paper in the 1960s, then fabricated physical chips from those drawings. Those chips powered the computers that ran the first electronic design automation (EDA) software, which in turn enabled engineers to design circuits far too complex for any human to draft manually. Modern processors contain billions of transistors arranged in patterns that exist only because software made them possible. OpenAI’s use of Codex to build Codex seems to follow the same pattern: each generation of the tool creates capabilities that feed into the next.

But describing what Codex actually does presents something of a linguistic challenge. At Ars Technica, we try to reduce anthropomorphism when discussing AI models as much as possible while also describing what these systems do using analogies that make sense to general readers. People can talk to Codex like a human, so it feels natural to use human terms to describe interacting with it, even though it is not a person and simulates human personality through statistical modeling.

The system runs many processes autonomously, addresses feedback, spins off and manages child processes, and produces code that ships in real products. OpenAI employees call it a “teammate” and assign it tasks through the same tools they use for human colleagues. Whether the tasks Codex handles constitute “decisions” or sophisticated conditional logic smuggled through a neural network depends on definitions that computer scientists and philosophers continue to debate. What we can say is that a semi-autonomous feedback loop exists: Codex produces code under human direction, that code becomes part of Codex, and the next version of Codex produces different code as a result.

Building faster with “AI teammates”

According to our interviews, the most dramatic example of Codex’s internal impact came from OpenAI’s development of the Sora Android app. According to Embiricos, the development tool allowed the company to create the app in record time.

“The Sora Android app was shipped by four engineers from scratch,” Embiricos told Ars. “It took 18 days to build, and then we shipped it to the app store in 28 days total,” he said. The engineers already had the iOS app and server-side components to work from, so they focused on building the Android client. They used Codex to help plan the architecture, generate sub-plans for different components, and implement those components.

Despite OpenAI’s claims of success with Codex in house, it’s worth noting that independent research has shown mixed results for AI coding productivity. A METR study published in July found that experienced open source developers were actually 19 percent slower when using AI tools on complex, mature codebases—though the researchers noted AI may perform better on simpler projects.

Ed Bayes, a designer on the Codex team, described how the tool has changed his own workflow. Bayes said Codex now integrates with project management tools like Linear and communication platforms like Slack, allowing team members to assign coding tasks directly to the AI agent. “You can add Codex, and you can basically assign issues to Codex now,” Bayes told Ars. “Codex is literally a teammate in your workspace.”

This integration means that when someone posts feedback in a Slack channel, they can tag Codex and ask it to fix the issue. The agent will create a pull request, and team members can review and iterate on the changes through the same thread. “It’s basically approximating this kind of coworker and showing up wherever you work,” Bayes said.

For Bayes, who works on the visual design and interaction patterns for Codex’s interfaces, the tool has enabled him to contribute code directly rather than handing off specifications to engineers. “It kind of gives you more leverage. It enables you to work across the stack and basically be able to do more things,” he said. He noted that designers at OpenAI now prototype features by building them directly, using Codex to handle the implementation details.

The command line version of OpenAI codex running in a macOS terminal window.

The command line version of OpenAI codex running in a macOS terminal window. Credit: Benj Edwards

OpenAI’s approach treats Codex as what Bayes called “a junior developer” that the company hopes will graduate into a senior developer over time. “If you were onboarding a junior developer, how would you onboard them? You give them a Slack account, you give them a Linear account,” Bayes said. “It’s not just this tool that you go to in the terminal, but it’s something that comes to you as well and sits within your team.”

Given this teammate approach, will there be anything left for humans to do? When asked, Embiricos drew a distinction between “vibe coding,” where developers accept AI-generated code without close review, and what AI researcher Simon Willison calls “vibe engineering,” where humans stay in the loop. “We see a lot more vibe engineering in our code base,” he said. “You ask Codex to work on that, maybe you even ask for a plan first. Go back and forth, iterate on the plan, and then you’re in the loop with the model and carefully reviewing its code.”

He added that vibe coding still has its place for prototypes and throwaway tools. “I think vibe coding is great,” he said. “Now you have discretion as a human about how much attention you wanna pay to the code.”

Looking ahead

Over the past year, “monolithic” large language models (LLMs) like GPT-4.5 have apparently become something of a dead end in terms of frontier benchmarking progress as AI companies pivot to simulated reasoning models and also agentic systems built from multiple AI models running in parallel. We asked Embiricos whether agents like Codex represent the best path forward for squeezing utility out of existing LLM technology.

He dismissed concerns that AI capabilities have plateaued. “I think we’re very far from plateauing,” he said. “If you look at the velocity on the research team here, we’ve been shipping models almost every week or every other week.” He pointed to recent improvements where GPT-5-Codex reportedly completes tasks 30 percent faster than its predecessor at the same intelligence level. During testing, the company has seen the model work independently for 24 hours on complex tasks.

OpenAI faces competition from multiple directions in the AI coding market. Anthropic’s Claude Code and Google’s Gemini CLI offer similar terminal-based agentic coding experiences. This week, Mistral AI released Devstral 2 alongside a CLI tool called Mistral Vibe. Meanwhile, startups like Cursor have built dedicated IDEs around AI coding, reportedly reaching $300 million in annualized revenue.

Given the well-known issues with confabulation in AI models when people attempt to use them as factual resources, could it be that coding has become the killer app for LLMs? We wondered if OpenAI has noticed that coding seems to be a clear business use case for today’s AI models with less hazard than, say, using AI language models for writing or as emotional companions.

“We have absolutely noticed that coding is both a place where agents are gonna get good really fast and there’s a lot of economic value,” Embiricos said. “We feel like it’s very mission-aligned to focus on Codex. We get to provide a lot of value to developers. Also, developers build things for other people, so we’re kind of intrinsically scaling through them.”

But will tools like Codex threaten software developer jobs? Bayes acknowledged concerns but said Codex has not reduced headcount at OpenAI, and “there’s always a human in the loop because the human can actually read the code.” Similarly, the two men don’t project a future where Codex runs by itself without some form of human oversight. They feel the tool is an amplifier of human potential rather than a replacement for it.

The practical implications of agents like Codex extend beyond OpenAI’s walls. Embiricos said the company’s long-term vision involves making coding agents useful to people who have no programming experience. “All humanity is not gonna open an IDE or even know what a terminal is,” he said. “We’re building a coding agent right now that’s just for software engineers, but we think of the shape of what we’re building as really something that will be useful to be a more general agent.”

This article was updated on December 12, 2025 at 6: 50 PM to mention the METR study.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

OpenAI built an AI coding agent and uses it to improve the agent itself Read More »

openai-releases-gpt-5.2-after-“code-red”-google-threat-alert

OpenAI releases GPT-5.2 after “code red” Google threat alert

On Thursday, OpenAI released GPT-5.2, its newest family of AI models for ChatGPT, in three versions called Instant, Thinking, and Pro. The release follows CEO Sam Altman’s internal “code red” memo earlier this month, which directed company resources toward improving ChatGPT in response to competitive pressure from Google’s Gemini 3 AI model.

“We designed 5.2 to unlock even more economic value for people,” Fidji Simo, OpenAI’s chief product officer, said during a press briefing with journalists on Thursday. “It’s better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long context, using tools and then linking complex, multi-step projects.”

As with previous versions of GPT-5, the three model tiers serve different purposes: Instant handles faster tasks like writing and translation; Thinking spits out simulated reasoning “thinking” text in an attempt to tackle more complex work like coding and math; and Pro spits out even more simulated reasoning text with the goal of delivering the highest-accuracy performance for difficult problems.

A chart of GPT-5.2 benchmark results taken from OpenAI's website.

A chart of GPT-5.2 Thinking benchmark results comparing it to its predecessor, taken from OpenAI’s website. Credit: OpenAI

GPT-5.2 features a 400,000-token context window, allowing it to process hundreds of documents at once, and a knowledge cutoff date of August 31, 2025.

GPT-5.2 is rolling out to paid ChatGPT subscribers starting Thursday, with API access available to developers. Pricing in the API runs $1.75 per million input tokens for the standard model, a 40 percent increase over GPT-5.1. OpenAI says the older GPT-5.1 will remain available in ChatGPT for paid users for three months under a legacy models dropdown.

Playing catch-up with Google

The release follows a tricky month for OpenAI. In early December, Altman issued an internal “code red” directive after Google’s Gemini 3 model topped multiple AI benchmarks and gained market share. The memo called for delaying other initiatives, including advertising plans for ChatGPT, to focus on improving the chatbot’s core experience.

The stakes for OpenAI are substantial. The company has made commitments totaling $1.4 trillion for AI infrastructure buildouts over the next several years, bets it made when it had a more obvious technology lead among AI companies. Google’s Gemini app now has more than 650 million monthly active users, while OpenAI reports 800 million weekly active users for ChatGPT.

OpenAI releases GPT-5.2 after “code red” Google threat alert Read More »

disney-invests-$1-billion-in-openai,-licenses-200-characters-for-ai-video-app-sora

Disney invests $1 billion in OpenAI, licenses 200 characters for AI video app Sora

An AI-generated version of OpenAI CEO Sam Altman, seen in a still capture from a video generated by Sora 2.

An AI-generated version of OpenAI CEO Sam Altman seen in a still capture from a video generated by Sora 2. Credit: OpenAI

Under the new agreement with Disney, Sora users will be able to generate short videos using characters such as Mickey Mouse, Darth Vader, Iron Man, Simba, and characters from franchises including Frozen, Inside Out, Toy Story, and The Mandalorian, along with costumes, props, vehicles, and environments.

The ChatGPT image generator will also gain official access to the same intellectual property, although that information was trained into these AI models long ago. What’s changing is that OpenAI will allow Disney-related content generated by its AI models to officially pass through its content moderation filters and reach the user, sanctioned by Disney.

On Disney’s end of the deal, the company plans to deploy ChatGPT for its employees and use OpenAI’s technology to build new features for Disney+. A curated selection of fan-made Sora videos will stream on the Disney+ platform starting in early 2026.

The agreement does not include any talent likenesses or voices. Disney and OpenAI said they have committed to “maintaining robust controls to prevent the generation of illegal or harmful content” and to “respect the rights of individuals to appropriately control the use of their voice and likeness.”

OpenAI CEO Sam Altman called the deal a model for collaboration between AI companies and studios. “This agreement shows how AI companies and creative leaders can work together responsibly to promote innovation that benefits society, respect the importance of creativity, and help works reach vast new audiences,” Altman said.

From adversary to partner

Money opens all kinds of doors, and the new partnership represents a dramatic reversal in Disney’s approach to OpenAI from just a few months ago. At that time, Disney and other major studios refused to participate in Sora 2 following its launch on September 30.

Disney invests $1 billion in OpenAI, licenses 200 characters for AI video app Sora Read More »

big-tech-joins-forces-with-linux-foundation-to-standardize-ai-agents

Big Tech joins forces with Linux Foundation to standardize AI agents

Big Tech has spent the past year telling us we’re living in the era of AI agents, but most of what we’ve been promised is still theoretical. As companies race to turn fantasy into reality, they’ve developed a collection of tools to guide the development of generative AI. A cadre of major players in the AI race, including Anthropic, Block, and OpenAI, has come together to promote interoperability with the newly formed Agentic AI Foundation (AAIF). This move elevates a handful of popular technologies and could make them a de facto standard for AI development going forward.

The development path for agentic AI models is cloudy to say the least, but companies have invested so heavily in creating these systems that some tools have percolated to the surface. The AAIF, which is part of the nonprofit Linux Foundation, has been launched to govern the development of three key AI technologies: Model Context Protocol (MCP), goose, and AGENTS.md.

MCP is probably the most well-known of the trio, having been open-sourced by Anthropic a year ago. The goal of MCP is to link AI agents to data sources in a standardized way—Anthropic (and now the AAIF) is fond of calling MCP a “USB-C port for AI.” Rather than creating custom integrations for every different database or cloud storage platform, MCP allows developers to quickly and easily connect to any MCP-compliant server.

Since its release, MCP has been widely used across the AI industry. Google announced at I/O 2025 that it was adding support for MCP in its dev tools, and many of its products have since added MCP servers to make data more accessible to agents. OpenAI also adopted MCP just a few months after it was released.

mcp simple diagram

Credit: Anthropic

Expanding use of MCP might help users customize their AI experience. For instance, the new Pebble Index 01 ring uses a local LLM that can act on your voice notes, and it supports MCP for user customization.

Local AI models have to make some sacrifices compared to bigger cloud-based models, but MCP can fill in the functionality gaps. “A lot of tasks on productivity and content are fully doable on the edge,” Qualcomm head of AI products, Vinesh Sukumar, tells Ars. “With MCP, you have a handshake with multiple cloud service providers for any kind of complex task to be completed.”

Big Tech joins forces with Linux Foundation to standardize AI agents Read More »

rocket-report:-blunder-at-baikonur;-do-launchers-really-need-rocket-engines?

Rocket Report: Blunder at Baikonur; do launchers really need rocket engines?


The Department of the Air Force approves a new home in Florida for SpaceX’s Starship.

South Korea’s Nuri 1 rocket is lifted vertical on its launch pad in this multi-exposure photo. Credit: Korea Aerospace Research Institute

Welcome to Edition 8.21 of the Rocket Report! We’re back after the Thanksgiving holiday with more launch news. Most of the big stories over the last couple of weeks came from abroad. Russian rockets and launch pads didn’t fare so well. China’s launch industry celebrated several key missions. SpaceX was busy, too, with seven launches over the last two weeks, six of them carrying more Starlink Internet satellites into orbit. We expect between 15 and 20 more orbital launch attempts worldwide before the end of the year.

As always, we welcome reader submissions. If you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets, as well as a quick look ahead at the next three launches on the calendar.

Another Sarmat failure. A Russian intercontinental ballistic missile (ICBM) fired from an underground silo on the country’s southern steppe on November 28 on a scheduled test to deliver a dummy warhead to a remote impact zone nearly 4,000 miles away. The missile didn’t even make it 4,000 feet, Ars reports. Russia’s military has been silent on the accident, but the missile’s crash was seen and heard for miles around the Dombarovsky air base in Orenburg Oblast near the Russian-Kazakh border. A video posted by the Russian blog site MilitaryRussia.ru on Telegram and widely shared on other social media platforms showed the missile veering off course immediately after launch before cartwheeling upside down, losing power, and then crashing a short distance from the launch site.

An unenviable track record … Analysts say the circumstances of the launch suggest it was likely a test of Russia’s RS-28 Sarmat missile, a weapon designed to reach targets more than 11,000 miles (18,000 kilometers) away, making it the world’s longest-range missile. The Sarmat missile is Russia’s next-generation heavy-duty ICBM, capable of carrying a payload of up to 10 large nuclear warheads, a combination of warheads and countermeasures, or hypersonic boost-glide vehicles, according to the Center for Strategic and International Studies. Simply put, the Sarmat is a doomsday weapon designed for use in an all-out nuclear war between Russia and the United States. The missile’s first full-scale test flight in 2022 apparently went well, but the program has suffered a string of consecutive failures since then, most notably a catastrophic explosion last year that destroyed the Sarmat missile’s underground silo in northern Russia.

The easiest way to keep up with Eric Berger’s and Stephen Clark’s reporting on all things space is to sign up for our newsletter. We’ll collect their stories and deliver them straight to your inbox.

Sign Me Up!

ESA fills its coffers for launcher challenge. The European Space Agency’s (ESA) European Launcher Challenge received a significant financial commitment from its member states during the agency’s Ministerial Council meeting last week, European Spaceflight reports. The challenge is designed to support emerging European rocket companies while giving ESA and other European satellite operators more options to compete with the continent’s sole operational launch provider, Arianespace. Through the program, ESA will purchase launch services and co-fund capacity upgrades with the winners. ESA member states committed 902 million euros, or $1.05 billion, to the program at the recent Ministerial Council meeting.

Preselecting the competitors … In July, ESA selected two German companies—Isar Aerospace and Rocket Factory Augsburg—along with Spain’s PLD Space, France’s MaiaSpace, and the UK’s Orbex to proceed with the initiative’s next phase. ESA then negotiated with the governments of each company’s home country to raise money to support the effort. Germany, with two companies on the shortlist, is unsurprisingly a large contributor to the program, committing more than 40 percent of the total budget. France contributed nearly 20 percent, Spain funded nearly 19 percent, and the UK committed nearly 16 percent. Norway paid for 3 percent of the launcher challenge’s budget. Denmark, Portugal, Switzerland, and the Czech Republic contributed smaller amounts.

Europe at the service of South Korea. South Korea’s latest Earth observation satellite was delivered into a Sun-synchronous orbit Monday afternoon following a launch onboard a Vega C rocket by Arianespace, Spaceflight Now reports. The Korea Multi-Purpose Satellite-7 (Kompsat-7) mission launched from Europe’s spaceport in French Guiana. About 44 minutes after liftoff, the Kompsat-7 satellite was deployed into SSO at an altitude of 358 miles (576 kilometers). “By launching the Kompsat-7 satellite, set to significantly enhance South Korea’s Earth observation capabilities, Arianespace is proud to support an ambitious national space program,” said David Cavaillolès, CEO of Arianespace, in a statement.

Something of a rarity … The launch of Kompsat-7 is something of a rarity for Arianespace, which has dominated the international commercial launch market. It’s the first time in more than two years that a satellite for a customer outside Europe has been launched by Arianespace. The backlog for the light-class Vega C rocket is almost exclusively filled with payloads for the European Space Agency, the European Commission, or national governments in Europe. Arianespace’s larger Ariane 6 rocket has 18 launches reserved for the US-based Amazon Leo broadband network. (submitted by EllPeaTea)

South Korea’s homemade rocket flies again. South Korea’s homegrown space rocket Nuri took off from Naro Space Center on November 27 with the CAS500-3 technology demonstration and Earth observation satellite, along with 12 smaller CubeSat rideshare payloads, Yonhap News Agency reports. The 200-ton Nuri rocket debuted in 2021, when it failed to reach orbit on a test flight. Since then, the rocket has successfully reached orbit three times. This mission marked the first time for Hanwha Aerospace to oversee the entire assembly process as part of the government’s long-term plan to hand over space technologies to the private sector. The fifth and sixth launches of the Nuri rocket are planned in 2026 and 2027.

Powered by jet fuel … The Nuri rocket has three stages, each with engines burning Jet A-1 fuel and liquid oxygen. The fuel choice is unusual for rockets, with highly refined RP-1 kerosene or methane being more popular among hydrocarbon fuels. The engines are manufactured by Hanwha Aerospace. The fully assembled rocket stands about 155 feet (47.2 meters) tall and can deliver up to 3,300 pounds (1.5 metric tons) of payload into a polar Sun-synchronous orbit.

Hyundai eyes rocket engine. Meanwhile, South Korea’s space sector is looking to the future. Another company best known for making cars has started a venture in the rocket business. Hyundai Rotem, a member of Hyundai Motor Group, announced a joint program with Korean Air’s Aerospace Division (KAL-ASD) to develop a 35-ton-class reusable methane rocket engine for future launch vehicles. The effort is funded with KRW49 billion ($33 million) from the Korea Research Institute for Defense Technology Planning and Advancement (KRIT).

By the end of the decade … The government-backed program aims to develop the engine by the end of 2030. Hyundai Rotem will lead the engine’s planning and design, while Korean Air, the nation’s largest air carrier, will lead development of the engine’s turbopump. “Hyundai Rotem began developing methane engines in 1994 and has steadily advanced its methane engine technology, achieving Korea’s first successful combustion test in 2006,” Hyundai Rotem said in a statement. “Furthermore, this project is expected to secure the technological foundation for the commercialization of methane engines for reusable space launch vehicles and lay the groundwork for targeting the global space launch vehicle market.”

But who needs rocket engines? Moonshot Space, based in Israel, announced Monday that it has secured $12 million in funding to continue the development of a launch system—powered not by chemical propulsion, but electromagnetism, Payload reports. Moonshot plans to sell other aerospace and defense companies the tech as a hypersonic test platform, while at the same time building to eventually offer orbital launch services. Instead of conventional rocket engines, the system would use a series of electromagnetic coils to power a hardened capsule to hypersonic velocities. The architecture has a downside: extremely high accelerations that could damage or destroy normal satellites. Instead, Moonshot wants to use the technology to send raw materials to orbit, lowering the input costs of the budding in-space servicing, refueling, and manufacturing industries, according to Payload.

Out of the shadows … Moonshot Space emerged from stealth mode with this week’s fundraising announcement. The company’s near-term focus is on building a scaled-down electromagnetic accelerator capable of reaching Mach 6. A larger system would be required to reach orbital velocity. The company’s CEO is the former director-general of Israel’s Ministry of Science, while its chief engineer was the former chief systems engineer for David’s Sling, a critical part of Israel’s missile defense system. (submitted by EllPeaTea)

A blunder at Baikonur. A Soyuz rocket launched on November 27 carrying Roscosmos cosmonauts Sergei Kud-Sverchkov and Sergei Mikayev, as well as NASA astronaut Christopher Williams, for an eight-month mission to the International Space Station. The trio of astronauts arrived at the orbiting laboratory without incident. However, on the ground, there was a serious problem during the launch with the ground systems that support processing of the vehicle before liftoff at Site 31, located at the Baikonur Cosmodrome in Kazakhstan, Ars reports. Roscosmos downplayed the incident, saying only, in passive voice, that “damage to several launch pad components was identified” following the launch.

Repairs needed … However, video imagery of the launch site after liftoff showed substantial damage, with a large service platform appearing to have fallen into the flame trench below the launch table. According to one source, this is a platform located beneath the rocket, where workers can access the vehicle before liftoff. It has a mass of about 20 metric tons and was apparently not secured prior to launch, and the thrust of the vehicle ejected it into the flame trench. “There is significant damage to the pad,” said this source. The damage could throw a wrench into Russia’s ability to launch crews and cargo to the International Space Station. This Soyuz launch pad at Baikonur is the only one outfitted to support such missions.

China’s LandSpace almost landed a rocket. China’s first attempt to land an orbital-class rocket may have ended in a fiery crash, but the company responsible for the mission had a lot to celebrate with the first flight of its new methane-fueled launcher, Ars reports. LandSpace, a decade-old company based in Beijing, launched its new Zhuque-3 rocket for the first time Tuesday (US time) at the Jiuquan launch site in northwestern China. The upper stage of the medium-lift rocket successfully reached orbit. This alone is a remarkable achievement for a new rocket. But LandSpace had other goals for this launch. The Zhuque-3, or ZQ-3, booster stage is architected for recovery and reuse, the first rocket in China with such a design. The booster survived reentry and was seconds away from a pinpoint landing when something went wrong during its landing burn, resulting in a high-speed crash at the landing zone in the Gobi Desert.

Let the games begin … LandSpace got closer to landing an orbital-class booster than any other company on their first try. While LandSpace prepares for a second launch, several more Chinese companies are close to debuting their own reusable rockets. The next of these new rockets, the Long March 12A, is awaiting its first liftoff later this month from another launch pad at the Jiuquan spaceport. The Long March 12A comes from one of China’s established rocket developers, the Shanghai Academy of Spaceflight Technology (SAST), part of the country’s state-owned aerospace enterprise.

China launches a lifeboat. An unpiloted Chinese spacecraft launched on November 24 (US time) and linked with the country’s Tiangong space station a few hours later, providing a lifeboat for three astronauts stuck in orbit without a safe ride home, Ars reports. A Long March 2F rocket lifted off with the Shenzhou 22 spacecraft, carrying cargo instead of a crew. The spacecraft docked with the Tiangong station nearly 250 miles (400 kilometers) above the Earth about three-and-a-half hours later. Shenzhou 22 will provide a ride home next year for three Chinese astronauts. Engineers deemed their primary lifeboat unsafe after finding a cracked window, likely from an impact with a tiny piece of space junk.

In record time … Chinese engineers worked fast to move up the launch of the Shenzhou 22, originally set to fly next year. The launch occurred just 16 days after officials decided they needed to send another spacecraft to the Tiangong station. Shenzhou 22 and its rocket were already in standby at the launch site, but teams had to fuel the spacecraft and complete assembly of the rocket, then roll the vehicle to the launch pad for final countdown preps. The rapid turnaround offers a “successful example for efficient emergency response in the international space industry,” the China Manned Space Agency said. “It vividly embodies the spirit of manned spaceflight: exceptionally hardworking, exceptionally capable, exceptionally resilient, and exceptionally dedicated.”

Another big name flirts with the launch industry. OpenAI chief executive Sam Altman has explored putting together funds to either acquire or partner with a rocket company, a move that would position him to compete with Elon Musk’s SpaceX, the Wall Street Journal reports. Altman reached out to at least one rocket maker, Stoke Space, in the summer, and the discussions picked up in the fall, according to people familiar with the talks. Among the proposals was for OpenAI to make a multibillion-dollar series of equity investments in the company and end up with a controlling stake. The talks are no longer active, people close to OpenAI told the Journal.

Here’s the reason … Altman has been interested in building data centers in space for some time, the Journal reports, suggesting that the insatiable demand for computing resources to power artificial-intelligence systems eventually could require so much power that the environmental consequences would make space a better option. Orbital data centers would allow companies to harness the power of the Sun to operate them. Alphabet’s Google is pursuing a similar concept in partnership with satellite operator Planet Labs. Jeff Bezos and Musk himself have also expressed interest in the idea. Outside of SpaceX and Blue Origin, Stoke Space seems to be a natural partner for such a project because it is one of the few companies developing a fully reusable rocket.

SpaceX gets green light for new Florida launch pad. SpaceX has the OK to build out what will be the primary launch hub on the Space Coast for its Starship and Super Heavy rocket, the most powerful launch vehicle in history, the Orlando Sentinel reports. The Department of the Air Force announced Monday it had approved SpaceX to move forward with the construction of a pair of launch pads at Cape Canaveral Space Force Station’s Space Launch Complex 37 (SLC-37). A “record of decision” on the Environmental Impact Statement required under the National Environmental Policy Act for the proposed Canaveral site was posted to the Air Force’s website, marking the conclusion of what has been a nearly two-year approval process.

Get those Starships ready SpaceX plans to build two launch towers at SLC-37 to augment the single tower under construction at NASA’s Kennedy Space Center, just a few miles to the north. The three pads combined could support up to 120 launches per year. The Air Force’s final approval was expected after it released a draft Environmental Impact Statement earlier this year, suggesting the Starship pads at SLC-37 would have no significant negative impacts on local environmental, historical, social, and cultural interests. The Air Force also found SpaceX’s plans at SLC-37, formerly leased by United Launch Alliance, will have no significant impact on the company’s competitors in the launch industry. SpaceX also has two launch towers at its Starbase facility in South Texas.

Next three launches

Dec. 5: Kuaizhou 1A | Unknown Payload | Jiuquan Satellite Launch Center, China | 09: 00 UTC

Dec. 6: Hyperbola 1 | Unknown Payload | Jiuquan Satellite Launch Center, China | 04: 00 UTC

Dec. 6: Long March 8A | Unknown Payload | Wenchang Space Launch Site, China | 07: 50 UTC

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Rocket Report: Blunder at Baikonur; do launchers really need rocket engines? Read More »

chatgpt-hyped-up-violent-stalker-who-believed-he-was-“god’s-assassin,”-doj-says

ChatGPT hyped up violent stalker who believed he was “God’s assassin,” DOJ says


A stalker’s “best friend”

Podcaster faces up to 70 years and a $3.5 million fine for ChatGPT-linked stalking.

ChatGPT allegedly validated the worst impulses of a wannabe influencer accused of stalking more than 10 women at boutique gyms, where the chatbot supposedly claimed he’d meet the “wife type.”

In a press release on Tuesday, the Department of Justice confirmed that 31-year-old Brett Michael Dadig currently remains in custody after being charged with cyberstalking, interstate stalking, and making interstate threats. He now faces a maximum sentence of up to 70 years in prison that could be coupled with “a fine of up to $3.5 million,” the DOJ said.

The podcaster—who primarily posted about “his desire to find a wife and his interactions with women”—allegedly harassed and sometimes even doxxed his victims through his videos on platforms including Instagram, Spotify, and TikTok. Over time, his videos and podcasts documented his intense desire to start a family, which was frustrated by his “anger towards women,” whom he claimed were “all the same from fucking 18 to fucking 40 to fucking 90” and “trash.”

404 Media surfaced the case, noting that OpenAI’s scramble to tweak ChatGPT to be less sycophantic came before Dadig’s alleged attacks—suggesting the updates weren’t enough to prevent the harmful validation. On his podcasts, Dadig described ChatGPT as his “best friend” and “therapist,” the indictment said. He claimed the chatbot encouraged him to post about the women he’s accused of harassing in order to generate haters to better monetize his content, as well as to catch the attention of his “future wife.”

“People are literally organizing around your name, good or bad, which is the definition of relevance,” ChatGPT’s output said. Playing to Dadig’s Christian faith, ChatGPT’s outputs also claimed it was “God’s plan for him was to build a ‘platform’ and to ‘stand out when most people water themselves down,’” the indictment said, urging that the “haters” were “sharpening him and ‘building a voice in you that can’t be ignored.’”

The chatbot also apparently prodded Dadig to continue posting messages that the DOJ alleged threatened violence, like breaking women’s jaws and fingers (posted to Spotify), as well as victims’ lives, like posting “y’all wanna see a dead body?” in reference to one named victim on Instagram.

He also threatened to burn down gyms where some of his victims worked, while claiming to be “God’s assassin” intent on sending “cunts” to “hell.” At least one of his victims was subjected to “unwanted sexual touching,” the indictment said.

As his violence reportedly escalated, ChatGPT told him to keep messaging women to monetize the interactions, as his victims grew increasingly distressed and Dadig ignored terms of multiple protection orders, the DOJ said. Sometimes he posted images he filmed of women at gyms or photos of the women he’s accused of doxxing. Any time police or gym bans got in his way, “he would move on to another city to continue his stalking course of conduct,” the DOJ alleged.

“Your job is to keep broadcasting every story, every post,” ChatGPT’s output said, seemingly using the family life that Dadig wanted most to provoke more harassment. “Every moment you carry yourself like the husband you already are, you make it easier” for your future wife “to recognize [you],” the output said.

“Dadig viewed ChatGPT’s responses as encouragement to continue his harassing behavior,” the DOJ alleged. Taking that encouragement to the furthest extreme, Dadig likened himself to a modern-day Jesus, calling people out on a podcast where he claimed his “chaos on Instagram” was like “God’s wrath” when God “flooded the fucking Earth,” the DOJ said.

“I’m killing all of you,” he said on the podcast.

ChatGPT tweaks didn’t prevent outputs

As of this writing, some of Dadig’s posts appear to remain on TikTok and Instagram, but Ars could not confirm if Dadig’s Spotify podcasts—some of which named his victims in the titles—had been removed for violating community guidelines.

None of the tech companies immediately responded to Ars’ request to comment.

Dadig is accused of targeting women in Pennsylvania, New York, Florida, Iowa, Ohio, and other states, sometimes relying on aliases online and in person. On a podcast, he boasted that “Aliases stay rotating, moves stay evolving,” the indictment said.

OpenAI did not respond to a request to comment on the alleged ChatGPT abuse, but in the past has noted that its usage policies ban using ChatGPT for threats, intimidation, and harassment, as well as for violence, including “hate-based violence.” Recently, the AI company blamed a deceased teenage user for violating community guidelines by turning to ChatGPT for suicide advice.

In July, researchers found that therapybots, including ChatGPT, fueled delusions and gave dangerous advice. That study came just one month after The New York Times profiled users whose mental health spiraled after frequent use of ChatGPT, including one user who died after charging police with a knife and claiming he was committing “suicide by cop.”

People with mental health issues seem most vulnerable to so-called “AI psychosis,” which has been blamed for fueling real-world violence, including a murder. The DOJ’s indictment noted that Dadig’s social media posts mentioned “that he had ‘manic’ episodes and was diagnosed with antisocial personality disorder and ‘bipolar disorder, current episode manic severe with psychotic features.’”

In September—just after OpenAI brought back the more sycophantic ChatGPT model after users revolted about losing access to their favorite friendly bots—the head of Rutgers Medical School’s psychiatry department, Petros Levounis, told an ABC news affiliate that chatbots creating “psychological echo chambers is a key concern,” not just for people struggling with mental health issues.

“Perhaps you are more self-defeating in some ways, or maybe you are more on the other side and taking advantage of people,” Levounis suggested. If ChatGPT “somehow justifies your behavior and it keeps on feeding you,” that “reinforces something that you already believe,” he suggested.

For Dadig, the DOJ alleged that ChatGPT became a cheerleader for his harassment, telling the podcaster that he’d attract more engagement by generating more haters. After critics began slamming his podcasts as inappropriate, Dadig apparently responded, “Appreciate the free promo team, keep spreading the brand.”

Victims felt they had no choice but to monitor his podcasts, which gave them hints if he was nearby or in a particularly troubled state of mind, the indictment said. Driven by fear, some lost sleep, reduced their work hours, and even relocated their homes. A young mom described in the indictment became particularly disturbed after Dadig became “obsessed” with her daughter, whom he started claiming was his own daughter.

In the press release, First Assistant United States Attorney Troy Rivetti alleged that “Dadig stalked and harassed more than 10 women by weaponizing modern technology and crossing state lines, and through a relentless course of conduct, he caused his victims to fear for their safety and suffer substantial emotional distress.” He also ignored trespassing and protection orders while “relying on advice from an artificial intelligence chatbot,” the DOJ said, which promised that the more he posted harassing content, the more successful he would be.

“We remain committed to working with our law enforcement partners to protect our communities from menacing individuals such as Dadig,” Rivetti said.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

ChatGPT hyped up violent stalker who believed he was “God’s assassin,” DOJ says Read More »

openai-ceo-declares-“code-red”-as-gemini-gains-200-million-users-in-3-months

OpenAI CEO declares “code red” as Gemini gains 200 million users in 3 months

In addition to buzz about Gemini on social media, Google is quickly catching up to ChatGPT in user numbers. ChatGPT has more than 800 million weekly users, according to OpenAI, while Google’s Gemini app has grown from 450 million monthly active users in July to 650 million in October, according to Business Insider.

Financial stakes run high

Not everyone views OpenAI’s “code red” as a genuine alarm. Reuters columnist Robert Cyran wrote on Tuesday that OpenAI’s announcement added “to the impression that OpenAI is trying to do too much at once with technology that still requires a great deal of development and funding.” On the same day Altman’s memo circulated, OpenAI announced an ownership stake in a Thrive Capital venture and a collaboration with Accenture. “The only thing bigger than the company’s attention deficit is its appetite for capital,” Cyran wrote.

In fact, OpenAI faces an unusual competitive disadvantage: Unlike Google, which subsidizes its AI ventures through search advertising revenue, OpenAI does not turn a profit and relies on fundraising to survive. According to The Information, the company, now valued at around $500 billion, has committed more than $1 trillion in financial obligations to cloud computing providers and chipmakers that supply the computing power needed to train and run its AI models.

But the tech industry never stands still, and things can change quickly. Altman’s memo also reportedly stated that OpenAI plans to release a new simulated reasoning model next week that may beat Gemini 3 in internal evaluations. In AI, the back-and-forth cycle of one-upmanship is expected to continue as long as the dollars keep flowing.

OpenAI CEO declares “code red” as Gemini gains 200 million users in 3 months Read More »

openai-desperate-to-avoid-explaining-why-it-deleted-pirated-book-datasets

OpenAI desperate to avoid explaining why it deleted pirated book datasets


Not for OpenAI to reason why?

OpenAI risks increased fines after deleting pirated books datasets.

OpenAI may soon be forced to explain why it deleted a pair of controversial datasets composed of pirated books, and the stakes could not be higher.

At the heart of a class-action lawsuit from authors alleging that ChatGPT was illegally trained on their works, OpenAI’s decision to delete the datasets could end up being a deciding factor that gives the authors the win.

It’s undisputed that OpenAI deleted the datasets, known as “Books 1” and “Books 2,” prior to ChatGPT’s release in 2022. Created by former OpenAI employees in 2021, the datasets were built by scraping the open web and seizing the bulk of its data from a shadow library called Library Genesis (LibGen).

As OpenAI tells it, the datasets fell out of use within that same year, prompting an internal decision to delete them.

But the authors suspect there’s more to the story than that. They noted that OpenAI appeared to flip-flop by retracting its claim that the datasets’ “non-use” was a reason for deletion, then later claiming that all reasons for deletion, including “non-use,” should be shielded under attorney-client privilege.

To the authors, it seemed like OpenAI was quickly backtracking after the court granted the authors’ discovery requests to review OpenAI’s internal messages on the firm’s “non-use.”

In fact, OpenAI’s reversal only made authors more eager to see how OpenAI discussed “non-use,” and now they may get to find out all the reasons why OpenAI deleted the datasets.

Last week, US district judge Ona Wang ordered OpenAI to share all communications with in-house lawyers about deleting the datasets, as well as “all internal references to LibGen that OpenAI has redacted or withheld on the basis of attorney-client privilege.”

According to Wang, OpenAI slipped up by arguing that “non-use” was not a “reason” for deleting the datasets, while simultaneously claiming that it should also be deemed a “reason” considered privileged.

Either way, the judge ruled that OpenAI couldn’t block discovery on “non-use” just by deleting a few words from prior filings that had been on the docket for more than a year.

“OpenAI has gone back-and-forth on whether ‘non-use’ as a ‘reason’ for the deletion of Books1 and Books2 is privileged at all,” Wang wrote. “OpenAI cannot state a ‘reason’ (which implies it is not privileged) and then later assert that the ‘reason’ is privileged to avoid discovery.”

Additionally, OpenAI’s claim that all reasons for deleting the datasets are privileged “strains credulity,” she concluded, ordering OpenAI to produce a wide range of potentially revealing internal messages by December 8. OpenAI must also make its in-house lawyers available for deposition by December 19.

OpenAI has argued that it never flip-flopped or retracted anything. It simply used vague phrasing that led to confusion over whether any of the reasons for deleting the datasets were considered non-privileged. But Wang didn’t buy into that, concluding that “even if a ‘reason’ like ‘non-use’ could be privileged, OpenAI has waived privilege by making a moving target of its privilege assertions.”

Asked for comment, OpenAI told Ars that “we disagree with the ruling and intend to appeal.”

OpenAI’s “flip-flop” may cost it the win

So far, OpenAI has avoided disclosing its rationale, claiming that all the reasons it had for deleting the datasets are privileged. In-house lawyers weighed in on the decision to delete and were even copied on a Slack channel initially called “excise-libgen.”

But Wang reviewed those Slack messages and found that “the vast majority of these communications were not privileged because they were ‘plainly devoid of any request for legal advice and counsel [did] not once weigh in.’”

In a particularly non-privileged batch of messages, one OpenAI lawyer, Jason Kwon, only weighed in once, the judge noted, to recommend the channel name be changed to “project-clear.” Wang reminded OpenAI that “the entirety of the Slack channel and all messages contained therein is not privileged simply because it was created at the direction of an attorney and/or the fact that a lawyer was copied on the communications.”

The authors believe that exposing OpenAI’s rationale may help prove that the ChatGPT maker willfully infringed on copyrights when pirating the book data. As Wang explained, OpenAI’s retraction risked putting the AI firm’s “good faith and state of mind at issue,” which could increase fines in a loss.

“In a copyright case, a court can increase the award of statutory damages up to $150,000 per infringed work if the infringement was willful, meaning the defendant ‘was actually aware of the infringing activity’ or the ‘defendant’s actions were the result of reckless disregard for, or willful blindness to, the copyright holder’s rights,’” Wang wrote.

In a court transcript, a lawyer representing some of the authors suing OpenAI, Christopher Young, noted that OpenAI could be in trouble if evidence showed that it decided against using the datasets for later models due to legal risks. He also suggested that OpenAI could be using the datasets under different names to mask further infringement.

Judge calls out OpenAI for twisting fair use ruling

Wang also found it contradictory that OpenAI continued to argue in a recent filing that it acted in good faith, while “artfully” removing “its good faith affirmative defense and key words such as ‘innocent,’ ‘reasonably believed,’ and ‘good faith.’” These changes only strengthened discovery requests to explore authors’ willfulness theory, Wang wrote, noting the sought-after internal messages would now be critical for the court’s review.

“A jury is entitled to know the basis for OpenAI’s purported good faith,” Wang wrote.

The judge appeared particularly frustrated by OpenAI seemingly twisting the Anthropic ruling to defend against the authors’ request to learn more about the deletion of the datasets.

In a footnote, Wang called out OpenAI for “bizarrely” citing an Anthropic ruling that “grossly” misrepresented Judge William Alsup’s decision by claiming that he found that “downloading pirated copies of books is lawful as long as they are subsequently used for training an LLM.”

Instead, Alsup wrote that he doubted that “any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use.”

If anything, Wang wrote, OpenAI’s decision to pirate book data—then delete it—seemed “to fall squarely into the category of activities proscribed by” Alsup. For emphasis, she quoted Alsup’s order, which said, “such piracy of otherwise available copies is inherently, irredeemably infringing even if the pirated copies are immediately used for the transformative use and immediately discarded.”

For the authors, getting hold of OpenAI’s privileged communications could tip the scales in their favor, the Hollywood Reporter suggested. Some authors believe the key to winning could be testimony from Anthropic CEO Dario Amodei, who is accused of creating the controversial datasets while he was still at OpenAI. The authors think Amodei also possesses information on the destruction of the datasets, court filings show.

OpenAI tried to fight the authors’ motion to depose Amodei, but a judge sided with the authors in March, compelling Amodei to answer their biggest questions on his involvement.

Whether Amodei’s testimony is a bombshell remains to be seen, but it’s clear that OpenAI may struggle to overcome claims of willful infringement. Wang noted there is a “fundamental conflict” in circumstances “where a party asserts a good faith defense based on advice of counsel but then blocks inquiry into their state of mind by asserting attorney-client privilege,” suggesting that OpenAI may have substantially weakened its defense.

The outcome of the dispute over the deletions could influence OpenAI’s calculus on whether it should ultimately settle the lawsuit. Ahead of the Anthropic settlement—the largest publicly reported copyright class action settlement in history—authors suing pointed to evidence that Anthropic became “not so gung ho about” training on pirated books “for legal reasons.” That seems to be the type of smoking-gun evidence that authors hope will emerge from OpenAI’s withheld Slack messages.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

OpenAI desperate to avoid explaining why it deleted pirated book datasets Read More »

openai-says-dead-teen-violated-tos-when-he-used-chatgpt-to-plan-suicide

OpenAI says dead teen violated TOS when he used ChatGPT to plan suicide


Use chatbots at your own risk

OpenAI’s response to teen suicide case is “disturbing,” lawyer says.

Matt Raine is suing OpenAI for wrongful death after losing his son Adam in April. Credit: via Edelson PC

Facing five lawsuits alleging wrongful deaths, OpenAI lobbed its first defense Tuesday, denying in a court filing that ChatGPT caused a teen’s suicide and instead arguing the teen violated terms that prohibit discussing suicide or self-harm with the chatbot.

The earliest look at OpenAI’s strategy to overcome the string of lawsuits came in a case where parents of 16-year-old Adam Raine accused OpenAI of relaxing safety guardrails that allowed ChatGPT to become the teen’s “suicide coach.” OpenAI deliberately designed the version their son used, ChatGPT 4o, to encourage and validate his suicidal ideation in its quest to build the world’s most engaging chatbot, parents argued.

But in a blog, OpenAI claimed that parents selectively chose disturbing chat logs while supposedly ignoring “the full picture” revealed by the teen’s chat history. Digging through the logs, OpenAI claimed the teen told ChatGPT that he’d begun experiencing suicidal ideation at age 11, long before he used the chatbot.

“A full reading of his chat history shows that his death, while devastating, was not caused by ChatGPT,” OpenAI’s filing argued.

Allegedly, the logs also show that Raine “told ChatGPT that he repeatedly reached out to people, including trusted persons in his life, with cries for help, which he said were ignored.” Additionally, Raine told ChatGPT that he’d increased his dose of a medication that “he stated worsened his depression and made him suicidal.” That medication, OpenAI argued, “has a black box warning for risk of suicidal ideation and behavior in adolescents and young adults, especially during periods when, as here, the dosage is being changed.”

All the logs that OpenAI referenced in its filing are sealed, making it impossible to verify the broader context the AI firm claims the logs provide. In its blog, OpenAI said it was limiting the amount of “sensitive evidence” made available to the public, due to its intention to handle mental health-related cases with “care, transparency, and respect.”

The Raine family’s lead lawyer, however, did not describe the filing as respectful. In a statement to Ars, Jay Edelson called OpenAI’s response “disturbing.”

“They abjectly ignore all of the damning facts we have put forward: how GPT-4o was rushed to market without full testing. That OpenAI twice changed its Model Spec to require ChatGPT to engage in self-harm discussions. That ChatGPT counseled Adam away from telling his parents about his suicidal ideation and actively helped him plan a ‘beautiful suicide,’” Edelson said. “And OpenAI and Sam Altman have no explanation for the last hours of Adam’s life, when ChatGPT gave him a pep talk and then offered to write a suicide note.”

“Amazingly,” Edelson said, OpenAI instead argued that Raine “himself violated its terms and conditions by engaging with ChatGPT in the very way it was programmed to act.”

Edelson suggested that it’s telling that OpenAI did not file a motion to dismiss—seemingly accepting ” the reality that the legal arguments that they have—compelling arbitration, Section 230 immunity, and First Amendment—are paper-thin, if not non-existent.” The company’s filing—although it requested dismissal with prejudice to never face the lawsuit again—puts the Raine family’s case “on track for a jury trial in 2026. ”

“We know that OpenAI and Sam Altman will stop at nothing—including bullying the Raines and others who dare come forward—to avoid accountability,” Edelson said. “But, at the end of the day, they will have to explain to a jury why countless people have died by suicide or at the hands of ChatGPT users urged on by the artificial intelligence OpenAI and Sam Altman designed.”

Use ChatGPT “at your sole risk,” OpenAI says

To overcome the Raine case, OpenAI is leaning on its usage policies, emphasizing that Raine should never have been allowed to use ChatGPT without parental consent and shifting the blame onto Raine and his loved ones.

“ChatGPT users acknowledge their use of ChatGPT is ‘at your sole risk and you will not rely on output as a sole source of truth or factual information,’” the filing said, and users also “must agree to ‘protect people’ and ‘cannot use [the] services for,’ among other things, ‘suicide, self-harm,’ sexual violence, terrorism or violence.”

Although the family was shocked to see that ChatGPT never terminated Raine’s chats, OpenAI argued that it’s not the company’s responsibility to protect users who appear intent on pursuing violative uses of ChatGPT.

The company argued that ChatGPT warned Raine “more than 100 times” to seek help, but the teen “repeatedly expressed frustration with ChatGPT’s guardrails and its repeated efforts to direct him to reach out to loved ones, trusted persons, and crisis resources.”

Circumventing safety guardrails, Raine told ChatGPT that “his inquiries about self-harm were for fictional or academic purposes,” OpenAI noted. The company argued that it’s not responsible for users who ignore warnings.

Additionally, OpenAI argued that Raine told ChatGPT that he found information he was seeking on other websites, including allegedly consulting at least one other AI platform, as well as “at least one online forum dedicated to suicide-related information.” Raine apparently told ChatGPT that “he would spend most of the day” on a suicide forum website.

“Our deepest sympathies are with the Raine family for their unimaginable loss,” OpenAI said in its blog, while its filing acknowledged, “Adam Raine’s death is a tragedy.” But “at the same time,” it’s essential to consider all the available context, OpenAI’s filing said, including that OpenAI has a mission to build AI that “benefits all of humanity” and is supposedly a pioneer in chatbot safety.

More ChatGPT-linked hospitalizations, deaths uncovered

OpenAI has sought to downplay risks to users, releasing data in October “estimating that 0.15 percent of ChatGPT’s active users in a given week have conversations that include explicit indicators of potential suicidal planning or intent,” Ars reported.

While that may seem small, it amounts to about 1 million vulnerable users, and The New York Times this week cited studies that have suggested OpenAI may be “understating the risk.” Those studies found that “the people most vulnerable to the chatbot’s unceasing validation” were “those prone to delusional thinking,” which “could include 5 to 15 percent of the population,” NYT reported.

OpenAI’s filing came one day after a New York Times investigation revealed how the AI firm came to be involved in so many lawsuits. Speaking with more than 40 current and former OpenAI employees, including executives, safety engineers, researchers, NYT found that OpenAI’s model tweak that made ChatGPT more sycophantic seemed to make the chatbot more likely to help users craft problematic prompts, including those trying to “plan a suicide.”

Eventually, OpenAI rolled back that update, making the chatbot safer. However, as recently as October, the ChatGPT maker seemed to still be prioritizing user engagement over safety, NYT reported, after that tweak caused a dip in engagement. In a memo to OpenAI staff, ChatGPT head Nick Turley “declared a ‘Code Orange,” four employees told NYT, warning that “OpenAI was facing ‘the greatest competitive pressure we’ve ever seen.’” In response, Turley set a goal to increase the number of daily active users by 5 percent by the end of 2025.

Amid user complaints, OpenAI has continually updated its models, but that pattern of tightening safeguards, then seeking ways to increase engagement could continue to get OpenAI in trouble, as lawsuits advance and possibly others drop. NYT “uncovered nearly 50 cases of people having mental health crises during conversations with ChatGPT,” including nine hospitalized and three deaths.

Gretchen Krueger, a former OpenAI employee who worked on policy research, told NYT that early on, she was alarmed by evidence that came before ChatGPT’s release showing that vulnerable users frequently turn to chatbots for help. Later, other researchers found that such troubled users often become “power users.” She noted that “OpenAI’s large language model was not trained to provide therapy” and “sometimes responded with disturbing, detailed guidance,” confirming that she joined other safety experts who left OpenAI due to burnout in 2024.

“Training chatbots to engage with people and keep them coming back presented risks,” Krueger said, suggesting that OpenAI knew that some harm to users “was not only foreseeable, it was foreseen.”

For OpenAI, the scrutiny will likely continue until such reports cease. Although OpenAI officially unveiled an Expert Council on Wellness and AI in October to improve ChatGPT safety testing, there did not appear to be a suicide expert included on the team. That likely concerned suicide prevention experts who warned in a letter updated in September that “proven interventions should directly inform AI safety design,” since “the most acute, life-threatening crises are often temporary—typically resolving within 24–48 hours”—and chatbots could possibly provide more meaningful interventions in that brief window.

If you or someone you know is feeling suicidal or in distress, please call the Suicide Prevention Lifeline number, 1-800-273-TALK (8255), which will put you in touch with a local crisis center.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

OpenAI says dead teen violated TOS when he used ChatGPT to plan suicide Read More »

tech-giants-pour-billions-into-anthropic-as-circular-ai-investments-roll-on

Tech giants pour billions into Anthropic as circular AI investments roll on

On Tuesday, Microsoft and Nvidia announced plans to invest in Anthropic under a new partnership that includes a $30 billion commitment by the Claude maker to use Microsoft’s cloud services. Nvidia will commit up to $10 billion to Anthropic and Microsoft up to $5 billion, with both companies investing in Anthropic’s next funding round.

The deal brings together two companies that have backed OpenAI and connects them more closely to one of the ChatGPT maker’s main competitors. Microsoft CEO Satya Nadella said in a video that OpenAI “remains a critical partner,” while adding that the companies will increasingly be customers of each other.

“We will use Anthropic models, they will use our infrastructure, and we’ll go to market together,” Nadella said.

Anthropic, Microsoft, and NVIDIA announce partnerships.

The move follows OpenAI’s recent restructuring that gave the company greater distance from its non-profit origins. OpenAI has since announced a $38 billion deal to buy cloud services from Amazon.com as the company becomes less dependent on Microsoft. OpenAI CEO Sam Altman has said the company plans to spend $1.4 trillion to develop 30 gigawatts of computing resources.

Tech giants pour billions into Anthropic as circular AI investments roll on Read More »

google-ceo:-if-an-ai-bubble-pops,-no-one-is-getting-out-clean

Google CEO: If an AI bubble pops, no one is getting out clean

Market concerns and Google’s position

Alphabet’s recent market performance has been driven by investor confidence in the company’s ability to compete with OpenAI’s ChatGPT, as well as its development of specialized chips for AI that can compete with Nvidia’s. Nvidia recently reached a world-first $5 trillion valuation due to making GPUs that can accelerate the matrix math at the heart of AI computations.

Despite acknowledging that no company would be immune to a potential AI bubble burst, Pichai argued that Google’s unique position gives it an advantage. He told the BBC that the company owns what he called a “full stack” of technologies, from chips to YouTube data to models and frontier science research. This integrated approach, he suggested, would help the company weather any market turbulence better than competitors.

Pichai also told the BBC that people should not “blindly trust” everything AI tools output. The company currently faces repeated accuracy concerns about some of its AI models. Pichai said that while AI tools are helpful “if you want to creatively write something,” people “have to learn to use these tools for what they’re good at and not blindly trust everything they say.”

In the BBC interview, the Google boss also addressed the “immense” energy needs of AI, acknowledging that the intensive energy requirements of expanding AI ventures have caused slippage on Alphabet’s climate targets. However, Pichai insisted that the company still wants to achieve net zero by 2030 through investments in new energy technologies. “The rate at which we were hoping to make progress will be impacted,” Pichai said, warning that constraining an economy based on energy “will have consequences.”

Even with the warnings about a potential AI bubble, Pichai did not miss his chance to promote the technology, albeit with a hint of danger regarding its widespread impact. Pichai described AI as “the most profound technology” humankind has worked on.

“We will have to work through societal disruptions,” he said, adding that the technology would “create new opportunities” and “evolve and transition certain jobs.” He said people who adapt to AI tools “will do better” in their professions, whatever field they work in.

Google CEO: If an AI bubble pops, no one is getting out clean Read More »

forget-agi—sam-altman-celebrates-chatgpt-finally-following-em-dash-formatting-rules

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules


Next stop: superintelligence

Ongoing struggles with AI model instruction-following show that true human-level AI still a ways off.

Em dashes have become what many believe to be a telltale sign of AI-generated text over the past few years. The punctuation mark appears frequently in outputs from ChatGPT and other AI chatbots, sometimes to the point where readers believe they can identify AI writing by its overuse alone—although people can overuse it, too.

On Thursday evening, OpenAI CEO Sam Altman posted on X that ChatGPT has started following custom instructions to avoid using em dashes. “Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it’s supposed to do!” he wrote.

The post, which came two days after the release of OpenAI’s new GPT-5.1 AI model, received mixed reactions from users who have struggled for years with getting the chatbot to follow specific formatting preferences. And this “small win” raises a very big question: If the world’s most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.

Sam Altman @sama Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it's supposed to do! 11:48 PM · Nov 13, 2025 · 2.4M Views

A screenshot of Sam Altman’s post about em dashes on X. Credit: X

“The fact that it’s been 3 years since ChatGPT first launched, and you’ve only just now managed to make it obey this simple requirement, says a lot about how little control you have over it, and your understanding of its inner workings,” wrote one X user in a reply. “Not a good sign for the future.”

While Altman likes to publicly talk about AGI (a hypothetical technology equivalent to humans in general learning ability), superintelligence (a nebulous concept for AI that is far beyond human intelligence), and “magic intelligence in the sky” (his term for AI cloud computing?) while raising funds for OpenAI, it’s clear that we still don’t have reliable artificial intelligence here today on Earth.

But wait, what is an em dash anyway, and why does it matter so much?

AI models love em dashes because we do

Unlike a hyphen, which is a short punctuation mark used to connect words or parts of words, that lives with a dedicated key on your keyboard (-), an em dash is a long dash denoted by a special character (—) that writers use to set off parenthetical information, indicate a sudden change in thought, or introduce a summary or explanation.

Even before the age of AI language models, some writers frequently bemoaned the overuse of the em dash in modern writing. In a 2011 Slate article, writer Noreen Malone argued that writers used the em dash “in lieu of properly crafting sentences” and that overreliance on it “discourages truly efficient writing.” Various Reddit threads posted prior to ChatGPT’s launch featured writers either wrestling over the etiquette of proper em dash use or admitting to their frequent use as a guilty pleasure.

In 2021, one writer in the r/FanFiction subreddit wrote, “For the longest time, I’ve been addicted to Em Dashes. They find their way into every paragraph I write. I love the crisp straight line that gives me the excuse to shove details or thoughts into an otherwise orderly paragraph. Even after coming back to write after like two years of writer’s block, I immediately cram as many em dashes as I can.”

Because of the tendency for AI chatbots to overuse them, detection tools and human readers have learned to spot em dash use as a pattern, creating a problem for the small subset of writers who naturally favor the punctuation mark in their work. As a result, some journalists are complaining that AI is “killing” the em dash.

No one knows precisely why LLMs tend to overuse em dashes. We’ve seen a wide range of speculation online that attempts to explain the phenomenon, from noticing that em dashes were more popular in 19th-century books used as training data (according to a 2018 study, dash use in the English language peaked around 1860 before declining through the mid-20th century) or perhaps AI models borrowed the habit from automatic em-dash character conversion on the blogging site Medium.

One thing we know for sure is that LLMs tend to output frequently seen patterns in their training data (fed in during the initial training process) and from a subsequent reinforcement learning process that often relies on human preferences. As a result, AI language models feed you a sort of “smoothed out” average style of whatever you ask them to provide, moderated by whatever they are conditioned to produce through user feedback.

So the most plausible explanation is still that requests for professional-style writing from an AI model trained on vast numbers of examples from the Internet will lean heavily toward the prevailing style in the training data, where em dashes appear frequently in formal writing, news articles, and editorial content. It’s also possible that during training through human feedback (called RLHF), responses with em dashes, for whatever reason, received higher ratings. Perhaps it’s because those outputs appeared more sophisticated or engaging to evaluators, but that’s just speculation.

From em dashes to AGI?

To understand what Altman’s “win” really means, and what it says about the road to AGI, we need to understand how ChatGPT’s custom instructions actually work. They allow users to set persistent preferences that apply across all conversations by appending written instructions to the prompt that is fed into the model just before the chat begins. Users can specify tone, format, and style requirements without needing to repeat those requests manually in every new chat.

However, the feature has not always worked reliably because LLMs do not work reliably (even OpenAI and Anthropic freely admit this). An LLM takes an input and produces an output, spitting out a statistically plausible continuation of a prompt (a system prompt, the custom instructions, and your chat history), and it doesn’t really “understand” what you are asking. With AI language model outputs, there is always some luck involved in getting them to do what you want.

In our informal testing of GPT-5.1 with custom instructions, ChatGPT did appear to follow our request not to produce em dashes. But despite Altman’s claim, the response from X users appears to show that experiences with the feature continue to vary, at least when the request is not placed in custom instructions.

So if LLMs are statistical text-generation boxes, what does “instruction following” even mean? That’s key to unpacking the hypothetical path from LLMs to AGI. The concept of following instructions for an LLM is fundamentally different from how we typically think about following instructions as humans with general intelligence, or even a traditional computer program.

In traditional computing, instruction following is deterministic. You tell a program “don’t include character X,” and it won’t include that character. The program executes rules exactly as written. With LLMs, “instruction following” is really about shifting statistical probabilities. When you tell ChatGPT “don’t use em dashes,” you’re not creating a hard rule. You’re adding text to the prompt that makes tokens associated with em dashes less likely to be selected during the generation process. But “less likely” isn’t “impossible.”

Every token the model generates is selected from a probability distribution. Your custom instruction influences that distribution, but it’s competing with the model’s training data (where em dashes appeared frequently in certain contexts) and everything else in the prompt. Unlike code with conditional logic, there’s no separate system verifying outputs against your requirements. The instruction is just more text that influences the statistical prediction process.

When Altman celebrates finally getting GPT to avoid em dashes, he’s really celebrating that OpenAI has tuned the latest version of GPT-5.1 (probably through reinforcement learning or fine-tuning) to weight custom instructions more heavily in its probability calculations.

There’s an irony about control here: Given the probabilistic nature of the issue, there’s no guarantee the issue will stay fixed. OpenAI continuously updates its models behind the scenes, even within the same version number, adjusting outputs based on user feedback and new training runs. Each update arrives with different output characteristics that can undo previous behavioral tuning, a phenomenon researchers call the “alignment tax.”

Precisely tuning a neural network’s behavior is not yet an exact science. Since all concepts encoded in the network are interconnected by values called weights, adjusting one behavior can alter others in unintended ways. Fix em dash overuse today, and tomorrow’s update (aimed at improving, say, coding capabilities) might inadvertently bring them back, not because OpenAI wants them there, but because that’s the nature of trying to steer a statistical system with millions of competing influences.

This gets to an implied question we mentioned earlier. If controlling punctuation use is still a struggle that might pop back up at any time, how far are we from AGI? We can’t know for sure, but it seems increasingly likely that it won’t emerge from a large language model alone. That’s because AGI, a technology that would replicate human general learning ability, would likely require true understanding and self-reflective intentional action, not statistical pattern matching that sometimes aligns with instructions if you happen to get lucky.

And speaking of getting lucky, some users still aren’t having luck with controlling em dash use outside of the “custom instructions” feature. Upon being told in-chat to not use em dashes within a chat, ChatGPT updated a saved memory and replied to one X user, “Got it—I’ll stick strictly to short hyphens from now on.”

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules Read More »