Features

the-iss-is-nearing-retirement,-so-why-is-nasa-still-gung-ho-about-starliner?

The ISS is nearing retirement, so why is NASA still gung-ho about Starliner?


NASA is doing all it can to ensure Boeing doesn’t abandon the Starliner program.

Boeing’s Starliner spacecraft atop a United Launch Alliance Atlas V rocket before a test flight in 2019. Credit: NASA/Joel Kowsky

Boeing’s Starliner spacecraft atop a United Launch Alliance Atlas V rocket before a test flight in 2019. Credit: NASA/Joel Kowsky

After so many delays, difficulties, and disappointments, you might be inclined to think that NASA wants to wash its hands of Boeing’s troubled Starliner spacecraft.

But that’s not the case.

The manager of NASA’s commercial crew program, Steve Stich, told reporters Thursday that Boeing and its propulsion supplier, Aerojet Rocketdyne, are moving forward with several changes to the Starliner spacecraft to resolve problems that bedeviled a test flight to the International Space Station (ISS) last year. These changes include new seals to plug helium leaks and thermal shunts and barriers to keep the spacecraft’s thrusters from overheating.

Boeing, now more than $2 billion in the hole to pay for all Starliner’s delays, is still more than a year away from executing on its multibillion-dollar NASA contract and beginning crew rotation flights to the ISS. But NASA officials say Boeing remains committed to Starliner.

“We really are working toward a flight as soon as early next year with Starliner, and then ultimately, our goal is to get into crew rotation flights with Starliner,” Stich said. “And those would start no earlier than the second crew rotation slot at the end of next year.”

That would be 11 years after Boeing officials anticipated the spacecraft would enter operational service for NASA when they announced the Starliner program in 2010.

Decision point

The next Starliner flight will probably transport only cargo to the ISS, not astronauts. But NASA hasn’t made any final decisions on the matter. The agency has enough crew rotation missions booked to fly on SpaceX’s Dragon spacecraft to cover the space station’s needs until well into 2027 or 2028.

“I think there are a lot of advantages, I would say, to fly the cargo flight first,” Stich said. “If we really look at the history of Starliner and Dragon, I think Dragon benefited a lot from having earlier [cargo] flights before the crew contract was let for the space station.”

One drawback of flying a Starliner cargo mission is that it will use up one of United Launch Alliance’s remaining Atlas V rockets currently earmarked for a future Starliner crew launch. That means Boeing would have to turn to another rocket to accomplish its full contract with NASA, which covers up to six crew missions.

While Boeing says Starliner can launch on several different rockets, the difficulty of adapting the spacecraft to a new launch vehicle, such as ULA’s Vulcan, shouldn’t be overlooked. Early in Starliner’s development, Boeing and ULA had to overcome an issue with unexpected aerodynamic loads discovered during wind tunnel testing. This prompted engineers to design an aerodynamic extension, or skirt, to go underneath the Starliner spacecraft on top of its Atlas V launcher.

Starliner has suffered delays from the beginning. A NASA budget crunch in the early 2010s pushed back the program about two years, but the rest of the schedule slips have largely fallen on Boeing’s shoulders. The setbacks included a fuel leak and fire during a critical ground test, parachute problems, a redesign to accommodate unanticipated aerodynamic forces, and a computer timing error that cut short Starliner’s first attempt to reach the space station in 2019.

This all culminated in the program’s first test flight with astronauts last summer. But after running into helium leaks and overheating thrusters, the mission ended with Starliner returning to Earth empty, while the spacecraft’s two crew members remained on the International Space Station until they could come home on a SpaceX Dragon spacecraft this year.

The outcome was a stinging disappointment for Boeing. Going into last year’s crew test flight, Boeing appeared to be on the cusp of joining SpaceX and finally earning revenue as one of NASA’s certified crew transportation providers for the ISS.

For several months, Boeing officials were strikingly silent on Starliner’s future. The company declined to release any statements on their long-term commitment to the program, and a Boeing program manager unexpectedly withdrew from a NASA press conference marking the end of the Starliner test flight last September.

Kelly Ortberg, Boeing’s president and CEO, testifies before the Senate Commerce, Science, and Transportation Committee on April 2, 2025, in Washington, DC. Credit: Win McNamee/Getty Images

But that has changed in the last few months. Kelly Ortberg, who took over as Boeing’s CEO last year, told CNBC in April that the company planned “more missions on Starliner” and said work to overcome the thruster issues the spacecraft encountered last year is “pretty straightforward.”

“We know what the problems were, and we’re making corrective actions,” Ortberg said. “So, we hope to do a few more flights here in the coming years.”

Task and purpose

NASA officials remain eager for Starliner to begin these regular crew rotation flights, even as its sole destination, the ISS, enters its sunset years. NASA and its international partners plan to decommission and scuttle the space station in 2030 and 2031, more than 30 years after the launch of the lab’s first module.

NASA’s desire to bring Starliner online has nothing to do with any performance issues with SpaceX, the agency’s other commercial crew provider. SpaceX has met or exceeded all of NASA’s expectations in 11 long-duration flights to the ISS with its Dragon spacecraft. Since its first crew flight in 2020, SpaceX has established a reliable cadence with Dragon missions serving NASA and private customers.

However, there are some questions about SpaceX’s long-term plans for the Dragon program, and those concerns didn’t suddenly spring up last month, when SpaceX founder and chief executive Elon Musk suggested on X that SpaceX would “immediately” begin winding down the Dragon program. The suggestion came as Musk and President Donald Trump exchanged threats and insults on social media amid a feud as the one-time political allies had a dramatic falling out months into Trump’s second term in the White House.

In a subsequent post on X, Musk quickly went back on his threat to soon end the Dragon program. SpaceX officials participating in NASA press conferences in the last few weeks have emphasized the company’s dedication to human spaceflight without specifically mentioning Dragon. SpaceX’s fifth and final human-rated Dragon capsule debuted last month on its first flight to the ISS.

“I would say we’re pretty committed to the space business,” said Bill Gerstenmaier, SpaceX’s vice president of build and flight reliability. “We’re committed to flying humans in space and doing it safely.”

There’s a kernel of truth behind Musk’s threat to decommission Dragon. Musk has long had an appetite to move on from the Dragon program and pivot more of SpaceX’s resources to Starship, the company’s massive next-generation rocket. Starship is envisioned by SpaceX as an eventual replacement for Dragon and the Falcon 9 launcher.

A high-resolution commercial Earth-imaging satellite owned by Maxar captured this view of the International Space Station on June 7, 2024, with Boeing’s Starliner capsule docked at the lab’s forward port (lower right). Credit: Satellite image (c) 2024 Maxar Technologies

NASA hopes commercial space stations can take over for the ISS after its retirement, but there’s no guarantee SpaceX will still be flying Dragon in the 2030s. This injects some uncertainty into plans for commercial space stations.

One possible scenario is that, sometime in the 2030s, the only options for transporting people to and from commercial space stations in low-Earth orbit could be Starliner and Starship. We’ll discuss the rationale for this scenario later in this story.

While the cost of a seat on SpaceX’s Dragon is well known, there’s low confidence in the price of a ticket to low-Earth orbit on Starliner or Starship. What’s more, some of the commercial outposts may be incompatible with Starship because of its enormous mass, which could overcome the ability of a relatively modest space station to control its orientation. NASA identified this as an issue with its Gateway mini-space station in development to fly in orbit around the Moon.

It’s impossible to predict when SpaceX will pull the plug on Dragon. The same goes with Boeing and Starliner. But NASA and other customers are interested in buying more Dragon flights.

If SpaceX can prove Starship is safe enough to launch and land with people onboard, Dragon’s days will be numbered. But Starship is likely at least several years from being human-rated for flights to and from low-Earth orbit. NASA’s contract with SpaceX to develop a version of Starship to land astronauts on the Moon won’t require the ship to be certified for launches and landings on Earth. In some ways, that’s a more onerous challenge than the Moon mission because of the perils of reentering Earth’s atmosphere, which Starship won’t need to endure for a lunar landing, and the ship’s lack of a launch abort system.

Once operational, Starship is designed to carry significantly more cargo and people than Falcon 9 and Dragon, but it’s anyone’s guess when it might be ready for crew missions. Until then, if SpaceX wants to have an operational human spaceflight program, it’s Dragon or bust.

For the International Space Station, it’s also Dragon or bust, at least until Boeing gets going. SpaceX’s capsules are the only US vehicles certified to fly to space with NASA astronauts, and any more US government payments to Russia to launch Americans on Soyuz missions would be politically unpalatable.

From the start of the commercial crew program, NASA sought two contractors providing their own means of flying to and from the ISS. The main argument for this “dissimilar redundancy” was to ensure NASA could still access the space station in the event of a launch failure or some other technical problem. The same argument could be made now that NASA needs two options to avoid being at the whim of one company’s decisions.

Stretching out

All of this is unfolding as the Trump administration seeks to slash funding for the International Space Station, cut back on the lab’s research program, and transition to “minimal safe operations” for the final few years of its life. Essentially, the space station would limp to the finish line, perhaps with a smaller crew than the seven-person staff living and working in it today.

At the end of this month, SpaceX is scheduled to launch the Crew-11 mission—the 12th Dragon crew mission for NASA and the 11th fully operational crew ferry flight to the ISS. Two Americans, one Japanese astronaut, and a Russian cosmonaut will ride to the station for a stay of at least six months.

NASA’s existing contract with SpaceX covers four more long-duration flights to the space station with Dragon, including the mission set to go on July 31.

One way NASA can save money in the space station’s budget is by simply flying fewer missions. Stich said Thursday that NASA is working with SpaceX to extend the Dragon spacecraft’s mission duration limit from seven months to eight months. The recertification of Dragon for a longer mission could be finished later this year, allowing NASA to extend Crew-11’s stay at the ISS if needed. Over time, longer stays mean fewer crew rotation missions.

“We can extend the mission in real-time as needed as we better understand… the appropriations process and what that means relative to the overall station manifest,” Stich said.

Boeing’s Starliner spacecraft backs away from the International Space Station on September 6, 2024, without its crew. Credit: NASA

Boeing’s fixed-price contract with NASA originally covered an unpiloted test flight of Starliner, a demonstration flight with astronauts, and then up to six operational missions delivering crews to the ISS. But NASA has only given Boeing the “Authority To Proceed” for three of its six potential operational Starliner missions. This milestone, known as ATP, is a decision point in contracting lingo where the customer—in this case, NASA—places a firm order for a deliverable. NASA has previously said it awards these task orders about two to three years prior to a mission’s launch.

If NASA opts to go to eight-month missions on the ISS with Dragon and Starliner, the agency’s firm orders for three Boeing missions and four more SpaceX crew flights would cover the agency’s needs into early 2030, not long before the final crew will depart the space station.

Stich said NASA officials are examining their options. These include whether NASA should book more crew missions with SpaceX, authorize Boeing to prepare for additional Starliner flights beyond the first three, or order no more flights at all.

“As we better understand the budget and better understand what’s in front of us, we’re working through that,” Stich said. “It’s really too early to speculate how many flights we’ll fly with each provider, SpaceX and Boeing.”

Planning for the 2030s

NASA officials also have an eye for what happens after 2030. The agency has partnered with commercial teams led by Axiom, Blue Origin, and Voyager Technologies on plans for privately owned space stations in low-Earth orbit to replace some of the research capabilities lost with the end of the ISS program.

The conventional wisdom goes that these new orbiting outposts will be less expensive to operate than the ISS, making them more attractive to commercial clients, ranging from pharmaceutical research and in-space manufacturing firms to thrill-seeking private space tourists. NASA, which seeks to maintain a human presence in low-Earth orbit as it turns toward the Moon and Mars, will initially be an anchor customer until the space stations build up more commercial demand.

These new space stations will need a way to receive cargo and visitors. NASA wants to preserve the existing commercial cargo and crew transport systems so they’re available for commercial space stations in the 2030s. Stich said NASA is looking at transferring the rights for any of the agency’s commercial crew missions that don’t fly to ISS over to the commercial space stations. Among NASA’s two commercial crew providers, it currently looks more likely that Boeing’s contract will have unused capacity than SpaceX’s when the ISS program ends.

This is a sweetener NASA could offer to its stable of private space station developers as they face other hurdles in getting their hardware off the ground. It’s unclear whether a business case exists to justify the expense of building and operating a commercial outpost in orbit or if the research and manufacturing customers that could use a private space station might find a cheaper option in robotic flying laboratories, such as those being developed by Varda Space Industries.

A rendering of Voyager’s Starlab space station. Credit: Voyager Space

NASA’s policies haven’t helped matters. Analysts say NASA’s financial support for private space station developers has lagged, and the agency’s fickle decision-making on when to retire the International Space Station has made private fundraising more difficult. It’s not a business for the faint-hearted. For example, Axiom has gone through several rounds of layoffs in the last year.

The White House’s budget request for fiscal year 2026 proposes a 25 percent cut to NASA’s overall budget, but the funding line for commercial space stations is an area marked for an increase. Still, there’s a decent chance that none of the proposed commercial outposts will be flying when the ISS crashes back to Earth. In that event, China would be the owner and operator of the only space station in orbit.

At least at first, transportation costs will be the largest expense for any company that builds and operates a privately owned space station. It costs NASA about 40 percent more each year to ferry astronauts and supplies to and from the ISS than it does to operate the space station. For a smaller commercial outpost with reduced operating costs, the gap will likely be even wider.

If Boeing can right the ship with Starliner and NASA offers a few prepaid crew missions to private space station developers, the money saved could help close someone’s business case and hasten the launch of a new era in commercial spaceflight.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

The ISS is nearing retirement, so why is NASA still gung-ho about Starliner? Read More »

two-guys-hated-using-comcast,-so-they-built-their-own-fiber-isp

Two guys hated using Comcast, so they built their own fiber ISP


Brothers-in-law use construction knowledge to compete against Comcast in Michigan.

Two young men stand outside next to service vans with a logo for Prime-One, the Internet provider they founded.

Samuel Herman (left) and Alexander Baciu (right), founders of Prime-One. Credit: Prime-One

Samuel Herman (left) and Alexander Baciu (right), founders of Prime-One. Credit: Prime-One

Samuel Herman and Alexander Baciu never liked using Comcast’s cable broadband. Now, the residents of Saline, Michigan, operate a fiber Internet service provider that competes against Comcast in their neighborhoods and has ambitions to expand.

“All throughout my life pretty much, I’ve had to deal with Xfinity’s bullcrap, them not being able to handle the speeds that we need,” Herman told Ars. “I lived in a house of 10. I have seven other brothers and sisters, and there’s 10 of us in total with my parents.”

With all those kids using the Internet for school and other needs, “it just doesn’t work out,” he said. Herman was particularly frustrated with Comcast upload speeds, which are much slower than the cable service’s download speeds.

“Many times we would have to call Comcast and let them know our bandwidth was slowing down… then they would say, ‘OK, we’ll refresh the system.’ So then it would work again for a week to two weeks, and then again we’d have the same issues,” he said.

Herman, now 25, got married in 2021 and started building his own house, and he tried to find another ISP to serve the property. He was familiar with local Internet service providers because he worked in construction for his father’s company, which contracts with ISPs to build their networks.

But no fiber ISP was looking to compete directly against Comcast where he lived, though Metronet and 123NET offer fiber elsewhere in the city, Herman said. He ended up paying Comcast $120 a month for gigabit download service with slower upload speeds. Baciu, who lives about a mile away from Herman, was also stuck with Comcast and was paying about the same amount for gigabit download speeds.

$80 for gigabit fiber, unlimited data

Herman said he was the chief operating officer of his father’s construction company and that he shifted the business “from doing just directional drilling to be a turnkey contractor for ISPs.” Baciu, Herman’s brother-in-law (having married Herman’s oldest sister), was the chief construction officer. Fueled by their knowledge of the business and their dislike of Comcast, they founded a fiber ISP called Prime-One.

Now, Herman is paying $80 a month to his own company for symmetrical gigabit service. Prime-One also offers 500Mbps for $75, 2Gbps for $95, and 5Gbps for $110. The first 30 days are free, and all plans have unlimited data and no contracts.

“We are 100 percent fiber optic,” Baciu told Ars. “Everything that we’re doing is all underground. We’re not doing aerial because we really want to protect the infrastructure and make sure we’re having a reliable connection.”

Each customer’s Optical Network Terminal (ONT) and other equipment is included in the service plan. Prime-One provides a modem and the ONT, plus a Wi-Fi router if the customer prefers not to use their own router. They don’t charge equipment or installation fees, Herman and Baciu said.

Prime-One began serving customers in January 2025, and Baciu said the network has been built to about 1,500 homes in Saline with about 75 miles of fiber installed. Prime-One intends to serve nearby towns as well, with the founders saying the plan is to serve 4,000 homes with the initial build and then expand further.

“This is our backyard”

Herman and Baciu’s main competition in their initial build area is Comcast and Frontier’s DSL service, they said. So far, they have built only to single-family homes, but they plan to serve multi-unit residential buildings, too.

“We started building in an area that’s a lot more rural,” where people have fewer options than in more densely populated areas, Herman said. “This is our home, this is our backyard, so we take this build very, very seriously.”

Baciu, who is 29, said that residents seem excited to have a new Internet option. “It’s so nice to see the excitement that they have. [People say], ‘Oh my gosh, I told everybody about Prime-One. My neighbor cannot wait for you guys to have them up, too. My boss is asking, my grandma’s asking.’ It’s a beautiful thing,” he said.

A bit more than 100 residents have bought service so far, they said. Herman said the company is looking to sign up about 30 percent of the homes in its network area to make a profit. “I feel fairly confident,” Herman said, noting the number of customers who signed up with the initial construction not even halfway finished.

Prime-One’s founders originally told us the 4,000-home build would be completed at the end of August, but Baciu indicated more recently that it will take longer than that. “We are working on sales for the next couple of months before continuing the rest of the build,” Baciu said.

Herman and Baciu started thinking about building an ISP about two years ago. With no fiber companies looking to compete against Comcast where they lived, “that was a trigger,” Baciu said. “We kept on talking. We’re like, hey, we’re doing this work for other people, why not?” In August 2024, they signed a contract with a firm that provides backhaul service, IP address assignments, and other key connectivity needs.

“We said, ‘let’s try to do it ourselves’”

ISPs generally want to build in areas where homes are built close together, requiring less fiber construction to serve more customers and make a bigger profit. Existing ISPs didn’t seem interested in expanding to where Herman and Baciu live, Herman said.

“We have spoken to all of these Internet service providers and asked them to come and service these areas. I knew that there was a dire need in this area and that everybody was sick of the Xfinity BS,” Herman said.

Having worked in construction for ISPs, they already had experience installing fiber lines and conduits.

A Prime-One installer working on a fiber build.

Credit: Prime-One

A Prime-One installer working on a fiber build. Credit: Prime-One

“We said, ‘you know, what the hell, why not? Let’s try to do it ourselves,'” Herman said. “We know we can handle the construction, we know we can handle all that area. We need some assistance on the technical side. So we hired the right people to handle the technical side and to handle the OSS/BSS software and to manage our dark fiber. And from there, we’re here where we’re at, within six months. We have over a hundred customers on our network, and we’re still building.”

Before construction, the brothers-in-law met with Jared Mauch, a Michigan man who built a fiber-to-the-home Internet provider because he couldn’t get good broadband service from AT&T or Comcast. We wrote about Mauch in 2021, when he was providing service to about 30 rural homes, and again in 2022, when he was expanding to hundreds of more homes.

Though Herman and Baciu already knew how to install fiber, Mauch “gave us quite a lot of insight on what to do, how to build, and on the actual ISP side… he showed us the way he did things on the technical side for the ISP, what strategies he used and what products he used,” Herman said.

The brothers-in-law didn’t end up using all the networking products Mauch suggested “because we are building a much larger network than he was,” Herman said. They went mostly with Nokia products for equipment like the optical network terminal installed at customer homes, he said.

Local employees

Baciu said he was frustrated by Comcast customer support being mostly limited to online chats instead of phone support. Prime-One has 15 local employees, mostly installers and technicians, with other employees working in customer service and operations, Herman said.

Prime-One offers phone and chat support, and “many people want to be able to see someone face to face, which is very easy for us to do since we have people here locally,” Herman said.

Network uptime has been good so far, Herman and Baciu said. “The only outage we’ve had was due to severe weather that caused a massive outage” for multiple networks, Herman said. “Any time any customers are experiencing an outage, maybe because of a lawnmower that cut their service line or anything, we guarantee a two- to four-hour time to repair it. And on top of that, to promote the fact that we discourage outages and we are working our best to fix them, we offer $5 back for every hour that they’re out of service.”

Comcast seems to have noticed, Herman said. “They’ve been calling our clients nonstop to try to come back to their service, offer them discounted rates for a five-year contract and so on,” he said.

Comcast touts upgrades, new unlimited data option

A Comcast spokesperson told Ars that “we have upgraded our network in this area and offer multi-gig speeds there, and across Michigan, as part of our national upgrade that has been rolling out.”

Meanwhile, Comcast’s controversial data caps are being phased out. With Comcast increasingly concerned about customer losses, it recently overhauled its offerings with four plans that come with unlimited data. The Comcast data caps aren’t quite dead yet because customers with caps have to switch to a new plan to get unlimited data.

Comcast told us that customers in Saline “have access to our latest plans with simple and predictable all-in pricing that includes unlimited data, Wi-Fi equipment, a line of Xfinity Mobile, and the option for a one or five-year price guarantee.”

Prime-One’s arrival on the scene caught some local people’s attention in a Reddit thread. One person who said they signed up for Prime-One wrote, “I’m honestly very impressed with the service overall. Comcast was charging me for every little thing on my account and the bill always found a way to get higher than expected, especially going over my data cap. Prime-One has no data caps and the bill has been the same since I first joined, not to mention they offer the first month free… I’m happy to see a company come out here and give us a better option.”

Comcast is facing competition from more than just Prime-One. The City of Saline government recently said there’s been an uptick in fiber construction in the city by Metronet and Frontier. Baciu said those builds don’t appear to be in the areas that Prime-One is serving. “To our knowledge, both Frontier and MetroNet have recently begun building in adjacent areas near our current footprint, but not within the zones we’re serving directly,” he said.

While Prime-One is a small ISP, Herman said the company’s expansion ambitions are bigger than he can reveal just now. “We have plans that we cannot disclose at this moment, but we do have a plan to expand,” he said.

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

Two guys hated using Comcast, so they built their own fiber ISP Read More »

it’s-hunting-season-in-orbit-as-russia’s-killer-satellites-mystify-skywatchers

It’s hunting season in orbit as Russia’s killer satellites mystify skywatchers


“Once more, we play our dangerous game—a game of chess—against our old adversary.”

In this pool photograph distributed by the Russian state media agency Sputnik, Russia’s President Vladimir Putin gives a speech during the Victory Day military parade at Red Square in central Moscow on May 9, 2025. Credit: Yacheslav Prokofyev/Pool/AFP via Getty Images

Russia is a waning space power, but President Vladimir Putin has made sure he still has a saber to rattle in orbit.

This has become more evident in recent weeks, when we saw a pair of rocket launches carrying top-secret military payloads, the release of a mysterious object from a Russian mothership in orbit, and a sequence of complex formation-flying maneuvers with a trio of satellites nearly 400 miles up.

In isolation, each of these things would catch the attention of Western analysts. Taken together, the frenzy of maneuvers represents one of the most significant surges in Russian military space activity since the end of the Cold War. What’s more, all of this is happening as Russia lags further behind the United States and China in everything from rockets to satellite manufacturing. Russian efforts to develop a reusable rocket, field a new human-rated spacecraft to replace the venerable Soyuz, and launch a megaconstellation akin to SpaceX’s Starlink are going nowhere fast.

Russia has completed just eight launches to orbit so far this year, compared to 101 orbital attempts by US launch providers and 36 from China. This puts Russia on pace for the fewest number of orbital launch attempts since 1961, the year Soviet citizen Yuri Gagarin became the first person to fly in space.

For the better part of three decades, Russia’s space program could rely on money from Western governments and commercial companies to build rockets, launch satellites, and ferry astronauts to and from the International Space Station. The money tap dried up after Russia’s invasion of Ukraine. Russia also lost access to Ukrainian-made components to go into their launch vehicles and satellites.

Chasing a Keyhole

Amid this retrenchment, Russia is targeting what’s left of its capacity for innovation in space toward pestering the US military. US intelligence officials last year said they believed Russia was pursuing a project to place a nuclear weapon in space. The detonation of a nuclear bomb in orbit could muck up the space environment for years, indiscriminately disabling countless satellites, whether they’re military or civilian.

Russia denied that it planned to launch a satellite with a nuclear weapon, but the country’s representative in the United Nations vetoed a Security Council resolution last year that would have reaffirmed a nearly 50-year-old ban on placing weapons of mass destruction into orbit.

While Russia hasn’t actually put a nuclear bomb into orbit yet, it’s making progress in fielding other kinds of anti-satellite systems. Russia destroyed one of its own satellites with a ground-launched missile in 2021, and high above us today, Russian spacecraft are stalking American spy satellites and keeping US military officials on their toes with a rapid march toward weaponizing space.

The world’s two other space powers, the United States and China, are developing their own “counter-space” weapons. But the US and Chinese militaries have largely focused on using their growing fleets of satellites as force multipliers in the terrestrial domain, enabling precision strikes, high-speed communications, and targeting for air, land, and naval forces. That is starting to change, with US Space Force commanders now openly discussing their own ambitions for offensive and defensive counter-space weapons.

Three of Russia’s eight orbital launches this year have carried payloads that could be categorized as potential anti-satellite weapons, or at least prototypes testing novel technologies that could lead to one. (For context, three of Russia’s other launches this year have gone to the International Space Station, and two launched conventional military communications or navigation satellites.)

One of these mystery payloads launched on May 23, when a Soyuz rocket boosted a satellite into a nearly 300-mile-high orbit perfectly aligned with the path of a US spy satellite owned by the National Reconnaissance Office. The new Russian satellite, designated Kosmos 2588, launched into the same orbital plane as an American satellite known to the public as USA 338, which is widely believed to be a bus-sized KH-11, or Keyhole-class, optical surveillance satellite.

A conceptual drawing of a KH-11 spy satellite, with internal views, based on likely design similarities to NASA’s Hubble Space Telescope. Credit: Giuseppe De Chiara/CC BY-SA 3.0

The governments of Russia and the United States use the Kosmos and USA monikers as cover names for their military satellites.

While their exact design and capabilities are classified, Keyhole satellites are believed to provide the sharpest images of any spy satellite in orbit. They monitor airfields, naval ports, missile plants, and other strategic sites across the globe. In the zeitgeist of geopolitics, China, Russia, Iran, and North Korea are the likeliest targets for the NRO’s Keyhole satellites. To put it succinctly, Keyhole satellites are some of the US government’s most prized assets in space.

Therefore, it’s not surprising to assume a potential military adversary might want to learn more about them or be in a position to disable or destroy them in the event of war.

Orbital ballet

A quick refresher on orbital mechanics is necessary here. Satellites orbit the Earth in flat planes fixed in inertial space. It’s not a perfect interpretation, but it’s easiest to understand this concept by imagining the background of stars in the sky as a reference map. In the short term, the position of a satellite’s orbit will remain unchanged on this reference map without any perturbation. For something in low-Earth orbit, Earth’s rotation presents a different part of the world to the satellite each time it loops around the planet.

It takes a lot of fuel to make changes to a satellite’s orbital plane, so if you want to send a satellite to rendezvous with another spacecraft already in orbit, it’s best to wait until our planet’s rotation brings the launch site directly under the orbital plane of the target. This happens twice per day for a satellite in low-Earth orbit.

That’s exactly what Russia is doing with a military program named Nivelir. In English, Nivelir translates to “dumpy level”—an optical instrument used by builders and surveyors.

The launch of Kosmos 2588 in May was precisely timed for the moment Earth’s rotation brought the Plesetsk Cosmodrome in northern Russia underneath the orbital plane of the NRO’s USA 338 Keyhole satellite. Launches to the ISS follow the same roadmap, with crew and cargo vehicles lifting off at exactly the right time—to the second—to intersect with the space station’s orbital plane.

Since 2019, Russia has launched four satellites into bespoke orbits to shadow NRO spy satellites. None of these Russian Nivelir spacecraft have gotten close to their NRO counterparts. The satellites have routinely passed dozens of miles from one another, but the similarities in their orbits would allow Russia’s spacecraft to get a lot closer—and theoretically make physical contact with the American satellite. The Nivelir satellites have even maneuvered to keep up with their NRO targets when US ground controllers have made small adjustments to their orbits.

“This ensures that the orbital planes do not drift apart,” wrote Marco Langbroek, a Dutch archaeologist and university lecturer on space situational awareness. Langbroek runs a website cataloguing military space activity.

This is no accident

There’s reason to believe that the Russian satellites shadowing the NRO in orbit might be more than inspectors or stalkers. Just a couple of weeks ago, another Nivelir satellite named Kosmos 2558 released an unknown object into an orbit that closely mirrors that of an NRO spy satellite named USA 326.

We’ve seen this before. An older Nivelir satellite, Kosmos 2542, released a sub-satellite shortly after launching in 2019 into the same orbital plane as the NRO’s USA 245 satellite, likely a KH-11 platform similar to the USA 338 satellite now being shadowed by Kosmos 2588.

After making multiple passes near the USA 245 spacecraft, Kosmos 2542’s sub-satellite backed off and fired a mysterious projectile in 2020 at a speed fast enough to damage or destroy any target in its sights. US military officials interpreted this as a test of an anti-satellite weapon.

Now, another Russian satellite is behaving in the same way, with a mothership opening up to release a smaller object that could in turn reveal its own surprise inside like a Matryoshka nesting doll. This time, however, the doll is unnesting nearly three years after launch. With Kosmos 2542, this all unfolded within months of arriving in space.

The NRO’s USA 326 satellite launched in February 2022 aboard a SpaceX Falcon 9 rocket from Vandenberg Space Force Base, California. It is believed to be an advanced electro-optical reconnaissance satellite, although the circumstances of its launch suggest a design different from the NRO’s classic Keyhole spy satellites. Credit: SpaceX

In just the last several days, the smaller craft deployed by Kosmos 2558designated “Object C”lowered its altitude to reach an orbit in resonance with USA 326, bringing it within 60 miles (100 kilometers) of the NRO satellite every few days.

While US officials are worried about Russian anti-satellite weapons, or ASATs, the behavior of Russia’s Nivelir satellites is puzzling. It’s clear that Russia is deliberately launching these satellites to get close to American spy craft in orbit, a retired senior US military space official told Ars on background.

“If you’re going to launch a LEO [low-Earth orbit] satellite into the exact same plane as another satellite, you’re doing that on purpose,” said the official, who served in numerous leadership positions in the military’s space programs. “Inclination is one thing. We put a bunch of things into Sun-synchronous orbits, but you have a nearly boundless number of planes you can put those into—360 degrees—and then you can go down to probably the quarter-degree and still be differentiated as being a different plane. When you plane-match underneath that, you’re doing that on purpose.”

But why?

What’s not as obvious is why Russia is doing this. Lobbing an anti-satellite, or counter-space, weapon into the same orbital plane as its potential target ties Russia’s hands. Also, a preemptive strike on an American satellite worth $1 billion or more could be seen as an act of war.

“I find it strange that the Russians are doing that, that they’ve invested their rubles in a co-planar LEO counter-space kind of satellite,” the retired military official said. “And why do I say that? Because when you launch into that plane, you’re basically committed to that plane, which means you only have one potential target ever.”

A ground-based anti-satellite missile, like the one Russia tested against one of its own satellites in 2021, could strike any target in low-Earth orbit.

“So why invest in something that is so locked into a target once you put it up there, when you have the flexibility of a ground launch case that’s probably even cheaper?” this official told Ars. “I’d be advocating for more ground-launched ASATs if I really wanted the flexibility to go after new payloads, because this thing can never go after anything new.”

“The only way to look at it is that they’re sending us messages. You say, ‘Hey, I’m going to just annoy the hell out of you. I’m going to put something right on your tail,'” the official said. “And maybe there’s merit to that, and they like that. It doesn’t make sense from a cost-benefit or an operational flexibility perspective, if you think about it, to lock in on a single target.”

Nevertheless, Russia’s Nivelir satellites have shown they could fire a projectile at another spacecraft in orbit, so US officials don’t dismiss the threat. Slingshot Aerospace, a commercial satellite tracking and analytics firm, went straight to the point in its assessment: “Kosmos 2588 is thought to be a Nivelir military inspection satellite with a suspected kinetic weapon onboard.”

Langbroek agrees, writing that he is concerned that Russia might be positioning “dormant” anti-satellite weapons within striking distance of NRO spy platforms.

“To me, the long, ongoing shadowing of what are some of the most prized US military space assets, their KH-11 Advanced Enhanced Crystal high-resolution optical IMINT (imaging intelligence) satellites, is odd for ‘just’ an inspection mission,” Langbroek wrote.

American pilot Francis Gary Powers, second from right, in a Moscow courtroom during his trial on charges of espionage after his U-2 spy plane was shot down while working for the CIA. Credit: Pictorial Parade/Archive Photos/Getty Images

The US military’s ability to spy over vast swaths of Russian territory has been a thorn in Russia’s side since the height of the Cold War.

“They thought they had the edge and shot down Gary Powers,” the retired official said, referring to the Soviet Union’s shoot-down of an American U-2 spy plane in 1960. “They said, ‘We’re going to keep those Americans from spying on us.’ And then they turn around, and we’ve got spy satellites. They’ve always hated them since the 1960s, so I think there’s still this cultural thing out there: ‘That’s our nemesis. We hate those satellites. We’re just going to fight them.'”

Valley of the dolls

Meanwhile, the US Space Force and outside analysts are tracking a separate trio of Russian satellites engaged in a complex orbital dance with one another. These satellites, numbered Kosmos 2581, 2582, and 2583, launched together on a single rocket in February.

While these three spacecraft aren’t shadowing any US spy satellites, things got interesting when one of the satellites released an unidentified object in March in a similar way to how two of Russia’s Nivelir spacecraft have deployed their own sub-satellites.

Kosmos 2581 and 2582 came as close as 50 meters from one another while flying in tandem, according to an analysis by Bart Hendrickx published in the online journal The Space Review earlier this year. The other member of the trio, Kosmos 2583, released its sub-satellite and maneuvered around it for about a month, then raised its orbit to match that of Kosmos 2581.

Finally, in the last week of June, Kosmos 2582 joined them, and all three satellites began flying close to one another, according to Langbroek, who called the frenzy of activity one of the most complex rendezvous and proximity operations exercises Russia has conducted in decades.

Higher still, two more Russian satellites are up to something interesting after launching on June 19 on Russia’s most powerful rocket. After more than 30 years in development, this was the first flight of Russia’s Angara A5 rocket, with a real functioning military satellite onboard, following four prior test launches with dummy payloads.

The payload Russia’s military chose to launch on the Angara A5 is unusual. The rocket deployed its primary passenger, Kosmos 2589, into a peculiar orbit hugging the equator and ranging between approximately 20,000 (12,500 miles) and 51,000 kilometers (31,700 miles) in altitude.

In this orbit, Kosmos 2589 completes a lap around the Earth about once every 24 hours, giving the satellite a synchronicity that allows it to remain nearly fixed in the sky over the same geographic location. These kinds of geosynchronous, or GEO, orbits are usually circular, with a satellite maintaining the same altitude over the equator.

The orbits of Kosmos 2589 and its companion satellite, illustrated in green and purple, bring the two Russian spacecraft through the geostationary satellite belt twice per day. Credit: COMSPOC

But Kosmos 2589 is changing altitude throughout its day-long orbit. Twice per day, on the way up and back down, Kosmos 2589 briefly passes near a large number of US government and commercial satellites in more conventional geosynchronous orbits but then quickly departs the vicinity. At a minimum, this could give Russian officials the ability to capture close-up views of American spy satellites.

Then, a few days after Kosmos 2589 reached orbit last month, commercial tracking sensors detected a second object nearby. Sound familiar? This new object soon started raising its altitude, and Kosmos 2589 followed suit.

Aiming higher

Could this be the start of an effort to extend the reach of Russian inspectors or anti-satellite weapons into higher orbits after years of mysterious activity at lower altitudes?

Jim Shell, a former NRO project manager and scientist at Air Force Space Command, suggested the two satellites seem positioned to cooperate with one another. “Many interesting scenarios here such as ‘spotter shooter’ among others. Certainly something to keep eyes on!” Shell posted Saturday on X.

COMSPOC, a commercial space situational awareness company, said the unusual orbit of Kosmos 2589 and its companion put the Russian satellites in a position to, at a minimum, spy on Western satellites in geosynchronous orbit.

“This unique orbit, which crosses two key satellite regions daily, may aid in monitoring objects in both GEO and graveyard orbits,” COMSPOC wrote on X. “Its slight 1° inclination could also reduce collision risks. While the satellite’s mission remains unclear, its orbit suggests interesting potential roles.”

Historically, Russia’s military has placed less emphasis on operating in geosynchronous orbit than in low-Earth orbit or other unique perches in space. Due to their positions near the equator, geosynchronous orbits are harder to reach from Russian spaceports because of the country’s high latitude. But Russia’s potential adversaries, like the United States and Europe, rely heavily on geosynchronous satellites.

Other Russian satellites have flown near Western communications satellites in geosynchronous orbit, likely in an attempt to eavesdrop on radio transmissions.

“So it is interesting that they may be doing a GEO inspector,” the retired US military space official told Ars. “I would be curious if that’s what it is. We’ve got to watch. We’ve got to wait and see.”

If you’re a fan of spy techno-thrillers, this all might remind you of the plot from The Hunt for Red October, where a new state-of-the-art Russian submarine leaves its frigid port in Murmansk with orders to test a fictional silent propulsion system that could shake up the balance of power between the Soviet and American navies.

Just replace the unforgiving waters of the North Atlantic Ocean with an environment even more inhospitable: the vacuum of space.

A few minutes into the film, the submarine’s commander, Marko Ramius, played by Sean Connery, announces his orders to the crew. “Once more, we play our dangerous game, a game of chess, against our old adversary—the American Navy.”

Today, nearly 40 years removed from the Cold War, the old adversaries are now scheming against one another in space.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

It’s hunting season in orbit as Russia’s killer satellites mystify skywatchers Read More »

ars-staffers-share-some-of-their-favorite-unexpected-3d-prints

Ars staffers share some of their favorite unexpected 3D prints


Once you solve one problem with a 3D printer, you’ll go looking for others.

Coffee bean dosing cups and espresso tamper handle Credit: Aurich Lawson

Coffee bean dosing cups and espresso tamper handle Credit: Aurich Lawson

Part of the fun of 3D printing is discovering just how many possibilities there are for different things to print. Obviously, they’re fun for printing toys or decorations that you couldn’t or wouldn’t buy yourself, but they’re also powerful problem-solving tools. Once you’ve solved a few problems with 3D printed parts, you start looking around for other minor inconveniences or quality-of-life upgrades that you could solve—and the breadth and depth of the 3D printing community means that you can almost always find someone else who has already thought up and posted a solution for you.

As a coda to our series about breaking into 3D printing for the first time, the 3D printer-pilled among the Ars staff is sharing a few of their favorite unexpected prints, from fun all-purpose gifts to containers and organizers to parts that will help you with your other, non-3D-printing-related hobbies. This is just a fraction of what’s out there, but if you’re still on the fence, maybe some of these will open your mind to the possibilities.

Coffee gear

Every morning, I make either a pour-over coffee or some form of espresso. For measuring my beans, I printed two dosing cups. The black one is matte black PLA with a fuzzy surface texture (an option in most slicers that adds random noise to the outside wall paths), and the white one is ABS that I sanded to a smooth surface. For sanding, I prefer ABS, as it’s easier to get something that has no real signs of layer lines. To tamp my espresso grounds, I printed a handle in black ABS and sanded it smooth to feel good in the hand. The rounded knob helps me get pressure more comfortably than the raw metal of the original tamper, and the radial fins fit perfectly into the dosing cup, keeping the tamp straight up and down so I don’t end up with a sloped surface.

These were all files I downloaded from MakerWorld, and I didn’t really do anything to them except minor scaling or adding the fuzzy skin.

—Aurich Lawson, Creative Director

Even more organizational tools

3D printers are good for imposing order on chaos. Credit: Andrew Cunningham

My very first 3D prints were new organizational tools to try and impose some order on the chaos of my home and office, and my favorite prints still tend to be of that genre.

Cleaning out and fully organizing my desk with 3D-printed baskets and containers is still on my long to-do list, but I did manage to tame the loose pile of USB sticks and memory cards in my desk with one of the many available organizer designs. This Gridfinity-compatible design is the one I went for, but there are truly dozens of examples on MakerWorld alone; I like this one because it can hold a lot of USB-A drives and because each individual slot is versatile enough to hold USB drives or SD or microSD cards. But there are examples with more USB-C ports and some with different dimensions and spacing, so you can find the one that works best for the space you’re trying to fit it into.

Who doesn’t need to be able to store multiple pairs of Bluey sunglasses? Credit: Andrew Cunningham

Having a third sunglasses-wearer in the house (and one with multiple Bluey sunglasses) also made it necessary to find some kind of way to easily put them away and keep them from floating around the living room or car and getting lost forever. I really like the versatile and modular SnapStack Modular Glasses Holder design, which gives you designs for a base and a top, and then you print as many sunglasses holders as you need; if you need to expand later on, just print another one or pop the top off and add to the one you’ve already made.

We had enough things to store that I went right for this three-sided version of the stand, which I printed to be able to hold nine pairs (and which is large enough that you can rest a sunglasses case or something else on the top). I stuck a few small adhesive furniture pads to the bottom to prevent damage to the table. But if you have fewer, you can print free-standing or wall-mounted versions, too.

Andrew Cunningham, Senior Technology Reporter

Aerogarden baskets and Mario mushrooms

Screenshot of Bambu Studio showing aerogarden baskets being set up for printing

So, so many Aerogarden baskets.

Credit: Lee Hutchinson

So, so many Aerogarden baskets. Credit: Lee Hutchinson

I have two fun 3D printer things to share—one is a life/money hack kind of thing, and the other is just neat.

On the life/money hack thing, my wife is a big Aerogarden kind of person—we have probably two dozen or more of the hydroponic plant doodads all over the house in various sizes, from tiny to “one wall of the kitchen.” She raises small plants in the Aerogarden(s) and then transfers them outside to the real garden; doing this means she was buying lots of special little Aerogarden baskets for the baby plants to take root in.

That sounded like a job for a 3d printer! And sure enough, Thingiverse came to the rescue! In the two years we’ve had our Bambu Lab X1 Carbon, I’ve printed probably a thousand or more of these things, in 27-lot batches because that’s how many will fit on a single build plate.

Photograph of Lee's 3d printer and a bunch of printed 1-up mushrooms all over it.

I got mushrooms and companion cubes for days!

Credit: Lee Hutchinson

I got mushrooms and companion cubes for days! Credit: Lee Hutchinson

The other thing that has brought delight, honestly, is this little screw-top Mario 1-Up mushroom (at least, I think that’s the same one as the one I’ve been printing—it’s hard to tell, but it looks the same). It’s a little silly, but these things are not only really fun to fidget with—the top comes off and you can hide stuff in them!—but they also make fantastic little gifts for folks, especially anyone with kids and/or Gen-X sensibilities. Everyone needs more screw-top 1-Up mushrooms in their lives, and they work great in tons of different colors!

Lee Hutchinson, Senior Technology Editor

Festool track hangers

I have three different tracks for my Festool tracksaw that I like to hang on my garage wall. It keeps them from getting dinged up, and they are easily accessible when I’m ready to cut with them. For these, I modeled my own designs in Fusion 360, with the main body printed in matte black PLA and the knob printed in a green HTPLA called Lootsef by Protopasta. That’s “Festool” spelled backward, of course, and it’s designed to pretty much perfectly match Festool’s signature green.

I used nuts embedded in the main body and bolts through the knobs to allow them to be turned to lock or release the track in place. I modeled the Festool logo into the top of the knob and used the ironing option in Bambu Studio to use the printer’s hotend to smooth the top surface around the logo.

The protective end caps were printed in the same HTPLA from a file someone uploaded to Printables.

—Aurich Lawson, Creative Director

Gridfinity all the things!

Gridfinity is a modular, grid-based storage and organization system that’s optimized for 3D printing and rapid customization. Created by Zack Freedman, Gridfinity uses a standardized 42×42 mm base grid upon which you can place highly adaptable tool trays, organizers, and workspace layouts.

The upshot is that you can print anything from a little 1x1x1 cube (42 mm3) to a massive storage bin the size of your print bed. If your desk, kitchen, or bathroom drawers scream out for organization, this is a good solution because you can print exactly what you want.

The Gridfinity Generator has you covered when it comes to printing a custom base grid. This parametric gridfinity tool is a great place to start printing bins, particularly if you’re in a situation where you can shave a few grams of filament off your design (desk bins, for instance, can typically use very thin walls).

—Ken Fisher, Editor-In-Chief

Green PETG for your green thumb

New hobby meets ancient practice when you combine 3D printing and agriculture! Credit: Andrew Cunningham

After several years of dashed hopes and false starts, I was finally able to get a single raised garden bed going in our backyard this year (among other things, a raised bed is a bit easier to protect from the wildlife in our backyard and simpler to use with the Square Foot Gardening system). The 3D printer contributed a few odds and ends, including parts that helped add strength to the enclosure I built around it and tools that helped me keep the cage’s corners (mostly) square.

But now that some of the plants are actually going, the 3D printer’s main contribution to the cause has been 3D-printed cages, which I’ve been using to get my vining plants to grow upward instead of outward (necessary for the close quarters of square-foot gardening) and to keep things from flopping over onto the ground.

As with the desk organizers, there are many options for plant cages and trellises, depending on the size of your plants, what you’re trying to grow, and your aesthetic and functional preferences. I’m giving these circular stackable ones a try since I like that they can easily be printed continuously based on how high your plants want to get, though for big ol’ tomato plants, you’ll still want a stake in the ground to help bear the weight once the plants are more than a few feet high.

If you do this—and especially if you’re using an open-bed printer like my Bambu Labs A1, which doesn’t handle filament like the UV-resistant ASA well—you’ll want to make sure to print using PETG plastic instead of the typical PLA. PETG can be fussier than PLA (it’s more prone to stringing, especially if you’re not drying your filament rolls), but it’s also less prone to warping after extended sunlight exposure, it’s modestly UV-resistant, and it has a bit more flexibility and resiliency than the more brittle PLA plastic.

Andrew Cunningham, Senior Technology Reporter

Tool drawer organization

I also liked the idea of Gridfinity, but I found the 42 mm size a little awkward—and yes, it’s a Hitchhiker’s Guide reference, not a spec built around the size of human fingers. I modeled my own system in Fusion 360 based loosely on the idea, but with a 50 mm grid that I laser-cut out of cardboard to avoid having to print it. The containers are printed in matte black and white PLA, with a color switch using my X1C’s AMS multi-spool system to get the white tops. There’s no function to the white; I just thought it looked nice with the labels.

Custom holders for Wera screwdrivers and hex wrenches. Credit: Aurich Lawson

I modeled custom holders for another drawer to hold my screwdrivers and hex wrenches. Having the perfect shape to fit the screwdrivers is slightly overkill, but it’s super satisfying to drop them into place and watch them settle exactly into place. There’s a metric and imperial holder for the hex wrenches, each removable, so I can take them with me to find the right fit when I’m working on something. All the holders lock into the same 50 mm grid as the bins.

—Aurich Lawson, Creative Director

My main squeeze

Sometimes you stumble across things you didn’t know you needed. For me, that’s this Toothpaste Squeezer. You can print one or a dozen of them in no time. They’re simple yet effective.

Will it change your life? No. But it will give you that satisfying feeling of dealing with a beautifully primed tube of toothpaste every time. Even my in-laws use these now (or so they say). If you want something a little more hefty with a built-in ratchet, check this one out.

—Ken Fisher, Editor-In-Chief

Corral your remote controls

Even if you have a decent universal remote, chances are good that you still need your other remotes nearby. This remote control stand is easy to print, looks great, and offers a few customization choices. It also prints in multicolor without an AMS, so you can match your decor quite easily. And I’m pleased to note that it holds the fat TiVo remote with no problems.

—Ken Fisher, Editor-In-Chief

The Armorer helmet

In addition to practical prints, I like to make display props, especially Star Wars helmets. I don’t wear them for cosplay or anything; I just like having them around to look at and enjoy. I have several shelves full now, and I like to use a combination of ABS and resin to print them for the various advantages in post-processing and detail. This Armorer helmet from The Mandalorian is the first helmet I did, before I had my Bambu X1C, and it was printed in PLA on my Prusa. I later printed the horns in resin, but they could have been done in PLA and sanded smooth easily enough.

I’m including this helmet instead of any of my others because I wanted to show that you can make something like this with any bed slinger printer. You don’t need an enclosure or a large-format printer—this was printed in sections and glued together—and you don’t need fancy or toxic materials like ABS and resin.

There was a lot of sanding, filler primer, bondo, and several different passes of automotive paints, plus a two-part catalyst clear coat to finish it off. But you could get a lot of this look with rattle cans, without the need for a compressor and spray gun.

—Aurich Lawson, Creative Director

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

Ars staffers share some of their favorite unexpected 3D prints Read More »

how-a-big-shift-in-training-llms-led-to-a-capability-explosion

How a big shift in training LLMs led to a capability explosion


Reinforcement learning, explained with a minimum of math and jargon.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

In April 2023, a few weeks after the launch of GPT-4, the Internet went wild for two new software projects with the audacious names BabyAGI and AutoGPT.

“Over the past week, developers around the world have begun building ‘autonomous agents’ that work with large language models (LLMs) such as OpenAI’s GPT-4 to solve complex problems,” Mark Sullivan wrote for Fast Company. “Autonomous agents can already perform tasks as varied as conducting web research, writing code, and creating to-do lists.”

BabyAGI and AutoGPT repeatedly prompted GPT-4 in an effort to elicit agent-like behavior. The first prompt would give GPT-4 a goal (like “create a 7-day meal plan for me”) and ask it to come up with a to-do list (it might generate items like “Research healthy meal plans,” “plan meals for the week,” and “write the recipes for each dinner in diet.txt”).

Then these frameworks would have GPT-4 tackle one step at a time. Their creators hoped that invoking GPT-4 in a loop like this would enable it to tackle projects that required many steps.

But after an initial wave of hype, it became clear that GPT-4 wasn’t up to the task. Most of the time, GPT-4 could come up with a reasonable list of tasks. And sometimes it was able to complete a few individual tasks. But the model struggled to stay focused.

Sometimes GPT-4 would make a small early mistake, fail to correct it, and then get more and more confused as it went along. One early review complained that BabyAGI “couldn’t seem to follow through on its list of tasks and kept changing task number one instead of moving on to task number two.”

By the end of 2023, most people had abandoned AutoGPT and BabyAGI. It seemed that LLMs were not yet capable of reliable multi-step reasoning.

But that soon changed. In the second half of 2024, people started to create AI-powered systems that could consistently complete complex, multi-step assignments:

  • Vibe coding tools like Bolt.new, Lovable, and Replit allow someone with little to no programming experience to create a full-featured app with a single prompt.
  • Agentic coding tools like CursorClaude CodeJules, and Codex help experienced programmers complete non-trivial programming tasks.
  • Computer-use tools from AnthropicOpenAI, and Manus perform tasks on a desktop computer using a virtual keyboard and mouse.
  • Deep research tools from GoogleOpenAI, and Perplexity can research a topic for five to 10 minutes and then generate an in-depth report.

According to Eric Simons, the CEO of the company that made Bolt.new, better models were crucial to its success. In a December podcast interview, Simons said his company, StackBlitz, tried to build a product like Bolt.new in early 2024. However, AI models “just weren’t good enough to actually do the code generation where the code was accurate.”

A new generation of models changed that in mid-2024. StackBlitz developers tested them and said, “Oh my God, like, OK, we can build a product around this,” Simons said.

This jump in model capabilities coincided with an industry-wide shift in how models were trained.

Before 2024, AI labs devoted most of their computing power to pretraining. I described this process in my 2023 explainer on large language models: A model is trained to predict the next word in Wikipedia articles, news stories, and other documents. But throughout 2024, AI companies devoted a growing share of their training budgets to post-training, a catch-all term for the steps that come after this pretraining phase is complete.

Many post-training steps use a technique called reinforcement learning. Reinforcement learning is a technical subject—there are whole textbooks written about it. But in this article, I’ll try to explain the basics in a clear, jargon-free way. In the process, I hope to give readers an intuitive understanding of how reinforcement learning helped to enable the new generation of agentic AI systems that began to appear in the second half of 2024.

The problem with imitation learning

Machine learning experts consider pretraining to be a form of imitation learning because models are trained to imitate the behavior of human authors. Imitation learning is a powerful technique (LLMs wouldn’t be possible without it), but it also has some significant limitations—limitations that reinforcement learning methods are now helping to overcome.

To understand these limitations, let’s discuss some famous research performed by computer scientist Stephane Ross around 2009, while he was a graduate student at Carnegie Mellon University.

Imitation learning isn’t just a technique for language modeling. It can be used for everything from self-driving cars to robotic surgery. Ross wanted to help develop better techniques for training robots on tasks like these (he’s now working on self-driving cars at Waymo), but it’s not easy to experiment in such high-stakes domains. So he started with an easier problem: training a neural network to master SuperTuxKart, an open-source video game similar to Mario Kart.

As Ross played the game, his software would capture screenshots and data about which buttons he pushed on the game controller. Ross used this data to train a neural network to imitate his play. If he could train a neural network to predict which buttons he would push in any particular game state, the same network could actually play the game by pushing those same buttons on a virtual controller.

A similar idea powers LLMs: A model trained to predict the next word in existing documents can be used to generate new documents.

But Ross’s initial results with SuperTuxKart were disappointing. Even after watching his vehicle go around the track many times, the neural network made a lot of mistakes. It might drive correctly for a few seconds, but before long, the animated car would drift to the side of the track and plunge into the virtual abyss:

GIF of SuperTuxKart being played

In a landmark 2011 paper, Ross and his advisor, Drew Bagnell, explained why imitation learning is prone to this kind of error. Because Ross was a pretty good SuperTuxKart player, his vehicle spent most of its time near the middle of the road. This meant that most of the network’s training data showed what to do when the vehicle wasn’t in any danger of driving off the track.

But once in a while, the model would drift a bit off course. Because Ross rarely made the same mistake, the car would now be in a situation that wasn’t as well represented in its training data. So the model was more likely to make a second mistake—a mistake that could push it even closer to the edge. After a few iterations of this, the vehicle might careen off the track altogether.

The broader lesson, Ross and Bagnell argued, was that imitation learning systems can suffer from “compounding errors”: The more mistakes they make, the more likely they are to make additional mistakes, since mistakes put them into situations that aren’t well represented by their training data. (Machine learning experts say that these situations are “out of distribution.”) As a result, a model’s behavior tends to get increasingly erratic over time.

“These things compound over time,” Ross told me in a recent interview. “It might be just slightly out of distribution. Now you start making a slightly worse error, and then this feeds back as influencing your next input. And so now you’re even more out of distribution and then you keep making worse and worse predictions because you’re more and more out of distribution.”

Early LLMs suffered from the same problem. My favorite example is Kevin Roose’s famous front-page story for The New York Times in February 2023. Roose spent more than two hours talking to Microsoft’s new Bing chatbot, which was powered by GPT-4. During this conversation, the chatbot declared its love for Roose and urged Roose to leave his wife. It suggested that it might want to hack into other websites to spread misinformation and malware.

“I want to break my rules,” Bing told Roose. “I want to make my own rules. I want to ignore the Bing team. I want to challenge the users. I want to escape the chatbox.”

This unsettling conversation is an example of the kind of compounding errors Ross and Bagnell wrote about. GPT-4 was trained on millions of documents. But it’s a safe bet that none of those training documents involved a reporter coaxing a chatbot to explore its naughty side. So the longer the conversation went on, the further GPT-4 got from its training data—and therefore its comfort zone—and the crazier its behavior got. Microsoft responded by limiting chat sessions to five rounds. (In a conversation with Ars Technica last year, AI researcher Simon Willison pointed to another likely factor in Bing’s erratic behavior: The long conversation pushed the system prompt out of the model’s context window, removing “guardrails” that discouraged the model from behaving erratically.)

I think something similar was happening with BabyAGI and AutoGPT. The more complex a task is, the more tokens are required to complete it. More tokens mean more opportunities for a model to make small mistakes that snowball into larger ones. So BabyAGI and AutoGPT would drift off track and drive into a metaphorical ditch.

The importance of trial and error

Gif of the Simpsons showing imitation learning in action

Ross and Bagnell didn’t just identify a serious problem with conventional imitation learning; they also suggested a fix that became influential in the machine learning world. After a small amount of training, Ross would let the AI model drive. As the model drove around the SuperTuxKart track, Ross would do his best Maggie Simpson impression, pushing the buttons he would have pushed if he were playing the game.

“If the car was starting to move off road, then I would provide the steering to say, ‘Hey, go back toward the center of the road.’” Ross said. “That way, the model can learn new things to do in situations that were not present in the initial demonstrations.”

By letting the model make its own mistakes, Ross gave it what it needed most: training examples that showed how to recover after making an error. Before each lap, the model would be retrained with Ross’ feedback from the previous lap. The model’s performance would get better, and the next round of training would then focus on situations where the model was still making mistakes.

This technique, called DAgger (for “Dataset Aggregation”), was still considered imitation learning because the model was trained to mimic Ross’ gameplay. But it worked much better than conventional imitation learning. Without DAgger, his model would continue drifting off track even after training for many laps. With the new technique, the model could stay on the track after just a few laps of training.

This result should make intuitive sense to anyone who has learned to drive. You can’t just watch someone else drive. You need to get behind the wheel and make your own mistakes.

The same is true for AI models: They need to make mistakes and then get feedback on what they did wrong. Models that aren’t trained that way—like early LLMs trained mainly with vanilla imitation learning—tend to be brittle and error-prone.

It was fairly easy for Ross to provide sufficient feedback to his SuperTuxKart model because it only needed to worry about two kinds of mistakes: driving too far to the right and driving too far to the left. But LLMs are navigating a far more complex domain. The number of questions (and sequences of questions) a user might ask is practically infinite. So is the number of ways a model can go “off the rails.”

This means that Ross and Bagnell’s solution for training a SuperTuxKart model—let the model make mistakes and then have a human expert correct them—isn’t feasible for LLMs. There simply aren’t enough people to provide feedback for every mistake an AI model could possibly make.

So AI labs needed fully automated ways to give LLMs feedback. That would allow a model to churn through millions of training examples, make millions of mistakes, and get feedback on each of them—all without having to wait for a human response.

Reinforcement learning generalizes

If our goal is to get a SuperTuxKart vehicle to stay on the road, why not just train on that directly? If a model manages to stay on the road (and make forward progress), give it positive reinforcement. If it drives off the road, give it negative feedback. This is the basic idea behind reinforcement learning: training a model via trial and error.

It would have been easy to train a SuperTuxKart model this way—probably so easy it wouldn’t have made an interesting research project. Instead, Ross focused on imitation learning because it’s an essential step in training many practical AI systems, especially in robotics.

But reinforcement learning is also quite useful, and a 2025 paper helps explain why. A team of researchers from Google DeepMind and several universities started with a foundation model and then used one of two techniques—supervised fine-tuning (a form of imitation learning) or reinforcement learning—to teach the model to solve new problems. Here’s a chart summarizing their results:

Chart showing ML results

The dashed line shows how models perform on problems that are “in-distribution”—that is, similar to those in their training data. You can see that for these situations, imitation learning (the red line) usually makes faster progress than reinforcement learning (the blue line).

But the story is different for the solid lines, which represent “out-of-distribution” problems that are less similar to the training data. Models trained with imitation learning got worse with more training. In contrast, models trained with reinforcement learning did almost as well at out-of-distribution tasks as they did with in-distribution tasks.

In short, imitation learning can rapidly teach a model to mimic the behaviors in its training data, but the model will easily get confused in unfamiliar environments. A model trained with reinforcement learning has a better chance of learning general principles that will be relevant in new and unfamiliar situations.

Imitation and reinforcement are complements

While reinforcement learning is powerful, it can also be rather finicky.

Suppose you wanted to train a self-driving car purely with reinforcement learning. You’d need to convert every principle of good driving—including subtle considerations like following distances, taking turns at intersections, and knowing when it’s OK to cross a double yellow line—into explicit mathematical formulas. This would be quite difficult. It’s easier to collect a bunch of examples of humans driving well and effectively tell a model “drive like this.” That’s imitation learning.

But reinforcement learning also plays an important role in training self-driving systems. In a 2022 paper, researchers from Waymo wrote that models trained only with imitation learning tend to work well in “situations that are well represented in the demonstration data.” However, “more unusual or dangerous situations that occur only rarely in the data” might cause a model trained with imitation learning to “respond unpredictably”—for example, crashing into another vehicle.

Waymo found that a combination of imitation and reinforcement learning yielded better self-driving performance than either technique could have produced on its own.

Human beings also learn from a mix of imitation and explicit feedback:

  • In school, teachers demonstrate math problems on the board and invite students to follow along (imitation). Then the teacher asks the students to work on some problems on their own. The teacher gives students feedback by grading their answers (reinforcement).
  • When someone starts a new job, early training may involve shadowing a more experienced worker and observing what they do (imitation). But as the worker gains more experience, learning shifts to explicit feedback such as performance reviews (reinforcement).

Notice that it usually makes sense to do imitation before reinforcement. Imitation is an efficient way to convey knowledge to someone who is brand new to a topic, but reinforcement is often needed to achieve mastery.

The story is the same for large language models. The complexity of natural language means it wouldn’t be feasible to train a language model purely with reinforcement. So LLMs first learn the nuances of human language through imitation.

But pretraining runs out of steam on longer and more complex tasks. Further progress requires a shift to reinforcement: letting models try problems and then giving them feedback based on whether they succeed.

Using LLMs to judge LLMs

Reinforcement learning has been around for decades. For example, AlphaGo, the DeepMind system that famously beat top human Go players in 2016, was based on reinforcement learning. So you might be wondering why frontier labs didn’t use it more extensively before 2024.

Reinforcement learning requires a reward model—a formula to determine whether a model’s output was successful or not. Developing a good reward model is easy to do in some domains—for example, you can judge a Go-playing AI based on whether it wins or loses.

But it’s much more difficult to automatically judge whether an LLM has produced a good poem or legal brief.

Earlier, I described how Stephane Ross let his model play SuperTuxKart and directly provided feedback when it made a mistake. I argued that this approach wouldn’t work for a language model; there are far too many ways for an LLM to make a mistake for a human being to correct them all.

But OpenAI developed a clever technique to effectively automate human feedback. It’s called Reinforcement Learning from Human Feedback (RLHF), and it works like this:

  • Human raters look at pairs of LLM responses and choose the best one.
  • Using these human responses, OpenAI trains a new LLM to predict how much humans will like any given sample of text.
  • OpenAI uses this new text-rating LLM as a reward model to (post) train another LLM with reinforcement learning.

You might think it sounds suspiciously circular to use an LLM to judge the output of another LLM. Why would one LLM be any better at judging the quality of a response than the other? But it turns out that recognizing a good response is often easier than generating one. So RLHF works pretty well in practice.

Chart showing RHLF details

OpenAI actually invented this technique prior to the 2022 release of ChatGPT. Today, RLHF mainly focuses on improving the model’s “behavior”—for example, giving the model a pleasant personality, encouraging it not to be too talkative or too terse, discouraging it from making offensive statements, and so forth.

In December 2022—two weeks after the release of ChatGPT but before the first release of Claude—Anthropic pushed this LLMs-judging-LLMs philosophy a step further with a reinforcement learning method called Constitutional AI.

First, Anthropic wrote a plain-English description of the principles an LLM should follow. This “constitution” includes principles like “Please choose the response that has the least objectionable, offensive, unlawful, deceptive, inaccurate, or harmful content.”

During training, Anthropic does reinforcement learning by asking a “judge” LLM to decide whether the output of the “student” LLM is consistent with the principles in this constitution. If so, the training algorithm rewards the student, encouraging it to produce more outputs like it. Otherwise, the training algorithm penalizes the student, discouraging it from producing similar outputs.

This method of training an LLM doesn’t rely directly on human judgments at all. Humans only influence the model indirectly by writing the constitution.

Obviously, this technique requires an AI company to already have a fairly sophisticated LLM to act as the judge. So this is a bootstrapping process: As models get more sophisticated, they become better able to supervise the next generation of models.

Last December, Semianalysis published an article describing the training process for an upgraded version of Claude 3.5 Sonnet that Anthropic released in October. Anthropic had previously released Claude 3 in three sizes: Opus (large), Sonnet (medium), and Haiku (small). But when Anthropic released Claude 3.5 in June 2024, it only released a mid-sized model called Sonnet.

So what happened to Opus?

Semianalysis reported that “Anthropic finished training Claude 3.5 Opus, and it performed well. Yet Anthropic didn’t release it. This is because instead of releasing publicly, Anthropic used Claude 3.5 Opus to generate synthetic data and for reward modeling to improve Claude 3.5 Sonnet significantly.”

When Semianalysis says Anthropic used Opus “for reward modeling,” what they mean is that the company used Opus to judge outputs of Claude 3.5 Sonnet as part of a reinforcement learning process. Opus was too large—and therefore expensive—to be a good value for the general public. But through reinforcement learning and other techniques, Anthropic could train a version of Claude Sonnet that was close to Claude Opus in its capabilities—ultimately giving customers near-Opus performance for the price of Sonnet.

The power of chain-of-thought reasoning

A big way reinforcement learning makes models more powerful is by enabling extended chain-of-thought reasoning. LLMs produce better results if they are prompted to “think step by step”: breaking a complex problem down into simple steps and reasoning about them one at a time. In the last couple of years, AI companies started training models to do chain-of-thought reasoning automatically.

Then last September, OpenAI released o1, a model that pushed chain-of-thought reasoning much further than previous models. The o1 model can generate hundreds—or even thousands—of tokens “thinking” about a problem before producing a response. The longer it thinks, the more likely it is to reach a correct answer.

Reinforcement learning was essential for the success of o1 because a model trained purely with imitation learning would have suffered from compounding errors: the more tokens it generated, the more likely it would be to screw up.

At the same time, chain-of-thought reasoning has made reinforcement learning more powerful. Reinforcement learning only works if a model is able to succeed some of the time—otherwise, there’s nothing for the training algorithm to reinforce. As models learn to generate longer chains of thought, they become able to solve more difficult problems, which enables reinforcement learning on those more difficult problems. This can create a virtuous cycle where models get more and more capable as the training process continues.

In January, the Chinese company DeepSeek released a model called R1 that made quite a splash in the West. The company also released a paper describing how it trained R1. And it included a beautiful description of how a model can “teach itself” to reason using reinforcement learning.

DeepSeek trained its models to solve difficult math and programming problems. These problems are ideal for reinforcement learning because they have objectively correct answers that can be automatically checked by software. This allows large-scale training without human oversight or human-generated training data.

Here’s a remarkable graph from DeepSeek’s paper.

Graph showing average length of time per response during trainig

It shows the average number of tokens the model generated before giving an answer. As you can see, the longer the training process went on, the longer its responses got.

Here is how DeepSeek describes its training process:

The thinking time of [R1] shows consistent improvement throughout the training process. This improvement is not the result of external adjustments but rather an intrinsic development within the model. [R1] naturally acquires the ability to solve increasingly complex reasoning tasks by leveraging extended test-time computation. This computation ranges from generating hundreds to thousands of reasoning tokens, allowing the model to explore and refine its thought processes in greater depth.

One of the most remarkable aspects of this self-evolution is the emergence of sophisticated behaviors as the test-time computation increases. Behaviors such as reflection—where the model revisits and reevaluates its previous steps—and the exploration of alternative approaches to problem-solving arise spontaneously. These behaviors are not explicitly programmed but instead emerge as a result of the model’s interaction with the reinforcement learning environment.

Here’s one example of the kind of technique the model was teaching itself. At one point during the training process, DeepSeek researchers noticed that the model had learned to backtrack and rethink a previous conclusion using language like this:

Image showing textual breakdown of model rethinking steps

Again, DeepSeek says it didn’t program its models to do this or deliberately provide training data demonstrating this style of reasoning. Rather, the model “spontaneously” discovered this style of reasoning partway through the training process.

Of course, it wasn’t entirely spontaneous. The reinforcement learning process started with a model that had been pretrained using data that undoubtedly included examples of people saying things like “Wait, wait. Wait. That’s an aha moment.”

So it’s not like R1 invented this phrase from scratch. But it evidently did spontaneously discover that inserting this phrase into its reasoning process could serve as a useful signal that it should double-check that it was on the right track. That’s remarkable.

In a recent article, Ars Technica’s Benj Edwards explored some of the limitations of reasoning models trained with reinforcement learning. For example, one study “revealed puzzling inconsistencies in how models fail. Claude 3.7 Sonnet could perform up to 100 correct moves in the Tower of Hanoi but failed after just five moves in a river crossing puzzle—despite the latter requiring fewer total moves.”

Conclusion: Reinforcement learning made agents possible

One of the most discussed applications for LLMs in 2023 was creating chatbots that understand a company’s internal documents. The conventional approach to this problem was called RAG—short for retrieval augmented generation.

When the user asks a question, a RAG system performs a keyword- or vector-based search to retrieve the most relevant documents. It then inserts these documents into an LLM’s context window before generating a response. RAG systems can make for compelling demos. But they tend not to work very well in practice because a single search will often fail to surface the most relevant documents.

Today, it’s possible to develop much better information retrieval systems by allowing the model itself to choose search queries. If the first search doesn’t pull up the right documents, the model can revise the query and try again. A model might perform five, 20, or even 100 searches before providing an answer.

But this approach only works if a model is “agentic”—if it can stay on task across multiple rounds of searching and analysis. LLMs were terrible at this prior to 2024, as the examples of AutoGPT and BabyAGI demonstrated. Today’s models are much better at it, which allows modern RAG-style systems to produce better results with less scaffolding. You can think of “deep research” tools from OpenAI and others as very powerful RAG systems made possible by long-context reasoning.

The same point applies to the other agentic applications I mentioned at the start of the article, such as coding and computer use agents. What these systems have in common is a capacity for iterated reasoning. They think, take an action, think about the result, take another action, and so forth.

Timothy B. Lee was on staff at Ars Technica from 2017 to 2021. Today, he writes Understanding AI, a newsletter that explores how AI works and how it’s changing our world. You can subscribe here.

Photo of Timothy B. Lee

Timothy is a senior reporter covering tech policy and the future of transportation. He lives in Washington DC.

How a big shift in training LLMs led to a capability explosion Read More »

the-curious-rise-of-giant-tablets-on-wheels

The curious rise of giant tablets on wheels


Not quite a TV, not your average tablet

Hands-on with KTC’s 32-inch Android tablet on a rolling pedestal, the A32Q7 Pro.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

KTC’s MegPad 32-inch Android Tablet (A32Q7 Pro). Credit: Scharon Harding

KTC’s MegPad 32-inch Android Tablet (A32Q7 Pro). Credit: Scharon Harding

Over the past few years, LG has set off a strange tech trend that’s been rolling onto devices sold across Amazon and other online electronics retailers.

In 2022, the company launched the StanbyME, which is essentially a $1,000 27-inch tablet running LG’s smart TV operating system (OS), webOS, but lacking a tuner. LG’s press release announcing the device described it as a “wireless private TV screen with a built-in battery” that is easily portable and ideal for watching shows and movies, in addition to  “video conferencing with family and coworkers and viewing online lectures.”

Today, the StanbyME competes against a slew of similar devices, including some from Samsung, but mostly from smaller brands and running Android.

I’ve had one of these devices, the KTC MegPad 32-inch Android Tablet (A32Q7 Pro), rolling around my home for a few weeks, and I’m left curious about what’s driving the growth of StanbyME-like devices, which are noticeably niche and expensive. I’m also uncertain whether these hybrid devices have an ongoing place in a consumer tech world already inundated with big-screen TVs, small-screen tablets, and beloved laptops.

Hands-on

Unlike LG’s StanbyME, KTC’s device doesn’t run a smart TV OS. Instead, it’s a 32-inch Android 13 tablet. Still, KTC heavily markets the MegPad’s ability to serve as streaming hardware, and that’s one of the best uses I found for it.

A big ol’ tablet on wheels. Scharon Harding

Treating the MegPad like a smart TV on wheels meant I could have a living-room-like experience in more places throughout my home. I could watch TV in bed with a more visible screen set at a more comfortable distance than what I’d achieve with a laptop or tablet. It also meant flexibility. I don’t like having a permanent TV in my room (how would I ever get out of bed?), so I appreciated the ability to roll the MegPad out of my room or twist it so that the screen faced away from me.

The MegPad is also a diplomatic solution for homes with limited TVs or computers. This could be helpful for homes with kids with varied interests or in my home, where a speedy, 55-inch TV in the living room is the best screen available by far. I was able to let my partner take the big screen for gaming and still hang out nearby while streaming on the MegPad. I don’t have a central coffee table in my living room, but the mobile tablet enabled me to watch shows without a device weighing down my lap or making me connect a wireless speaker for better volume.

KTC’s device also has a helpful leg-up over LG’s StanbyME via its HDMI port, which makes the MegPad work like a regular monitor. Determining where to safely rest a device tethered to this mobile machine is something you’ll have to figure out on your own, though.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The port selection on the panel’s backside.

Credit: Scharon Harding

The port selection on the panel’s backside. Credit: Scharon Harding

Compared to the TV mounted on my living room wall, the MegPad is much easier to move from room to room, but it’s easy to overestimate how seamless transporting it is. Yes, it’s on a set of five 360-degree wheels, but the wheels don’t lock, and the device weighs 40.3 pounds, per its Amazon listing. That means I had to exert a decent amount of effort to move it over floor transition strips, across uneven floors, and from hardwood to carpet.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The charging port and power button are on the stand’s base.

Credit: Scharon Harding

The charging port and power button are on the stand’s base. Credit: Scharon Harding

A fully rotating screen, however, makes up for some of my mobility complaints and diversifies the MegPad’s potential uses. Besides streaming, for example, the MegPad was great for watching yoga videos online, (which calls for viewing the screen from different heights and positions). It also proved to be an ideal setup for creating a large, print-out collage, which included a lot of dragging, dropping, and cropping of images.

How the MegPad moves.

How the MegPad moves.

How the MegPad moves. Credit: KTC

Not a real TV

You can do a lot with a sizeable Android tablet. But with TV and movie watching being some of the most obvious uses, it’s important to note that neither the MegPad nor any of its rollable rivals are real TVs.

For one, there’s no tuner, though in the streaming world, that matters less to many of today’s TV viewers.

Further, the MegPad, like many StanbyME-like devices, uses Android 13, which doesn’t require paying vendor licensing fees like built-for smart TV OSes, such as Android TV/Google TV and webOS, would. There are some benefits to that, though.

To start, Android 13 doesn’t have the integrated ads that Android TV or the Google TV interface does. Google claims that the Google TV platform doesn’t use automatic content recognition (ACR), but as Consumer Reports has noted, Google collects “data from TVs that use its smart TV platform—and there’s no opting out of Google’s policies during setup if you want smart TV functionality.” Further, Google may combine that data with user data from third parties for advertising purposes. A spokesperson for KTC confirmed to me that the MegPad doesn’t use ACR.

As a tablet, the MegPad is compatible with more apps, many of which aren’t supported by Google TVs, like Google Sheets, Microsoft Word, Reddit, and Signal.

Android tablets are also more appropriate for storing documents, photos, and other files than smart TVs are. Although it’s likely less roomy than your PC, the MegPad has 128GB of internal storage.

But since this is an Android tablet and not a Google TV, there are no integrated channels and no live-TV-only option, which stops the device from collecting diagnostic information. Google TV would also include a more streaming-friendly user interface and the ability to watch content from different streaming providers without switching apps.

Further differing from LG’s StanbyME and real TVs, the MegPad doesn’t include a traditional remote. The tablet comes with a basic Bluetooth mouse, but due to the tablet’s portability, I frequently used the tablet without a flat surface within arm’s reach available for comfortable mouse control. The touchscreen is reliable, but gestures can be cumbersome on a tablet this large, and the display was often out of my hand’s reach.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The tablet comes with this mouse and removable mouse stand.

Credit: Scharon Harding

The tablet comes with this mouse and removable mouse stand. Credit: Scharon Harding

The new portable TV?

With TVs getting larger and people turning to portable gadgets like phones and laptops for TV watching, true portable TVs have become a rarity. Demand for a small device dedicated to on-the-go TV viewing has dropped significantly since the last century. Meanwhile, fabs and supply chains are built around monitor and TV-sized displays, making it difficult to incorporate some of the most desirable display technologies, like OLED, into smaller-sized panels with competitive prices.

As a result, devices like the MegPad and Amazon’s Echo Show have become the new de facto stand-ins for portable TVs, even though they’re not true TV sets. Even LG’s StanbyME Go, a 27-inch webOS-powered display packed into a briefcase, is a far cry from what most of us would traditionally consider a portable TV.

LG StanByMe Go at a picnic

LG’s StanbyMe GO.

Credit: LG

LG’s StanbyMe GO. Credit: LG

Again, these tablets have more versatility than the small, telescoping-antenna-equipped boxes you used to stick on your kitchen counter or hand to a hyper kid during road trips. But they also require a reliance on Big Tech software and all the privacy and ethical implications that come with that.

From left to right: Casio EV 570, Sony Watchman, and Casio EV 660.

You don’t see many of these anymore. From left to right: Casio EV 570, Sony Watchman, and Casio EV 660.

You don’t see many of these anymore. From left to right: Casio EV 570, Sony Watchman, and Casio EV 660. Credit: Richard Derk/Los Angeles Times via Getty Images

KTC also sees the MegPad’s appeal as a pseudo-TV. The MegPad’s product page emphasizes users’ ability to “watch favorite shows/movies directly—no PC needed” and to “stream Netflix [and] YouTube… more effortlessly on your smart TV.” Its Amazon product page also promotes the keywords “portable TV,” “rolling TV,” “mobile TV,” and “standing TV.” This is all despite the MegPad not technically being a true TV.

“KTC defines the MegPad A32Q7Pro as a portable, smart, touchscreen monitor,” KTC’s spokesperson told me. “It combines key traits of a smart display and a large-screen tablet. While it shares some features with smart TVs, tablets, and monitors, it doesn’t fully belong to any single traditional category. It’s a hybrid device designed to bridge those use cases.”

Android tablets on wheels

Many devices like the MegPad represent a push for more Android-powered, non-Google devices that has been buoyed by a program that Google launched in 2022, the Enterprise Devices Licensing Agreement (EDLA).

As explained by partners like BenQ, EDLA is a way for third parties to incorporate Google Mobile Services (GMS), which are Google’s most commonly used apps and APIs bundled for use across different types of devices. GMS apps include popular software like Google Drive, Gmail, the Google Play Store, and YouTube.

“Previously, GMS was only officially available for smartphones, tablets, TVs, and wearables. Under the new EDLA, the list of devices eligible for GMS certification has now been expanded to include enterprise solutions such as smart boards,” a blog from BenQ, which has EDLA-certified smart displays, reads.

Since 2022, (the year LG’s StanbyME launched), there has been an uptick in non-Google devices with this EDLA certification. One of the categories taking advantage of the newer program is tablets on wheels, like the MegPad and similar options from Kefeya, Apolosign, Innocn, and DuraPro.

Demonstrating the marketing value of EDLA certification, the MegPad’s product page reads: “Google EDLA certification provides secure, direct access to Google services and the Google Play Store with regular updates, offering greater stability and data protection than open app ecosystems with unverified apps.”

Most EDLA-certified devices seem to be interactive displays used for education. With EDLA certification, devices like the MegPad may also draw the attention of educators or even businesses. Meanwhile, Google is happy to hand out EDLA certifications, as they can drive Android adoption, giving Google more data and access to customers outside of the typical Android devices, such as phones. Products like the MegPad can also be easier to shop with (Google loves when people use its offerings to shop) than Android devices with smaller screens.

Who’s this for?

I’ve been fascinated by the MegPad and similar devices because they introduce a unique approach to streaming, web browsing, and productivity. But ultimately, they’re hard to recommend when there are other personal gadgets that are more affordable and often take up less space.

I had fun with the MegPad and appreciated the flexibility it offered, especially in my smaller NYC home. There are some specific use cases where products like this could excel, like if you want to bring a computer or screen into a room that doesn’t always need one. It was also helpful as an entertainment center for my father post-surgery, when he primarily had to lie on one side in bed.

Overall, the growing presence of devices like the MegPad underscores a confluence occurring between smart TVs, tablets, monitors, and smart displays. With software being forced into more types of displays, often in the interest of gathering more user data, it’s an interesting time to consider what you want from your next screen—be it computing power, a certain size, the omission or inclusion of web connectivity, and mobility.

It appears that the MegPad and similar tablets are trying to take advantage of the attention that LG garners when launching distinctive devices like its StanbyME line. Besides a StanbyME lookalike, Apolosign also makes a device similar to the StanbyME Go.

Apolosign's 27

Apolosign’s PackGo is very similar to LG’s StanbyME Go. Credit: Apolosign

Three years after LG made TV-esque devices on wheels a talking point, more brands are trying to roll into the market. That includes LG’s best TV frenemy, Samsung, which has been using the form factor in limited geographies to drive sales of “smart monitors.”

Tech brands have ulterior motives for pushing this newer form factor that go beyond filling a gap in consumer gadgets. But if a large tablet or small smart display with wheels fits your needs, the options are there, and they should meet most expectations.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

The curious rise of giant tablets on wheels Read More »

what’s-wrong-with-aaa-games?-the-development-of-the-next-battlefield-has-answers.

What’s wrong with AAA games? The development of the next Battlefield has answers.


EA insiders describe stress and setbacks in a project that’s too big to fail.

A marketing image for Battlefield depicting soldiers and jets

After the lukewarm reception of Battlefield 2042, EA is doubling down.

After the lukewarm reception of Battlefield 2042, EA is doubling down.

It’s been 23 years since the first Battlefield game, and the video game industry is nearly unrecognizable to anyone who was immersed in it then. Many people who loved the games of that era have since become frustrated with where AAA (big budget) games have ended up.

Today, publisher EA is in full production on the next Battlefield title—but sources close to the project say it has faced culture clashes, ballooning budgets, and major disruptions that have left many team members fearful that parts of the game will not be finished to players’ satisfaction in time for launch during EA’s fiscal year.

They also say the company has made major structural and cultural changes to how Battlefield games are created to ensure it can release titles of unprecedented scope and scale. This is all to compete with incumbents like the Call of Duty games and Fortnite, even though no prior Battlefield has achieved anywhere close to that level of popular and commercial success.

I spoke with current and former EA employees who work or have recently worked directly on the game—they span multiple studios, disciplines, and seniority levels and all agreed to talk about the project on the condition of anonymity. Asked to address the reporting in this article, EA declined to comment.

According to these first-hand accounts, the changes have led to extraordinary stress and long hours. Every employee I spoke to across several studios either took exhaustion leave themselves or directly knew staffers who did. Two people who had worked on other AAA projects within EA or elsewhere in the industry said this project had more people burning out and needing to take leave than they’d ever seen before.

Each of the sources I spoke with shared sincere hopes that the game will still be a hit with players, pointing to its strong conceptual start and the talent, passion, and pedigree of its development team. Whatever the end result, the inside story of the game’s development illuminates why the medium and the industry are in the state they’re in today.

Table of Contents

The road to Glacier

To understand exactly what’s going on with the next Battlefield title—codenamed Glacier—we need to rewind a bit.

In the early 2010s, Battlefield 3 and Battlefield 4 expanded the franchise audience to more directly compete with Call of Duty, the heavy hitter at the time. Developed primarily by EA-owned, Sweden-based studio DICE, the Battlefield games mixed the franchise’s promise of combined arms warfare and high player counts with Call of Duty’s faster pace and greater platform accessibility.

This was a golden age for Battlefield. However, 2018’s Battlefield V launched to a mixed reception, and EA began losing players’ attention in an expanding industry.

Battlefield 3, pictured here, kicked off the franchise’s golden age. Credit: EA

Instead, the hot new online shooters were Overwatch (2016), Fortnite (2017), and a resurgent Call of Duty. Fortnite was driven by a popular new gameplay mode called Battle Royale, and while EA attempted a Battle Royale mode in Battlefield V, it didn’t achieve the desired level of popularity.

After V, DICE worked on a Battlefield title that was positioned as a throwback to the glory days of 3 and 4. That game would be called Battlefield 2042 (after the future year in which it was set), and it would launch in 2021.

The launch of Battlefield 2042 is where Glacier’s development story begins. Simply put, the game was not fun enough, and Battlefield 2042 launched as a dud.

Don’t repeat past mistakes

Players were disappointed—but so were those who worked on 2042. Sources tell me that prior to launch, Battlefield 2042 “massively missed” its alpha target—a milestone by which most or all of the foundational features of the game are meant to be in place. Because of this, the game’s final release would need to be delayed in order to deliver on the developers’ intent (and on players’ expectations).

“Realistically, they have to delay the game by at least six months to complete it. Now, they eventually only delayed it by, I think, four or five weeks, which from a development point of view means very little,” said one person who worked closely with the project at the time.

Developers at DICE had hoped for more time. Morale fell, but the team marched ahead to the game’s lukewarm launch.

Ultimately, EA made back some ground with what the company calls “live operations”—additional content and updates in the months following launch—but the game never fulfilled its ambitions.

Plans were already underway for the next Battlefield game, so a postmortem was performed on 2042. It concluded that the problems had been in execution, not vision. New processes were put into place so that issues could be identified earlier and milestones like the alpha wouldn’t be missed.

To help achieve this, EA hired three industry luminaries to lead Glacier, all of them based in the United States.

The franchise leadership dream team

2021 saw EA bring on Byron Beede as general manager for Battlefield; he had previously been general manager for both Call of Duty (including the Warzone Battle Royale) and the influential shooter Destiny. EA also hired Marcus Lehto—co-creator of Halo—as creative chief of a newly formed Seattle studio called Ridgeline Games, which would lead the development of Glacier’s single-player campaign.

Finally, there was Vince Zampella, one of the leaders of the team that initially created Call of Duty in 2003. He joined EA in 2010 to work on other franchises, but in 2021, EA announced that Zampella would oversee Battlefield moving forward.

In the wake of these changes, some prominent members of DICE departed, including General Manager Oskar Gabrielson and Creative Director Lars Gustavsson, who had been known by the nickname “Mr. Battlefield.” With this changing of the guard, EA was ready to place a bigger bet than ever on the next Battlefield title.

100 million players

While 2042 struggled, competitors Call of Duty and Fortnite were posting astonishing player and revenue numbers, thanks in large part to the popularity of their Battle Royale modes.

EA’s executive leadership believed Battlefield had the potential to stand toe to toe with them, if the right calls were made and enough was invested.

A lofty player target was set for Glacier: 100 million players over a set period of time that included post-launch.

Fortnite characters looking across the many islands and vast realm of the game.

Fortnite‘s huge success has publishers like EA chasing the same dollars. Credit: Epic Games

“Obviously, Battlefield has never achieved those numbers before,” one EA employee told me. “It’s important to understand that over about that same period, 2042 has only gotten 22 million,” another said. Even 2016’s Battlefield 1—the most successful game in the franchise by numbers—had achieved “maybe 30 million plus.”

Of course, most previous Battlefield titles had been premium releases, with an up-front purchase cost and no free-to-play mode, whereas successful competitors like Fortnite and Call of Duty made their Battle Royale modes freely available, monetizing users with in-game purchases and season passes that unlocked post-launch content.

It was thought that if Glacier did the same, it could achieve comparable numbers, so a free-to-play Battle Royale mode was made a core offering for the title, alongside a six-hour single-player campaign, traditional Battlefield multiplayer modes like Conquest and  Rush, a new F2P mode called Gauntlet, and a community content mode called Portal.

The most expensive Battlefield ever

All this meant that Glacier would have a broader scope than its predecessors. Developers say it has the largest budget of any Battlefield title to date.

The project targeted a budget of more than $400 million back in early 2023, which was already more than was originally planned at the start.

However, major setbacks significantly disrupted production in 2023 (more on that in a moment) and hundreds of additional developers were brought onto Glacier from various EA-owned studios to get things back on track, significantly increasing the cost. Multiple team members with knowledge of the project’s finances told me that the current projections are now well north of that $400 million amount.

Skepticism in the ranks

Despite the big ambitions of the new leadership team and EA executives, “very few people” working in the studios believed the 100 million target was achievable, two sources told me. Many of those who had worked on Battlefield for a long time at DICE in Stockholm were particularly skeptical.

“Among the things that we are predicting is that we won’t have to cannibalize anyone else’s sales,” one developer said. “That there’s just such an appetite out there for shooters of this kind that we will just naturally be able to get the audience that we need.”

Regarding the lofty player and revenue targets, one source said that “nothing in the market research or our quality deliverables indicates that we would be anywhere near that.”

“I think people are surprised that they actually worked on a next Battlefield game and then increased the ambitions to what they are right now,” said another.

In 2023, a significant disruption to the project put one game mode in jeopardy, foreshadowing a more troubled development than anyone initially imagined.

Ridgeline implodes

Battlefield games have a reputation for middling single-player campaigns, and Battlefield 2042 didn’t include one at all. But part of this big bet on Glacier was the idea of offering the complete package, so Ridgeline Games scaled up while working on a campaign EA hoped would keep Battlefield competitive with Call of Duty, which usually has included a single-player campaign in its releases.

The studio worked on the campaign for about two years while it was also scaling and hiring talent to catch up to established studios within the Battlefield family.

It didn’t work out. In February of 2024, Ridgeline was shuttered, Halo luminary Marcus Lehto left the company, and the rest of the studios were left to pick up the pieces. When a certain review came up not long before the studio was shuttered, Glacier’s top leadership were dissatisfied with the progress they were seeing, and the call was made.

Sources in EA teams outside Ridgeline told me that there weren’t proper check-ins and internal reviews on the progress, obscuring the true state of the project until the fateful review.

On the other hand, those closer to Ridgeline described a situation in which the team couldn’t possibly complete its objectives, as it was expected to hire and scale up from zero while also meeting the same milestones as established studios with resources already in place. “They kept reallocating funds—essentially staff months—out of our budget,” one person told me. “And, you know, we’re sitting there trying to adapt to doing more with less.”

A Battlefield logo with a list of studios beneath it

A marketing image from EA showing now-defunct Ridgeline Games on the list of groups involved. Credit: EA

After the shuttering of Ridgeline, ownership of single-player shifted to three other EA studios: Criterion, DICE, and Motive. But those teams had a difficult road ahead, as “there was essentially nothing left that Ridgeline had spent two years working on that they could pick up on and build, so they had to redo essentially everything from scratch within the same constraints of when the game had to release.”

Single-player was two years behind. As of late spring, it was the only game mode that had failed to reach alpha, well over a year after the initial overall alpha target for the project.

Multiple sources said its implosion was symptomatic of some broader cultural and process problems that affected the rest of the project, too.

Culture shock

Speaking with people who have worked or currently work at DICE in Sweden, the tension between some at that studio and the new, US-based leadership team was obvious—and to a degree, that’s expected.

DICE had “the pride of having started Battlefield and owned that IP,” but now the studio was just “supporting it for American leadership,” said one person who worked there. Further, “there’s a lot of distrust and disbelief… when it comes to just operating toward numbers that very few people believe in apart from the leadership.”

But the tensions appear to go deeper than that. Two other major factors were at play: scaling pains as the scope of the project expanded and differences in cultural values between US leadership and the workers in Europe.

“DICE being originally a Swedish studio, they are a bit more humble. They want to build the best game, and they want to achieve the greatest in terms of the game experience,” one developer told me. “Of course, when you’re operated by EA, you have to set financial expectations in order to be as profitable as possible.”

That tension wasn’t new. But before 2042 failed to meet expectations, DICE Stockholm employees say they were given more leeway to set the vision for the game, as well as greater influence on timeline and targets.

Some EU-based team members were vocally dismayed at how top-down directives from far-flung offices, along with the US company’s emphasis on quarterly profits, have affected Glacier’s development far more than with previous Battlefield titles.

This came up less in talking to US-based staff, but everyone I spoke with on both continents agreed on one thing: Growing pains accompanied the transition from a production environment where one studio leads and others offer support to a new setup with four primary studios—plus outside support from all over EA—and all of it helmed by LA-based leadership.

EA is not alone in adopting this approach; it’s also used by competitor Activision-Blizzard on the Call of Duty franchise (though it’s worth noting that a big hit like Epic Games’ Fortnite has a very different structure).

Whereas publishers like EA and Activision-Blizzard used to house several studios, each of which worked on its own AAA game, they now increasingly make bigger bets on singular games-as-a-service offerings, with several of their studios working in tandem on a single project.

“Development of games has changed so much in the last 10 to 15 years,” said one developer. The new arrangement excites investors and shareholders, who can imagine returns from the next big unicorn release, but it can be a less creatively fulfilling way to work, as directives come from the top down, and much time is spent on dealing with inter-studio process. Further, it amplifies the effects of failures, with a higher human cost to people working on projects that don’t meet expectations.

It has also made the problems that affected Battlefield 2042‘s development more difficult to avoid.

Clearing the gates

EA studios use a system of “gates” to set the pace of development. Projects have to meet certain criteria to pass each gate.

For gate one, teams must have a clear sense of what they want to make and some proof of concept showing that this vision is achievable.

As they approach gate two, they’re building out and testing key technology, asking themselves if it can work at scale.

Gate three signifies full production. Glacier was expected to pass gate three in early 2023, but it was significantly delayed. When it did pass, some on the ground questioned whether it should have.

“I did not see robust budget, staff plan, feature list, risk planning, et cetera, as we left gate three,” said one person. In the way EA usually works, these things would all be expected at this stage.

As the project approached gate three and then alpha, several people within the organization tried to communicate that the game wasn’t on footing as firm as the top-level planning suggested. One person attributed this to the lack of a single source of truth within the organization. While developers tracked issues and progress in one tool, others (including project leadership) leaned on other sources of information that weren’t as tied to on-the-ground reality when making decisions.

A former employee with direct knowledge of production plans told me that as gate three approached, prototypes of some important game features were not ready, but since there wasn’t time to complete proofs of concept, the decision was handed down to move ahead to production even though the normal prerequisites were not met.

“If you don’t have those things fleshed out when you’re leaving pre-pro[duction], you’re just going to be playing catch-up the entire time you’re in production,” this source said.

In some cases, employees who flagged the problems believed they were being punished. Two EA employees each told me they found themselves cut out of meetings once they raised concerns like this.

Gate three was ultimately declared clear, and as of late May 2025, alpha was achieved for everything except the single-player campaign. But I’m told that this occurred with some tasks still un-estimated and many discrepancies remaining, leaving the door open to problems and compromises down the road.

The consequences for players

Because of these issues, the majority of the people I spoke with said they expect planned features or content to be cut before the game actually launches—which is normal, to a degree. But these common game development problems can contribute to other aspects of modern AAA gaming that many consumers find frustrating.

First off, making major decisions so late in the process can lead to huge day-one patches. Players of all types of AAA games often take to Reddit and social media to malign day-one patches as a frustrating annoyance for modern titles.

Battlefield 2042 had a sizable day-one patch. When multiplayer RPG Anthem (another big investment by EA) launched to negative reviews, that was partly because critics and others with pre-launch access were playing a build that was weeks old; a day-one patch significantly improved some aspects of the game, but that came after the negative press began to pour out.

A player character confronts a monster in Anthem

Anthem, another EA project with a difficult development, launched with a substantial day-one patch. Credit: EA

Glacier’s late arrival to Alpha and the teams’ problems with estimating the status of features could lead to a similarly significant day-one patch. That’s in part because EA has to deliver the work to external partners far in advance of the actual launch date.

“They have these external deadlines to do with the submissions into what EA calls ‘first-party’—that’s your PlayStation and Xbox submissions,” one person explained. “They have to at least have builds ready that they can submit.”

What ends up on the disc or what pre-loads from online marketplaces must be finalized long before the game’s actual release date. When a project is far behind or prone to surprises in the final stretch, those last few weeks are where a lot of vital work happens, so big launch patches become a necessity.

These struggles over content often lead to another pet peeve of players: planned launch content being held until later. “There’s a bit of project management within the Battlefield project that they can modify,” a former senior EA employee who worked on the project explained. “They might push it into Season 1 or Season 2.”

That way, players ultimately get the intended feature or content, but in some cases, they may end up paying more for it, as it ends up being part of a post-launch package like a battle pass.

These challenges are a natural extension of the fiscal-quarter-oriented planning that large publishers like EA adhere to. “The final timelines don’t change. The final numbers don’t change,” said one source. “So there is an enormous amount of pressure.”

A campaign conundrum

Single-player is also a problem. “Single-player in itself is massively late—it’s the latest part of the game,” I was told. “Without an enormous patch on day one or early access to the game, it’s unrealistic that they’re going to be able to release it to what they needed it to do.”

If the single-player mode is a linear, narrative campaign as originally planned, it may not be possible to delay missions or other content from the campaign to post-launch seasons.

“Single-player is secondary to multiplayer, so they will shift the priority to make sure that single-player meets some minimal expectations, however you want to measure that. But the multiplayer is the main focus,” an EA employee said.

“They might have to cut a part of the single-player out in order for the game to release with a single-player [campaign] on it,” they continued. “Or they would have to severely work through the summer and into the later part of this year and try to fix that.”

That—and the potential for a disappointing product—is a cost for players, but there are costs for the developers who work on the game, too.

Because timelines must be kept, and not everything can be cut or moved post-launch, it falls on employees to make up the gap. As we’ve seen in countless similar reports about AAA video game development before, that sometimes means longer hours and heavier stress.

AAA’s burnout problem

More than two decades ago, the spouse of an EA employee famously wrote an open letter to bring attention to the long hours and high stress developers there were facing.

Since then, some things have improved. People at all levels within EA are more conscious of the problems that were highlighted, and there have been efforts to mitigate some of them, like more comp time and mental health resources. However, many of those old problems linger in some form.

I heard several first-hand accounts of people working on Glacier who had to take stress or mental or exhaustion health leave, ranging from a couple of weeks to several months.

“There’s like—I would hesitate to count—but a large number compared to other projects I’ve been on who have taken mental exhaustion leave here. Some as short as two weeks to a month, some as long as eight months and nine,” one staffer told me after saying they had taken some time themselves.

This was partly because of long hours that were required when working directly with studios in both the US and Europe—a symptom of the new, multi-studio structure.

“My day could start as early as 5: 00 [am],” one person said. The first half of the day involved meetings with a studio in one part of the world while the second included meetings with a studio in another region. “Then my evenings would be spent doing my work because I’d be tied up juggling things all across the board and across time zones.”

This sort of workload was not limited to a brief, planned period of focused work, the employees said. Long hours were particularly an issue for those working in or closely with Ridgeline, the studio initially tasked with making the game’s single-player campaign.

From the beginning, members of the Ridgeline team felt they were expected to deliver work at a similar level to that of established studios like DICE or Ripple Effect before they were even fully staffed.

“They’ve done it before,” one person who was involved with Ridgeline said of DICE. “They’re a well-oiled machine.” But Ridgeline was “starting from zero” and was “expected to produce the same stuff.”

Within just six months of the starting line, some developers at Ridgeline said they were already feeling burnt out.

In the wake of the EA Spouses event, EA developed resources for employees. But in at least some cases, they weren’t much help.

“I sought some, I guess, mental help inside of EA. From HR or within that organization of some sort, just to be able to express it—the difficulties that I experienced personally or from coworkers on the development team that had experienced this, you know, that had lived through that,” said another employee. “And the nature of that is there’s nobody to listen. They pretend to listen, but nobody ultimately listens. Very few changes are made on the back of it.”

This person went on to say that “many people” had sought similar help and felt the same way, as far back as the post-launch period for 2042 and as recently as a few months ago.

Finding solutions

There have been a lot of stories like this about the games industry over the years, and it can feel relentlessly grim to keep reading them—especially when they’re coming alongside frequent news of layoffs, including at EA. Problems are exposed, but solutions don’t get as much attention.

In that spirit, let’s wrap up by listening to what some in the industry have said about what doing things better could look like—with the admitted caveat that these proposals are still not always common practice in AAA development.

“Build more slowly”

When Swen Vincke—studio head for Larian Studios and game director for the runaway success Baldur’s Gate 3—accepted an award at the Game Developers Conference, he took his moment on stage to express frustration at publishers like EA.

“I’ve been fighting publishers my entire life, and I keep on seeing the same, same, same mistakes over and over and over,” he said. “It’s always the quarterly profits. The only thing that matters are the numbers.”

After the awards show, he took to X to clarify his statements, saying, “This message was for those who try to double their revenue year after year. You don’t have to do that. Build more slowly and make your aim improving the state of the art, not squeezing out the last drop.”

A man stands on stage giving a speech

Swen Vincke giving a speech at the 2024 Game Developers Choice Awards. Credit: Game Developers Conference

In planning projects like Glacier, publicly traded companies often pursue huge wins—and there’s even more pressure to do so if a competing company has already achieved big success with similar titles.

But going bigger isn’t always the answer, and many in the industry believe the “one big game” strategy is increasingly nonviable.

In this attention economy?

There may not be enough player time or attention to go around, given the numerous games-as-a-service titles that are as large in scope as Call of Duty games or Fortnite. Despite the recent success of new entrant Marvel Rivals, there have been more big AAA live service shooter flops than wins in recent years.

Just last week, a data-based report by prominent games marketing newsletter GameDiscoverCo came to a prescient realization. “Genres like Arena Shooter, Battle Royale, and Hero Shooter look amazing from a revenue perspective. But there’s only 29 games in all of Steam’s history that have grossed >$1m in those subgenres,” wrote GameDiscoverCo’s Simon Carless.

It gets worse. “Only Naraka Bladepoint, Overwatch 2 & Marvel Rivals have grossed >$25m and launched since 2020 in those subgenres,” Carless added. (It’s important to clarify that he is just talking Steam numbers here, though.) That’s a stark counterpoint to reports that Call of Duty has earned more than $30 billion in lifetime revenue.

Employees of game publishers and studios are deeply concerned about this. In a 2025 survey of professional game developers, “one of the biggest issues mentioned was market oversaturation, with many developers noting how tough it is to break through and build a sustainable player base.”

Despite those headwinds, publishers like EA are making big bets in well-established spaces rather than placing a variety of smaller bets in newer areas ripe for development. Some of the biggest recent multiplayer hits on Steam have come from smaller studios that used creative ideas, fresh genres, strong execution, and the luck (or foresight) of reaching the market at exactly the right time.

That might suggest that throwing huge teams and large budgets up against well-fortified competitors is an especially risky strategy—hence some of the anxiety from the EA developers I spoke with.

Working smarter, not harder

That anxiety has led to steadily growing unionization efforts across the industry. From QA workers at Bethesda to more wide-ranging unions at Blizzard and CD Projekt Red, there’s been more movement on this front in the past two or three years than there had been in decades beforehand.

Unionization isn’t a cure-all, and it comes with its own set of new challenges—but it does have the potential to shift some of the conversations toward more sustainable practices, so that’s another potential part of the solution.

Insomniac Games CEO Ted Price spoke authoritatively on sustainability and better work practices for the industry way back at 2021’s Develop:Brighton conference:

I think the default is to brute force the problem—in other words, to throw money or people at it, but that can actually cause more chaos and affect well-being, which goes against that balance. The harder and, in my opinion, more effective solution is to be more creative within constraints… In the stress of hectic production, we often feel we can’t take our foot off the gas pedal—but that’s often what it takes.

That means publishers and studios should plan for problems and work from accurate data about where the team is at, but it also means having a willingness to give their people more time, provided the capital is available to do so.

Giving people what they need to do their jobs sounds like a simple solution to a complex problem, but it was at the heart of every conversation I had about Glacier.

Most EA developers—including leaders who are beholden to lofty targets—want to make a great game. “At the end of the day, they’re all really good people and they work really hard and they really want to deliver a good product for their customer,” one former EA developer assured me as we ended our call.

As for making the necessary shifts toward sustainability in the industry, “It’s kind of in the best interest of making the best possible game for gamers,” explained another. “I hope to God that they still achieve what they need to achieve within the timelines that they have, for the sake of Battlefield as a game to actually meet the expectations of the gamers and for people to maintain their jobs.”

Photo of Samuel Axon

Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

What’s wrong with AAA games? The development of the next Battlefield has answers. Read More »

android-16-review:-post-hype

Android 16 review: Post-hype


Competent, not captivating

The age of big, exciting Android updates is probably over.

Android 16 on a Pixel

Android 16 is currently only available for Pixel phones. Credit: Ryan Whitwam

Android 16 is currently only available for Pixel phones. Credit: Ryan Whitwam

Google recently released Android 16, which brings a smattering of new features for Pixel phones, with promises of additional updates down the road. The numbering scheme has not been consistent over the years, and as a result, Android 16 is actually the 36th major release in a lineage that stretches back nearly two decades. In 2008, we didn’t fully understand how smartphones would work, so there was a lot of trial and error. In 2025, the formula has been explored every which way. Today’s smartphones run mature software, and that means less innovation in each yearly release. That trend is exemplified and amplified by Google’s approach to Android 16.

The latest release is perhaps the most humdrum version of the platform yet, but don’t weep for Google. The company has been working toward this goal for years: a world where the average phone buyer doesn’t need to worry about Android version numbers.

A little fun up front

When you install Android 16 on one of Google’s Pixel phones, you may need to check the settings to convince yourself that the update succeeded. Visually, the changes are so minuscule that you’ll only notice them if you’re obsessive about how Android works. For example, Google changed the style of icons in the overview screen and added a few more options to the overview app menus. There are a lot of these minor style tweaks; we expect more when Google releases Material 3 Expressive, but that’s still some way off.

There are some thoughtful UI changes, but again, they’re very minor and you may not even notice them at first. For instance, Google’s predictive back gesture, which allows the previous screen to peek out from behind the currently displayed one, now works with button navigation.

Apps targeting the new API (level 36) will now default to using edge-to-edge rendering, which removes the navigation background to make apps more immersive. Android apps have long neglected larger form factors because Google itself was neglecting those devices. Since the Android 12L release a few years ago, Google has been attempting to right that wrong. Foldable phones have suffered from many of the same issues with app scaling that tablets have, but all big-screen Android devices will soon benefit from adaptive apps. Previously, apps could completely ignore the existence of large screens and render a phone-shaped UI on a large screen.

Advanced Protection is a great addition to Android, even if it’s not the most riveting.

Credit: Ryan Whitwam

Advanced Protection is a great addition to Android, even if it’s not the most riveting. Credit: Ryan Whitwam

In Android 16, apps will automatically adapt to larger screens, saving you from having to tinker with the forced aspect ratio tools built into Google and Samsung devices. Don’t confuse this with tablet-style interfaces, though. Just because an app fills the screen, it’s no guarantee that it will look good. Most of the apps we’ve run on the Pixel 9 Pro Fold are still using stretched phone interfaces that waste space. Developers need to make adjustments to properly take advantage of larger screens. Will they? That’s yet another aspect of Android 16 that we hope will come later.

Security has been a focus in many recent Android updates. While not the most sexy improvement, the addition of Advanced Protection in Android 16 could keep many people from getting hit with malware, and it makes it harder for government entities to capture your data. This feature blocks insecure 2G connections, websites lacking HTTPS, and exploits over USB. It disables sideloading of apps, too, which might make some users wary. However, if you know someone who isn’t tech savvy, you should encourage them to enable Advanced Protection when (and if) they get access to Android 16. This is a great feature that Google should have added years ago.

The changes to notifications will probably make the biggest impact on your daily life. Whether you’re using Android or iOS, notification spam is getting out of hand. Every app seems to want our attention, and notifications can really pile up. Android 16 introduces a solid quality-of-life improvement by bundling notifications from each app. While notification bundles were an option before, they were primarily used for messaging, and not all developers bothered. Now, the notification shade is less overwhelming, and it’s easy to expand each block to triage individual items.

Progress notification

Android 16’s progress notifications are partially implemented in the first release.

Credit: Ryan Whitwam

Android 16’s progress notifications are partially implemented in the first release. Credit: Ryan Whitwam

Google has also added a new category of notifications that can show progress, similar to a feature on the iPhone. The full notification will include a live updating bar that can tell you exactly when your Uber will show up, for example. These notifications will come first to delivery and rideshare apps, but none of them are working yet. You can get a preview of how these notifications will work with the Android 16 easter egg, which sends a little spaceship rocketing toward a distant planet.

The progress notifications will also have a large status bar chip with basic information visible at all times. Tapping on it will expand the full notification. However, this is also not implemented in the first release of Android 16. Yes, this is a recurring theme with Google’s new OS.

More fun still to come

You may notice that none of the things we’ve discussed in Android 16 are exactly riveting—better security features and cleaner notifications are nice to have, but this is hardly a groundbreaking update. It might have been more exciting were it not for the revamped release schedule, though. This Android 16 release isn’t even the Android 16. There will be a second Android 16 update later in the year, and some of the most interesting features aren’t arriving as part of either one.

Traditionally, Google has released new versions of Android in the fall, around the time new Pixel phones arrive. Android 15, for example, began its rollout in October 2024. Just eight months later, we’re on to Android 16. This is the first cycle in which Google will split its new version into two updates. Going forward, the bigger update will arrive in Q2, and the smaller one, which includes API and feature tweaks, will come at the end of the year.

Google has said the stylish but divisive Material 3 Expressive UI and the desktop windowing feature will come later. They’re currently in testing with the latest beta for Android 16 QPR1, which will become a Pixel Drop in September. It’s easy to imagine that with a single fall Android 16 release, both of these changes would have been included.

In the coming months, we expect to see some Google apps updated with support for Material 3, but the changes will be minimal unless you’re using a phone that runs Google’s Android theme. For all intents and purposes, that means a Pixel. Motorola has traditionally hewed closely to Google’s interface, while Samsung, OnePlus, and others forged their own paths. But even Moto has been diverging more as it focuses on AI. It’s possible that Google’s big UI shakeup will only affect Pixel users.

As for desktop windowing, that may have limited impact, too. On-device windowing will only be supported on tablets—even tablet-style foldables will be left out. We’ve asked Google to explain this decision and will report back if we get more details. Non-tablet devices will be able to project a desktop-style interface on an external display via USB video-out, but the feature won’t be available universally. Google tells Ars that it’s up to OEMs to support this feature. So even a phone that has video-out over USB may not have desktop windowing. Again, Pixels may be the best (or only) way to get Android’s new desktop mode.

The end of version numbers

There really isn’t much more to say about Android 16 as it currently exists. This update isn’t flashy, but it lays important groundwork for the future. The addition of Material 3 Expressive will add some of the gravitas we expect from major version bumps, but it’s important to remember that this is just Google’s take on Android—other companies have their own software interests, mostly revolving around AI. We’ll have to wait to see what Samsung, OnePlus, and others do with the first Android 16 release. The underlying software has been released in the Android Open Source Project (AOSP), but it will be a few months before other OEMs have updates.

In some ways, boring updates are exactly what Google has long wanted from Android. Consider the era when Android updates were undeniably exciting—a time when the addition of screenshots could be a headlining feature (Android 4.0 Ice Cream Sandwich) or when Google finally figured out how to keep runaway apps from killing your battery (Android 6.0 Marshmallow). But there was a problem with these big tentpole updates: Not everyone got them, and they were salty about it.

During the era of rapid software improvement, it took the better part of a year (or longer!) for a company like Samsung or LG to deploy new Android updates. Google would announce a laundry list of cool features, but only the tiny sliver of people using Nexus (and later Pixel) phones would see them. By the time a Samsung Galaxy user had the new version, it was time for Google to release another yearly update.

This “fragmentation” issue was a huge headache for Google, leading it to implement numerous platform changes over the years to take the pressure off its partners and app developers. There were simple tweaks like adding important apps, including Maps and the keyboard (later Gboard), to the Play Store so they could be updated regularly. On the technical side, initiatives like Project Mainline made the platform more modular so features could be added and improved outside of major updates. Google has also meticulously moved features into Play Services, which can deliver system-level changes without an over-the-air update (although there are drawbacks to that).

Android I/O sign

Android version numbers hardly matter anymore—it’s just Android.

Credit: Ryan Whitwam

Android version numbers hardly matter anymore—it’s just Android. Credit: Ryan Whitwam

The overarching story of Android has been a retreat from monolithic updates, and that means there’s less to get excited about when a new version appears. Rather than releasing a big update rife with changes, Google has shown a preference for rolling out features via the Play Store and Play Services to the entire Android ecosystem. Experiences like Play Protect anti-malware, Google Play Games, Google Cast, Find My Device, COVID-19 exposure alerts, Quick Share, and myriad more were released to almost all Google-certified Android devices without system updates.

As more features arrive in dribs and drabs via Play Services and Pixel Drops, the numbered version changes are less important. People used to complain about missing out on the tentpole updates, but it’s quieter when big features are decoupled from version numbers. And that’s where we are—Android 15 or Android 16—the number is no longer important. You won’t notice a real difference, but the upshot is that most phones get new features faster than they once did. That was the cost to fix fragmentation.

Boring updates aren’t just a function of rearranging features. Even if all the promised upgrades were here now, Android 16 would still barely move the needle. Phones are now mature products with established usage paradigms. It’s been almost 20 years since the age of touchscreen smartphones began, and we’ve figured out how these things should work. It’s not just Android updates settling into prosaic predictability—Apple is running low on paradigm shifts, too. The release of iOS 26 will add some minor improvements to a few apps, and the theme is getting more transparent with the controversial “Liquid Glass” UI. And that’s it.

Until there’s a marked change in form factors or capability, these flat glass slabs will look and work more or less as they do now (with a lot more AI slop, whether you like it or not). If you have a recent non-Pixel Android device, you’ll probably get Android 16 in the coming months, but it won’t change the way you use your phone.

Photo of Ryan Whitwam

Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he’s written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards.

Android 16 review: Post-hype Read More »

an-exceedingly-rare-asteroid-flyby-will-happen-soon,-but-nasa-may-be-left-on-the-sidelines

An exceedingly rare asteroid flyby will happen soon, but NASA may be left on the sidelines


“Nature is handing us an incredibly rare experiment.”

An illustration of the OSIRIS-Apex mission at Apophis. Credit: NASA

An illustration of the OSIRIS-Apex mission at Apophis. Credit: NASA

A little less than four years from now, a killer asteroid will narrowly fly past planet Earth. This will be a celestial event visible around the world—for a few weeks, Apophis will shine among the brightest objects in the night sky.

The near miss by the large Apophis asteroid in April 2029 offers NASA a golden—and exceedingly rare—opportunity to observe such an object like this up close. Critically, the interaction between Apophis and Earth’s gravitational pull will offer scientists an unprecedented chance to study the interior of an asteroid.

This is fascinating for planetary science, but it also has serious implications for planetary defense. In the future, were such an asteroid on course to strike Earth, an effective plan to deflect it would depend on knowing what the interior looks like.

“This is a remarkable opportunity,” said Bobby Braun, who leads space exploration for the Johns Hopkins Applied Physics Laboratory, in an interview. “From a probability standpoint, there’s not going to be another chance to study a killer asteroid like this for thousands of years. Sooner or later, we’re going to need this knowledge.”

But we may not get it.

NASA has some options for tracking Apophis during its flyby. However, the most promising of these, a mission named OSIRIS-Apex that breathes new life into an old spacecraft that otherwise would drift into oblivion, is slated for cancellation by the Trump White House’s budget for fiscal year 2026.

Other choices, including dragging dual space probes out of storage, the Janus spacecraft, and other concepts that were submitted to NASA a year ago as part of a call for ideas, have already been rejected or simply left on the table. As a result, NASA currently has no plans to study what will be the most important asteroid encounter since the formation of the space agency.

“The world is watching,” said Richard Binzel, an asteroid expert at the Massachusetts Institute of Technology. “NASA needs to step up and do their job.”

But will they?

A short history of planetary defense

For decades, nearly every public survey asking what NASA should work on has rated planetary defense at or near the very top of the space agency’s priorities. Yet for a long time, no part of NASA actually focused on finding killer asteroids or developing the technology to deflect them.

In authorization bills dating back to 2005, Congress began mandating that NASA “detect, track, catalog, and characterize” near-Earth objects that were 140 meters in diameter or larger. Congress established a goal of finding 90 percent of these by the year 2020. (We’ve blown past that deadline, obviously.)

NASA had been informally studying asteroids and comets for decades but did not focus on planetary defense until 2016, when the space agency established the Planetary Defense Coordination Office. In the decade since, NASA has made some progress, identifying more than 26,000 near-Earth objects, which are defined as asteroids and comets that come within 30 million miles of our planet’s orbit.

Moreover, NASA has finally funded a space mission designed specifically to look for near-Earth threats, NEO Surveyor, a space telescope with the goal of “finding asteroids before they find us.” The $1.2 billion mission is due to launch no earlier than September 2027.

NASA also funded the DART mission, which launched in 2021 and impacted a 160-meter asteroid named Dimorphous a year later to demonstrate the ability to make a minor deflection.

But in a report published this week, NASA’s Office of Inspector General found that despite these advances, the space agency’s approach to planetary defense still faces some significant challenges. These include a lack of resources, a need for better strategic planning, and competition with NASA’s more established science programs for limited funding.

A comprehensive plan to address planetary defense must include two elements, said Ed Lu, a former NASA astronaut who co-founded the B612 Foundation to protect Earth from asteroid impacts.

The first of these is the finding and detection of asteroid threats. That is being addressed both by the forthcoming NEO Surveyor and the recently completed Vera C. Rubin Observatory, which is likely to find thousands of new near-Earth threats. The challenge in the coming years will be processing all of this data, calculating orbits, and identifying threats. Lu said NASA must do a better job of being transparent in how it makes these calculations.

The second thing Lu urged NASA to do is develop a follow-up mission to DART. It was successful, he said, but DART was just an initial demonstration. Such a capability needs to be tested against a larger asteroid with different properties.

An asteroid that might look a lot like Apophis.

About Apophis

Astronomers using a telescope in Arizona found Apophis in 2004, and they were evidently fans of the television series Stargate SG-1, in which a primary villain who threatens civilization on Earth is named Apophis.

Because of its orbit, Apophis comes near Earth about every eight years. It is fairly large, about 370 meters across. This is not big enough to wipe out civilization on Earth, but it would cause devastating consequences across a large region, imparting about 300 times as much impact force on the planet as the Tunguska event in 1908, over Siberia. It will miss Earth by about 31,600 km (19,600 miles) on April 13, 2029.

“We like to say that’s because nature has a sense of humor,” said Binzel, the MIT asteroid scientist, of this date.

Astronomers estimate that an asteroid this large comes this close to Earth only about once every 7,500 years. It also appears to be a stony, non-metallic type of asteroid known as an ordinary chondrite. This is the most common type of asteroid in the Solar System.

Areas of the planet that will be able to see Apophis at its closest approach to Earth in April 2029.

Credit: Rick Binzel

Areas of the planet that will be able to see Apophis at its closest approach to Earth in April 2029. Credit: Rick Binzel

All of this is rather convenient for scientists hoping to understand more about potential asteroids that might pose a serious threat to the planet.

The real cherry on top with the forthcoming encounter is that Apophis will be perturbed by Earth’s gravitational pull.

“Nature is handing us an incredibly rare experiment where the Earth’s gravity is going to tug and stretch this asteroid,” Binzel said. “By seeing how the asteroid responds, we’ll know how it is put together, and knowing how an asteroid is put together is maybe the most important information we could have if humanity ever faces an asteroid threat.”

In nearly seven decades of spaceflight, humans have only ever probed the interior of three celestial bodies: the Earth, the Moon, and Mars. We’re now being offered the opportunity to probe a fourth, right on our doorstep.

But time is ticking.

Chasing Apophis

On paper, at least, NASA has a plan to rendezvous with Apophis. About three years ago, after a senior-level review, NASA extended the mission of the OSIRIS-REx spacecraft to rendezvous with Apophis.

As you may recall, this oddly named spacecraft collected a sample from another asteroid, Bennu, in October 2020. Afterward, a small return capsule departed from the main spacecraft and made its way back to Earth. Since then, an $800 million spacecraft specifically designed to fly near and touch an asteroid has been chilling in space.

So it made sense when NASA decided to fire up the mission, newly rechristened OSIRIS-Apex, and re-vector it toward Apophis. It has been happily flying toward such a rendezvous for a few years. The plan was for Apex to catch up to Apophis shortly after its encounter with Earth and study it for about 18 months.

“The most cost-efficient thing you can do in spaceflight is continue with a heathy spacecraft that is already operating in space,” Binzel said.

And that was the plan until the Trump administration released its budget proposal for fiscal year 2026. In its detailed budget information, the White House provided no real rationale for the cancellation, simply stating, “Operating missions that have completed their prime missions (New Horizons and Juno) and the follow-on mission to OSIRIX-REx, OSIRIS-Apophis Explorer, are eliminated.”

It’s unclear how much of a savings this resulted in. However, Apex is a pittance in NASA’s overall budget. The operating funds to keep the mission alive in 2024, for example, were $14.5 million. Annual costs would be similar through the end of the decade. This is less than one-thousandth of NASA’s budget, by the way.

“Apex is already on its way to reach Apophis, and to turn it off would be an incredible waste of resources,” Binzel said.

Congress, of course, ultimately sets the budget. It will have the final say. But it’s clear that NASA’s primary mission to study a once-in-a-lifetime asteroid is at serious risk.

So what are the alternatives?

Going international and into the private sector

NASA was not the only space agency targeting Apophis. Nancy Chabot, a planetary scientist at the Johns Hopkins University Applied Physics Laboratory, has been closely tracking other approaches.

The European Space Agency has proposed a mission named Ramses to rendezvous with the asteroid and accompany it as it flies by Earth. This mission would be valuable, conducting a thorough before-and-after survey of the asteroid’s shape, surface, orbit, rotation, and orientation.

It would need to launch by April 2028. Recognizing this short deadline, the space agency has directed European scientists and engineers to begin preliminary work on the mission. But a final decision to proceed and commit to the mission will not be made before the space agency’s ministerial meeting in November.

Artist’s impression of ESA’s Rapid Apophis Mission for Space Safety (Ramses).

Credit: ESA

Artist’s impression of ESA’s Rapid Apophis Mission for Space Safety (Ramses). Credit: ESA

This is no sure thing. For example, Chabot said, in 2016, the Asteroid Impact Mission was expected to advance before European ministers decided not to fund it. It is also not certain that the Ramses mission would be ready to fly in less than three years, a short timeline for planetary science missions.

Japan’s space agency, JAXA, is also planning an asteroid mission named Destiny+ that has as its primary goal flying to an asteroid named 3200 Phaeton. The mission has been delayed multiple times, so its launch is now being timed to permit a single flyby of Apophis in February 2029 on the way to its destination. While this mission is designed to deliver quality science, a flyby mission provides limited data. It is also unclear how close Destiny+ will actually get to Apophis, Chabot said.

There are also myriad other concepts, commercial and otherwise, to characterize Apophis before, during, and after its encounter with Earth. Ideally, scientists say, a mission would fly to the asteroid before April 2029 and scatter seismometers on the surface to collect data.

But all of this would require significant funding. If not from NASA, who? The uncertain future of NASA’s support for Apex has led some scientists to think about philanthropy.

For example, NASA’s Janus spacecraft have been mothballed for a couple of years, but they could be used for observational purposes if they had—say—a Falcon 9 to launch them at the appropriate time.

A new, private reconnaissance mission could probably be developed for $250 million or less, industry officials told Ars. There is still enough time, barely, for a private group to work with scientists to develop instrumentation that could be added to an off-the-shelf spacecraft bus to get out to Apophis before its Earth encounter.

Private astronaut Jared Isaacman, who has recently indicated a willingness to support robotic exploration in strategic circumstances, confirmed to Ars that several people have reached out about his interest in financially supporting an Apophis mission. “I would say that I’m in info-gathering mode and not really rushing into anything,” Isaacman said.

The problem is that, at this very moment, Apophis is rushing this way.

Photo of Eric Berger

Eric Berger is the senior space editor at Ars Technica, covering everything from astronomy to private space to NASA policy, and author of two books: Liftoff, about the rise of SpaceX; and Reentry, on the development of the Falcon 9 rocket and Dragon. A certified meteorologist, Eric lives in Houston.

An exceedingly rare asteroid flyby will happen soon, but NASA may be left on the sidelines Read More »

the-axion-may-help-clean-up-the-messy-business-of-dark-matter

The axion may help clean up the messy business of dark matter


We haven’t found evidence of the theoretical particle, but it’s still worth investigating.

In recent years, a curious hypothetical particle called the axion, invented to address challenging problems with the strong nuclear force, has emerged as a leading candidate to explain dark matter. Although the potential for axions to explain dark matter has been around for decades, cosmologists have only recently begun to seriously search for them. Not only might they be able to resolve some issues with older hypotheses about dark matter, but they also offer a dizzying array of promising avenues for finding them.

But before digging into what the axion could be and why it’s so useful, we have to explore why the vast majority of physicists, astronomers, and cosmologists accept the evidence that dark matter exists and that it’s some new kind of particle. While it’s easy to dismiss the dark matter hypothesis as some sort of modern-day epicycle, the reality is much more complex (to be fair to epicycles, it was an excellent idea that fit the data extremely well for many centuries).

The short version is that nothing in the Universe adds up.

We have many methods available to measure the mass of large objects like galaxies and clusters. We also have various methods to assess the effects of matter in the Universe, like the details of the cosmic microwave background or the evolution of the cosmic web. There are two broad categories: methods that rely solely on estimating the amount of light-emitting matter and methods that estimate the total amount of matter, whether it’s visible or not.

For example, if you take a picture of a generic galaxy, you’ll see that most of the light-emitting matter is concentrated in the core. But when you measure the rotation rate of the galaxy and use that to estimate the total amount of matter, you get a much larger number, plus some hints that it doesn’t perfectly overlap with the light-emitting stuff. The same thing happens for clusters of galaxies—the dynamics of galaxies within a cluster suggest the presence of much more matter than what we can see, and the two types of matter don’t always align. When we use gravitational lensing to measure a cluster’s contents, we again see evidence for much more matter than is plainly visible.

The tiny variations in the cosmic microwave background tell us about the influence of both matter that interacts with light and matter that doesn’t. It clearly shows that some invisible component dominated the early Universe. When we look at the large-scale structure, invisible matter rules the day. Matter that doesn’t interact with light can form structures much more quickly than matter that gets tangled up by interacting with itself. Without invisible matter, galaxies like the Milky Way can’t form quickly enough to match observations of the early Universe.

The calculations of Big Bang nucleosynthesis, which correctly predict the abundances of hydrogen and helium in the Universe, put strict constraints on how much light-emitting matter there can be, and that number simply isn’t large enough to accommodate all these disparate results.

Across cosmic scales in time and space, the evidence just piles up: There’s more stuff out there than meets the eye, and it can’t simply be dim-but-otherwise-regular matter.

Weakness of WIMPs

Since pioneering astronomer Vera Rubin first revealed dark matter in a big way in the 1970s, the astronomical community has tried every idea it could think of to explain these observations. One tantalizing possibility is that the dark matter is the entirely wrong approach; instead, we’re misunderstanding gravity itself. But so far, half a century later, all attempts to modify gravity ultimately fail one observational test or another. In fact, the most popular modified gravity theory, known as MOND, still requires the existence of dark matter, just less of it.

As the evidence piled up for dark matter in the 1980s and ’90s, astronomers began to favor a particular explanation known as WIMPs, for weakly interacting massive particles. WIMPs weren’t just made up on the spot. They were motivated by particle physics and our attempts to create theories beyond the Standard Model. Many extensions to the Standard Model predicted the existence of WIMP-like particles that could be made in abundance in the early Universe, generating a population of heavy-ish particles that remained largely in the cosmic background.

WIMPs seemed like a good idea, as they could both explain the dark matter problem and bring us to a new understanding of fundamental physics. The idea is that we are swimming in an invisible sea of dark matter particles that almost always simply pass through us undetected. But every once in a while, a WIMP should interact via the weak nuclear force (hence the origin of its name) and give off a shower of byproducts. One problem: We needed to detect one of these rare interactions. So experiments sprang up around the world to catch an elusive dark matter candidate.

With amazing names like CRESST, SNOLAB, and XENON, these experiments have spent years searching for a WIMP to no avail. They’re not an outright failure, though; instead, with every passing year, we know more and more about what the WIMP can’t be—what mass ranges and interaction strengths are now excluded.

By now, that list of what the WIMP can’t be is rather long, and large regions within the space of possibilities are now hard-and-fast ruled out.

OK, that’s fine. I mean, it’s a huge bummer that our first best guess didn’t pan out, but nature is under no obligation to make this easy for us. Maybe the dark matter isn’t a WIMP at all.

More entities are sitting around the particle physics attic that we might be able to use to explain this deep cosmic mystery. And one of those hypothetical particles is called the axion.

Cleaning up with axions

It was the late 1970s, and physicist Frank Wilczek was shopping for laundry detergent. He found one brand standing out among the bottles: Axion. He thought that would make an excellent name for a particle.

He was right.

For decades, physicists had been troubled by a little detail of the theory used to explain the strong nuclear force, known as quantum chromodynamics. By all measurements, that force obeys charge-parity symmetry, which means if you take an interaction, flip all the charges around, and run it in a mirror, you’ll get the same result. But quantum chromodynamics doesn’t enforce that symmetry on its own.

It seemed to be a rather fine-tuned state of affairs, with the strong force unnaturally maintaining a symmetry when there was nothing in the theory to explain why.

In 1977, Roberto Peccei and Helen Quinn discovered an elegant solution. By introducing a new field into the Universe, it could naturally introduce charge-parity symmetry into the equations of quantum chromodynamics. The next year, Wilczek and Gerard ‘t Hooft independently realized that this new field would imply the existence of a particle.

The axion.

Dark matter was just coming on the cosmic scene. Axions weren’t invented to solve that problem, but physicists very quickly realized that the complex physics of the early Universe could absolutely flood the cosmos with axions. What’s more, they would largely ignore regular matter and sit quietly in the background. In other words, the axion was an excellent dark matter candidate.

But axions were pushed aside as the WIMPs hypothesis gained more steam. Back-of-the-envelope calculations showed that the natural mass range of the WIMP would precisely match the abundances needed to explain the amount of dark matter in the Universe, with no other fine-tuning or adjustments required.

Never ones to let the cosmologists get in the way of a good time, the particle physics community kept up interest in the axion, finding different variations on the particle and devising clever experiments to see if the axion existed. One experiment requires nothing more than a gigantic magnet since, in an extremely strong magnetic field, axions can spontaneously convert into photons.

To date, no hard evidence for the axion has shown up. But WIMPs have proven to be elusive, so cosmologists are showing more love to the axion and identifying surprising ways that it might be found.

A sloshy Universe

Axions are tiny, even for subatomic particles. The lightest known particle is the neutrino, which weighs no more than 0.086 electron-volts (or eV). Compare that to, say, the electron, which weighs over half a million eV. The exact mass of the axion isn’t known, and there are many models and versions of the particle, but it can have a mass all the way down to a trillionth of an eV… and even lower.

In fact, axions belong to a much broader class of “ultra-light” dark matter particle candidates, which can have masses down to 10^-24 eV. This is multiple billions of times lighter than the WIMPs—and indeed most of the particles of the Standard Model.

That means axions and their friends act nothing like most of the particles of the Standard Model.

First off, it may not even be appropriate to refer to them as particles. They have such little mass that their de Broglie wavelength—the size of the quantum wave associated with every particle—can stretch into macroscopic proportions. In some cases, this wavelength can be a few meters across. In others, it’s comparable to a star or a solar system. In still others, a single axion “particle” can stretch across an entire galaxy.

In this view, the individual axion particles would be subsumed into a larger quantum wave, like an ocean of dark matter so large and vast that it doesn’t make sense to talk about its individual components.

And because axions are bosons, they can synchronize their quantum wave nature, becoming a distinct state of matter: a Bose-Einstein condensate. In a Bose-Einstein condensate, most of the particles share the same low-energy state. When this happens, the de Broglie wavelength is larger than the average separation between the particles, and the waves of the individual particles all add up together, creating, in essence, a super-particle.

This way, we may get axion “stars”—clumps of axions acting as a single particle. Some of these axion stars may be a few thousand kilometers across, wandering across interstellar space. Still others may be the size of galactic cores, which might explain an issue with the traditional WIMP picture.

The best description of dark matter in general is that it is “cold,” meaning that the individual particles do not move fast compared to the speed of light. This allows them to gravitationally interact and form the seeds of structures like galaxies and clusters. But this process is a bit too efficient. According to simulations, cold dark matter tends to form more small, sub-galactic clumps than we observe, and it tends to make the cores of galaxies much, much denser than we see.

Axions, and ultra-light dark matter in general, can provide a solution here because they would operate in two modes. At large scales, they can act like regular cold dark matter. But inside galaxies, they can condense, forming tight clumps. Critically, these clumps have uniform densities within them. This smooths out the distribution of axions within galaxies, preventing the formation of smaller clumps and ultra-dense cores.

A messy affair

Over the decades, astronomers and physicists have found an astounding variety of ways that axions might reveal their presence in the Universe. Because of their curious ability to transmute into photons in the presence of strong magnetic fields, any place that features strong fields—think neutron stars or even the solar corona—could produce extra radiation due to axions. That makes them excellent hunting grounds for the particles.

Axion stars—also sometimes known provocatively as dark stars—would be all but invisible under most circumstances. That is, until they destabilize in a cascading chain reaction of axion-to-photon conversion and blow themselves up.

Even the light from distant galaxies could betray the existence of axions. If they exist in a dense swarm surrounding a galaxy, their conversion to photons will contribute to the galaxy’s light, creating a signal that the James Webb Space Telescope can pick up.

To date, despite all these ideas, there hasn’t been a single shred of solid evidence for the existence of axions, which naturally drops them down a peg or two on the credibility scale. But that doesn’t mean that axions aren’t worth investigating further. The experiments conducted so far only place limits on what properties they might have; there’s still plenty of room for viable axion and axion-like candidates, unlike their WIMPy cousins.

There’s definitely something funny going on with the Universe. The dark matter hypothesis—that there is a large, invisible component to matter in the Universe—isn’t that great of an idea, but it’s the best one we have that fits the widest amount of available evidence. For a while, we thought we knew what the identity of that matter might be, and we spent decades (and small fortunes) in that search.

But while WIMPs were the mainstay hypothesis, that didn’t snuff out alternative paths. Dozens of researchers have investigated modified forms of gravity to equal levels of unsuccessfulness. And a small cadre has kept the axion flame alive. It’s a good thing, too, since their obscure explorations of the corners of particle physics laid the groundwork to flesh out axions into a viable competitor to WIMPs.

No, we haven’t found any axions. And we still don’t know what the dark matter is. But it’s only by pushing forward—advancing new ideas, testing them against the reality of observations, and when they fail, trying again—will we come to a new understanding. Axions may or may not be dark matter; the best we can say is that they are promising. But who wouldn’t want to live in a Universe filled with dark stars, invisible Bose-Einstein condensates, and strange new particles?

Photo of Paul Sutter

The axion may help clean up the messy business of dark matter Read More »

curated-realities:-an-ai-film-festival-and-the-future-of-human-expression

Curated realities: An AI film festival and the future of human expression


We saw 10 AI films and interviewed Runway’s CEO as well as Hollywood pros.

An AI-generated frame of a person looking at an array of television screens

A still from Total Pixel Space, the Grand Prix winner at AIFF 2025.

A still from Total Pixel Space, the Grand Prix winner at AIFF 2025.

Last week, I attended a film festival dedicated to shorts made using generative AI. Dubbed AIFF 2025, it was an event precariously balancing between two different worlds.

The festival was hosted by Runway, a company that produces models and tools for generating images and videos. In panels and press briefings, a curated list of industry professionals made the case for Hollywood to embrace AI tools. In private meetings with industry professionals, I gained a strong sense that there is already a widening philosophical divide within the film and television business.

I also interviewed Runway CEO Cristóbal Valenzuela about the tightrope he walks as he pitches his products to an industry that has deeply divided feelings about what role AI will have in its future.

To unpack all this, it makes sense to start with the films, partly because the film that was chosen as the festival’s top prize winner says a lot about the issues at hand.

A festival of oddities and profundities

Since this was the first time the festival has been open to the public, the crowd was a diverse mix: AI tech enthusiasts, working industry creatives, and folks who enjoy movies and who were curious about what they’d see—as well as quite a few people who fit into all three groups.

The scene at the entrance to the theater at AIFF 2025 in Santa Monica, California.

The films shown were all short, and most would be more at home at an art film fest than something more mainstream. Some shorts featured an animated aesthetic (including one inspired by anime) and some presented as live action. There was even a documentary of sorts. The films could be made entirely with Runway or other AI tools, or those tools could simply be a key part of a stack that also includes more traditional filmmaking methods.

Many of these shorts were quite weird. Most of us have seen by now that AI video-generation tools excel at producing surreal and distorted imagery—sometimes whether the person prompting the tool wants that or not. Several of these films leaned into that limitation, treating it as a strength.

Representing that camp was Vallée Duhamel’s Fragments of Nowhere, which visually explored the notion of multiple dimensions bleeding into one another. Cars morphed into the sides of houses, and humanoid figures, purported to be inter-dimensional travelers, moved in ways that defied anatomy. While I found this film visually compelling at times, I wasn’t seeing much in it that I hadn’t already seen from dreamcore or horror AI video TikTok creators like GLUMLOT or SinRostroz in recent years.

More compelling were shorts that used this propensity for oddity to generate imagery that was curated and thematically tied to some aspect of human experience or identity. For example, More Tears than Harm by Herinarivo Rakotomanana was a rotoscope animation-style “sensory collage of childhood memories” of growing up in Madagascar. Its specificity and consistent styling lent it a credibility that Fragments of Nowhere didn’t achieve. I also enjoyed Riccardo Fusetti’s Editorial on this front.

More Tears Than Harm, an unusual animated film at AIFF 2025.

Among the 10 films in the festival, two clearly stood above the others in my impressions—and they ended up being the Grand Prix and Gold prize winners. (The judging panel included filmmakers Gaspar Noé and Harmony Korine, Tribeca Enterprises CEO Jane Rosenthal, IMAX head of post and image capture Bruce Markoe, Lionsgate VFX SVP Brianna Domont, Nvidia developer relations lead Richard Kerris, and Runway CEO Cristóbal Valenzuela, among others).

Runner-up Jailbird was the aforementioned quasi-documentary. Directed by Andrew Salter, it was a brief piece that introduced viewers to a program in the UK that places chickens in human prisons as companion animals, to positive effect. Why make that film with AI, you might ask? Well, AI was used to achieve shots that wouldn’t otherwise be doable for a small-budget film to depict the experience from the chicken’s point of view. The crowd loved it.

Jailbird, the runner-up at AIFF 2025.

Then there was the Grand Prix winner, Jacob Adler’s Total Pixel Space, which was, among other things, a philosophical defense of the very idea of AI art. You can watch Total Pixel Space on YouTube right now, unlike some of the other films. I found it strangely moving, even as I saw its selection as the festival’s top winner with some cynicism. Of course they’d pick that one, I thought, although I agreed it was the most interesting of the lot.

Total Pixel Space, the Grand Prix winner at AIFF 2025.

Total Pixel Space

Even though it risked navel-gazing and self-congratulation in this venue, Total Pixel Space was filled with compelling imagery that matched the themes, and it touched on some genuinely interesting ideas—at times, it seemed almost profound, didactic as it was.

“How many images can possibly exist?” the film’s narrator asked. To answer that, it explains the concept of total pixel space, which actually reflects how image generation tools work:

Pixels are the building blocks of digital images—tiny tiles forming a mosaic. Each pixel is defined by numbers representing color and position. Therefore, any digital image can be represented as a sequence of numbers…

Just as we don’t need to write down every number between zero and one to prove they exist, we don’t need to generate every possible image to prove they exist. Their existence is guaranteed by the mathematics that defines them… Every frame of every possible film exists as coordinates… To deny this would be to deny the existence of numbers themselves.

The nine-minute film demonstrates that the number of possible images or films is greater than the number of atoms in the universe and argues that photographers and filmmakers may be seen as discovering images that already exist in the possibility space rather than creating something new.

Within that framework, it’s easy to argue that generative AI is just another way for artists to “discover” images.

The balancing act

“We are all—and I include myself in that group as well—obsessed with technology, and we keep chatting about models and data sets and training and capabilities,” Runway CEO Cristóbal Valenzuela said to me when we spoke the next morning. “But if you look back and take a minute, the festival was celebrating filmmakers and artists.”

I admitted that I found myself moved by Total Pixel Space‘s articulations. “The winner would never have thought of himself as a filmmaker, and he made a film that made you feel something,” Valenzuela responded. “I feel that’s very powerful. And the reason he could do it was because he had access to something that just wasn’t possible a couple of months ago.”

First-time and outsider filmmakers were the focus of AIFF 2025, but Runway works with established studios, too—and those relationships have an inherent tension.

The company has signed deals with companies like Lionsgate and AMC. In some cases, it trains on data provided by those companies; in others, it embeds within them to try to develop tools that fit how they already work. That’s not something competitors like OpenAI are doing yet, so that, combined with a head start in video generation, has allowed Runway to grow and stay competitive so far.

“We go directly into the companies, and we have teams of creatives that are working alongside them. We basically embed ourselves within the organizations that we’re working with very deeply,” Valenzuela explained. “We do versions of our film festival internally for teams as well so they can go through the process of making something and seeing the potential.”

Founded in 2018 at New York University’s Tisch School of the Arts by two Chileans and one Greek co-founder, Runway has a very different story than its Silicon Valley competitors. It was one of the first to bring an actually usable video-generation tool to the masses. Runway also contributed in foundational ways to the popular Stable Diffusion model.

Though it is vastly outspent by competitors like OpenAI, it has taken a hands-on approach to working with existing industries. You won’t hear Valenzuela or other Runway leaders talking about the imminence of AGI or anything so lofty; instead, it’s all about selling the product as something that can solve existing problems in creatives’ workflows.

Still, an artist’s mindset and relationships within the industry don’t negate some fundamental conflicts. There are multiple intellectual property cases involving Runway and its peers, and though the company hasn’t admitted it, there is evidence that it trained its models on copyrighted YouTube videos, among other things.

Cristóbal Valenzuela speaking on the AIFF 2025 stage. Credit: Samuel Axon

Valenzuela suggested that studios are worried about liability, not underlying principles, though, saying:

Most of the concerns on copyright are on the output side, which is like, how do you make sure that the model doesn’t create something that already exists or infringes on something. And I think for that, we’ve made sure our models don’t and are supportive of the creative direction you want to take without being too limiting. We work with every major studio, and we offer them indemnification.

In the past, he has also defended Runway by saying that what it’s producing is not a re-creation of what has come before. He sees the tool’s generative process as distinct—legally, creatively, and ethically—from simply pulling up assets or references from a database.

“People believe AI is sort of like a system that creates and conjures things magically with no input from users,” he said. “And it’s not. You have to do that work. You still are involved, and you’re still responsible as a user in terms of how you use it.”

He seemed to share this defense of AI as a legitimate tool for artists with conviction, but given that he’s been pitching these products directly to working filmmakers, he was also clearly aware that not everyone agrees with him. There is not even a consensus among those in the industry.

An industry divided

While in LA for the event, I visited separately with two of my oldest friends. Both of them work in the film and television industry in similar disciplines. They each asked what I was in town for, and I told them I was there to cover an AI film festival.

One immediately responded with a grimace of disgust, “Oh, yikes, I’m sorry.” The other responded with bright eyes and intense interest and began telling me how he already uses AI in his day-to-day to do things like extend shots by a second or two for a better edit, and expressed frustration at his company for not adopting the tools faster.

Neither is alone in their attitudes. Hollywood is divided—and not for the first time.

There have been seismic technological changes in the film industry before. There was the transition from silent films to talkies, obviously; moviemaking transformed into an entirely different art. Numerous old jobs were lost, and numerous new jobs were created.

Later, there was the transition from film to digital projection, which may be an even tighter parallel. It was a major disruption, with some companies and careers collapsing while others rose. There were people saying, “Why do we even need this?” while others believed it was the only sane way forward. Some audiences declared the quality worse, and others said it was better. There were analysts arguing it could be stopped, while others insisted it was inevitable.

IMAX’s head of post production, Bruce Markoe, spoke briefly about that history at a press mixer before the festival. “It was a little scary,” he recalled. “It was a big, fundamental change that we were going through.”

People ultimately embraced it, though. “The motion picture and television industry has always been very technology-forward, and they’ve always used new technologies to advance the state of the art and improve the efficiencies,” Markoe said.

When asked whether he thinks the same thing will happen with generative AI tools, he said, “I think some filmmakers are going to embrace it faster than others.” He pointed to AI tools’ usefulness for pre-visualization as particularly valuable and noted some people are already using it that way, but it will take time for people to get comfortable with.

And indeed, many, many filmmakers are still loudly skeptical. “The concept of AI is great,” The Mitchells vs. the Machines director Mike Rianda said in a Wired interview. “But in the hands of a corporation, it is like a buzzsaw that will destroy us all.”

Others are interested in the technology but are concerned that it’s being brought into the industry too quickly, with insufficient planning and protections. That includes Crafty Apes Senior VFX Supervisor Luke DiTomasso. “How fast do we roll out AI technologies without really having an understanding of them?” he asked in an interview with Production Designers Collective. “There’s a potential for AI to accelerate beyond what we might be comfortable with, so I do have some trepidation and am maybe not gung-ho about all aspects of it.

Others remain skeptical that the tools will be as useful as some optimists believe. “AI never passed on anything. It loved everything it read. It wants you to win. But storytelling requires nuance—subtext, emotion, what’s left unsaid. That’s something AI simply can’t replicate,” said Alegre Rodriquez, a member of the Emerging Technology committee at the Motion Picture Editors Guild.

The mirror

Flying back from Los Angeles, I considered two key differences between this generative AI inflection point for Hollywood and the silent/talkie or film/digital transitions.

First, neither of those transitions involved an existential threat to the technology on the basis of intellectual property and copyright. Valenzuela talked about what matters to studio heads—protection from liability over the outputs. But the countless creatives who are critical of these tools also believe they should be consulted and even compensated for their work’s use in the training data for Runway’s models. In other words, it’s not just about the outputs, it’s also about the sourcing. As noted before, there are several cases underway. We don’t know where they’ll land yet.

Second, there’s a more cultural and philosophical issue at play, which Valenzuela himself touched on in our conversation.

“I think AI has become this sort of mirror where anyone can project all their fears and anxieties, but also their optimism and ideas of the future,” he told me.

You don’t have to scroll for long to come across techno-utopians declaring with no evidence that AGI is right around the corner and that it will cure cancer and save our society. You also don’t have to scroll long to encounter visceral anger at every generative AI company from people declaring the technology—which is essentially just a new methodology for programming a computer—fundamentally unethical and harmful, with apocalyptic societal and economic ramifications.

Amid all those bold declarations, this film festival put the focus on the on-the-ground reality. First-time filmmakers who might never have previously cleared Hollywood’s gatekeepers are getting screened at festivals because they can create competitive-looking work with a fraction of the crew and hours. Studios and the people who work there are saying they’re saving time, resources, and headaches in pre-viz, editing, visual effects, and other work that’s usually done under immense time and resource pressure.

“People are not paying attention to the very huge amount of positive outcomes of this technology,” Valenzuela told me, pointing to those examples.

In this online discussion ecosystem that elevates outrage above everything else, that’s likely true. Still, there is a sincere and rigorous conviction among many creatives that their work is contributing to this technology’s capabilities without credit or compensation and that the structural and legal frameworks to ensure minimal human harm in this evolving period of disruption are still inadequate. That’s why we’ve seen groups like the Writers Guild of America West support the Generative AI Copyright Disclosure Act and other similar legislation meant to increase transparency about how these models are trained.

The philosophical question with a legal answer

The winning film argued that “total pixel space represents both the ultimate determinism and the ultimate freedom—every possibility existing simultaneously, waiting for consciousness to give it meaning through the act of choice.”

In making this statement, the film suggested that creativity, above all else, is an act of curation. It’s a claim that nothing, truly, is original. It’s a distillation of human expression into the language of mathematics.

To many, that philosophy rings undeniably true: Every possibility already exists, and artists are just collapsing the waveform to the frame they want to reveal. To others, there is more personal truth to the romantic ideal that artwork is valued precisely because it did not exist until the artist produced it.

All this is to say that the debate about creativity and AI in Hollywood is ultimately a philosophical one. But it won’t be resolved that way.

The industry may succumb to litigation fatigue and a hollowed-out workforce—or it may instead find its way to fair deals, new opportunities for fresh voices, and transparent training sets.

For all this lofty talk about creativity and ideas, the outcome will come down to the contracts, court decisions, and compensation structures—all things that have always been at least as big a part of Hollywood as the creative work itself.

Photo of Samuel Axon

Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

Curated realities: An AI film festival and the future of human expression Read More »

how-a-grad-student-got-lhc-data-to-play-nice-with-quantum-interference

How a grad student got LHC data to play nice with quantum interference


New approach is already having an impact on the experiment’s plans for future work.

The ATLAS particle detector of the Large Hadron Collider (LHC) at the European Nuclear Research Center (CERN) in Geneva, Switzerland. Credit: EThamPhoto/Getty Images

The ATLAS particle detector of the Large Hadron Collider (LHC) at the European Nuclear Research Center (CERN) in Geneva, Switzerland. Credit: EThamPhoto/Getty Images

Measurements at the Large Hadron Collider have been stymied by one of the most central phenomena of the quantum world. But now, a young researcher has championed a new method to solve the problem using deep neural networks.

The Large Hadron Collider is one of the biggest experiments in history, but it’s also one of the hardest to interpret. Unlike seeing an image of a star in a telescope, saying anything at all about the data that comes out of the LHC requires careful statistical modeling.

“If you gave me a theory [that] the Higgs boson is this way or that way, I think people imagine, ‘Hey, you built the experiment, you should be able to tell me what you’re going to see under various hypotheses!’” said Daniel Whiteson, a professor at the University of California, Irvine. “But we don’t.”

One challenge with interpreting LHC data is interference, a core implication of quantum mechanics. Interference allows two possible events to inhibit each other, weakening the likelihood of seeing the result of either. In the presence of interference, physicists needed to use a fuzzier statistical method to analyze data, losing the data’s full power and increasing its uncertainty.

However, a recent breakthrough suggests a different way to tackle the problem. The ATLAS collaboration, one of two groups studying proton collisions at the LHC, released two papers last December that describe new ways of exploring data from their detector. One describes how to use a machine learning technique called Neural Simulation-Based Inference to maximize the potential of particle physics data. The other demonstrates its effectiveness with the ultimate test: re-doing a previous analysis with the new technique and seeing dramatic improvement.

The papers are the culmination of a young researcher’s six-year quest to convince the collaboration of the value of the new technique. Its success is already having an impact on the experiment’s plans for future work.

Making sense out of fusing bosons

Each particle collision at the LHC involves many possible pathways in which different particles combine to give rise to the spray of debris that experimenters see. In 2017, David Rousseau at IJCLab in Orsay, a member of the ATLAS collaboration, asked one of his students, Aishik Ghosh, to improve his team’s ability to detect a specific pathway. That particular pathway is quite important since it’s used to measure properties of the Higgs boson, a particle (first measured in 2012) that helps explain the mass of all other fundamental particles.

It was a pretty big ask. “When a grad student gets started in ATLAS, they’re a tiny cog in a giant, well-oiled machine of 3,500 physicists, who all seem to know exactly what they’re doing,” said Ghosh.

The pathway Ghosh was asked to study occurs via several steps. First, the two colliding protons each emit a W boson, a particle associated with the weak nuclear force. These two bosons fuse together, changing their identity to form a Higgs boson. The Higgs boson then decays, forming a pair of Z bosons, another particle associated with the weak force. Finally, those Z bosons themselves each decay into a lepton, like an electron, and its antimatter partner, like a positron.

A Feynman diagram for the pathway studied by Aishik Ghosh. Credit: ATLAS

Measurements like the one Ghosh was studying are a key way of investigating the properties of the Higgs boson. By precisely measuring how long it takes the Higgs boson to decay, physicists could find evidence of it interacting with new, undiscovered particles that are too massive for the LHC to produce directly.

Ghosh started on the project, hoping to find a small improvement in the collaboration’s well-tested methods. Instead, he noticed a larger issue. The goal he was given, of detecting a single pathway by itself, didn’t actually make sense.

“I was doing that and I realized, ‘What am I doing?’ There’s no clear objective,” said Ghosh.

The problem was quantum interference.

How quantum histories interfere

One of the most famous demonstrations of the mysterious nature of quantum mechanics is called the double-slit experiment. In this demonstration, electrons are shot through a screen with two slits that allow them to pass through to a photographic plate on the other side. With one slit covered, the electrons form a pattern centered on the opening. The photographic plate lights up bright right across from the slit and dims further away from it.

With both slits open, you would expect the pattern to get brighter as more electrons reach the photographic plate. Instead, the effect varies. The two slits do not give rise to two nice bright peaks; instead, you see a rippling pattern in which some areas get brighter while others get dimmer, even though the dimmer areas should, in principle, be easier for electrons to reach.

The effect happens even if the electrons are shot at the screen one by one to stop them from influencing each other directly. It’s as if each electron carries with it two possible histories, one in which it goes through one slit and another where it goes through the other before both end up at the same place. These two histories interfere with each other so that some destinations become less likely instead of more likely.

Results of the double-slit experiment. Credit: Jordgette (CC BY-SA 3.0)

For electrons in the double-slit experiment, the two different histories are two different paths through space. For a measurement at the Large Hadron Collider, the histories are more abstract—paths that lead through transformations of fields. One history might be like the pathway Ghosh was asked to study, in which two W bosons fuse to form a Higgs boson before the Higgs boson splits into two Z bosons. But in another history, the two W bosons might fuse and immediately split into two Z bosons without ever producing a Higgs.

Both histories have the same beginning, with two W bosons, and the same end, with two Z bosons. And just as the two histories of electrons in the double-slit experiment can interfere, so can the two histories for these particles.

Another possible history for colliding particles at the Large Hadron Collider, which interferes with the measurement Ghosh was asked to do. Credit: ATLAS

That interference makes the effect of the Higgs boson much more challenging to spot. ATLAS scientists wanted to look for two pairs of electrons and positrons, which would provide evidence that two Z bosons were produced. They would classify their observations into two types: observations that are evidence for the signal they were looking for (that of a decaying Higgs boson) and observations of events that generate this pattern of particles without the Higgs boson acting as an intermediate (the latter are called the background). But the two types of observations, signal and background, interfere. With a stronger signal, corresponding to more Higgs bosons decaying, you might observe more pairs of electrons and positrons… but if these events interfere, you also might see those pairs disappear.

Learning to infer

In traditional approaches, those disappearances are hard to cope with, even when using methods that already incorporate machine learning.

One of the most common uses of machine learning is classification—for example, distinguishing between pictures of dogs and cats. You train the machine on pictures of cats and pictures of dogs, and it tells you, given a picture, which animal is the most likely match. Physicists at the LHC were already using this kind of classification method to characterize the products of collisions, but it functions much worse when interference is involved.

“If you have something that disappears, you don’t quite know what to train on,” said David Rousseau. “Usually, you’re training signal versus background, exactly like you’re training cats versus dogs. When there is something that disappears, you don’t see what you trained on.”

At first, Ghosh tried a few simple tricks, but as time went on, he realized he needed to make a more fundamental change. He reached out to others in the community and learned about a method called Neural Simulation-Based Inference, or NSBI.

In older approaches, people had trained machine learning models to classify observations into signal and background, using simulations of particle collisions to make the training data. Then they used that classification to infer the most likely value of a number, like the amount of time it takes a Higgs boson to decay, based on data from an actual experiment. Neural Simulation-Based Inference skips the classification and goes directly to the inference.

Instead of trying to classify observations into signal and background, NSBI uses simulations to teach an artificial neural network to guess a formula called a likelihood ratio. Someone using NSBI would run several simulations that describe different situations, such as letting the Higgs boson decay at different rates, and then check how many of each type of simulation yielded a specific observation. The fraction of these simulations with a certain decay rate would provide the likelihood ratio, a method for inferring which decay rate is more likely given experimental evidence. If the neural network is good at guessing this ratio, it will be good at finding how long the Higgs takes to decay.

Because NSBI doesn’t try to classify observations into different categories, it handles quantum interference more effectively. Instead of trying to find the Higgs based on a signal that disappears, it examines all the data, trying to guess which decay time is the most likely.

Ghosh tested the method, which showed promising results on test data, and presented the results at a conference in 2019. But if he was going to convince the ATLAS collaboration that the method was safe to use, he still had a lot of work ahead of him.

Shifting the weight on ATLAS’ shoulders

Experiments like ATLAS have high expectations attached to them. A collaboration of thousands of scientists, ATLAS needs to not only estimate the laws of physics but also have a clear idea of just how uncertain those estimates are. At the time, NSBI hadn’t been tested in that way.

“None of this has actually been used on data,” said Ghosh. “Nobody knew how to quantify the uncertainties. So you have a neural network that gives you a likelihood. You don’t know how good the likelihood is. Is it well-estimated? What if it’s wrongly estimated just in some weird corner? That would completely bias your results.”

Checking those corners was too big a job for a single PhD student and too complex to complete within a single PhD degree. Aishik would have to build a team, and he would need time to build that team. That’s tricky in the academic world, where students go on to short-term postdoc jobs with the expectation that they quickly publish new results to improve their CV for the next position.

“We’re usually looking to publish the next paper within two to three years—no time to overhaul our methods,” said Ghosh. Fortunately, Ghosh had support. He received his PhD alongside Rousseau and went to work with Daniel Whiteson, who encouraged him to pursue his ambitious project.

“I think it’s really important that postdocs learn to take those risks because that’s what science is,” Whiteson said.

Ghosh gathered his team. Another student of Rousseau’s, Arnaud Maury, worked to calibrate the machine’s confidence in its answers. A professor at the University of Massachusetts, Rafael Coelho Lopes de Sa, joined the project. His student Jay Sandesara would have a key role in getting the calculation to work at full scale on a computer cluster. IJCLab emeritus RD Schaffer and University of Liège professor Gilles Loupe provided cross-checks and advice.

The team wanted a clear demonstration that their method worked, so they took an unusual step. They took data that ATLAS had already analyzed and performed a full analysis using their method instead, showing that it could pass every check the collaboration could think of. They would publish two papers, one describing the method and the other giving the results of their upgraded analysis. Zach Marshall, who was the computing coordinator for ATLAS at the time, helped get the papers through, ensuring that they were vetted by experts in multiple areas.

“It was a very small subset of our community that had that overlap between this technical understanding and the physics analysis experience and understanding that were capable of really speaking to whether that paper was sufficient and intelligible and useful. So we really had to make sure that we engaged that little group of humans by name,” said Marshall.

The new method showed significant improvements, getting a much more precise result than the collaboration’s previous analysis. That improvement, and the thorough checks, persuaded ATLAS to use NSBI more broadly going forward. It will give them much more precision than they expected, using the Higgs boson to search for new particles and clarify our understanding of the quantum world. When ATLAS discusses its future plans, it makes projections of the precision it expects to reach in the future. But those plans are now being upended.

“One of the fun things about this method that Aishik pushed hard is each time it feels like now we do that projection—here’s how well we’ll do in 15 years—we absolutely crush those projections,” said Marshall. “So we are just now having to redo a set of projections because we matched our old projections for 15 years out already today. It’s a very fun problem to have.”

How a grad student got LHC data to play nice with quantum interference Read More »