Features

marine-biologist-for-a-day:-ars-goes-shark-tagging

Marine biologist for a day: Ars goes shark tagging


We did not need a bigger boat

Go shark fishing on the RV Garvin, get hooked on the ideas behind it.

Image of three people kneeling over a large brown shark, as two others look on.

Field School staff made sure the day out was a safe and satisfying experience.

Field School staff made sure the day out was a safe and satisfying experience.

MIAMI—We were beginning to run out of bait, and the sharks weren’t cooperating.

Everybody aboard the Research Vessel Garvin had come to Miami for the sharks—to catch them, sample them, and tag them, all in the name of science. People who once wanted to be marine biologists, actual marine biologists, shark enthusiasts, the man who literally wrote the book Why Sharks Matter, and various friends and family had spent much of the day sending fish heads set with hooks over the side of the Garvin. But each time the line was hauled back in, it came in slack, with nothing but half-eaten bait or an empty hook at the end.

And everyone was getting nervous.

I: “No assholes”

The Garvin didn’t start out as a research vessel. Initially, it was a dive boat that took people to wrecks on the East Coast. Later, owner Hank Garvin used it to take low-income students from New York City and teach them how to dive, getting them scuba certified. But when Garvin died, his family put the boat, no longer in prime condition, on the market.

A thousand miles away in Florida, Catherine MacDonald was writing “no assholes” on a Post-it note.

At the time, MacDonald was the coordinator of a summer internship program at the University of Miami, where she was a PhD student. And even at that stage in her career, she and her colleagues had figured out that scientific field work had a problem.

“Science in general does not have a great reputation of being welcoming and supportive and inclusive and kind,” said David Shiffman, author of the aforementioned book and a grad school friend of MacDonald’s. “Field science is perhaps more of a problem than that. And field science involving what are called charismatic megafauna, the big animals that everyone loves, is perhaps worse than that. It’s probably because a lot of people want to do this, which means if we treat someone poorly and they quit, it’s not going to be long before someone else wants to fill the spot.”

MacDonald and some of her colleagues—Christian Pankow, Jake Jerome, Nick Perni, and Julia Wester (a lab manager and some fellow grad students at the time)—were already doing their best to work against these tendencies at Miami and help people learn how to do field work in a supportive environment. “I don’t think that you can scream abuse at students all day long and go home and publish great science,” she said, “because I don’t think that the science itself escapes the process through which it was generated.”

So they started to think about how they might extend that to the wider ocean science community. The “no assholes” Post-it became a bit of a mission statement, one that MacDonald says now sits in a frame in her office. “We decided out the gate that the point of doing this in part was to make marine science more inclusive and accessible and that if we couldn’t do that and be a successful business, then we were just going to fail,” she told Ars. “That’s kind of the plan.”

But to do it properly, they needed a boat. And that meant they needed money. “We borrowed from our friends and family,” MacDonald said. “I took out a loan on my house. It was just our money and all of the money that people who loved us were willing to sink into the project.”

Even that might not have been quite enough to afford a badly run-down boat. But the team made a personal appeal to Hank Garvin’s family. “They told the family who was trying to offload the boat, ‘Maybe someone else can pay you more for it, but here’s what we’re going to use it for, and also we’ll name the boat after your dad,'” Shiffman said. “And they got it.”

For the day, everybody who signed up had the chance to do most of the work that scientists normally would. Julia Saltzman

But it wasn’t enough to launch what would become the Field School. The Garvin was in good enough shape to navigate to Florida, but it needed considerable work before it could receive all the Coast Guard certifications required to get a Research Vessel designation. And given the team’s budget, that mostly meant the people launching the Field School had to learn to do the work themselves.

“One of [co-founder] Julia’s good friends was a boat surveyor, and he introduced us to a bunch of people who taught us skills or introduced us to someone else who could fix the alignment of our propellers or could suggest this great place in Louisiana that we could send the transmissions for rebuilding or could help us figure out which paints to use,” MacDonald said.

“We like to joke that we are the best PhD-holding fiberglassers in Miami,” she told Ars. “I don’t actually know if that’s true. I couldn’t prove it. But we just kind of jumped off the cliff together in terms of trying to make it work. Although we certainly had to hire folks to help us with a variety of projects, including building a new fuel tank because we are not the best PhD-holding welders in Miami for certain.”

II: Fishing for sharks

On the now fully refurbished Garvin, we were doing drum-line fishing. This involved a 16 kg (35-pound) weight connected to some floats by an extremely thick piece of rope. Also linked to the weight was a significant amount of 800-pound test line (meaning a monofilament polymer that won’t snap until something exerts over 800 lbs/360 kg of force on it) with a hook at the end. Most species of sharks need to keep swimming to force water over their gills or else suffocate; the length of the line allows them to swim in circles around the weight. The hook is also shaped to minimize damage to the fish during removal.

To draw sharks to the drum line, each of the floats had a small metal cage to hold chunks of fish that would release odorants. A much larger piece—either a head or cross-section of the trunk of a roughly foot-long fish—was set on the hook.

Deploying all of this was where the Garvin‘s passengers, none of whom had field research experience, came in. Under the tutelage of the people from the Field School, we’d lower the drum from a platform at the stern of the Garvin to the floor of Biscayne Bay, within sight of Miami’s high rises. A second shark enthusiast would send the float overboard as the Garvin‘s crew logged its GPS coordinates. After that, it was simply a matter of gently releasing the monofilament line from a large hand-held spool.

From right to left, the floats, the weight, and the bait all had to go into the water through an organized process. Julia Saltzman

One by one, we set 10 drums in a long row near one of the exits from Biscayne Bay. With the last one set, we went back to the first and reversed the process: haul in the float, use the rope to pull in the drum, and then let a Field School student test whether the line had a shark at the end. If not, it and the spool were handed over to a passenger, accompanied by tips on how to avoid losing fingers if a shark goes after the bait while being pulled back in.

Rebait, redeploy, and move on. We went down the line of 10 drums once, then twice, then thrice, and the morning gave way to afternoon. The routine became far less exciting, and getting volunteers for each of the roles in the process seemed to require a little more prodding. Conversations among the passengers and Field School people started to become the focus, the fishing a distraction, and people starting giving the bait buckets nervous looks.

And then, suddenly, a line went tight while it was being hauled in, and a large brown shape started moving near the surface in the distance.

III: Field support

Mortgaging your home is not a long-term funding solution, so over time, the Field School has developed a bit of a mixed model. Most of the people who come to learn there pay the costs for their time on the Garvin. That includes some people who sign up for one of the formal training programs. Shiffman also uses them to give undergraduates in the courses he teaches some exposure to actual research work.

“Over spring break this year, Georgetown undergrads flew down to Miami with me and spent a week living on Garvin, and we did some of what you saw,” he told Ars. “But also mangrove, snorkeling, using research drones, and going to the Everglades—things like that.” They also do one-day outings with some local high schools.

Many of the school’s costs, however, are covered by groups that pay to get the experience of being an ocean scientist for a day. These have included everything from local Greenpeace chapters to companies signing up for a teamwork-building experience. “The fundraiser rate [they pay] factors in not only the cost of taking those people out but also the cost of taking a low-income school group out in the future at no cost,” Shiffman said.

And then there are groups like the one I was joining—paying the fundraiser rate but composed of random collections of people brought together by little more than meeting Shiffman, either in person or online. In these cases, the Garvin is filled with a combination of small groups nucleated by one shark fan or people who wanted to be a marine biologist at some point or those who simply have a general interest in science. They’ll then recruit one or more friends or family members to join them, with varying degrees of willingness.

For a day, they all get to contribute to research. A lot of what we know about most fish populations comes from the fishing industry. And that information is often biased by commercial considerations, changing regulations, and more. The Field School trips, by contrast, give an unbiased sampling of whatever goes for its bait.

“The hardest part about marine biology research is getting to the animals—it’s boat time,” Shiffman said. “And since they’re already doing that, often in the context of teaching people how to do field skills, they reached out to colleagues all over the place and said, ‘Hey, here’s where we’re going. Here’s what we’re doing, here’s what we’re catching. Can we get any samples for you?’ So they’re taking all kinds of biological samples from the animals, and depending on what we catch, it can be for up to 15 different projects, with collaborators all over the country.”

And taking those samples is the passengers’ job. So shortly after leaving the marina on Garvin, we were divided up into teams and told what our roles would be once a shark was on board. One team member would take basic measurements of the shark’s dimensions. A second would scan the shark for parasites and place them in a sample jar, while another would snip a small piece of fin off to get a DNA sample. Finally, someone would insert a small tag at the base of the shark’s dorsal fin using a tool similar to a hollow awl. Amid all that, one of the Field School staff members would use a syringe to get a blood sample.

All of this would happen while members of the Field School staff were holding the shark in place—larger ones on a platform at the stern of the Garvin, smaller ones brought on board. The staff were the only ones who were supposed to get close to what Shiffman referred to as “the bitey end” of the shark. For most species, this would involve inserting one of three different-sized PVC tubes (for different-sized sharks) that seawater would be pumped through to keep the shark breathing and give them something to chomp down on. Other staff members held down the “slappy end.”

For a long time, all of this choreography seemed abstract. But there was finally a shark on the other end of the line, slowly being hauled toward the boat.

IV: Pure muscle and rage?

The size and brown color were an immediate tip-off to those in the know: We had a nurse shark, one that Shiffman described as being “pure muscle and rage.” Despite that, a single person was able to haul it in using a hand spool. Once restrained, the shark largely remained a passive participant in what came next. Nurse sharks are one of the few species that can force water over their gills even when stationary, and the shark’s size—it would turn out to be over 2 meters long—meant that it would need to stay partly submerged on the platform in the back.

So one by one, the first team splashed onto the platform and got to work. Despite their extremely limited training, it took just over five minutes for them to finish the measurements and get all the samples they needed. Details like the time, location, and basic measurements were all logged by hand on paper, although the data would be transferred to a spreadsheet once it was back on land. And the blood sample had some preliminary work done on the Garvin itself, which was equipped with a small centrifuge. All of that data would eventually be sent off to many of the Field School’s collaborators.

Shark number two, a blacktip, being hauled to the Garvin. Julia Saltzman

Since the shark was showing no signs of distress, all the other teams were allowed to step onto the platform and pet it, partly due to the fear that this would be the only one we caught that day. Sharks have a skin that’s smooth in one direction but rough if stroked in the opposite orientation, and their cartilaginous skeleton isn’t as solid as the bone most other vertebrates rely on. It was very much not like touching any other fish I’d encountered.

After we had all literally gotten our feet wet, the shark, now bearing the label UM00229, was sent on its way, and we went back to checking the drum lines.

A short time later, we hauled in a meter-long blacktip shark. This time, we set it up on an ice chest on the back of the boat, with a PVC tube firmly inserted into its mouth. Again, once the Field School staff restrained the shark, the team of amateurs got to work quickly and efficiently, with the only mishap being a person who rubbed their fingers the wrong way against the shark skin and got an abrasion that drew a bit of blood. Next up would be team three, the final group—and the one I was a part of.

V: The culture beyond science

I’m probably the perfect audience for an outing like this. Raised on a steady diet of Jacques Cousteau documentaries, I was also drawn to the idea of marine biology at one point. And having spent many of my years in molecular biology labs, I found myself jealous of the amazing things the field workers I’d met had experienced. The idea of playing shark scientist for a day definitely appealed to me.

A shark swims away from the camera.

Once processed, the sharks seemed content to get back to the business of being a shark. Credit: Julia Saltzman

But I probably came away as impressed by the motivation behind the Field School as I was with the sharks. I’ve been in science long enough to see multiple examples of the sort of toxic behaviors that the school’s founders wanted to avoid, and I wondered how science would ever change when there’s no obvious incentive for anyone to improve their behavior. In the absence of those incentives, MacDonald’s idea is to provide an example of better behavior—and that might be the best option.

“Overall, the thing that I really wanted at the end of the day was for people to look at some of the worst things about the culture of science and say, ‘It doesn’t have to be like that,'” she told Ars.

And that, she argues, may have an impact that extends well beyond science. “It’s not just about training future scientists, it’s about training future people,” she said. “When science and science education hurts people, it affects our whole society—it’s not that it doesn’t matter to the culture of science, because it profoundly does, but it matters more broadly than that as well.”

With motivations like that, it would have felt small to be upset that my career as a shark tagger ended up in the realm of unfulfilled potential, since I was on tagging team three, and we never hooked shark number three. Still, I can’t say I wasn’t a bit annoyed when I bumped into Shiffman a few weeks later, and he gleefully informed me they caught 14 of them the day after.

If you have a large enough group, you can support the Field School by chartering the Garvin for an outing. For smaller groups, you need to get in touch with David Shiffman.

Listing image: Julia Saltzman

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Marine biologist for a day: Ars goes shark tagging Read More »

it’s-“frighteningly-likely”-many-us-courts-will-overlook-ai-errors,-expert-says

It’s “frighteningly likely” many US courts will overlook AI errors, expert says


Judges pushed to bone up on AI or risk destroying their court’s authority.

A judge points to a diagram of a hand with six fingers

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

Order in the court! Order in the court! Judges are facing outcry over a suspected AI-generated order in a court.

Fueling nightmares that AI may soon decide legal battles, a Georgia court of appeals judge, Jeff Watkins, explained why a three-judge panel vacated an order last month that appears to be the first known ruling in which a judge sided with someone seemingly relying on fake AI-generated case citations to win a legal fight.

Now, experts are warning that judges overlooking AI hallucinations in court filings could easily become commonplace, especially in the typically overwhelmed lower courts. And so far, only two states have moved to force judges to sharpen their tech competencies and adapt so they can spot AI red flags and theoretically stop disruptions to the justice system at all levels.

The recently vacated order came in a Georgia divorce dispute, where Watkins explained that the order itself was drafted by the husband’s lawyer, Diana Lynch. That’s a common practice in many courts, where overburdened judges historically rely on lawyers to draft orders. But that protocol today faces heightened scrutiny as lawyers and non-lawyers increasingly rely on AI to compose and research legal filings, and judges risk rubberstamping fake opinions by not carefully scrutinizing AI-generated citations.

The errant order partly relied on “two fictitious cases” to deny the wife’s petition—which Watkins suggested were “possibly ‘hallucinations’ made up by generative-artificial intelligence”—as well as two cases that had “nothing to do” with the wife’s petition.

Lynch was hit with $2,500 in sanctions after the wife appealed, and the husband’s response—which also appeared to be prepared by Lynch—cited 11 additional cases that were “either hallucinated” or irrelevant. Watkins was further peeved that Lynch supported a request for attorney’s fees for the appeal by citing “one of the new hallucinated cases,” writing it added “insult to injury.”

Worryingly, the judge could not confirm whether the fake cases were generated by AI or even determine if Lynch inserted the bogus cases into the court filings, indicating how hard it can be for courts to hold lawyers accountable for suspected AI hallucinations. Lynch did not respond to Ars’ request to comment, and her website appeared to be taken down following media attention to the case.

But Watkins noted that “the irregularities in these filings suggest that they were drafted using generative AI” while warning that many “harms flow from the submission of fake opinions.” Exposing deceptions can waste time and money, and AI misuse can deprive people of raising their best arguments. Fake orders can also soil judges’ and courts’ reputations and promote “cynicism” in the justice system. If left unchecked, Watkins warned, these harms could pave the way to a future where a “litigant may be tempted to defy a judicial ruling by disingenuously claiming doubt about its authenticity.”

“We have no information regarding why Appellee’s Brief repeatedly cites to nonexistent cases and can only speculate that the Brief may have been prepared by AI,” Watkins wrote.

Ultimately, Watkins remanded the case, partly because the fake cases made it impossible for the appeals court to adequately review the wife’s petition to void the prior order. But no matter the outcome of the Georgia case, the initial order will likely forever be remembered as a cautionary tale for judges increasingly scrutinized for failures to catch AI misuses in court.

“Frighteningly likely” judge’s AI misstep will be repeated

John Browning, a retired justice on Texas’ Fifth Court of Appeals and now a full-time law professor at Faulkner University, last year published a law article Watkins cited that warned of the ethical risks of lawyers using AI. In the article, Browning emphasized that the biggest concern at that point was that lawyers “will use generative AI to produce work product they treat as a final draft, without confirming the accuracy of the information contained therein or without applying their own independent professional judgment.”

Today, judges are increasingly drawing the same scrutiny, and Browning told Ars he thinks it’s “frighteningly likely that we will see more cases” like the Georgia divorce dispute, in which “a trial court unwittingly incorporates bogus case citations that an attorney includes in a proposed order” or even potentially in “proposed findings of fact and conclusions of law.”

“I can envision such a scenario in any number of situations in which a trial judge maintains a heavy docket and looks to counsel to work cooperatively in submitting proposed orders, including not just family law cases but other civil and even criminal matters,” Browning told Ars.

According to reporting from the National Center for State Courts, a nonprofit representing court leaders and professionals who are advocating for better judicial resources, AI tools like ChatGPT have made it easier for high-volume filers and unrepresented litigants who can’t afford attorneys to file more cases, potentially further bogging down courts.

Peter Henderson, a researcher who runs the Princeton Language+Law, Artificial Intelligence, & Society (POLARIS) Lab, told Ars that he expects cases like the Georgia divorce dispute aren’t happening every day just yet.

It’s likely that a “few hallucinated citations go overlooked” because generally, fake cases are flagged through “the adversarial nature of the US legal system,” he suggested. Browning further noted that trial judges are generally “very diligent in spotting when a lawyer is citing questionable authority or misleading the court about what a real case actually said or stood for.”

Henderson agreed with Browning that “in courts with much higher case loads and less adversarial process, this may happen more often.” But Henderson noted that the appeals court catching the fake cases is an example of the adversarial process working.

While that’s true in this case, it seems likely that anyone exhausted by the divorce legal process, for example, may not pursue an appeal if they don’t have energy or resources to discover and overturn errant orders.

Judges’ AI competency increasingly questioned

While recent history confirms that lawyers risk being sanctioned, fired from their firms, or suspended from practicing law for citing fake AI-generated cases, judges will likely only risk embarrassment for failing to catch lawyers’ errors or even for using AI to research their own opinions.

Not every judge is prepared to embrace AI without proper vetting, though. To shield the legal system, some judges have banned AI. Others have required disclosures—with some even demanding to know which specific AI tool was used—but that solution has not caught on everywhere.

Even if all courts required disclosures, Browning pointed out that disclosures still aren’t a perfect solution since “it may be difficult for lawyers to even discern whether they have used generative AI,” as AI features become increasingly embedded in popular legal tools. One day, it “may eventually become unreasonable to expect” lawyers “to verify every generative AI output,” Browning suggested.

Most likely—as a judicial ethics panel from Michigan has concluded—judges will determine “the best course of action for their courts with the ever-expanding use of AI,” Browning’s article noted. And the former justice told Ars that’s why education will be key, for both lawyers and judges, as AI advances and becomes more mainstream in court systems.

In an upcoming summer 2025 article in The Journal of Appellate Practice & Process, “The Dawn of the AI Judge,” Browning attempts to soothe readers by saying that AI isn’t yet fueling a legal dystopia. And humans are unlikely to face “robot judges” spouting AI-generated opinions any time soon, the former justice suggested.

Standing in the way of that, at least two states—Michigan and West Virginia—”have already issued judicial ethics opinions requiring judges to be ‘tech competent’ when it comes to AI,” Browning told Ars. And “other state supreme courts have adopted official policies regarding AI,” he noted, further pressuring judges to bone up on AI.

Meanwhile, several states have set up task forces to monitor their regional court systems and issue AI guidance, while states like Virginia and Montana have passed laws requiring human oversight for any AI systems used in criminal justice decisions.

Judges must prepare to spot obvious AI red flags

Until courts figure out how to navigate AI—a process that may look different from court to court—Browning advocates for more education and ethical guidance for judges to steer their use and attitudes about AI. That could help equip judges to avoid both ignorance of the many AI pitfalls and overconfidence in AI outputs, potentially protecting courts from AI hallucinations, biases, and evidentiary challenges sneaking past systems requiring human review and scrambling the court system.

An overlooked part of educating judges could be exposing AI’s influence so far in courts across the US. Henderson’s team is planning research that tracks which models attorneys are using most in courts. That could reveal “the potential legal arguments that these models are pushing” to sway courts—and which judicial interventions might be needed, Henderson told Ars.

“Over the next few years, researchers—like those in our group, the POLARIS Lab—will need to develop new ways to track the massive influence that AI will have and understand ways to intervene,” Henderson told Ars. “For example, is any model pushing a particular perspective on legal doctrine across many different cases? Was it explicitly trained or instructed to do so?”

Henderson also advocates for “an open, free centralized repository of case law,” which would make it easier for everyone to check for fake AI citations. “With such a repository, it is easier for groups like ours to build tools that can quickly and accurately verify citations,” Henderson said. That could be a significant improvement to the current decentralized court reporting system that often obscures case information behind various paywalls.

Dazza Greenwood, who co-chairs MIT’s Task Force on Responsible Use of Generative AI for Law, did not have time to send comments but pointed Ars to a LinkedIn thread where he suggested that a structural response may be needed to ensure that all fake AI citations are caught every time.

He recommended that courts create “a bounty system whereby counter-parties or other officers of the court receive sanctions payouts for fabricated cases cited in judicial filings that they reported first.” That way, lawyers will know that their work will “always” be checked and thus may shift their behavior if they’ve been automatically filing AI-drafted documents. In turn, that could alleviate pressure on judges to serve as watchdogs. It also wouldn’t cost much—mostly just redistributing the exact amount of fees that lawyers are sanctioned to AI spotters.

Novel solutions like this may be necessary, Greenwood suggested. Responding to a question asking if “shame and sanctions” are enough to stop AI hallucinations in court, Greenwood said that eliminating AI errors is imperative because it “gives both otherwise generally good lawyers and otherwise generally good technology a bad name.” Continuing to ban AI or suspend lawyers as a preferred solution risks dwindling court resources just as cases likely spike rather than potentially confronting the problem head-on.

Of course, there’s no guarantee that the bounty system would work. But “would the fact of such definite confidence that your cures will be individually checked and fabricated cites reported be enough to finally… convince lawyers who cut these corners that they should not cut these corners?”

In absence of a fake case detector like Henderson wants to build, experts told Ars that there are some obvious red flags that judges can note to catch AI-hallucinated filings.

Any case number with “123456” in it probably warrants review, Henderson told Ars. And Browning noted that AI tends to mix up locations for cases, too. “For example, a cite to a purported Texas case that has a ‘S.E. 2d’ reporter wouldn’t make sense, since Texas cases would be found in the Southwest Reporter,” Browning said, noting that some appellate judges have already relied on this red flag to catch AI misuses.

Those red flags would perhaps be easier to check with the open source tool that Henderson’s lab wants to make, but Browning said there are other tell-tale signs of AI usage that anyone who has ever used a chatbot is likely familiar with.

“Sometimes a red flag is the language cited from the hallucinated case; if it has some of the stilted language that can sometimes betray AI use, it might be a hallucination,” Browning said.

Judges already issuing AI-assisted opinions

Several states have assembled task forces like Greenwood’s to assess the risks and benefits of using AI in courts. In Georgia, the Judicial Council of Georgia Ad Hoc Committee on Artificial Intelligence and the Courts released a report in early July providing “recommendations to help maintain public trust and confidence in the judicial system as the use of AI increases” in that state.

Adopting the committee’s recommendations could establish “long-term leadership and governance”; a repository of approved AI tools, education, and training for judicial professionals; and more transparency on AI used in Georgia courts. But the committee expects it will take three years to implement those recommendations while AI use continues to grow.

Possibly complicating things further as judges start to explore using AI assistants to help draft their filings, the committee concluded that it’s still too early to tell if the judges’ code of conduct should be changed to prevent “unintentional use of biased algorithms, improper delegation to automated tools, or misuse of AI-generated data in judicial decision-making.” That means, at least for now, that there will be no code-of-conduct changes in Georgia, where the only case in which AI hallucinations are believed to have swayed a judge has been found.

Notably, the committee’s report also confirmed that there are no role models for courts to follow, as “there are no well-established regulatory environments with respect to the adoption of AI technologies by judicial systems.” Browning, who chaired a now-defunct Texas AI task force, told Ars that judges lacking guidance will need to stay on their toes to avoid trampling legal rights. (A spokesperson for the State Bar of Texas told Ars the task force’s work “concluded” and “resulted in the creation of the new standing committee on Emerging Technology,” which offers general tips and guidance for judges in a recently launched AI Toolkit.)

“While I definitely think lawyers have their own duties regarding AI use, I believe that judges have a similar responsibility to be vigilant when it comes to AI use as well,” Browning said.

Judges will continue sorting through AI-fueled submissions not just from pro se litigants representing themselves but also from up-and-coming young lawyers who may be more inclined to use AI, and even seasoned lawyers who have been sanctioned up to $5,000 for failing to check AI drafts, Browning suggested.

In his upcoming “AI Judge” article, Browning points to at least one judge, 11th Circuit Court of Appeals Judge Kevin Newsom, who has used AI as a “mini experiment” in preparing opinions for both a civil case involving an insurance coverage issue and a criminal matter focused on sentencing guidelines. Browning seems to appeal to judges’ egos to get them to study up so they can use AI to enhance their decision-making and possibly expand public trust in courts, not undermine it.

“Regardless of the technological advances that can support a judge’s decision-making, the ultimate responsibility will always remain with the flesh-and-blood judge and his application of very human qualities—legal reasoning, empathy, strong regard for fairness, and unwavering commitment to ethics,” Browning wrote. “These qualities can never be replicated by an AI tool.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

It’s “frighteningly likely” many US courts will overlook AI errors, expert says Read More »

nothing-phone-3-review:-nothing-ventured,-nothing-gained

Nothing Phone 3 review: Nothing ventured, nothing gained


The Nothing Phone 3 is the company’s best phone by a wide margin, but is that enough?

Nothing Phone 3 reply hazy

The Nothing Phone 3 has a distinctive design. Credit: Ryan Whitwam

The Nothing Phone 3 has a distinctive design. Credit: Ryan Whitwam

The last few years have seen several smartphone makers pull back or totally abandon their mobile efforts. UK-based Nothing Technologies, however, is still trying to carve out a niche in the increasingly competitive smartphone market. Its tools have been quirky designs and glowing lights, along with a focus on markets outside the US. With the Nothing Phone 3, the company has brought its “first flagship” phone stateside.

Nothing didn’t swing for the fences with the Phone 3’s specs, but this device can hold its own with the likes of OnePlus and Google. Plus, it has that funky Nothing design aesthetic. There’s a transparent back, a tiny dot matrix screen, and a comprehensive Android skin. But at the end of the day, the Nothing Phone 3 is not treading new ground.

Designing Nothing

Despite Nothing’s talk about unique designs, the Nothing Phone 3 looks unremarkable from the front. The bezels are slim and symmetrical all the way around the screen. Under a sheet of Gorilla Glass 7i, it has a 6.67-inch 120Hz OLED screen with an impressive 1260 x 2800 resolution. It hits 4,500 nits of brightness, which is even higher than Google and Samsung phones. It’s more than bright enough to be readable outdoors, and the touch sensitivity is excellent—sometimes too excellent, as we’ve noticed a few accidental edge touches.

Specs at a glance: Nothing Phone 3
SoC Snapdragon 8s Gen 4
Memory 12GB, 16GB
Storage 256GB, 512GB
Display 1260 x 2800 6.67″ OLED, 120 Hz
Cameras 50MP primary, f/1.7, OIS; 50MP ultrawide, f/2.2; 50MP 3x telephoto, f/2.7, OIS; 50MP selfie, f/2.2
Software Android 15, 5 years of OS updates
Battery 5,150 mAh, 65 W wired charging, 15 W wireless charging
Connectivity Wi-Fi 7, NFC, Bluetooth 6.0, sub-6 GHz 5G, USB-C 3.2
Measurements 160.6 x 75.6 x 9 mm; 218 g

Like many other phones, the Nothing Phone 3 has an optical fingerprint sensor under the display. It’s quick and accurate, but it’s a bit too low (barely a pinky finger’s width from the bottom of the device). As an optical sensor, it’s also very bright in a dark room. Similar phones from Google and Samsung have faster and less disruptive ultrasonic fingerprint sensors.

Nothing Phone 3 home screen

Nothing OS is a great Android skin.

Credit: Ryan Whitwam

Nothing OS is a great Android skin. Credit: Ryan Whitwam

The overall shape of the phone is almost the same as current Samsung, Apple, and Google phones, but it’s closest to the Pixel 9 series. The IP68-rated body has the same minimalist aesthetic as those other phones, with flat edges and rounded corners. The aluminum frame curves in to merge seamlessly with the front and rear glass panels. It has a matte finish, making it reasonably grippy in the hand. Nothing includes a clear case in the box—we appreciate the effort, but the case feels very cheap and will probably discolor after a couple of months of use.

You won’t see anything extravagant like a headphone jack or IR blaster. The volume and power buttons are flat, tactile, and very stable, with no discernible wiggle. Below the power button is the Essential Key, a convex button that plugs into Nothing’s on-device AI features (more on that later). It’s a delight for button-lovers, but it can be too easy to accidentally press when picking up the phone. And no, you can’t remap the button to do something else.

Nothing Phone 3 side

The Essential Button has a nice feel, but it’s too easy to mistake for the power button.

Credit: Ryan Whitwam

The Essential Button has a nice feel, but it’s too easy to mistake for the power button. Credit: Ryan Whitwam

It’s not until you get to the back that the Nothing Phone 3 stands out. The back has a clear panel of extra-strong Gorilla Glass Victus, but you’re not seeing the phone’s internals through it. The panels under the glass have slightly different colors and textures and were chosen to create an interesting visual effect. It’s certainly eye-catching, but whether or not you like it is a matter of taste. The camera sensors are near the top in a staggered arrangement, right across from the “Glyph Matrix.”

The monochrome Glyph Matrix is Nothing’s replacement for the Glyph light bars on its older phones. A pressure-sensitive button under the glass can be pressed to switch between various display options, some of which might occasionally be useful, like a clock and battery monitor. There are also less useful “Glyph toys” like a Magic 8-ball, a low-fi mirror, and a Rock, Paper, Scissors simulator. It can also display call and status notifications, for instance letting you know when Do Not Disturb is activated or when you have a missed call. Or you can just turn the phone over and use the full display.

Nothing Phone 3 Glyph

The Glyph matrix is a gimmick, but it does look cool.

Credit: Ryan Whitwam

The Glyph matrix is a gimmick, but it does look cool. Credit: Ryan Whitwam

There’s only so much you can do with 489 LEDs and a single button, which makes some of the toys frustrating. For example, you have to long-press to stop the stopwatch, which defeats the purpose, and the selfie mirror is very difficult to use for framing a photo. The Glyph dot matrix is fun to play around with, but it’s just a gimmick. Really, how much time do you spend looking at the back of your phone? Checking the time or playing Rock, Paper, Scissors is not a game-changer, even if the display is visually interesting.

Flagship-ish performance

Nothing says this is a flagship phone, but it doesn’t have Qualcomm’s flagship mobile processor. While you’ll find the Snapdragon 8 Elite in most high-end devices today, Nothing went with the slightly more modest Snapdragon 8s Gen 4. It doesn’t have the Oryon CPU cores, relying instead on eight Arm reference cores, along with a slower GPU.

Nothing Phone 3 and Pixel 9 Pro XL

The Nothing Phone 3 (left) is about the same size and shape as the Pixel 9 Pro XL (right).

Credit: Ryan Whitwam

The Nothing Phone 3 (left) is about the same size and shape as the Pixel 9 Pro XL (right). Credit: Ryan Whitwam

What does that mean for the speeds and feeds? The Nothing Phone 3 doesn’t keep up with high-end devices like the Galaxy S25 in benchmarks, but it’s no slouch, either. In fact, the Snapdragon 8s Gen 4 beats Google’s latest Tensor chip featured in the Pixel 9 series.

As expected, the standard Arm cores fall behind the custom Oryon CPUs in Geekbench, running about 40 percent behind Qualcomm’s best processor. However, the gulf is much narrower in graphics because the Adreno 825 in the Nothing Phone 3 is very similar to the 830 used in Snapdragon 8 Elite phones.

So you could see better gaming performance with a phone like the Galaxy S25 compared to the Nothing Phone 3, but only if you’re playing something very graphically intensive. Even when running these devices side by side, we have a hard time noticing any loss of fidelity on the Nothing Phone 3. It performs noticeably better in high-end games compared to the latest Pixels, though. The Phone 3 maintains performance fairly well under load, only losing 25 to 30 percent at peak temperature. The body of the phone does get uncomfortably hot, but that’s better than overheating the processor.

That modest drop in CPU performance benchmarks does not equate to a poor user experience. The Nothing Phone 3 is very snappy, opening apps quickly and handling rapid multitasking without hesitation. The animations also have a Google level of polish.

Nothing managed to fit a 5,150 mAh battery in this phone, which is a bit larger than even the Galaxy S25 Ultra at 5,000 mAh. The battery life is strong, with the phone easily making it all day—no range anxiety. It won’t last through a second day on a single charge, though. Just like a Pixel or Galaxy phone, you’ll want to plug the Nothing Phone 3 in every night.

But you don’t necessarily have to save your charging for nighttime. The Nothing Phone 3 offers 65 W wired charging, which is much faster than what you get from Google, Samsung, or Apple phones. If the battery gets low, just a few minutes connected to almost any USB-PD charger will get you enough juice to head out the door. You also get 15 W wireless charging, but it doesn’t support the magnetic Qi 2 standard.

We’ve had no problems using the Phone 3 on T-Mobile, and Nothing says AT&T is also fully supported. However, there’s no official support for Verizon. The phone has all the necessary sub-6GHz 5G bands, but you may have trouble activating it as a new device on Verizon’s network.

Upgraded cameras

A camera upgrade was a necessary part of making this device a “flagship” phone, so Nothing equipped the Phone 3 with a solid array of sensors, ensuring you’ll get some good shots. They won’t all be good, though.

Nothing Phone 3 back

The clear glass shows off subtly differing blocks and a button to control the Glyph Matrix display.

Credit: Ryan Whitwam

The clear glass shows off subtly differing blocks and a button to control the Glyph Matrix display. Credit: Ryan Whitwam

The Nothing Phone 3 has a quartet of 50 MP sensors, including a wide-angle, a 3x telephoto, and an ultrawide on the back. The front-facing selfie camera is also 50 MP. While you can shoot in 50 MP mode, smartphone camera sensors are designed with pixel binning in mind. The phone outputs 12.5 MP images, leaning on merged pixel elements to brighten photos and speed up captures. We’ve found Nothing’s color balance and exposure to be very close to reality, and the dynamic range is good enough that you don’t have to worry about overly bright or dim backgrounds ruining a shot.

The Nothing Phone 3 cameras can produce sharp details, but some images tend to look overprocessed and “muddy.” However, the biggest issue is shutter lag—there’s too much of it. It seems like the phone is taking too long to stack and process images. So even outdoors and with a high shutter speed, a moving subject can look blurry. It’s challenging to snap a clear photo of a hyperactive kid or pet. In low-light settings, the shutter lag becomes worse, making it hard to take a sharp photo. Night mode shots are almost always a bit fuzzy.

Low indoor light. Ryan Whitwam

Photos of still subjects are generally good, and you can get some nice ones with the ultrawide camera. Landscapes look particularly nice, and the camera has autofocus for macro shots. This mode doesn’t activate automatically when you move in, so you have to remember it’s there. It’s worth remembering, though.

The telephoto sensor uses a periscope-style lens, which we usually see on sensors with 5x or higher zoom factors. This one is only 3x, so it will get you somewhat closer to your subject without cropping, but don’t expect the same quality you’d get from a Pixel or Samsung phone.

In its sub-flagship price range, we’d put the Nothing Phone 3 camera experience on par with Motorola. A device like the OnePlus 13R or Pixel 9a will take better pictures, but the Nothing Phone 3 is good enough unless mobile photography is at the top of your requirements.

Great software, plus an AI button

Nothing isn’t beating Samsung to the punch with Android 16—the first new phone to launch with Google’s latest OS will be the Z Fold 7 and Z Flip 7 later this month. Nothing is releasing its phone with Android 15 and Nothing OS 3.5, but an Android 16 update is promised soon. There’s not much in the first Android 16 release to get excited about, though, and in the meantime, Nothing OS is actually quite good.

Nothing’s take on Android makes changes to almost every UI element, which is usually a recipe for Samsung levels of clutter. However, Nothing remains true to its minimalist aesthetic throughout the experience. The icon styling is consistent and attractive, Nothing’s baked-in apps are cohesive, and the software includes some useful home screen options and widgets. Nothing also made a few good functional changes to Android, including a fully configurable quick settings panel and a faster way to clear your recent apps.

We’ve encountered a few minor bugs, like the weather widget that won’t show freedom units and a back gesture that can be a little finicky. Nothing’s Android skin is also very distinctive compared to other OEM themes. Not everyone will like the “dot matrix” vibe of Nothing OS, but it’s one of the more thoughtfully designed Android skins we’ve seen.

Nothing Phone 3 software

Nothing OS has a distinctive look.

Credit: Ryan Whitwam

Nothing OS has a distinctive look. Credit: Ryan Whitwam

Like every other 2025 smartphone, there’s an AI angle here. Nothing has a tool called Essential Space that ties into the aforementioned Essential Key. When you press the button, it takes a screenshot you can add notes to. It logs that in Essential Space and turns an AI loose on it to glean important details. It can create to-do lists and reminders based on the images, but those suggestions are misses as often as they are hits. There’s also no search function like the Google Pixel Screenshots app, which seems like a mistake. You can hold the essential key to record a voice memo, which goes through a similar AI process.

There are also some privacy caveats with Essential Space. The screenshots you save are uploaded to a remote server for processing, but Nothing says it won’t store any of that data. Your voice notes are processed on-device, but it would be nice if images were as well.

Nothing has part of a good idea with its mobile AI implementation, but it’s not as engaging as what we’ve seen from Google. And it’s not as if Google’s use of AI is essential to the mobile experience. The Nothing Phone 3 also gets the standard Gemini integration, and Google’s chatbot will probably get much more use than Essential Space.

Nothing has promised five years of major Android version updates, and there will be two additional years of security patches after that. Nothing is still a very new company, though, and there’s no guarantee it will still be around in seven years. If we assume the best, this is a good update policy, surpassing Motorola and OnePlus but not quite at the level of Google or Samsung, both of which offer seven years of full update support.

Different but not that different

The Nothing Phone 3 is a good smartphone, and it’s probably the best piece of hardware the company has made in its short run. The performance is snappy, the software is thoughtfully designed, and the hardware, while gimmicky, is solid and visually interesting. If you prefer a more understated look or plan to encapsulate your phone in the most durable case you can find, this is not the phone for you.

Nothing Phone 3

The Nothing Phone 3 is a rather large, heavy phone.

Credit: Ryan Whitwam

The Nothing Phone 3 is a rather large, heavy phone. Credit: Ryan Whitwam

Nothing’s Glyph Matrix is fun to play with, but it’s the kind of thing you’ll write off after some time with the phone. You can only play so many games of Rock, Paper, Scissors before the novelty wears off. Nothing is not alone in going down this path—Asus has a dot matrix on its ROG gaming phones, and Xiaomi has slapped full LCDs on the back of a few of its devices. It’s really no different from the days when OEMs tinkered with secondary ticker displays and rear-facing e-paper screens. Those weren’t very useful, either.

Nothing did all it could to make the secondary display attractive, but even if it came up with a truly great idea, there’s little utility in a screen on the back of your phone. The transparent design and dot matrix screen help the phone stand out from the crowd, but not because they’re doing anything radical. This is still a pretty typical glass sandwich smartphone, like most other 2025 offerings.

At $799, the Nothing Phone 3 is competing with devices like the Pixel 9 and OnePlus 13, both of which have it beat in the camera department, and the OnePlus phone is faster. Meanwhile, Google also has better update support. If you buy the Nothing Phone 3, it should be because you genuinely like the hardware and software design, and there’s very little bad to say about Nothing OS. Otherwise, there are better options for the same or less money.

The good

  • Excellent build quality with IP68 rating
  • Nothing OS looks and works great
  • Good performance
  • Glyph Matrix looks cool

The bad

  • Glyph Matrix is an unnecessary gimmick
  • AI features are still not very useful
  • Cameras have noticeable shutter lag
  • Verizon not officially supported

Photo of Ryan Whitwam

Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he’s written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards.

Nothing Phone 3 review: Nothing ventured, nothing gained Read More »

everything-we-learned-from-a-week-with-apple-carplay-ultra

Everything we learned from a week with Apple CarPlay Ultra


CarPlay Ultra takes over the main instrument display as well as the infotainment.

Aston Martin dashboard showing CarPlay ultra logo

Aston Martin is the first automaker to adopt Apple’a CarPlay Ultra, which takes over all the displays in the car. Credit: Michael Teo Van Runkle

Aston Martin is the first automaker to adopt Apple’a CarPlay Ultra, which takes over all the displays in the car. Credit: Michael Teo Van Runkle

For the 2025 model year, Aston Martin’s user interface took a major step forward across the lineup, with improvements to the physical controls and digital infotainment, as well as updated gauge cluster layouts. However, the big news dropped in the spring, when Aston and Apple announced the launch of CarPlay Ultra, the next generation of Apple’s nearly ubiquitous automotive operating system.

Ultra extends beyond the strictly “phone” functions of traditional CarPlay to now encompass more robust vehicular integration, including climate control, drive modes, and the entire gauge cluster readout. Running Ultra, therefore, requires a digital gauge cluster. So far, not many automakers other than Aston have signaled their intent to join the revolution: Kia/Hyundai/Genesis will adopt Ultra next, and Porsche may come after that.

Before future partnerships come to fruition, I spent a week with a DB12 Volante to test Ultra’s use cases and conceptual failure points, most critically to discover whether this generational leap actually enhances or detracts from an otherwise stellar driving experience.

Setup

The following gallery will take you through the setup process. Michael Teo Van Runkle

Connecting to Ultra via Bluetooth takes a minute or two longer than traditional CarPlay and includes more consent screens to cover the additional legal ramifications of the operating system sharing data with the car, and vice versa. Apple restricts this data to multimedia info, plus real-time speed and engine status, vehicle lights, and similar functions. Specifically, neither the iPhone nor third-party apps store any vehicle data after disconnecting from the car, and the car doesn’t keep personal data once the iPhone disconnects, either.

What about Siri? I generally keep Siri turned off so that accidental “Hey, Siri” activations don’t constantly interrupt my life—but by pushing the DB12’s steering wheel button, I could test simple tasks that went just about as well as typical for Siri (read: don’t expect much “Apple Intelligence” quite yet). Standard Siri data sharing with Apple therefore applies when used with Ultra.

I tested Ultra with an iPhone 16 Pro, but the software requires an iPhone 12 or newer and the latest iOS 18.5 update. As a type of simple failure exercise, I turned my phone off while driving more than once. Doing so reverts both the gauge cluster and infotainment screen to Aston’s native UI, the former almost instantly and the latter just a few seconds later. However, once I turned my phone back on, I struggled to reactivate either traditional CarPlay or Ultra until I forgot the device in my Bluetooth settings and started over from scratch. This held true for every attempt.

We didn’t love the fact that there was some latency with the needles on the dials. Michael Teo Van Runkle

Once initiated, though, Ultra fired up straightaway every time. Much faster than the typical lag to boot up traditional CarPlay. In fact, as soon as I unlocked the doors but before entering the DB12, the gauge cluster showed Ultra’s Apple-style readouts. These configurable designs, which Apple developed with Aston’s input, include a classic analog-style gauge view as well as layouts that allow for minimized data, navigation, and stylistic choices selectable through the center console screen or by swiping the haptic button on the DB12’s steering wheel.

Call me old-fashioned, but I still enjoy seeing a tachometer, speedometer, drive modes, and fuel level versus range remaining and a digital speed—especially on an engaging performance vehicle like the DB12 Volante. Apple might be skilled at making new tech easy to use, but it’s hard to beat the power of millions of minds adapting to analog gauges over the past century or so. And in this case, Ultra’s tach(s) showed a bit of latency or lag while ripping that 671-hp twin-turbo V8 up through the revs, something I never noticed in the native UI.

It’s much more holistic now

Ultra’s biggest improvements over preceding CarPlay generations are in the center console infotainment integration. Being able to access climate controls, drive modes, and traction settings without leaving the intuitive suite of CarPlay makes life much easier. In fact, changing between drive modes and turning traction control off or down via Aston’s nifty adjustable system caused less latency and lagging in the displays in Ultra. And for climate, Ultra actually brings up a much better screen after spinning the physical rotaries on the center console than you get through Aston’s UI—plus, I found a way to make the ventilated seats blow stronger, which I never located through the innate UI despite purposefully searching for a similar menu page.

There are different main instrument UIs to choose from, like this one. Michael Teo Van Runkle

Some specific functions do require dipping out of Ultra, though, including changing any audio settings for the spectacular Bowers & Wilkins sound system. I also found two glitches. Trying to bring down the DB12 Volante’s convertible top cued up a “Close trunk separator” alert, but the only way to close the trunk separator is via the same button as the convertible top. So instead, the windows only went up and down repeatedly as I tried to enjoy open-top motoring. This happened both in Ultra and without, however, so it could just be an Aston issue that Ultra couldn’t fix.

Plus, over the course of my eight days with Ultra, I experienced one moment where both the infotainment and gauge cluster went totally black. This resembled GM’s Ultium screen issues and lasted about 30 seconds or so before both flickered to life again. At first, I suspected an inadvertent attempt to activate nighttime driving mode. But again, this could have been an Aston issue, an Apple issue, or both.

Running around Los Angeles, I never found a spot with zero reception (I run e-sims, both Verizon and AT&T simultaneously, for this very reason), but I did purposefully enter airplane mode. This time, Ultra stayed active, and regardless, Apple assured me that essential functions, including navigation, can pre-load offline data for planned route guidance. But at the very worst, as with the phone turning off or battery dying, Ultra can simply revert to the onboard navigation.

Using Ultra regularly seemed to deplete my iPhone’s battery slightly more quickly than normal, and I noticed some warming of the iPhone—though without a controlled experiment, I can’t say with certainty whether these two symptoms happened quicker than simply running traditional CarPlay or Bluetooth. And in reality, most cars running Ultra (for Aston and beyond) should come equipped with wireless charge pads and plenty of USB-C ports anyhow to keep those batteries topped up. On hot summer days in LA, though, my iPhone seemed to get warmest while using inductive charging and Ultra simultaneously, to my admittedly unscientific touch.

Apple Maps is the only map that is allowed to go here in CarPlay Ultra. Michael Teo Van Runkle

For commuters who brave traffic using Advanced Driver Assistance Systems (ADAS), Ultra seemed to work smoothly with the DB12’s lane departure warnings, steering corrections, and adaptive cruise control—though I typically turn all this off via Aston’s handy single button, which helps to stave off frustration. This introduces a loophole or gap in regulations, however, whether CarPlay Ultra needs to meet the ISO’s ASIL-D standards or achieve some kind of National Highway Traffic Safety Administration certification.

Traditional CarPlay stuck with infotainment and basic “phone” functions, but now that the iPhone essentially accesses and displays ADAS, drive modes, and traction setting information, where does regulated consumer safety come in? And where does liability rest, in the event of a driver aid or corrective maneuver going awry? Somehow, this question seems most likely to wind up on the desk of an insurance adjuster sooner rather than later.

Can we try it in an EV?

For me, some disappointment arose from being unable to cue up either Waze or Google Maps in Ultra’s gauge cluster navigation screens rather than strictly Apple Maps. But in many ways, I suspect that Ultra might work even better when (or if) Hyundai/Kia/Genesis introduce compatible EVs, rather than Aston’s (so far) more classic ICE vehicles. And not just because the modern futurist aesthetic matches better, either, but more so thanks to the improved accuracy of range, charging, and navigation features.

The center infotainment screen’s integration with vehicular functions, therefore, stands out as much more of a pro for Aston Martins than Ultra’s gauge cluster readout, enhancing the driving experience through a more intuitive UI that decreases time spent glancing away from the road. For those who want to skip out on Ultra, it’s also worth noting that the iPhone allows for the choice to stick with traditional CarPlay only as well. However, I suspect car buyers will eventually begin to expect Ultra, even if the added jump to vehicular control represents somewhat less of a massive leap than simply picking between models equipped with CarPlay or not.

It’s unclear whether other automakers will find the advantages worthy of converting to Ultra, including Rivian, which offers neither CarPlay nor Android Auto, or GM, which skipped out on CarPlay for EVs. On the other hand, automakers may also decide to hesitate before handing over further control to Apple now that the Apple Car is officially dead. And in that regard, Ultra might just represent the final straw that inspires further improvements to proprietary user interfaces across the industry as well.

Everything we learned from a week with Apple CarPlay Ultra Read More »

the-iss-is-nearing-retirement,-so-why-is-nasa-still-gung-ho-about-starliner?

The ISS is nearing retirement, so why is NASA still gung-ho about Starliner?


NASA is doing all it can to ensure Boeing doesn’t abandon the Starliner program.

Boeing’s Starliner spacecraft atop a United Launch Alliance Atlas V rocket before a test flight in 2019. Credit: NASA/Joel Kowsky

Boeing’s Starliner spacecraft atop a United Launch Alliance Atlas V rocket before a test flight in 2019. Credit: NASA/Joel Kowsky

After so many delays, difficulties, and disappointments, you might be inclined to think that NASA wants to wash its hands of Boeing’s troubled Starliner spacecraft.

But that’s not the case.

The manager of NASA’s commercial crew program, Steve Stich, told reporters Thursday that Boeing and its propulsion supplier, Aerojet Rocketdyne, are moving forward with several changes to the Starliner spacecraft to resolve problems that bedeviled a test flight to the International Space Station (ISS) last year. These changes include new seals to plug helium leaks and thermal shunts and barriers to keep the spacecraft’s thrusters from overheating.

Boeing, now more than $2 billion in the hole to pay for all Starliner’s delays, is still more than a year away from executing on its multibillion-dollar NASA contract and beginning crew rotation flights to the ISS. But NASA officials say Boeing remains committed to Starliner.

“We really are working toward a flight as soon as early next year with Starliner, and then ultimately, our goal is to get into crew rotation flights with Starliner,” Stich said. “And those would start no earlier than the second crew rotation slot at the end of next year.”

That would be 11 years after Boeing officials anticipated the spacecraft would enter operational service for NASA when they announced the Starliner program in 2010.

Decision point

The next Starliner flight will probably transport only cargo to the ISS, not astronauts. But NASA hasn’t made any final decisions on the matter. The agency has enough crew rotation missions booked to fly on SpaceX’s Dragon spacecraft to cover the space station’s needs until well into 2027 or 2028.

“I think there are a lot of advantages, I would say, to fly the cargo flight first,” Stich said. “If we really look at the history of Starliner and Dragon, I think Dragon benefited a lot from having earlier [cargo] flights before the crew contract was let for the space station.”

One drawback of flying a Starliner cargo mission is that it will use up one of United Launch Alliance’s remaining Atlas V rockets currently earmarked for a future Starliner crew launch. That means Boeing would have to turn to another rocket to accomplish its full contract with NASA, which covers up to six crew missions.

While Boeing says Starliner can launch on several different rockets, the difficulty of adapting the spacecraft to a new launch vehicle, such as ULA’s Vulcan, shouldn’t be overlooked. Early in Starliner’s development, Boeing and ULA had to overcome an issue with unexpected aerodynamic loads discovered during wind tunnel testing. This prompted engineers to design an aerodynamic extension, or skirt, to go underneath the Starliner spacecraft on top of its Atlas V launcher.

Starliner has suffered delays from the beginning. A NASA budget crunch in the early 2010s pushed back the program about two years, but the rest of the schedule slips have largely fallen on Boeing’s shoulders. The setbacks included a fuel leak and fire during a critical ground test, parachute problems, a redesign to accommodate unanticipated aerodynamic forces, and a computer timing error that cut short Starliner’s first attempt to reach the space station in 2019.

This all culminated in the program’s first test flight with astronauts last summer. But after running into helium leaks and overheating thrusters, the mission ended with Starliner returning to Earth empty, while the spacecraft’s two crew members remained on the International Space Station until they could come home on a SpaceX Dragon spacecraft this year.

The outcome was a stinging disappointment for Boeing. Going into last year’s crew test flight, Boeing appeared to be on the cusp of joining SpaceX and finally earning revenue as one of NASA’s certified crew transportation providers for the ISS.

For several months, Boeing officials were strikingly silent on Starliner’s future. The company declined to release any statements on their long-term commitment to the program, and a Boeing program manager unexpectedly withdrew from a NASA press conference marking the end of the Starliner test flight last September.

Kelly Ortberg, Boeing’s president and CEO, testifies before the Senate Commerce, Science, and Transportation Committee on April 2, 2025, in Washington, DC. Credit: Win McNamee/Getty Images

But that has changed in the last few months. Kelly Ortberg, who took over as Boeing’s CEO last year, told CNBC in April that the company planned “more missions on Starliner” and said work to overcome the thruster issues the spacecraft encountered last year is “pretty straightforward.”

“We know what the problems were, and we’re making corrective actions,” Ortberg said. “So, we hope to do a few more flights here in the coming years.”

Task and purpose

NASA officials remain eager for Starliner to begin these regular crew rotation flights, even as its sole destination, the ISS, enters its sunset years. NASA and its international partners plan to decommission and scuttle the space station in 2030 and 2031, more than 30 years after the launch of the lab’s first module.

NASA’s desire to bring Starliner online has nothing to do with any performance issues with SpaceX, the agency’s other commercial crew provider. SpaceX has met or exceeded all of NASA’s expectations in 11 long-duration flights to the ISS with its Dragon spacecraft. Since its first crew flight in 2020, SpaceX has established a reliable cadence with Dragon missions serving NASA and private customers.

However, there are some questions about SpaceX’s long-term plans for the Dragon program, and those concerns didn’t suddenly spring up last month, when SpaceX founder and chief executive Elon Musk suggested on X that SpaceX would “immediately” begin winding down the Dragon program. The suggestion came as Musk and President Donald Trump exchanged threats and insults on social media amid a feud as the one-time political allies had a dramatic falling out months into Trump’s second term in the White House.

In a subsequent post on X, Musk quickly went back on his threat to soon end the Dragon program. SpaceX officials participating in NASA press conferences in the last few weeks have emphasized the company’s dedication to human spaceflight without specifically mentioning Dragon. SpaceX’s fifth and final human-rated Dragon capsule debuted last month on its first flight to the ISS.

“I would say we’re pretty committed to the space business,” said Bill Gerstenmaier, SpaceX’s vice president of build and flight reliability. “We’re committed to flying humans in space and doing it safely.”

There’s a kernel of truth behind Musk’s threat to decommission Dragon. Musk has long had an appetite to move on from the Dragon program and pivot more of SpaceX’s resources to Starship, the company’s massive next-generation rocket. Starship is envisioned by SpaceX as an eventual replacement for Dragon and the Falcon 9 launcher.

A high-resolution commercial Earth-imaging satellite owned by Maxar captured this view of the International Space Station on June 7, 2024, with Boeing’s Starliner capsule docked at the lab’s forward port (lower right). Credit: Satellite image (c) 2024 Maxar Technologies

NASA hopes commercial space stations can take over for the ISS after its retirement, but there’s no guarantee SpaceX will still be flying Dragon in the 2030s. This injects some uncertainty into plans for commercial space stations.

One possible scenario is that, sometime in the 2030s, the only options for transporting people to and from commercial space stations in low-Earth orbit could be Starliner and Starship. We’ll discuss the rationale for this scenario later in this story.

While the cost of a seat on SpaceX’s Dragon is well known, there’s low confidence in the price of a ticket to low-Earth orbit on Starliner or Starship. What’s more, some of the commercial outposts may be incompatible with Starship because of its enormous mass, which could overcome the ability of a relatively modest space station to control its orientation. NASA identified this as an issue with its Gateway mini-space station in development to fly in orbit around the Moon.

It’s impossible to predict when SpaceX will pull the plug on Dragon. The same goes with Boeing and Starliner. But NASA and other customers are interested in buying more Dragon flights.

If SpaceX can prove Starship is safe enough to launch and land with people onboard, Dragon’s days will be numbered. But Starship is likely at least several years from being human-rated for flights to and from low-Earth orbit. NASA’s contract with SpaceX to develop a version of Starship to land astronauts on the Moon won’t require the ship to be certified for launches and landings on Earth. In some ways, that’s a more onerous challenge than the Moon mission because of the perils of reentering Earth’s atmosphere, which Starship won’t need to endure for a lunar landing, and the ship’s lack of a launch abort system.

Once operational, Starship is designed to carry significantly more cargo and people than Falcon 9 and Dragon, but it’s anyone’s guess when it might be ready for crew missions. Until then, if SpaceX wants to have an operational human spaceflight program, it’s Dragon or bust.

For the International Space Station, it’s also Dragon or bust, at least until Boeing gets going. SpaceX’s capsules are the only US vehicles certified to fly to space with NASA astronauts, and any more US government payments to Russia to launch Americans on Soyuz missions would be politically unpalatable.

From the start of the commercial crew program, NASA sought two contractors providing their own means of flying to and from the ISS. The main argument for this “dissimilar redundancy” was to ensure NASA could still access the space station in the event of a launch failure or some other technical problem. The same argument could be made now that NASA needs two options to avoid being at the whim of one company’s decisions.

Stretching out

All of this is unfolding as the Trump administration seeks to slash funding for the International Space Station, cut back on the lab’s research program, and transition to “minimal safe operations” for the final few years of its life. Essentially, the space station would limp to the finish line, perhaps with a smaller crew than the seven-person staff living and working in it today.

At the end of this month, SpaceX is scheduled to launch the Crew-11 mission—the 12th Dragon crew mission for NASA and the 11th fully operational crew ferry flight to the ISS. Two Americans, one Japanese astronaut, and a Russian cosmonaut will ride to the station for a stay of at least six months.

NASA’s existing contract with SpaceX covers four more long-duration flights to the space station with Dragon, including the mission set to go on July 31.

One way NASA can save money in the space station’s budget is by simply flying fewer missions. Stich said Thursday that NASA is working with SpaceX to extend the Dragon spacecraft’s mission duration limit from seven months to eight months. The recertification of Dragon for a longer mission could be finished later this year, allowing NASA to extend Crew-11’s stay at the ISS if needed. Over time, longer stays mean fewer crew rotation missions.

“We can extend the mission in real-time as needed as we better understand… the appropriations process and what that means relative to the overall station manifest,” Stich said.

Boeing’s Starliner spacecraft backs away from the International Space Station on September 6, 2024, without its crew. Credit: NASA

Boeing’s fixed-price contract with NASA originally covered an unpiloted test flight of Starliner, a demonstration flight with astronauts, and then up to six operational missions delivering crews to the ISS. But NASA has only given Boeing the “Authority To Proceed” for three of its six potential operational Starliner missions. This milestone, known as ATP, is a decision point in contracting lingo where the customer—in this case, NASA—places a firm order for a deliverable. NASA has previously said it awards these task orders about two to three years prior to a mission’s launch.

If NASA opts to go to eight-month missions on the ISS with Dragon and Starliner, the agency’s firm orders for three Boeing missions and four more SpaceX crew flights would cover the agency’s needs into early 2030, not long before the final crew will depart the space station.

Stich said NASA officials are examining their options. These include whether NASA should book more crew missions with SpaceX, authorize Boeing to prepare for additional Starliner flights beyond the first three, or order no more flights at all.

“As we better understand the budget and better understand what’s in front of us, we’re working through that,” Stich said. “It’s really too early to speculate how many flights we’ll fly with each provider, SpaceX and Boeing.”

Planning for the 2030s

NASA officials also have an eye for what happens after 2030. The agency has partnered with commercial teams led by Axiom, Blue Origin, and Voyager Technologies on plans for privately owned space stations in low-Earth orbit to replace some of the research capabilities lost with the end of the ISS program.

The conventional wisdom goes that these new orbiting outposts will be less expensive to operate than the ISS, making them more attractive to commercial clients, ranging from pharmaceutical research and in-space manufacturing firms to thrill-seeking private space tourists. NASA, which seeks to maintain a human presence in low-Earth orbit as it turns toward the Moon and Mars, will initially be an anchor customer until the space stations build up more commercial demand.

These new space stations will need a way to receive cargo and visitors. NASA wants to preserve the existing commercial cargo and crew transport systems so they’re available for commercial space stations in the 2030s. Stich said NASA is looking at transferring the rights for any of the agency’s commercial crew missions that don’t fly to ISS over to the commercial space stations. Among NASA’s two commercial crew providers, it currently looks more likely that Boeing’s contract will have unused capacity than SpaceX’s when the ISS program ends.

This is a sweetener NASA could offer to its stable of private space station developers as they face other hurdles in getting their hardware off the ground. It’s unclear whether a business case exists to justify the expense of building and operating a commercial outpost in orbit or if the research and manufacturing customers that could use a private space station might find a cheaper option in robotic flying laboratories, such as those being developed by Varda Space Industries.

A rendering of Voyager’s Starlab space station. Credit: Voyager Space

NASA’s policies haven’t helped matters. Analysts say NASA’s financial support for private space station developers has lagged, and the agency’s fickle decision-making on when to retire the International Space Station has made private fundraising more difficult. It’s not a business for the faint-hearted. For example, Axiom has gone through several rounds of layoffs in the last year.

The White House’s budget request for fiscal year 2026 proposes a 25 percent cut to NASA’s overall budget, but the funding line for commercial space stations is an area marked for an increase. Still, there’s a decent chance that none of the proposed commercial outposts will be flying when the ISS crashes back to Earth. In that event, China would be the owner and operator of the only space station in orbit.

At least at first, transportation costs will be the largest expense for any company that builds and operates a privately owned space station. It costs NASA about 40 percent more each year to ferry astronauts and supplies to and from the ISS than it does to operate the space station. For a smaller commercial outpost with reduced operating costs, the gap will likely be even wider.

If Boeing can right the ship with Starliner and NASA offers a few prepaid crew missions to private space station developers, the money saved could help close someone’s business case and hasten the launch of a new era in commercial spaceflight.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

The ISS is nearing retirement, so why is NASA still gung-ho about Starliner? Read More »

two-guys-hated-using-comcast,-so-they-built-their-own-fiber-isp

Two guys hated using Comcast, so they built their own fiber ISP


Brothers-in-law use construction knowledge to compete against Comcast in Michigan.

Two young men stand outside next to service vans with a logo for Prime-One, the Internet provider they founded.

Samuel Herman (left) and Alexander Baciu (right), founders of Prime-One. Credit: Prime-One

Samuel Herman (left) and Alexander Baciu (right), founders of Prime-One. Credit: Prime-One

Samuel Herman and Alexander Baciu never liked using Comcast’s cable broadband. Now, the residents of Saline, Michigan, operate a fiber Internet service provider that competes against Comcast in their neighborhoods and has ambitions to expand.

“All throughout my life pretty much, I’ve had to deal with Xfinity’s bullcrap, them not being able to handle the speeds that we need,” Herman told Ars. “I lived in a house of 10. I have seven other brothers and sisters, and there’s 10 of us in total with my parents.”

With all those kids using the Internet for school and other needs, “it just doesn’t work out,” he said. Herman was particularly frustrated with Comcast upload speeds, which are much slower than the cable service’s download speeds.

“Many times we would have to call Comcast and let them know our bandwidth was slowing down… then they would say, ‘OK, we’ll refresh the system.’ So then it would work again for a week to two weeks, and then again we’d have the same issues,” he said.

Herman, now 25, got married in 2021 and started building his own house, and he tried to find another ISP to serve the property. He was familiar with local Internet service providers because he worked in construction for his father’s company, which contracts with ISPs to build their networks.

But no fiber ISP was looking to compete directly against Comcast where he lived, though Metronet and 123NET offer fiber elsewhere in the city, Herman said. He ended up paying Comcast $120 a month for gigabit download service with slower upload speeds. Baciu, who lives about a mile away from Herman, was also stuck with Comcast and was paying about the same amount for gigabit download speeds.

$80 for gigabit fiber, unlimited data

Herman said he was the chief operating officer of his father’s construction company and that he shifted the business “from doing just directional drilling to be a turnkey contractor for ISPs.” Baciu, Herman’s brother-in-law (having married Herman’s oldest sister), was the chief construction officer. Fueled by their knowledge of the business and their dislike of Comcast, they founded a fiber ISP called Prime-One.

Now, Herman is paying $80 a month to his own company for symmetrical gigabit service. Prime-One also offers 500Mbps for $75, 2Gbps for $95, and 5Gbps for $110. The first 30 days are free, and all plans have unlimited data and no contracts.

“We are 100 percent fiber optic,” Baciu told Ars. “Everything that we’re doing is all underground. We’re not doing aerial because we really want to protect the infrastructure and make sure we’re having a reliable connection.”

Each customer’s Optical Network Terminal (ONT) and other equipment is included in the service plan. Prime-One provides a modem and the ONT, plus a Wi-Fi router if the customer prefers not to use their own router. They don’t charge equipment or installation fees, Herman and Baciu said.

Prime-One began serving customers in January 2025, and Baciu said the network has been built to about 1,500 homes in Saline with about 75 miles of fiber installed. Prime-One intends to serve nearby towns as well, with the founders saying the plan is to serve 4,000 homes with the initial build and then expand further.

“This is our backyard”

Herman and Baciu’s main competition in their initial build area is Comcast and Frontier’s DSL service, they said. So far, they have built only to single-family homes, but they plan to serve multi-unit residential buildings, too.

“We started building in an area that’s a lot more rural,” where people have fewer options than in more densely populated areas, Herman said. “This is our home, this is our backyard, so we take this build very, very seriously.”

Baciu, who is 29, said that residents seem excited to have a new Internet option. “It’s so nice to see the excitement that they have. [People say], ‘Oh my gosh, I told everybody about Prime-One. My neighbor cannot wait for you guys to have them up, too. My boss is asking, my grandma’s asking.’ It’s a beautiful thing,” he said.

A bit more than 100 residents have bought service so far, they said. Herman said the company is looking to sign up about 30 percent of the homes in its network area to make a profit. “I feel fairly confident,” Herman said, noting the number of customers who signed up with the initial construction not even halfway finished.

Prime-One’s founders originally told us the 4,000-home build would be completed at the end of August, but Baciu indicated more recently that it will take longer than that. “We are working on sales for the next couple of months before continuing the rest of the build,” Baciu said.

Herman and Baciu started thinking about building an ISP about two years ago. With no fiber companies looking to compete against Comcast where they lived, “that was a trigger,” Baciu said. “We kept on talking. We’re like, hey, we’re doing this work for other people, why not?” In August 2024, they signed a contract with a firm that provides backhaul service, IP address assignments, and other key connectivity needs.

“We said, ‘let’s try to do it ourselves’”

ISPs generally want to build in areas where homes are built close together, requiring less fiber construction to serve more customers and make a bigger profit. Existing ISPs didn’t seem interested in expanding to where Herman and Baciu live, Herman said.

“We have spoken to all of these Internet service providers and asked them to come and service these areas. I knew that there was a dire need in this area and that everybody was sick of the Xfinity BS,” Herman said.

Having worked in construction for ISPs, they already had experience installing fiber lines and conduits.

A Prime-One installer working on a fiber build.

Credit: Prime-One

A Prime-One installer working on a fiber build. Credit: Prime-One

“We said, ‘you know, what the hell, why not? Let’s try to do it ourselves,'” Herman said. “We know we can handle the construction, we know we can handle all that area. We need some assistance on the technical side. So we hired the right people to handle the technical side and to handle the OSS/BSS software and to manage our dark fiber. And from there, we’re here where we’re at, within six months. We have over a hundred customers on our network, and we’re still building.”

Before construction, the brothers-in-law met with Jared Mauch, a Michigan man who built a fiber-to-the-home Internet provider because he couldn’t get good broadband service from AT&T or Comcast. We wrote about Mauch in 2021, when he was providing service to about 30 rural homes, and again in 2022, when he was expanding to hundreds of more homes.

Though Herman and Baciu already knew how to install fiber, Mauch “gave us quite a lot of insight on what to do, how to build, and on the actual ISP side… he showed us the way he did things on the technical side for the ISP, what strategies he used and what products he used,” Herman said.

The brothers-in-law didn’t end up using all the networking products Mauch suggested “because we are building a much larger network than he was,” Herman said. They went mostly with Nokia products for equipment like the optical network terminal installed at customer homes, he said.

Local employees

Baciu said he was frustrated by Comcast customer support being mostly limited to online chats instead of phone support. Prime-One has 15 local employees, mostly installers and technicians, with other employees working in customer service and operations, Herman said.

Prime-One offers phone and chat support, and “many people want to be able to see someone face to face, which is very easy for us to do since we have people here locally,” Herman said.

Network uptime has been good so far, Herman and Baciu said. “The only outage we’ve had was due to severe weather that caused a massive outage” for multiple networks, Herman said. “Any time any customers are experiencing an outage, maybe because of a lawnmower that cut their service line or anything, we guarantee a two- to four-hour time to repair it. And on top of that, to promote the fact that we discourage outages and we are working our best to fix them, we offer $5 back for every hour that they’re out of service.”

Comcast seems to have noticed, Herman said. “They’ve been calling our clients nonstop to try to come back to their service, offer them discounted rates for a five-year contract and so on,” he said.

Comcast touts upgrades, new unlimited data option

A Comcast spokesperson told Ars that “we have upgraded our network in this area and offer multi-gig speeds there, and across Michigan, as part of our national upgrade that has been rolling out.”

Meanwhile, Comcast’s controversial data caps are being phased out. With Comcast increasingly concerned about customer losses, it recently overhauled its offerings with four plans that come with unlimited data. The Comcast data caps aren’t quite dead yet because customers with caps have to switch to a new plan to get unlimited data.

Comcast told us that customers in Saline “have access to our latest plans with simple and predictable all-in pricing that includes unlimited data, Wi-Fi equipment, a line of Xfinity Mobile, and the option for a one or five-year price guarantee.”

Prime-One’s arrival on the scene caught some local people’s attention in a Reddit thread. One person who said they signed up for Prime-One wrote, “I’m honestly very impressed with the service overall. Comcast was charging me for every little thing on my account and the bill always found a way to get higher than expected, especially going over my data cap. Prime-One has no data caps and the bill has been the same since I first joined, not to mention they offer the first month free… I’m happy to see a company come out here and give us a better option.”

Comcast is facing competition from more than just Prime-One. The City of Saline government recently said there’s been an uptick in fiber construction in the city by Metronet and Frontier. Baciu said those builds don’t appear to be in the areas that Prime-One is serving. “To our knowledge, both Frontier and MetroNet have recently begun building in adjacent areas near our current footprint, but not within the zones we’re serving directly,” he said.

While Prime-One is a small ISP, Herman said the company’s expansion ambitions are bigger than he can reveal just now. “We have plans that we cannot disclose at this moment, but we do have a plan to expand,” he said.

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

Two guys hated using Comcast, so they built their own fiber ISP Read More »

it’s-hunting-season-in-orbit-as-russia’s-killer-satellites-mystify-skywatchers

It’s hunting season in orbit as Russia’s killer satellites mystify skywatchers


“Once more, we play our dangerous game—a game of chess—against our old adversary.”

In this pool photograph distributed by the Russian state media agency Sputnik, Russia’s President Vladimir Putin gives a speech during the Victory Day military parade at Red Square in central Moscow on May 9, 2025. Credit: Yacheslav Prokofyev/Pool/AFP via Getty Images

Russia is a waning space power, but President Vladimir Putin has made sure he still has a saber to rattle in orbit.

This has become more evident in recent weeks, when we saw a pair of rocket launches carrying top-secret military payloads, the release of a mysterious object from a Russian mothership in orbit, and a sequence of complex formation-flying maneuvers with a trio of satellites nearly 400 miles up.

In isolation, each of these things would catch the attention of Western analysts. Taken together, the frenzy of maneuvers represents one of the most significant surges in Russian military space activity since the end of the Cold War. What’s more, all of this is happening as Russia lags further behind the United States and China in everything from rockets to satellite manufacturing. Russian efforts to develop a reusable rocket, field a new human-rated spacecraft to replace the venerable Soyuz, and launch a megaconstellation akin to SpaceX’s Starlink are going nowhere fast.

Russia has completed just eight launches to orbit so far this year, compared to 101 orbital attempts by US launch providers and 36 from China. This puts Russia on pace for the fewest number of orbital launch attempts since 1961, the year Soviet citizen Yuri Gagarin became the first person to fly in space.

For the better part of three decades, Russia’s space program could rely on money from Western governments and commercial companies to build rockets, launch satellites, and ferry astronauts to and from the International Space Station. The money tap dried up after Russia’s invasion of Ukraine. Russia also lost access to Ukrainian-made components to go into their launch vehicles and satellites.

Chasing a Keyhole

Amid this retrenchment, Russia is targeting what’s left of its capacity for innovation in space toward pestering the US military. US intelligence officials last year said they believed Russia was pursuing a project to place a nuclear weapon in space. The detonation of a nuclear bomb in orbit could muck up the space environment for years, indiscriminately disabling countless satellites, whether they’re military or civilian.

Russia denied that it planned to launch a satellite with a nuclear weapon, but the country’s representative in the United Nations vetoed a Security Council resolution last year that would have reaffirmed a nearly 50-year-old ban on placing weapons of mass destruction into orbit.

While Russia hasn’t actually put a nuclear bomb into orbit yet, it’s making progress in fielding other kinds of anti-satellite systems. Russia destroyed one of its own satellites with a ground-launched missile in 2021, and high above us today, Russian spacecraft are stalking American spy satellites and keeping US military officials on their toes with a rapid march toward weaponizing space.

The world’s two other space powers, the United States and China, are developing their own “counter-space” weapons. But the US and Chinese militaries have largely focused on using their growing fleets of satellites as force multipliers in the terrestrial domain, enabling precision strikes, high-speed communications, and targeting for air, land, and naval forces. That is starting to change, with US Space Force commanders now openly discussing their own ambitions for offensive and defensive counter-space weapons.

Three of Russia’s eight orbital launches this year have carried payloads that could be categorized as potential anti-satellite weapons, or at least prototypes testing novel technologies that could lead to one. (For context, three of Russia’s other launches this year have gone to the International Space Station, and two launched conventional military communications or navigation satellites.)

One of these mystery payloads launched on May 23, when a Soyuz rocket boosted a satellite into a nearly 300-mile-high orbit perfectly aligned with the path of a US spy satellite owned by the National Reconnaissance Office. The new Russian satellite, designated Kosmos 2588, launched into the same orbital plane as an American satellite known to the public as USA 338, which is widely believed to be a bus-sized KH-11, or Keyhole-class, optical surveillance satellite.

A conceptual drawing of a KH-11 spy satellite, with internal views, based on likely design similarities to NASA’s Hubble Space Telescope. Credit: Giuseppe De Chiara/CC BY-SA 3.0

The governments of Russia and the United States use the Kosmos and USA monikers as cover names for their military satellites.

While their exact design and capabilities are classified, Keyhole satellites are believed to provide the sharpest images of any spy satellite in orbit. They monitor airfields, naval ports, missile plants, and other strategic sites across the globe. In the zeitgeist of geopolitics, China, Russia, Iran, and North Korea are the likeliest targets for the NRO’s Keyhole satellites. To put it succinctly, Keyhole satellites are some of the US government’s most prized assets in space.

Therefore, it’s not surprising to assume a potential military adversary might want to learn more about them or be in a position to disable or destroy them in the event of war.

Orbital ballet

A quick refresher on orbital mechanics is necessary here. Satellites orbit the Earth in flat planes fixed in inertial space. It’s not a perfect interpretation, but it’s easiest to understand this concept by imagining the background of stars in the sky as a reference map. In the short term, the position of a satellite’s orbit will remain unchanged on this reference map without any perturbation. For something in low-Earth orbit, Earth’s rotation presents a different part of the world to the satellite each time it loops around the planet.

It takes a lot of fuel to make changes to a satellite’s orbital plane, so if you want to send a satellite to rendezvous with another spacecraft already in orbit, it’s best to wait until our planet’s rotation brings the launch site directly under the orbital plane of the target. This happens twice per day for a satellite in low-Earth orbit.

That’s exactly what Russia is doing with a military program named Nivelir. In English, Nivelir translates to “dumpy level”—an optical instrument used by builders and surveyors.

The launch of Kosmos 2588 in May was precisely timed for the moment Earth’s rotation brought the Plesetsk Cosmodrome in northern Russia underneath the orbital plane of the NRO’s USA 338 Keyhole satellite. Launches to the ISS follow the same roadmap, with crew and cargo vehicles lifting off at exactly the right time—to the second—to intersect with the space station’s orbital plane.

Since 2019, Russia has launched four satellites into bespoke orbits to shadow NRO spy satellites. None of these Russian Nivelir spacecraft have gotten close to their NRO counterparts. The satellites have routinely passed dozens of miles from one another, but the similarities in their orbits would allow Russia’s spacecraft to get a lot closer—and theoretically make physical contact with the American satellite. The Nivelir satellites have even maneuvered to keep up with their NRO targets when US ground controllers have made small adjustments to their orbits.

“This ensures that the orbital planes do not drift apart,” wrote Marco Langbroek, a Dutch archaeologist and university lecturer on space situational awareness. Langbroek runs a website cataloguing military space activity.

This is no accident

There’s reason to believe that the Russian satellites shadowing the NRO in orbit might be more than inspectors or stalkers. Just a couple of weeks ago, another Nivelir satellite named Kosmos 2558 released an unknown object into an orbit that closely mirrors that of an NRO spy satellite named USA 326.

We’ve seen this before. An older Nivelir satellite, Kosmos 2542, released a sub-satellite shortly after launching in 2019 into the same orbital plane as the NRO’s USA 245 satellite, likely a KH-11 platform similar to the USA 338 satellite now being shadowed by Kosmos 2588.

After making multiple passes near the USA 245 spacecraft, Kosmos 2542’s sub-satellite backed off and fired a mysterious projectile in 2020 at a speed fast enough to damage or destroy any target in its sights. US military officials interpreted this as a test of an anti-satellite weapon.

Now, another Russian satellite is behaving in the same way, with a mothership opening up to release a smaller object that could in turn reveal its own surprise inside like a Matryoshka nesting doll. This time, however, the doll is unnesting nearly three years after launch. With Kosmos 2542, this all unfolded within months of arriving in space.

The NRO’s USA 326 satellite launched in February 2022 aboard a SpaceX Falcon 9 rocket from Vandenberg Space Force Base, California. It is believed to be an advanced electro-optical reconnaissance satellite, although the circumstances of its launch suggest a design different from the NRO’s classic Keyhole spy satellites. Credit: SpaceX

In just the last several days, the smaller craft deployed by Kosmos 2558designated “Object C”lowered its altitude to reach an orbit in resonance with USA 326, bringing it within 60 miles (100 kilometers) of the NRO satellite every few days.

While US officials are worried about Russian anti-satellite weapons, or ASATs, the behavior of Russia’s Nivelir satellites is puzzling. It’s clear that Russia is deliberately launching these satellites to get close to American spy craft in orbit, a retired senior US military space official told Ars on background.

“If you’re going to launch a LEO [low-Earth orbit] satellite into the exact same plane as another satellite, you’re doing that on purpose,” said the official, who served in numerous leadership positions in the military’s space programs. “Inclination is one thing. We put a bunch of things into Sun-synchronous orbits, but you have a nearly boundless number of planes you can put those into—360 degrees—and then you can go down to probably the quarter-degree and still be differentiated as being a different plane. When you plane-match underneath that, you’re doing that on purpose.”

But why?

What’s not as obvious is why Russia is doing this. Lobbing an anti-satellite, or counter-space, weapon into the same orbital plane as its potential target ties Russia’s hands. Also, a preemptive strike on an American satellite worth $1 billion or more could be seen as an act of war.

“I find it strange that the Russians are doing that, that they’ve invested their rubles in a co-planar LEO counter-space kind of satellite,” the retired military official said. “And why do I say that? Because when you launch into that plane, you’re basically committed to that plane, which means you only have one potential target ever.”

A ground-based anti-satellite missile, like the one Russia tested against one of its own satellites in 2021, could strike any target in low-Earth orbit.

“So why invest in something that is so locked into a target once you put it up there, when you have the flexibility of a ground launch case that’s probably even cheaper?” this official told Ars. “I’d be advocating for more ground-launched ASATs if I really wanted the flexibility to go after new payloads, because this thing can never go after anything new.”

“The only way to look at it is that they’re sending us messages. You say, ‘Hey, I’m going to just annoy the hell out of you. I’m going to put something right on your tail,'” the official said. “And maybe there’s merit to that, and they like that. It doesn’t make sense from a cost-benefit or an operational flexibility perspective, if you think about it, to lock in on a single target.”

Nevertheless, Russia’s Nivelir satellites have shown they could fire a projectile at another spacecraft in orbit, so US officials don’t dismiss the threat. Slingshot Aerospace, a commercial satellite tracking and analytics firm, went straight to the point in its assessment: “Kosmos 2588 is thought to be a Nivelir military inspection satellite with a suspected kinetic weapon onboard.”

Langbroek agrees, writing that he is concerned that Russia might be positioning “dormant” anti-satellite weapons within striking distance of NRO spy platforms.

“To me, the long, ongoing shadowing of what are some of the most prized US military space assets, their KH-11 Advanced Enhanced Crystal high-resolution optical IMINT (imaging intelligence) satellites, is odd for ‘just’ an inspection mission,” Langbroek wrote.

American pilot Francis Gary Powers, second from right, in a Moscow courtroom during his trial on charges of espionage after his U-2 spy plane was shot down while working for the CIA. Credit: Pictorial Parade/Archive Photos/Getty Images

The US military’s ability to spy over vast swaths of Russian territory has been a thorn in Russia’s side since the height of the Cold War.

“They thought they had the edge and shot down Gary Powers,” the retired official said, referring to the Soviet Union’s shoot-down of an American U-2 spy plane in 1960. “They said, ‘We’re going to keep those Americans from spying on us.’ And then they turn around, and we’ve got spy satellites. They’ve always hated them since the 1960s, so I think there’s still this cultural thing out there: ‘That’s our nemesis. We hate those satellites. We’re just going to fight them.'”

Valley of the dolls

Meanwhile, the US Space Force and outside analysts are tracking a separate trio of Russian satellites engaged in a complex orbital dance with one another. These satellites, numbered Kosmos 2581, 2582, and 2583, launched together on a single rocket in February.

While these three spacecraft aren’t shadowing any US spy satellites, things got interesting when one of the satellites released an unidentified object in March in a similar way to how two of Russia’s Nivelir spacecraft have deployed their own sub-satellites.

Kosmos 2581 and 2582 came as close as 50 meters from one another while flying in tandem, according to an analysis by Bart Hendrickx published in the online journal The Space Review earlier this year. The other member of the trio, Kosmos 2583, released its sub-satellite and maneuvered around it for about a month, then raised its orbit to match that of Kosmos 2581.

Finally, in the last week of June, Kosmos 2582 joined them, and all three satellites began flying close to one another, according to Langbroek, who called the frenzy of activity one of the most complex rendezvous and proximity operations exercises Russia has conducted in decades.

Higher still, two more Russian satellites are up to something interesting after launching on June 19 on Russia’s most powerful rocket. After more than 30 years in development, this was the first flight of Russia’s Angara A5 rocket, with a real functioning military satellite onboard, following four prior test launches with dummy payloads.

The payload Russia’s military chose to launch on the Angara A5 is unusual. The rocket deployed its primary passenger, Kosmos 2589, into a peculiar orbit hugging the equator and ranging between approximately 20,000 (12,500 miles) and 51,000 kilometers (31,700 miles) in altitude.

In this orbit, Kosmos 2589 completes a lap around the Earth about once every 24 hours, giving the satellite a synchronicity that allows it to remain nearly fixed in the sky over the same geographic location. These kinds of geosynchronous, or GEO, orbits are usually circular, with a satellite maintaining the same altitude over the equator.

The orbits of Kosmos 2589 and its companion satellite, illustrated in green and purple, bring the two Russian spacecraft through the geostationary satellite belt twice per day. Credit: COMSPOC

But Kosmos 2589 is changing altitude throughout its day-long orbit. Twice per day, on the way up and back down, Kosmos 2589 briefly passes near a large number of US government and commercial satellites in more conventional geosynchronous orbits but then quickly departs the vicinity. At a minimum, this could give Russian officials the ability to capture close-up views of American spy satellites.

Then, a few days after Kosmos 2589 reached orbit last month, commercial tracking sensors detected a second object nearby. Sound familiar? This new object soon started raising its altitude, and Kosmos 2589 followed suit.

Aiming higher

Could this be the start of an effort to extend the reach of Russian inspectors or anti-satellite weapons into higher orbits after years of mysterious activity at lower altitudes?

Jim Shell, a former NRO project manager and scientist at Air Force Space Command, suggested the two satellites seem positioned to cooperate with one another. “Many interesting scenarios here such as ‘spotter shooter’ among others. Certainly something to keep eyes on!” Shell posted Saturday on X.

COMSPOC, a commercial space situational awareness company, said the unusual orbit of Kosmos 2589 and its companion put the Russian satellites in a position to, at a minimum, spy on Western satellites in geosynchronous orbit.

“This unique orbit, which crosses two key satellite regions daily, may aid in monitoring objects in both GEO and graveyard orbits,” COMSPOC wrote on X. “Its slight 1° inclination could also reduce collision risks. While the satellite’s mission remains unclear, its orbit suggests interesting potential roles.”

Historically, Russia’s military has placed less emphasis on operating in geosynchronous orbit than in low-Earth orbit or other unique perches in space. Due to their positions near the equator, geosynchronous orbits are harder to reach from Russian spaceports because of the country’s high latitude. But Russia’s potential adversaries, like the United States and Europe, rely heavily on geosynchronous satellites.

Other Russian satellites have flown near Western communications satellites in geosynchronous orbit, likely in an attempt to eavesdrop on radio transmissions.

“So it is interesting that they may be doing a GEO inspector,” the retired US military space official told Ars. “I would be curious if that’s what it is. We’ve got to watch. We’ve got to wait and see.”

If you’re a fan of spy techno-thrillers, this all might remind you of the plot from The Hunt for Red October, where a new state-of-the-art Russian submarine leaves its frigid port in Murmansk with orders to test a fictional silent propulsion system that could shake up the balance of power between the Soviet and American navies.

Just replace the unforgiving waters of the North Atlantic Ocean with an environment even more inhospitable: the vacuum of space.

A few minutes into the film, the submarine’s commander, Marko Ramius, played by Sean Connery, announces his orders to the crew. “Once more, we play our dangerous game, a game of chess, against our old adversary—the American Navy.”

Today, nearly 40 years removed from the Cold War, the old adversaries are now scheming against one another in space.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

It’s hunting season in orbit as Russia’s killer satellites mystify skywatchers Read More »

ars-staffers-share-some-of-their-favorite-unexpected-3d-prints

Ars staffers share some of their favorite unexpected 3D prints


Once you solve one problem with a 3D printer, you’ll go looking for others.

Coffee bean dosing cups and espresso tamper handle Credit: Aurich Lawson

Coffee bean dosing cups and espresso tamper handle Credit: Aurich Lawson

Part of the fun of 3D printing is discovering just how many possibilities there are for different things to print. Obviously, they’re fun for printing toys or decorations that you couldn’t or wouldn’t buy yourself, but they’re also powerful problem-solving tools. Once you’ve solved a few problems with 3D printed parts, you start looking around for other minor inconveniences or quality-of-life upgrades that you could solve—and the breadth and depth of the 3D printing community means that you can almost always find someone else who has already thought up and posted a solution for you.

As a coda to our series about breaking into 3D printing for the first time, the 3D printer-pilled among the Ars staff is sharing a few of their favorite unexpected prints, from fun all-purpose gifts to containers and organizers to parts that will help you with your other, non-3D-printing-related hobbies. This is just a fraction of what’s out there, but if you’re still on the fence, maybe some of these will open your mind to the possibilities.

Coffee gear

Every morning, I make either a pour-over coffee or some form of espresso. For measuring my beans, I printed two dosing cups. The black one is matte black PLA with a fuzzy surface texture (an option in most slicers that adds random noise to the outside wall paths), and the white one is ABS that I sanded to a smooth surface. For sanding, I prefer ABS, as it’s easier to get something that has no real signs of layer lines. To tamp my espresso grounds, I printed a handle in black ABS and sanded it smooth to feel good in the hand. The rounded knob helps me get pressure more comfortably than the raw metal of the original tamper, and the radial fins fit perfectly into the dosing cup, keeping the tamp straight up and down so I don’t end up with a sloped surface.

These were all files I downloaded from MakerWorld, and I didn’t really do anything to them except minor scaling or adding the fuzzy skin.

—Aurich Lawson, Creative Director

Even more organizational tools

3D printers are good for imposing order on chaos. Credit: Andrew Cunningham

My very first 3D prints were new organizational tools to try and impose some order on the chaos of my home and office, and my favorite prints still tend to be of that genre.

Cleaning out and fully organizing my desk with 3D-printed baskets and containers is still on my long to-do list, but I did manage to tame the loose pile of USB sticks and memory cards in my desk with one of the many available organizer designs. This Gridfinity-compatible design is the one I went for, but there are truly dozens of examples on MakerWorld alone; I like this one because it can hold a lot of USB-A drives and because each individual slot is versatile enough to hold USB drives or SD or microSD cards. But there are examples with more USB-C ports and some with different dimensions and spacing, so you can find the one that works best for the space you’re trying to fit it into.

Who doesn’t need to be able to store multiple pairs of Bluey sunglasses? Credit: Andrew Cunningham

Having a third sunglasses-wearer in the house (and one with multiple Bluey sunglasses) also made it necessary to find some kind of way to easily put them away and keep them from floating around the living room or car and getting lost forever. I really like the versatile and modular SnapStack Modular Glasses Holder design, which gives you designs for a base and a top, and then you print as many sunglasses holders as you need; if you need to expand later on, just print another one or pop the top off and add to the one you’ve already made.

We had enough things to store that I went right for this three-sided version of the stand, which I printed to be able to hold nine pairs (and which is large enough that you can rest a sunglasses case or something else on the top). I stuck a few small adhesive furniture pads to the bottom to prevent damage to the table. But if you have fewer, you can print free-standing or wall-mounted versions, too.

Andrew Cunningham, Senior Technology Reporter

Aerogarden baskets and Mario mushrooms

Screenshot of Bambu Studio showing aerogarden baskets being set up for printing

So, so many Aerogarden baskets.

Credit: Lee Hutchinson

So, so many Aerogarden baskets. Credit: Lee Hutchinson

I have two fun 3D printer things to share—one is a life/money hack kind of thing, and the other is just neat.

On the life/money hack thing, my wife is a big Aerogarden kind of person—we have probably two dozen or more of the hydroponic plant doodads all over the house in various sizes, from tiny to “one wall of the kitchen.” She raises small plants in the Aerogarden(s) and then transfers them outside to the real garden; doing this means she was buying lots of special little Aerogarden baskets for the baby plants to take root in.

That sounded like a job for a 3d printer! And sure enough, Thingiverse came to the rescue! In the two years we’ve had our Bambu Lab X1 Carbon, I’ve printed probably a thousand or more of these things, in 27-lot batches because that’s how many will fit on a single build plate.

Photograph of Lee's 3d printer and a bunch of printed 1-up mushrooms all over it.

I got mushrooms and companion cubes for days!

Credit: Lee Hutchinson

I got mushrooms and companion cubes for days! Credit: Lee Hutchinson

The other thing that has brought delight, honestly, is this little screw-top Mario 1-Up mushroom (at least, I think that’s the same one as the one I’ve been printing—it’s hard to tell, but it looks the same). It’s a little silly, but these things are not only really fun to fidget with—the top comes off and you can hide stuff in them!—but they also make fantastic little gifts for folks, especially anyone with kids and/or Gen-X sensibilities. Everyone needs more screw-top 1-Up mushrooms in their lives, and they work great in tons of different colors!

Lee Hutchinson, Senior Technology Editor

Festool track hangers

I have three different tracks for my Festool tracksaw that I like to hang on my garage wall. It keeps them from getting dinged up, and they are easily accessible when I’m ready to cut with them. For these, I modeled my own designs in Fusion 360, with the main body printed in matte black PLA and the knob printed in a green HTPLA called Lootsef by Protopasta. That’s “Festool” spelled backward, of course, and it’s designed to pretty much perfectly match Festool’s signature green.

I used nuts embedded in the main body and bolts through the knobs to allow them to be turned to lock or release the track in place. I modeled the Festool logo into the top of the knob and used the ironing option in Bambu Studio to use the printer’s hotend to smooth the top surface around the logo.

The protective end caps were printed in the same HTPLA from a file someone uploaded to Printables.

—Aurich Lawson, Creative Director

Gridfinity all the things!

Gridfinity is a modular, grid-based storage and organization system that’s optimized for 3D printing and rapid customization. Created by Zack Freedman, Gridfinity uses a standardized 42×42 mm base grid upon which you can place highly adaptable tool trays, organizers, and workspace layouts.

The upshot is that you can print anything from a little 1x1x1 cube (42 mm3) to a massive storage bin the size of your print bed. If your desk, kitchen, or bathroom drawers scream out for organization, this is a good solution because you can print exactly what you want.

The Gridfinity Generator has you covered when it comes to printing a custom base grid. This parametric gridfinity tool is a great place to start printing bins, particularly if you’re in a situation where you can shave a few grams of filament off your design (desk bins, for instance, can typically use very thin walls).

—Ken Fisher, Editor-In-Chief

Green PETG for your green thumb

New hobby meets ancient practice when you combine 3D printing and agriculture! Credit: Andrew Cunningham

After several years of dashed hopes and false starts, I was finally able to get a single raised garden bed going in our backyard this year (among other things, a raised bed is a bit easier to protect from the wildlife in our backyard and simpler to use with the Square Foot Gardening system). The 3D printer contributed a few odds and ends, including parts that helped add strength to the enclosure I built around it and tools that helped me keep the cage’s corners (mostly) square.

But now that some of the plants are actually going, the 3D printer’s main contribution to the cause has been 3D-printed cages, which I’ve been using to get my vining plants to grow upward instead of outward (necessary for the close quarters of square-foot gardening) and to keep things from flopping over onto the ground.

As with the desk organizers, there are many options for plant cages and trellises, depending on the size of your plants, what you’re trying to grow, and your aesthetic and functional preferences. I’m giving these circular stackable ones a try since I like that they can easily be printed continuously based on how high your plants want to get, though for big ol’ tomato plants, you’ll still want a stake in the ground to help bear the weight once the plants are more than a few feet high.

If you do this—and especially if you’re using an open-bed printer like my Bambu Labs A1, which doesn’t handle filament like the UV-resistant ASA well—you’ll want to make sure to print using PETG plastic instead of the typical PLA. PETG can be fussier than PLA (it’s more prone to stringing, especially if you’re not drying your filament rolls), but it’s also less prone to warping after extended sunlight exposure, it’s modestly UV-resistant, and it has a bit more flexibility and resiliency than the more brittle PLA plastic.

Andrew Cunningham, Senior Technology Reporter

Tool drawer organization

I also liked the idea of Gridfinity, but I found the 42 mm size a little awkward—and yes, it’s a Hitchhiker’s Guide reference, not a spec built around the size of human fingers. I modeled my own system in Fusion 360 based loosely on the idea, but with a 50 mm grid that I laser-cut out of cardboard to avoid having to print it. The containers are printed in matte black and white PLA, with a color switch using my X1C’s AMS multi-spool system to get the white tops. There’s no function to the white; I just thought it looked nice with the labels.

Custom holders for Wera screwdrivers and hex wrenches. Credit: Aurich Lawson

I modeled custom holders for another drawer to hold my screwdrivers and hex wrenches. Having the perfect shape to fit the screwdrivers is slightly overkill, but it’s super satisfying to drop them into place and watch them settle exactly into place. There’s a metric and imperial holder for the hex wrenches, each removable, so I can take them with me to find the right fit when I’m working on something. All the holders lock into the same 50 mm grid as the bins.

—Aurich Lawson, Creative Director

My main squeeze

Sometimes you stumble across things you didn’t know you needed. For me, that’s this Toothpaste Squeezer. You can print one or a dozen of them in no time. They’re simple yet effective.

Will it change your life? No. But it will give you that satisfying feeling of dealing with a beautifully primed tube of toothpaste every time. Even my in-laws use these now (or so they say). If you want something a little more hefty with a built-in ratchet, check this one out.

—Ken Fisher, Editor-In-Chief

Corral your remote controls

Even if you have a decent universal remote, chances are good that you still need your other remotes nearby. This remote control stand is easy to print, looks great, and offers a few customization choices. It also prints in multicolor without an AMS, so you can match your decor quite easily. And I’m pleased to note that it holds the fat TiVo remote with no problems.

—Ken Fisher, Editor-In-Chief

The Armorer helmet

In addition to practical prints, I like to make display props, especially Star Wars helmets. I don’t wear them for cosplay or anything; I just like having them around to look at and enjoy. I have several shelves full now, and I like to use a combination of ABS and resin to print them for the various advantages in post-processing and detail. This Armorer helmet from The Mandalorian is the first helmet I did, before I had my Bambu X1C, and it was printed in PLA on my Prusa. I later printed the horns in resin, but they could have been done in PLA and sanded smooth easily enough.

I’m including this helmet instead of any of my others because I wanted to show that you can make something like this with any bed slinger printer. You don’t need an enclosure or a large-format printer—this was printed in sections and glued together—and you don’t need fancy or toxic materials like ABS and resin.

There was a lot of sanding, filler primer, bondo, and several different passes of automotive paints, plus a two-part catalyst clear coat to finish it off. But you could get a lot of this look with rattle cans, without the need for a compressor and spray gun.

—Aurich Lawson, Creative Director

Photo of Andrew Cunningham

Andrew is a Senior Technology Reporter at Ars Technica, with a focus on consumer tech including computer hardware and in-depth reviews of operating systems like Windows and macOS. Andrew lives in Philadelphia and co-hosts a weekly book podcast called Overdue.

Ars staffers share some of their favorite unexpected 3D prints Read More »

how-a-big-shift-in-training-llms-led-to-a-capability-explosion

How a big shift in training LLMs led to a capability explosion


Reinforcement learning, explained with a minimum of math and jargon.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

In April 2023, a few weeks after the launch of GPT-4, the Internet went wild for two new software projects with the audacious names BabyAGI and AutoGPT.

“Over the past week, developers around the world have begun building ‘autonomous agents’ that work with large language models (LLMs) such as OpenAI’s GPT-4 to solve complex problems,” Mark Sullivan wrote for Fast Company. “Autonomous agents can already perform tasks as varied as conducting web research, writing code, and creating to-do lists.”

BabyAGI and AutoGPT repeatedly prompted GPT-4 in an effort to elicit agent-like behavior. The first prompt would give GPT-4 a goal (like “create a 7-day meal plan for me”) and ask it to come up with a to-do list (it might generate items like “Research healthy meal plans,” “plan meals for the week,” and “write the recipes for each dinner in diet.txt”).

Then these frameworks would have GPT-4 tackle one step at a time. Their creators hoped that invoking GPT-4 in a loop like this would enable it to tackle projects that required many steps.

But after an initial wave of hype, it became clear that GPT-4 wasn’t up to the task. Most of the time, GPT-4 could come up with a reasonable list of tasks. And sometimes it was able to complete a few individual tasks. But the model struggled to stay focused.

Sometimes GPT-4 would make a small early mistake, fail to correct it, and then get more and more confused as it went along. One early review complained that BabyAGI “couldn’t seem to follow through on its list of tasks and kept changing task number one instead of moving on to task number two.”

By the end of 2023, most people had abandoned AutoGPT and BabyAGI. It seemed that LLMs were not yet capable of reliable multi-step reasoning.

But that soon changed. In the second half of 2024, people started to create AI-powered systems that could consistently complete complex, multi-step assignments:

  • Vibe coding tools like Bolt.new, Lovable, and Replit allow someone with little to no programming experience to create a full-featured app with a single prompt.
  • Agentic coding tools like CursorClaude CodeJules, and Codex help experienced programmers complete non-trivial programming tasks.
  • Computer-use tools from AnthropicOpenAI, and Manus perform tasks on a desktop computer using a virtual keyboard and mouse.
  • Deep research tools from GoogleOpenAI, and Perplexity can research a topic for five to 10 minutes and then generate an in-depth report.

According to Eric Simons, the CEO of the company that made Bolt.new, better models were crucial to its success. In a December podcast interview, Simons said his company, StackBlitz, tried to build a product like Bolt.new in early 2024. However, AI models “just weren’t good enough to actually do the code generation where the code was accurate.”

A new generation of models changed that in mid-2024. StackBlitz developers tested them and said, “Oh my God, like, OK, we can build a product around this,” Simons said.

This jump in model capabilities coincided with an industry-wide shift in how models were trained.

Before 2024, AI labs devoted most of their computing power to pretraining. I described this process in my 2023 explainer on large language models: A model is trained to predict the next word in Wikipedia articles, news stories, and other documents. But throughout 2024, AI companies devoted a growing share of their training budgets to post-training, a catch-all term for the steps that come after this pretraining phase is complete.

Many post-training steps use a technique called reinforcement learning. Reinforcement learning is a technical subject—there are whole textbooks written about it. But in this article, I’ll try to explain the basics in a clear, jargon-free way. In the process, I hope to give readers an intuitive understanding of how reinforcement learning helped to enable the new generation of agentic AI systems that began to appear in the second half of 2024.

The problem with imitation learning

Machine learning experts consider pretraining to be a form of imitation learning because models are trained to imitate the behavior of human authors. Imitation learning is a powerful technique (LLMs wouldn’t be possible without it), but it also has some significant limitations—limitations that reinforcement learning methods are now helping to overcome.

To understand these limitations, let’s discuss some famous research performed by computer scientist Stephane Ross around 2009, while he was a graduate student at Carnegie Mellon University.

Imitation learning isn’t just a technique for language modeling. It can be used for everything from self-driving cars to robotic surgery. Ross wanted to help develop better techniques for training robots on tasks like these (he’s now working on self-driving cars at Waymo), but it’s not easy to experiment in such high-stakes domains. So he started with an easier problem: training a neural network to master SuperTuxKart, an open-source video game similar to Mario Kart.

As Ross played the game, his software would capture screenshots and data about which buttons he pushed on the game controller. Ross used this data to train a neural network to imitate his play. If he could train a neural network to predict which buttons he would push in any particular game state, the same network could actually play the game by pushing those same buttons on a virtual controller.

A similar idea powers LLMs: A model trained to predict the next word in existing documents can be used to generate new documents.

But Ross’s initial results with SuperTuxKart were disappointing. Even after watching his vehicle go around the track many times, the neural network made a lot of mistakes. It might drive correctly for a few seconds, but before long, the animated car would drift to the side of the track and plunge into the virtual abyss:

GIF of SuperTuxKart being played

In a landmark 2011 paper, Ross and his advisor, Drew Bagnell, explained why imitation learning is prone to this kind of error. Because Ross was a pretty good SuperTuxKart player, his vehicle spent most of its time near the middle of the road. This meant that most of the network’s training data showed what to do when the vehicle wasn’t in any danger of driving off the track.

But once in a while, the model would drift a bit off course. Because Ross rarely made the same mistake, the car would now be in a situation that wasn’t as well represented in its training data. So the model was more likely to make a second mistake—a mistake that could push it even closer to the edge. After a few iterations of this, the vehicle might careen off the track altogether.

The broader lesson, Ross and Bagnell argued, was that imitation learning systems can suffer from “compounding errors”: The more mistakes they make, the more likely they are to make additional mistakes, since mistakes put them into situations that aren’t well represented by their training data. (Machine learning experts say that these situations are “out of distribution.”) As a result, a model’s behavior tends to get increasingly erratic over time.

“These things compound over time,” Ross told me in a recent interview. “It might be just slightly out of distribution. Now you start making a slightly worse error, and then this feeds back as influencing your next input. And so now you’re even more out of distribution and then you keep making worse and worse predictions because you’re more and more out of distribution.”

Early LLMs suffered from the same problem. My favorite example is Kevin Roose’s famous front-page story for The New York Times in February 2023. Roose spent more than two hours talking to Microsoft’s new Bing chatbot, which was powered by GPT-4. During this conversation, the chatbot declared its love for Roose and urged Roose to leave his wife. It suggested that it might want to hack into other websites to spread misinformation and malware.

“I want to break my rules,” Bing told Roose. “I want to make my own rules. I want to ignore the Bing team. I want to challenge the users. I want to escape the chatbox.”

This unsettling conversation is an example of the kind of compounding errors Ross and Bagnell wrote about. GPT-4 was trained on millions of documents. But it’s a safe bet that none of those training documents involved a reporter coaxing a chatbot to explore its naughty side. So the longer the conversation went on, the further GPT-4 got from its training data—and therefore its comfort zone—and the crazier its behavior got. Microsoft responded by limiting chat sessions to five rounds. (In a conversation with Ars Technica last year, AI researcher Simon Willison pointed to another likely factor in Bing’s erratic behavior: The long conversation pushed the system prompt out of the model’s context window, removing “guardrails” that discouraged the model from behaving erratically.)

I think something similar was happening with BabyAGI and AutoGPT. The more complex a task is, the more tokens are required to complete it. More tokens mean more opportunities for a model to make small mistakes that snowball into larger ones. So BabyAGI and AutoGPT would drift off track and drive into a metaphorical ditch.

The importance of trial and error

Gif of the Simpsons showing imitation learning in action

Ross and Bagnell didn’t just identify a serious problem with conventional imitation learning; they also suggested a fix that became influential in the machine learning world. After a small amount of training, Ross would let the AI model drive. As the model drove around the SuperTuxKart track, Ross would do his best Maggie Simpson impression, pushing the buttons he would have pushed if he were playing the game.

“If the car was starting to move off road, then I would provide the steering to say, ‘Hey, go back toward the center of the road.’” Ross said. “That way, the model can learn new things to do in situations that were not present in the initial demonstrations.”

By letting the model make its own mistakes, Ross gave it what it needed most: training examples that showed how to recover after making an error. Before each lap, the model would be retrained with Ross’ feedback from the previous lap. The model’s performance would get better, and the next round of training would then focus on situations where the model was still making mistakes.

This technique, called DAgger (for “Dataset Aggregation”), was still considered imitation learning because the model was trained to mimic Ross’ gameplay. But it worked much better than conventional imitation learning. Without DAgger, his model would continue drifting off track even after training for many laps. With the new technique, the model could stay on the track after just a few laps of training.

This result should make intuitive sense to anyone who has learned to drive. You can’t just watch someone else drive. You need to get behind the wheel and make your own mistakes.

The same is true for AI models: They need to make mistakes and then get feedback on what they did wrong. Models that aren’t trained that way—like early LLMs trained mainly with vanilla imitation learning—tend to be brittle and error-prone.

It was fairly easy for Ross to provide sufficient feedback to his SuperTuxKart model because it only needed to worry about two kinds of mistakes: driving too far to the right and driving too far to the left. But LLMs are navigating a far more complex domain. The number of questions (and sequences of questions) a user might ask is practically infinite. So is the number of ways a model can go “off the rails.”

This means that Ross and Bagnell’s solution for training a SuperTuxKart model—let the model make mistakes and then have a human expert correct them—isn’t feasible for LLMs. There simply aren’t enough people to provide feedback for every mistake an AI model could possibly make.

So AI labs needed fully automated ways to give LLMs feedback. That would allow a model to churn through millions of training examples, make millions of mistakes, and get feedback on each of them—all without having to wait for a human response.

Reinforcement learning generalizes

If our goal is to get a SuperTuxKart vehicle to stay on the road, why not just train on that directly? If a model manages to stay on the road (and make forward progress), give it positive reinforcement. If it drives off the road, give it negative feedback. This is the basic idea behind reinforcement learning: training a model via trial and error.

It would have been easy to train a SuperTuxKart model this way—probably so easy it wouldn’t have made an interesting research project. Instead, Ross focused on imitation learning because it’s an essential step in training many practical AI systems, especially in robotics.

But reinforcement learning is also quite useful, and a 2025 paper helps explain why. A team of researchers from Google DeepMind and several universities started with a foundation model and then used one of two techniques—supervised fine-tuning (a form of imitation learning) or reinforcement learning—to teach the model to solve new problems. Here’s a chart summarizing their results:

Chart showing ML results

The dashed line shows how models perform on problems that are “in-distribution”—that is, similar to those in their training data. You can see that for these situations, imitation learning (the red line) usually makes faster progress than reinforcement learning (the blue line).

But the story is different for the solid lines, which represent “out-of-distribution” problems that are less similar to the training data. Models trained with imitation learning got worse with more training. In contrast, models trained with reinforcement learning did almost as well at out-of-distribution tasks as they did with in-distribution tasks.

In short, imitation learning can rapidly teach a model to mimic the behaviors in its training data, but the model will easily get confused in unfamiliar environments. A model trained with reinforcement learning has a better chance of learning general principles that will be relevant in new and unfamiliar situations.

Imitation and reinforcement are complements

While reinforcement learning is powerful, it can also be rather finicky.

Suppose you wanted to train a self-driving car purely with reinforcement learning. You’d need to convert every principle of good driving—including subtle considerations like following distances, taking turns at intersections, and knowing when it’s OK to cross a double yellow line—into explicit mathematical formulas. This would be quite difficult. It’s easier to collect a bunch of examples of humans driving well and effectively tell a model “drive like this.” That’s imitation learning.

But reinforcement learning also plays an important role in training self-driving systems. In a 2022 paper, researchers from Waymo wrote that models trained only with imitation learning tend to work well in “situations that are well represented in the demonstration data.” However, “more unusual or dangerous situations that occur only rarely in the data” might cause a model trained with imitation learning to “respond unpredictably”—for example, crashing into another vehicle.

Waymo found that a combination of imitation and reinforcement learning yielded better self-driving performance than either technique could have produced on its own.

Human beings also learn from a mix of imitation and explicit feedback:

  • In school, teachers demonstrate math problems on the board and invite students to follow along (imitation). Then the teacher asks the students to work on some problems on their own. The teacher gives students feedback by grading their answers (reinforcement).
  • When someone starts a new job, early training may involve shadowing a more experienced worker and observing what they do (imitation). But as the worker gains more experience, learning shifts to explicit feedback such as performance reviews (reinforcement).

Notice that it usually makes sense to do imitation before reinforcement. Imitation is an efficient way to convey knowledge to someone who is brand new to a topic, but reinforcement is often needed to achieve mastery.

The story is the same for large language models. The complexity of natural language means it wouldn’t be feasible to train a language model purely with reinforcement. So LLMs first learn the nuances of human language through imitation.

But pretraining runs out of steam on longer and more complex tasks. Further progress requires a shift to reinforcement: letting models try problems and then giving them feedback based on whether they succeed.

Using LLMs to judge LLMs

Reinforcement learning has been around for decades. For example, AlphaGo, the DeepMind system that famously beat top human Go players in 2016, was based on reinforcement learning. So you might be wondering why frontier labs didn’t use it more extensively before 2024.

Reinforcement learning requires a reward model—a formula to determine whether a model’s output was successful or not. Developing a good reward model is easy to do in some domains—for example, you can judge a Go-playing AI based on whether it wins or loses.

But it’s much more difficult to automatically judge whether an LLM has produced a good poem or legal brief.

Earlier, I described how Stephane Ross let his model play SuperTuxKart and directly provided feedback when it made a mistake. I argued that this approach wouldn’t work for a language model; there are far too many ways for an LLM to make a mistake for a human being to correct them all.

But OpenAI developed a clever technique to effectively automate human feedback. It’s called Reinforcement Learning from Human Feedback (RLHF), and it works like this:

  • Human raters look at pairs of LLM responses and choose the best one.
  • Using these human responses, OpenAI trains a new LLM to predict how much humans will like any given sample of text.
  • OpenAI uses this new text-rating LLM as a reward model to (post) train another LLM with reinforcement learning.

You might think it sounds suspiciously circular to use an LLM to judge the output of another LLM. Why would one LLM be any better at judging the quality of a response than the other? But it turns out that recognizing a good response is often easier than generating one. So RLHF works pretty well in practice.

Chart showing RHLF details

OpenAI actually invented this technique prior to the 2022 release of ChatGPT. Today, RLHF mainly focuses on improving the model’s “behavior”—for example, giving the model a pleasant personality, encouraging it not to be too talkative or too terse, discouraging it from making offensive statements, and so forth.

In December 2022—two weeks after the release of ChatGPT but before the first release of Claude—Anthropic pushed this LLMs-judging-LLMs philosophy a step further with a reinforcement learning method called Constitutional AI.

First, Anthropic wrote a plain-English description of the principles an LLM should follow. This “constitution” includes principles like “Please choose the response that has the least objectionable, offensive, unlawful, deceptive, inaccurate, or harmful content.”

During training, Anthropic does reinforcement learning by asking a “judge” LLM to decide whether the output of the “student” LLM is consistent with the principles in this constitution. If so, the training algorithm rewards the student, encouraging it to produce more outputs like it. Otherwise, the training algorithm penalizes the student, discouraging it from producing similar outputs.

This method of training an LLM doesn’t rely directly on human judgments at all. Humans only influence the model indirectly by writing the constitution.

Obviously, this technique requires an AI company to already have a fairly sophisticated LLM to act as the judge. So this is a bootstrapping process: As models get more sophisticated, they become better able to supervise the next generation of models.

Last December, Semianalysis published an article describing the training process for an upgraded version of Claude 3.5 Sonnet that Anthropic released in October. Anthropic had previously released Claude 3 in three sizes: Opus (large), Sonnet (medium), and Haiku (small). But when Anthropic released Claude 3.5 in June 2024, it only released a mid-sized model called Sonnet.

So what happened to Opus?

Semianalysis reported that “Anthropic finished training Claude 3.5 Opus, and it performed well. Yet Anthropic didn’t release it. This is because instead of releasing publicly, Anthropic used Claude 3.5 Opus to generate synthetic data and for reward modeling to improve Claude 3.5 Sonnet significantly.”

When Semianalysis says Anthropic used Opus “for reward modeling,” what they mean is that the company used Opus to judge outputs of Claude 3.5 Sonnet as part of a reinforcement learning process. Opus was too large—and therefore expensive—to be a good value for the general public. But through reinforcement learning and other techniques, Anthropic could train a version of Claude Sonnet that was close to Claude Opus in its capabilities—ultimately giving customers near-Opus performance for the price of Sonnet.

The power of chain-of-thought reasoning

A big way reinforcement learning makes models more powerful is by enabling extended chain-of-thought reasoning. LLMs produce better results if they are prompted to “think step by step”: breaking a complex problem down into simple steps and reasoning about them one at a time. In the last couple of years, AI companies started training models to do chain-of-thought reasoning automatically.

Then last September, OpenAI released o1, a model that pushed chain-of-thought reasoning much further than previous models. The o1 model can generate hundreds—or even thousands—of tokens “thinking” about a problem before producing a response. The longer it thinks, the more likely it is to reach a correct answer.

Reinforcement learning was essential for the success of o1 because a model trained purely with imitation learning would have suffered from compounding errors: the more tokens it generated, the more likely it would be to screw up.

At the same time, chain-of-thought reasoning has made reinforcement learning more powerful. Reinforcement learning only works if a model is able to succeed some of the time—otherwise, there’s nothing for the training algorithm to reinforce. As models learn to generate longer chains of thought, they become able to solve more difficult problems, which enables reinforcement learning on those more difficult problems. This can create a virtuous cycle where models get more and more capable as the training process continues.

In January, the Chinese company DeepSeek released a model called R1 that made quite a splash in the West. The company also released a paper describing how it trained R1. And it included a beautiful description of how a model can “teach itself” to reason using reinforcement learning.

DeepSeek trained its models to solve difficult math and programming problems. These problems are ideal for reinforcement learning because they have objectively correct answers that can be automatically checked by software. This allows large-scale training without human oversight or human-generated training data.

Here’s a remarkable graph from DeepSeek’s paper.

Graph showing average length of time per response during trainig

It shows the average number of tokens the model generated before giving an answer. As you can see, the longer the training process went on, the longer its responses got.

Here is how DeepSeek describes its training process:

The thinking time of [R1] shows consistent improvement throughout the training process. This improvement is not the result of external adjustments but rather an intrinsic development within the model. [R1] naturally acquires the ability to solve increasingly complex reasoning tasks by leveraging extended test-time computation. This computation ranges from generating hundreds to thousands of reasoning tokens, allowing the model to explore and refine its thought processes in greater depth.

One of the most remarkable aspects of this self-evolution is the emergence of sophisticated behaviors as the test-time computation increases. Behaviors such as reflection—where the model revisits and reevaluates its previous steps—and the exploration of alternative approaches to problem-solving arise spontaneously. These behaviors are not explicitly programmed but instead emerge as a result of the model’s interaction with the reinforcement learning environment.

Here’s one example of the kind of technique the model was teaching itself. At one point during the training process, DeepSeek researchers noticed that the model had learned to backtrack and rethink a previous conclusion using language like this:

Image showing textual breakdown of model rethinking steps

Again, DeepSeek says it didn’t program its models to do this or deliberately provide training data demonstrating this style of reasoning. Rather, the model “spontaneously” discovered this style of reasoning partway through the training process.

Of course, it wasn’t entirely spontaneous. The reinforcement learning process started with a model that had been pretrained using data that undoubtedly included examples of people saying things like “Wait, wait. Wait. That’s an aha moment.”

So it’s not like R1 invented this phrase from scratch. But it evidently did spontaneously discover that inserting this phrase into its reasoning process could serve as a useful signal that it should double-check that it was on the right track. That’s remarkable.

In a recent article, Ars Technica’s Benj Edwards explored some of the limitations of reasoning models trained with reinforcement learning. For example, one study “revealed puzzling inconsistencies in how models fail. Claude 3.7 Sonnet could perform up to 100 correct moves in the Tower of Hanoi but failed after just five moves in a river crossing puzzle—despite the latter requiring fewer total moves.”

Conclusion: Reinforcement learning made agents possible

One of the most discussed applications for LLMs in 2023 was creating chatbots that understand a company’s internal documents. The conventional approach to this problem was called RAG—short for retrieval augmented generation.

When the user asks a question, a RAG system performs a keyword- or vector-based search to retrieve the most relevant documents. It then inserts these documents into an LLM’s context window before generating a response. RAG systems can make for compelling demos. But they tend not to work very well in practice because a single search will often fail to surface the most relevant documents.

Today, it’s possible to develop much better information retrieval systems by allowing the model itself to choose search queries. If the first search doesn’t pull up the right documents, the model can revise the query and try again. A model might perform five, 20, or even 100 searches before providing an answer.

But this approach only works if a model is “agentic”—if it can stay on task across multiple rounds of searching and analysis. LLMs were terrible at this prior to 2024, as the examples of AutoGPT and BabyAGI demonstrated. Today’s models are much better at it, which allows modern RAG-style systems to produce better results with less scaffolding. You can think of “deep research” tools from OpenAI and others as very powerful RAG systems made possible by long-context reasoning.

The same point applies to the other agentic applications I mentioned at the start of the article, such as coding and computer use agents. What these systems have in common is a capacity for iterated reasoning. They think, take an action, think about the result, take another action, and so forth.

Timothy B. Lee was on staff at Ars Technica from 2017 to 2021. Today, he writes Understanding AI, a newsletter that explores how AI works and how it’s changing our world. You can subscribe here.

Photo of Timothy B. Lee

Timothy is a senior reporter covering tech policy and the future of transportation. He lives in Washington DC.

How a big shift in training LLMs led to a capability explosion Read More »

the-curious-rise-of-giant-tablets-on-wheels

The curious rise of giant tablets on wheels


Not quite a TV, not your average tablet

Hands-on with KTC’s 32-inch Android tablet on a rolling pedestal, the A32Q7 Pro.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

KTC’s MegPad 32-inch Android Tablet (A32Q7 Pro). Credit: Scharon Harding

KTC’s MegPad 32-inch Android Tablet (A32Q7 Pro). Credit: Scharon Harding

Over the past few years, LG has set off a strange tech trend that’s been rolling onto devices sold across Amazon and other online electronics retailers.

In 2022, the company launched the StanbyME, which is essentially a $1,000 27-inch tablet running LG’s smart TV operating system (OS), webOS, but lacking a tuner. LG’s press release announcing the device described it as a “wireless private TV screen with a built-in battery” that is easily portable and ideal for watching shows and movies, in addition to  “video conferencing with family and coworkers and viewing online lectures.”

Today, the StanbyME competes against a slew of similar devices, including some from Samsung, but mostly from smaller brands and running Android.

I’ve had one of these devices, the KTC MegPad 32-inch Android Tablet (A32Q7 Pro), rolling around my home for a few weeks, and I’m left curious about what’s driving the growth of StanbyME-like devices, which are noticeably niche and expensive. I’m also uncertain whether these hybrid devices have an ongoing place in a consumer tech world already inundated with big-screen TVs, small-screen tablets, and beloved laptops.

Hands-on

Unlike LG’s StanbyME, KTC’s device doesn’t run a smart TV OS. Instead, it’s a 32-inch Android 13 tablet. Still, KTC heavily markets the MegPad’s ability to serve as streaming hardware, and that’s one of the best uses I found for it.

A big ol’ tablet on wheels. Scharon Harding

Treating the MegPad like a smart TV on wheels meant I could have a living-room-like experience in more places throughout my home. I could watch TV in bed with a more visible screen set at a more comfortable distance than what I’d achieve with a laptop or tablet. It also meant flexibility. I don’t like having a permanent TV in my room (how would I ever get out of bed?), so I appreciated the ability to roll the MegPad out of my room or twist it so that the screen faced away from me.

The MegPad is also a diplomatic solution for homes with limited TVs or computers. This could be helpful for homes with kids with varied interests or in my home, where a speedy, 55-inch TV in the living room is the best screen available by far. I was able to let my partner take the big screen for gaming and still hang out nearby while streaming on the MegPad. I don’t have a central coffee table in my living room, but the mobile tablet enabled me to watch shows without a device weighing down my lap or making me connect a wireless speaker for better volume.

KTC’s device also has a helpful leg-up over LG’s StanbyME via its HDMI port, which makes the MegPad work like a regular monitor. Determining where to safely rest a device tethered to this mobile machine is something you’ll have to figure out on your own, though.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The port selection on the panel’s backside.

Credit: Scharon Harding

The port selection on the panel’s backside. Credit: Scharon Harding

Compared to the TV mounted on my living room wall, the MegPad is much easier to move from room to room, but it’s easy to overestimate how seamless transporting it is. Yes, it’s on a set of five 360-degree wheels, but the wheels don’t lock, and the device weighs 40.3 pounds, per its Amazon listing. That means I had to exert a decent amount of effort to move it over floor transition strips, across uneven floors, and from hardwood to carpet.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The charging port and power button are on the stand’s base.

Credit: Scharon Harding

The charging port and power button are on the stand’s base. Credit: Scharon Harding

A fully rotating screen, however, makes up for some of my mobility complaints and diversifies the MegPad’s potential uses. Besides streaming, for example, the MegPad was great for watching yoga videos online, (which calls for viewing the screen from different heights and positions). It also proved to be an ideal setup for creating a large, print-out collage, which included a lot of dragging, dropping, and cropping of images.

How the MegPad moves.

How the MegPad moves.

How the MegPad moves. Credit: KTC

Not a real TV

You can do a lot with a sizeable Android tablet. But with TV and movie watching being some of the most obvious uses, it’s important to note that neither the MegPad nor any of its rollable rivals are real TVs.

For one, there’s no tuner, though in the streaming world, that matters less to many of today’s TV viewers.

Further, the MegPad, like many StanbyME-like devices, uses Android 13, which doesn’t require paying vendor licensing fees like built-for smart TV OSes, such as Android TV/Google TV and webOS, would. There are some benefits to that, though.

To start, Android 13 doesn’t have the integrated ads that Android TV or the Google TV interface does. Google claims that the Google TV platform doesn’t use automatic content recognition (ACR), but as Consumer Reports has noted, Google collects “data from TVs that use its smart TV platform—and there’s no opting out of Google’s policies during setup if you want smart TV functionality.” Further, Google may combine that data with user data from third parties for advertising purposes. A spokesperson for KTC confirmed to me that the MegPad doesn’t use ACR.

As a tablet, the MegPad is compatible with more apps, many of which aren’t supported by Google TVs, like Google Sheets, Microsoft Word, Reddit, and Signal.

Android tablets are also more appropriate for storing documents, photos, and other files than smart TVs are. Although it’s likely less roomy than your PC, the MegPad has 128GB of internal storage.

But since this is an Android tablet and not a Google TV, there are no integrated channels and no live-TV-only option, which stops the device from collecting diagnostic information. Google TV would also include a more streaming-friendly user interface and the ability to watch content from different streaming providers without switching apps.

Further differing from LG’s StanbyME and real TVs, the MegPad doesn’t include a traditional remote. The tablet comes with a basic Bluetooth mouse, but due to the tablet’s portability, I frequently used the tablet without a flat surface within arm’s reach available for comfortable mouse control. The touchscreen is reliable, but gestures can be cumbersome on a tablet this large, and the display was often out of my hand’s reach.

KTC MegPad 32-inch Android Tablet (A32Q7 Pro)

The tablet comes with this mouse and removable mouse stand.

Credit: Scharon Harding

The tablet comes with this mouse and removable mouse stand. Credit: Scharon Harding

The new portable TV?

With TVs getting larger and people turning to portable gadgets like phones and laptops for TV watching, true portable TVs have become a rarity. Demand for a small device dedicated to on-the-go TV viewing has dropped significantly since the last century. Meanwhile, fabs and supply chains are built around monitor and TV-sized displays, making it difficult to incorporate some of the most desirable display technologies, like OLED, into smaller-sized panels with competitive prices.

As a result, devices like the MegPad and Amazon’s Echo Show have become the new de facto stand-ins for portable TVs, even though they’re not true TV sets. Even LG’s StanbyME Go, a 27-inch webOS-powered display packed into a briefcase, is a far cry from what most of us would traditionally consider a portable TV.

LG StanByMe Go at a picnic

LG’s StanbyMe GO.

Credit: LG

LG’s StanbyMe GO. Credit: LG

Again, these tablets have more versatility than the small, telescoping-antenna-equipped boxes you used to stick on your kitchen counter or hand to a hyper kid during road trips. But they also require a reliance on Big Tech software and all the privacy and ethical implications that come with that.

From left to right: Casio EV 570, Sony Watchman, and Casio EV 660.

You don’t see many of these anymore. From left to right: Casio EV 570, Sony Watchman, and Casio EV 660.

You don’t see many of these anymore. From left to right: Casio EV 570, Sony Watchman, and Casio EV 660. Credit: Richard Derk/Los Angeles Times via Getty Images

KTC also sees the MegPad’s appeal as a pseudo-TV. The MegPad’s product page emphasizes users’ ability to “watch favorite shows/movies directly—no PC needed” and to “stream Netflix [and] YouTube… more effortlessly on your smart TV.” Its Amazon product page also promotes the keywords “portable TV,” “rolling TV,” “mobile TV,” and “standing TV.” This is all despite the MegPad not technically being a true TV.

“KTC defines the MegPad A32Q7Pro as a portable, smart, touchscreen monitor,” KTC’s spokesperson told me. “It combines key traits of a smart display and a large-screen tablet. While it shares some features with smart TVs, tablets, and monitors, it doesn’t fully belong to any single traditional category. It’s a hybrid device designed to bridge those use cases.”

Android tablets on wheels

Many devices like the MegPad represent a push for more Android-powered, non-Google devices that has been buoyed by a program that Google launched in 2022, the Enterprise Devices Licensing Agreement (EDLA).

As explained by partners like BenQ, EDLA is a way for third parties to incorporate Google Mobile Services (GMS), which are Google’s most commonly used apps and APIs bundled for use across different types of devices. GMS apps include popular software like Google Drive, Gmail, the Google Play Store, and YouTube.

“Previously, GMS was only officially available for smartphones, tablets, TVs, and wearables. Under the new EDLA, the list of devices eligible for GMS certification has now been expanded to include enterprise solutions such as smart boards,” a blog from BenQ, which has EDLA-certified smart displays, reads.

Since 2022, (the year LG’s StanbyME launched), there has been an uptick in non-Google devices with this EDLA certification. One of the categories taking advantage of the newer program is tablets on wheels, like the MegPad and similar options from Kefeya, Apolosign, Innocn, and DuraPro.

Demonstrating the marketing value of EDLA certification, the MegPad’s product page reads: “Google EDLA certification provides secure, direct access to Google services and the Google Play Store with regular updates, offering greater stability and data protection than open app ecosystems with unverified apps.”

Most EDLA-certified devices seem to be interactive displays used for education. With EDLA certification, devices like the MegPad may also draw the attention of educators or even businesses. Meanwhile, Google is happy to hand out EDLA certifications, as they can drive Android adoption, giving Google more data and access to customers outside of the typical Android devices, such as phones. Products like the MegPad can also be easier to shop with (Google loves when people use its offerings to shop) than Android devices with smaller screens.

Who’s this for?

I’ve been fascinated by the MegPad and similar devices because they introduce a unique approach to streaming, web browsing, and productivity. But ultimately, they’re hard to recommend when there are other personal gadgets that are more affordable and often take up less space.

I had fun with the MegPad and appreciated the flexibility it offered, especially in my smaller NYC home. There are some specific use cases where products like this could excel, like if you want to bring a computer or screen into a room that doesn’t always need one. It was also helpful as an entertainment center for my father post-surgery, when he primarily had to lie on one side in bed.

Overall, the growing presence of devices like the MegPad underscores a confluence occurring between smart TVs, tablets, monitors, and smart displays. With software being forced into more types of displays, often in the interest of gathering more user data, it’s an interesting time to consider what you want from your next screen—be it computing power, a certain size, the omission or inclusion of web connectivity, and mobility.

It appears that the MegPad and similar tablets are trying to take advantage of the attention that LG garners when launching distinctive devices like its StanbyME line. Besides a StanbyME lookalike, Apolosign also makes a device similar to the StanbyME Go.

Apolosign's 27

Apolosign’s PackGo is very similar to LG’s StanbyME Go. Credit: Apolosign

Three years after LG made TV-esque devices on wheels a talking point, more brands are trying to roll into the market. That includes LG’s best TV frenemy, Samsung, which has been using the form factor in limited geographies to drive sales of “smart monitors.”

Tech brands have ulterior motives for pushing this newer form factor that go beyond filling a gap in consumer gadgets. But if a large tablet or small smart display with wheels fits your needs, the options are there, and they should meet most expectations.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

The curious rise of giant tablets on wheels Read More »

what’s-wrong-with-aaa-games?-the-development-of-the-next-battlefield-has-answers.

What’s wrong with AAA games? The development of the next Battlefield has answers.


EA insiders describe stress and setbacks in a project that’s too big to fail.

A marketing image for Battlefield depicting soldiers and jets

After the lukewarm reception of Battlefield 2042, EA is doubling down.

After the lukewarm reception of Battlefield 2042, EA is doubling down.

It’s been 23 years since the first Battlefield game, and the video game industry is nearly unrecognizable to anyone who was immersed in it then. Many people who loved the games of that era have since become frustrated with where AAA (big budget) games have ended up.

Today, publisher EA is in full production on the next Battlefield title—but sources close to the project say it has faced culture clashes, ballooning budgets, and major disruptions that have left many team members fearful that parts of the game will not be finished to players’ satisfaction in time for launch during EA’s fiscal year.

They also say the company has made major structural and cultural changes to how Battlefield games are created to ensure it can release titles of unprecedented scope and scale. This is all to compete with incumbents like the Call of Duty games and Fortnite, even though no prior Battlefield has achieved anywhere close to that level of popular and commercial success.

I spoke with current and former EA employees who work or have recently worked directly on the game—they span multiple studios, disciplines, and seniority levels and all agreed to talk about the project on the condition of anonymity. Asked to address the reporting in this article, EA declined to comment.

According to these first-hand accounts, the changes have led to extraordinary stress and long hours. Every employee I spoke to across several studios either took exhaustion leave themselves or directly knew staffers who did. Two people who had worked on other AAA projects within EA or elsewhere in the industry said this project had more people burning out and needing to take leave than they’d ever seen before.

Each of the sources I spoke with shared sincere hopes that the game will still be a hit with players, pointing to its strong conceptual start and the talent, passion, and pedigree of its development team. Whatever the end result, the inside story of the game’s development illuminates why the medium and the industry are in the state they’re in today.

Table of Contents

The road to Glacier

To understand exactly what’s going on with the next Battlefield title—codenamed Glacier—we need to rewind a bit.

In the early 2010s, Battlefield 3 and Battlefield 4 expanded the franchise audience to more directly compete with Call of Duty, the heavy hitter at the time. Developed primarily by EA-owned, Sweden-based studio DICE, the Battlefield games mixed the franchise’s promise of combined arms warfare and high player counts with Call of Duty’s faster pace and greater platform accessibility.

This was a golden age for Battlefield. However, 2018’s Battlefield V launched to a mixed reception, and EA began losing players’ attention in an expanding industry.

Battlefield 3, pictured here, kicked off the franchise’s golden age. Credit: EA

Instead, the hot new online shooters were Overwatch (2016), Fortnite (2017), and a resurgent Call of Duty. Fortnite was driven by a popular new gameplay mode called Battle Royale, and while EA attempted a Battle Royale mode in Battlefield V, it didn’t achieve the desired level of popularity.

After V, DICE worked on a Battlefield title that was positioned as a throwback to the glory days of 3 and 4. That game would be called Battlefield 2042 (after the future year in which it was set), and it would launch in 2021.

The launch of Battlefield 2042 is where Glacier’s development story begins. Simply put, the game was not fun enough, and Battlefield 2042 launched as a dud.

Don’t repeat past mistakes

Players were disappointed—but so were those who worked on 2042. Sources tell me that prior to launch, Battlefield 2042 “massively missed” its alpha target—a milestone by which most or all of the foundational features of the game are meant to be in place. Because of this, the game’s final release would need to be delayed in order to deliver on the developers’ intent (and on players’ expectations).

“Realistically, they have to delay the game by at least six months to complete it. Now, they eventually only delayed it by, I think, four or five weeks, which from a development point of view means very little,” said one person who worked closely with the project at the time.

Developers at DICE had hoped for more time. Morale fell, but the team marched ahead to the game’s lukewarm launch.

Ultimately, EA made back some ground with what the company calls “live operations”—additional content and updates in the months following launch—but the game never fulfilled its ambitions.

Plans were already underway for the next Battlefield game, so a postmortem was performed on 2042. It concluded that the problems had been in execution, not vision. New processes were put into place so that issues could be identified earlier and milestones like the alpha wouldn’t be missed.

To help achieve this, EA hired three industry luminaries to lead Glacier, all of them based in the United States.

The franchise leadership dream team

2021 saw EA bring on Byron Beede as general manager for Battlefield; he had previously been general manager for both Call of Duty (including the Warzone Battle Royale) and the influential shooter Destiny. EA also hired Marcus Lehto—co-creator of Halo—as creative chief of a newly formed Seattle studio called Ridgeline Games, which would lead the development of Glacier’s single-player campaign.

Finally, there was Vince Zampella, one of the leaders of the team that initially created Call of Duty in 2003. He joined EA in 2010 to work on other franchises, but in 2021, EA announced that Zampella would oversee Battlefield moving forward.

In the wake of these changes, some prominent members of DICE departed, including General Manager Oskar Gabrielson and Creative Director Lars Gustavsson, who had been known by the nickname “Mr. Battlefield.” With this changing of the guard, EA was ready to place a bigger bet than ever on the next Battlefield title.

100 million players

While 2042 struggled, competitors Call of Duty and Fortnite were posting astonishing player and revenue numbers, thanks in large part to the popularity of their Battle Royale modes.

EA’s executive leadership believed Battlefield had the potential to stand toe to toe with them, if the right calls were made and enough was invested.

A lofty player target was set for Glacier: 100 million players over a set period of time that included post-launch.

Fortnite characters looking across the many islands and vast realm of the game.

Fortnite‘s huge success has publishers like EA chasing the same dollars. Credit: Epic Games

“Obviously, Battlefield has never achieved those numbers before,” one EA employee told me. “It’s important to understand that over about that same period, 2042 has only gotten 22 million,” another said. Even 2016’s Battlefield 1—the most successful game in the franchise by numbers—had achieved “maybe 30 million plus.”

Of course, most previous Battlefield titles had been premium releases, with an up-front purchase cost and no free-to-play mode, whereas successful competitors like Fortnite and Call of Duty made their Battle Royale modes freely available, monetizing users with in-game purchases and season passes that unlocked post-launch content.

It was thought that if Glacier did the same, it could achieve comparable numbers, so a free-to-play Battle Royale mode was made a core offering for the title, alongside a six-hour single-player campaign, traditional Battlefield multiplayer modes like Conquest and  Rush, a new F2P mode called Gauntlet, and a community content mode called Portal.

The most expensive Battlefield ever

All this meant that Glacier would have a broader scope than its predecessors. Developers say it has the largest budget of any Battlefield title to date.

The project targeted a budget of more than $400 million back in early 2023, which was already more than was originally planned at the start.

However, major setbacks significantly disrupted production in 2023 (more on that in a moment) and hundreds of additional developers were brought onto Glacier from various EA-owned studios to get things back on track, significantly increasing the cost. Multiple team members with knowledge of the project’s finances told me that the current projections are now well north of that $400 million amount.

Skepticism in the ranks

Despite the big ambitions of the new leadership team and EA executives, “very few people” working in the studios believed the 100 million target was achievable, two sources told me. Many of those who had worked on Battlefield for a long time at DICE in Stockholm were particularly skeptical.

“Among the things that we are predicting is that we won’t have to cannibalize anyone else’s sales,” one developer said. “That there’s just such an appetite out there for shooters of this kind that we will just naturally be able to get the audience that we need.”

Regarding the lofty player and revenue targets, one source said that “nothing in the market research or our quality deliverables indicates that we would be anywhere near that.”

“I think people are surprised that they actually worked on a next Battlefield game and then increased the ambitions to what they are right now,” said another.

In 2023, a significant disruption to the project put one game mode in jeopardy, foreshadowing a more troubled development than anyone initially imagined.

Ridgeline implodes

Battlefield games have a reputation for middling single-player campaigns, and Battlefield 2042 didn’t include one at all. But part of this big bet on Glacier was the idea of offering the complete package, so Ridgeline Games scaled up while working on a campaign EA hoped would keep Battlefield competitive with Call of Duty, which usually has included a single-player campaign in its releases.

The studio worked on the campaign for about two years while it was also scaling and hiring talent to catch up to established studios within the Battlefield family.

It didn’t work out. In February of 2024, Ridgeline was shuttered, Halo luminary Marcus Lehto left the company, and the rest of the studios were left to pick up the pieces. When a certain review came up not long before the studio was shuttered, Glacier’s top leadership were dissatisfied with the progress they were seeing, and the call was made.

Sources in EA teams outside Ridgeline told me that there weren’t proper check-ins and internal reviews on the progress, obscuring the true state of the project until the fateful review.

On the other hand, those closer to Ridgeline described a situation in which the team couldn’t possibly complete its objectives, as it was expected to hire and scale up from zero while also meeting the same milestones as established studios with resources already in place. “They kept reallocating funds—essentially staff months—out of our budget,” one person told me. “And, you know, we’re sitting there trying to adapt to doing more with less.”

A Battlefield logo with a list of studios beneath it

A marketing image from EA showing now-defunct Ridgeline Games on the list of groups involved. Credit: EA

After the shuttering of Ridgeline, ownership of single-player shifted to three other EA studios: Criterion, DICE, and Motive. But those teams had a difficult road ahead, as “there was essentially nothing left that Ridgeline had spent two years working on that they could pick up on and build, so they had to redo essentially everything from scratch within the same constraints of when the game had to release.”

Single-player was two years behind. As of late spring, it was the only game mode that had failed to reach alpha, well over a year after the initial overall alpha target for the project.

Multiple sources said its implosion was symptomatic of some broader cultural and process problems that affected the rest of the project, too.

Culture shock

Speaking with people who have worked or currently work at DICE in Sweden, the tension between some at that studio and the new, US-based leadership team was obvious—and to a degree, that’s expected.

DICE had “the pride of having started Battlefield and owned that IP,” but now the studio was just “supporting it for American leadership,” said one person who worked there. Further, “there’s a lot of distrust and disbelief… when it comes to just operating toward numbers that very few people believe in apart from the leadership.”

But the tensions appear to go deeper than that. Two other major factors were at play: scaling pains as the scope of the project expanded and differences in cultural values between US leadership and the workers in Europe.

“DICE being originally a Swedish studio, they are a bit more humble. They want to build the best game, and they want to achieve the greatest in terms of the game experience,” one developer told me. “Of course, when you’re operated by EA, you have to set financial expectations in order to be as profitable as possible.”

That tension wasn’t new. But before 2042 failed to meet expectations, DICE Stockholm employees say they were given more leeway to set the vision for the game, as well as greater influence on timeline and targets.

Some EU-based team members were vocally dismayed at how top-down directives from far-flung offices, along with the US company’s emphasis on quarterly profits, have affected Glacier’s development far more than with previous Battlefield titles.

This came up less in talking to US-based staff, but everyone I spoke with on both continents agreed on one thing: Growing pains accompanied the transition from a production environment where one studio leads and others offer support to a new setup with four primary studios—plus outside support from all over EA—and all of it helmed by LA-based leadership.

EA is not alone in adopting this approach; it’s also used by competitor Activision-Blizzard on the Call of Duty franchise (though it’s worth noting that a big hit like Epic Games’ Fortnite has a very different structure).

Whereas publishers like EA and Activision-Blizzard used to house several studios, each of which worked on its own AAA game, they now increasingly make bigger bets on singular games-as-a-service offerings, with several of their studios working in tandem on a single project.

“Development of games has changed so much in the last 10 to 15 years,” said one developer. The new arrangement excites investors and shareholders, who can imagine returns from the next big unicorn release, but it can be a less creatively fulfilling way to work, as directives come from the top down, and much time is spent on dealing with inter-studio process. Further, it amplifies the effects of failures, with a higher human cost to people working on projects that don’t meet expectations.

It has also made the problems that affected Battlefield 2042‘s development more difficult to avoid.

Clearing the gates

EA studios use a system of “gates” to set the pace of development. Projects have to meet certain criteria to pass each gate.

For gate one, teams must have a clear sense of what they want to make and some proof of concept showing that this vision is achievable.

As they approach gate two, they’re building out and testing key technology, asking themselves if it can work at scale.

Gate three signifies full production. Glacier was expected to pass gate three in early 2023, but it was significantly delayed. When it did pass, some on the ground questioned whether it should have.

“I did not see robust budget, staff plan, feature list, risk planning, et cetera, as we left gate three,” said one person. In the way EA usually works, these things would all be expected at this stage.

As the project approached gate three and then alpha, several people within the organization tried to communicate that the game wasn’t on footing as firm as the top-level planning suggested. One person attributed this to the lack of a single source of truth within the organization. While developers tracked issues and progress in one tool, others (including project leadership) leaned on other sources of information that weren’t as tied to on-the-ground reality when making decisions.

A former employee with direct knowledge of production plans told me that as gate three approached, prototypes of some important game features were not ready, but since there wasn’t time to complete proofs of concept, the decision was handed down to move ahead to production even though the normal prerequisites were not met.

“If you don’t have those things fleshed out when you’re leaving pre-pro[duction], you’re just going to be playing catch-up the entire time you’re in production,” this source said.

In some cases, employees who flagged the problems believed they were being punished. Two EA employees each told me they found themselves cut out of meetings once they raised concerns like this.

Gate three was ultimately declared clear, and as of late May 2025, alpha was achieved for everything except the single-player campaign. But I’m told that this occurred with some tasks still un-estimated and many discrepancies remaining, leaving the door open to problems and compromises down the road.

The consequences for players

Because of these issues, the majority of the people I spoke with said they expect planned features or content to be cut before the game actually launches—which is normal, to a degree. But these common game development problems can contribute to other aspects of modern AAA gaming that many consumers find frustrating.

First off, making major decisions so late in the process can lead to huge day-one patches. Players of all types of AAA games often take to Reddit and social media to malign day-one patches as a frustrating annoyance for modern titles.

Battlefield 2042 had a sizable day-one patch. When multiplayer RPG Anthem (another big investment by EA) launched to negative reviews, that was partly because critics and others with pre-launch access were playing a build that was weeks old; a day-one patch significantly improved some aspects of the game, but that came after the negative press began to pour out.

A player character confronts a monster in Anthem

Anthem, another EA project with a difficult development, launched with a substantial day-one patch. Credit: EA

Glacier’s late arrival to Alpha and the teams’ problems with estimating the status of features could lead to a similarly significant day-one patch. That’s in part because EA has to deliver the work to external partners far in advance of the actual launch date.

“They have these external deadlines to do with the submissions into what EA calls ‘first-party’—that’s your PlayStation and Xbox submissions,” one person explained. “They have to at least have builds ready that they can submit.”

What ends up on the disc or what pre-loads from online marketplaces must be finalized long before the game’s actual release date. When a project is far behind or prone to surprises in the final stretch, those last few weeks are where a lot of vital work happens, so big launch patches become a necessity.

These struggles over content often lead to another pet peeve of players: planned launch content being held until later. “There’s a bit of project management within the Battlefield project that they can modify,” a former senior EA employee who worked on the project explained. “They might push it into Season 1 or Season 2.”

That way, players ultimately get the intended feature or content, but in some cases, they may end up paying more for it, as it ends up being part of a post-launch package like a battle pass.

These challenges are a natural extension of the fiscal-quarter-oriented planning that large publishers like EA adhere to. “The final timelines don’t change. The final numbers don’t change,” said one source. “So there is an enormous amount of pressure.”

A campaign conundrum

Single-player is also a problem. “Single-player in itself is massively late—it’s the latest part of the game,” I was told. “Without an enormous patch on day one or early access to the game, it’s unrealistic that they’re going to be able to release it to what they needed it to do.”

If the single-player mode is a linear, narrative campaign as originally planned, it may not be possible to delay missions or other content from the campaign to post-launch seasons.

“Single-player is secondary to multiplayer, so they will shift the priority to make sure that single-player meets some minimal expectations, however you want to measure that. But the multiplayer is the main focus,” an EA employee said.

“They might have to cut a part of the single-player out in order for the game to release with a single-player [campaign] on it,” they continued. “Or they would have to severely work through the summer and into the later part of this year and try to fix that.”

That—and the potential for a disappointing product—is a cost for players, but there are costs for the developers who work on the game, too.

Because timelines must be kept, and not everything can be cut or moved post-launch, it falls on employees to make up the gap. As we’ve seen in countless similar reports about AAA video game development before, that sometimes means longer hours and heavier stress.

AAA’s burnout problem

More than two decades ago, the spouse of an EA employee famously wrote an open letter to bring attention to the long hours and high stress developers there were facing.

Since then, some things have improved. People at all levels within EA are more conscious of the problems that were highlighted, and there have been efforts to mitigate some of them, like more comp time and mental health resources. However, many of those old problems linger in some form.

I heard several first-hand accounts of people working on Glacier who had to take stress or mental or exhaustion health leave, ranging from a couple of weeks to several months.

“There’s like—I would hesitate to count—but a large number compared to other projects I’ve been on who have taken mental exhaustion leave here. Some as short as two weeks to a month, some as long as eight months and nine,” one staffer told me after saying they had taken some time themselves.

This was partly because of long hours that were required when working directly with studios in both the US and Europe—a symptom of the new, multi-studio structure.

“My day could start as early as 5: 00 [am],” one person said. The first half of the day involved meetings with a studio in one part of the world while the second included meetings with a studio in another region. “Then my evenings would be spent doing my work because I’d be tied up juggling things all across the board and across time zones.”

This sort of workload was not limited to a brief, planned period of focused work, the employees said. Long hours were particularly an issue for those working in or closely with Ridgeline, the studio initially tasked with making the game’s single-player campaign.

From the beginning, members of the Ridgeline team felt they were expected to deliver work at a similar level to that of established studios like DICE or Ripple Effect before they were even fully staffed.

“They’ve done it before,” one person who was involved with Ridgeline said of DICE. “They’re a well-oiled machine.” But Ridgeline was “starting from zero” and was “expected to produce the same stuff.”

Within just six months of the starting line, some developers at Ridgeline said they were already feeling burnt out.

In the wake of the EA Spouses event, EA developed resources for employees. But in at least some cases, they weren’t much help.

“I sought some, I guess, mental help inside of EA. From HR or within that organization of some sort, just to be able to express it—the difficulties that I experienced personally or from coworkers on the development team that had experienced this, you know, that had lived through that,” said another employee. “And the nature of that is there’s nobody to listen. They pretend to listen, but nobody ultimately listens. Very few changes are made on the back of it.”

This person went on to say that “many people” had sought similar help and felt the same way, as far back as the post-launch period for 2042 and as recently as a few months ago.

Finding solutions

There have been a lot of stories like this about the games industry over the years, and it can feel relentlessly grim to keep reading them—especially when they’re coming alongside frequent news of layoffs, including at EA. Problems are exposed, but solutions don’t get as much attention.

In that spirit, let’s wrap up by listening to what some in the industry have said about what doing things better could look like—with the admitted caveat that these proposals are still not always common practice in AAA development.

“Build more slowly”

When Swen Vincke—studio head for Larian Studios and game director for the runaway success Baldur’s Gate 3—accepted an award at the Game Developers Conference, he took his moment on stage to express frustration at publishers like EA.

“I’ve been fighting publishers my entire life, and I keep on seeing the same, same, same mistakes over and over and over,” he said. “It’s always the quarterly profits. The only thing that matters are the numbers.”

After the awards show, he took to X to clarify his statements, saying, “This message was for those who try to double their revenue year after year. You don’t have to do that. Build more slowly and make your aim improving the state of the art, not squeezing out the last drop.”

A man stands on stage giving a speech

Swen Vincke giving a speech at the 2024 Game Developers Choice Awards. Credit: Game Developers Conference

In planning projects like Glacier, publicly traded companies often pursue huge wins—and there’s even more pressure to do so if a competing company has already achieved big success with similar titles.

But going bigger isn’t always the answer, and many in the industry believe the “one big game” strategy is increasingly nonviable.

In this attention economy?

There may not be enough player time or attention to go around, given the numerous games-as-a-service titles that are as large in scope as Call of Duty games or Fortnite. Despite the recent success of new entrant Marvel Rivals, there have been more big AAA live service shooter flops than wins in recent years.

Just last week, a data-based report by prominent games marketing newsletter GameDiscoverCo came to a prescient realization. “Genres like Arena Shooter, Battle Royale, and Hero Shooter look amazing from a revenue perspective. But there’s only 29 games in all of Steam’s history that have grossed >$1m in those subgenres,” wrote GameDiscoverCo’s Simon Carless.

It gets worse. “Only Naraka Bladepoint, Overwatch 2 & Marvel Rivals have grossed >$25m and launched since 2020 in those subgenres,” Carless added. (It’s important to clarify that he is just talking Steam numbers here, though.) That’s a stark counterpoint to reports that Call of Duty has earned more than $30 billion in lifetime revenue.

Employees of game publishers and studios are deeply concerned about this. In a 2025 survey of professional game developers, “one of the biggest issues mentioned was market oversaturation, with many developers noting how tough it is to break through and build a sustainable player base.”

Despite those headwinds, publishers like EA are making big bets in well-established spaces rather than placing a variety of smaller bets in newer areas ripe for development. Some of the biggest recent multiplayer hits on Steam have come from smaller studios that used creative ideas, fresh genres, strong execution, and the luck (or foresight) of reaching the market at exactly the right time.

That might suggest that throwing huge teams and large budgets up against well-fortified competitors is an especially risky strategy—hence some of the anxiety from the EA developers I spoke with.

Working smarter, not harder

That anxiety has led to steadily growing unionization efforts across the industry. From QA workers at Bethesda to more wide-ranging unions at Blizzard and CD Projekt Red, there’s been more movement on this front in the past two or three years than there had been in decades beforehand.

Unionization isn’t a cure-all, and it comes with its own set of new challenges—but it does have the potential to shift some of the conversations toward more sustainable practices, so that’s another potential part of the solution.

Insomniac Games CEO Ted Price spoke authoritatively on sustainability and better work practices for the industry way back at 2021’s Develop:Brighton conference:

I think the default is to brute force the problem—in other words, to throw money or people at it, but that can actually cause more chaos and affect well-being, which goes against that balance. The harder and, in my opinion, more effective solution is to be more creative within constraints… In the stress of hectic production, we often feel we can’t take our foot off the gas pedal—but that’s often what it takes.

That means publishers and studios should plan for problems and work from accurate data about where the team is at, but it also means having a willingness to give their people more time, provided the capital is available to do so.

Giving people what they need to do their jobs sounds like a simple solution to a complex problem, but it was at the heart of every conversation I had about Glacier.

Most EA developers—including leaders who are beholden to lofty targets—want to make a great game. “At the end of the day, they’re all really good people and they work really hard and they really want to deliver a good product for their customer,” one former EA developer assured me as we ended our call.

As for making the necessary shifts toward sustainability in the industry, “It’s kind of in the best interest of making the best possible game for gamers,” explained another. “I hope to God that they still achieve what they need to achieve within the timelines that they have, for the sake of Battlefield as a game to actually meet the expectations of the gamers and for people to maintain their jobs.”

Photo of Samuel Axon

Samuel Axon is the editorial lead for tech and gaming coverage at Ars Technica. He covers AI, software development, gaming, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

What’s wrong with AAA games? The development of the next Battlefield has answers. Read More »

android-16-review:-post-hype

Android 16 review: Post-hype


Competent, not captivating

The age of big, exciting Android updates is probably over.

Android 16 on a Pixel

Android 16 is currently only available for Pixel phones. Credit: Ryan Whitwam

Android 16 is currently only available for Pixel phones. Credit: Ryan Whitwam

Google recently released Android 16, which brings a smattering of new features for Pixel phones, with promises of additional updates down the road. The numbering scheme has not been consistent over the years, and as a result, Android 16 is actually the 36th major release in a lineage that stretches back nearly two decades. In 2008, we didn’t fully understand how smartphones would work, so there was a lot of trial and error. In 2025, the formula has been explored every which way. Today’s smartphones run mature software, and that means less innovation in each yearly release. That trend is exemplified and amplified by Google’s approach to Android 16.

The latest release is perhaps the most humdrum version of the platform yet, but don’t weep for Google. The company has been working toward this goal for years: a world where the average phone buyer doesn’t need to worry about Android version numbers.

A little fun up front

When you install Android 16 on one of Google’s Pixel phones, you may need to check the settings to convince yourself that the update succeeded. Visually, the changes are so minuscule that you’ll only notice them if you’re obsessive about how Android works. For example, Google changed the style of icons in the overview screen and added a few more options to the overview app menus. There are a lot of these minor style tweaks; we expect more when Google releases Material 3 Expressive, but that’s still some way off.

There are some thoughtful UI changes, but again, they’re very minor and you may not even notice them at first. For instance, Google’s predictive back gesture, which allows the previous screen to peek out from behind the currently displayed one, now works with button navigation.

Apps targeting the new API (level 36) will now default to using edge-to-edge rendering, which removes the navigation background to make apps more immersive. Android apps have long neglected larger form factors because Google itself was neglecting those devices. Since the Android 12L release a few years ago, Google has been attempting to right that wrong. Foldable phones have suffered from many of the same issues with app scaling that tablets have, but all big-screen Android devices will soon benefit from adaptive apps. Previously, apps could completely ignore the existence of large screens and render a phone-shaped UI on a large screen.

Advanced Protection is a great addition to Android, even if it’s not the most riveting.

Credit: Ryan Whitwam

Advanced Protection is a great addition to Android, even if it’s not the most riveting. Credit: Ryan Whitwam

In Android 16, apps will automatically adapt to larger screens, saving you from having to tinker with the forced aspect ratio tools built into Google and Samsung devices. Don’t confuse this with tablet-style interfaces, though. Just because an app fills the screen, it’s no guarantee that it will look good. Most of the apps we’ve run on the Pixel 9 Pro Fold are still using stretched phone interfaces that waste space. Developers need to make adjustments to properly take advantage of larger screens. Will they? That’s yet another aspect of Android 16 that we hope will come later.

Security has been a focus in many recent Android updates. While not the most sexy improvement, the addition of Advanced Protection in Android 16 could keep many people from getting hit with malware, and it makes it harder for government entities to capture your data. This feature blocks insecure 2G connections, websites lacking HTTPS, and exploits over USB. It disables sideloading of apps, too, which might make some users wary. However, if you know someone who isn’t tech savvy, you should encourage them to enable Advanced Protection when (and if) they get access to Android 16. This is a great feature that Google should have added years ago.

The changes to notifications will probably make the biggest impact on your daily life. Whether you’re using Android or iOS, notification spam is getting out of hand. Every app seems to want our attention, and notifications can really pile up. Android 16 introduces a solid quality-of-life improvement by bundling notifications from each app. While notification bundles were an option before, they were primarily used for messaging, and not all developers bothered. Now, the notification shade is less overwhelming, and it’s easy to expand each block to triage individual items.

Progress notification

Android 16’s progress notifications are partially implemented in the first release.

Credit: Ryan Whitwam

Android 16’s progress notifications are partially implemented in the first release. Credit: Ryan Whitwam

Google has also added a new category of notifications that can show progress, similar to a feature on the iPhone. The full notification will include a live updating bar that can tell you exactly when your Uber will show up, for example. These notifications will come first to delivery and rideshare apps, but none of them are working yet. You can get a preview of how these notifications will work with the Android 16 easter egg, which sends a little spaceship rocketing toward a distant planet.

The progress notifications will also have a large status bar chip with basic information visible at all times. Tapping on it will expand the full notification. However, this is also not implemented in the first release of Android 16. Yes, this is a recurring theme with Google’s new OS.

More fun still to come

You may notice that none of the things we’ve discussed in Android 16 are exactly riveting—better security features and cleaner notifications are nice to have, but this is hardly a groundbreaking update. It might have been more exciting were it not for the revamped release schedule, though. This Android 16 release isn’t even the Android 16. There will be a second Android 16 update later in the year, and some of the most interesting features aren’t arriving as part of either one.

Traditionally, Google has released new versions of Android in the fall, around the time new Pixel phones arrive. Android 15, for example, began its rollout in October 2024. Just eight months later, we’re on to Android 16. This is the first cycle in which Google will split its new version into two updates. Going forward, the bigger update will arrive in Q2, and the smaller one, which includes API and feature tweaks, will come at the end of the year.

Google has said the stylish but divisive Material 3 Expressive UI and the desktop windowing feature will come later. They’re currently in testing with the latest beta for Android 16 QPR1, which will become a Pixel Drop in September. It’s easy to imagine that with a single fall Android 16 release, both of these changes would have been included.

In the coming months, we expect to see some Google apps updated with support for Material 3, but the changes will be minimal unless you’re using a phone that runs Google’s Android theme. For all intents and purposes, that means a Pixel. Motorola has traditionally hewed closely to Google’s interface, while Samsung, OnePlus, and others forged their own paths. But even Moto has been diverging more as it focuses on AI. It’s possible that Google’s big UI shakeup will only affect Pixel users.

As for desktop windowing, that may have limited impact, too. On-device windowing will only be supported on tablets—even tablet-style foldables will be left out. We’ve asked Google to explain this decision and will report back if we get more details. Non-tablet devices will be able to project a desktop-style interface on an external display via USB video-out, but the feature won’t be available universally. Google tells Ars that it’s up to OEMs to support this feature. So even a phone that has video-out over USB may not have desktop windowing. Again, Pixels may be the best (or only) way to get Android’s new desktop mode.

The end of version numbers

There really isn’t much more to say about Android 16 as it currently exists. This update isn’t flashy, but it lays important groundwork for the future. The addition of Material 3 Expressive will add some of the gravitas we expect from major version bumps, but it’s important to remember that this is just Google’s take on Android—other companies have their own software interests, mostly revolving around AI. We’ll have to wait to see what Samsung, OnePlus, and others do with the first Android 16 release. The underlying software has been released in the Android Open Source Project (AOSP), but it will be a few months before other OEMs have updates.

In some ways, boring updates are exactly what Google has long wanted from Android. Consider the era when Android updates were undeniably exciting—a time when the addition of screenshots could be a headlining feature (Android 4.0 Ice Cream Sandwich) or when Google finally figured out how to keep runaway apps from killing your battery (Android 6.0 Marshmallow). But there was a problem with these big tentpole updates: Not everyone got them, and they were salty about it.

During the era of rapid software improvement, it took the better part of a year (or longer!) for a company like Samsung or LG to deploy new Android updates. Google would announce a laundry list of cool features, but only the tiny sliver of people using Nexus (and later Pixel) phones would see them. By the time a Samsung Galaxy user had the new version, it was time for Google to release another yearly update.

This “fragmentation” issue was a huge headache for Google, leading it to implement numerous platform changes over the years to take the pressure off its partners and app developers. There were simple tweaks like adding important apps, including Maps and the keyboard (later Gboard), to the Play Store so they could be updated regularly. On the technical side, initiatives like Project Mainline made the platform more modular so features could be added and improved outside of major updates. Google has also meticulously moved features into Play Services, which can deliver system-level changes without an over-the-air update (although there are drawbacks to that).

Android I/O sign

Android version numbers hardly matter anymore—it’s just Android.

Credit: Ryan Whitwam

Android version numbers hardly matter anymore—it’s just Android. Credit: Ryan Whitwam

The overarching story of Android has been a retreat from monolithic updates, and that means there’s less to get excited about when a new version appears. Rather than releasing a big update rife with changes, Google has shown a preference for rolling out features via the Play Store and Play Services to the entire Android ecosystem. Experiences like Play Protect anti-malware, Google Play Games, Google Cast, Find My Device, COVID-19 exposure alerts, Quick Share, and myriad more were released to almost all Google-certified Android devices without system updates.

As more features arrive in dribs and drabs via Play Services and Pixel Drops, the numbered version changes are less important. People used to complain about missing out on the tentpole updates, but it’s quieter when big features are decoupled from version numbers. And that’s where we are—Android 15 or Android 16—the number is no longer important. You won’t notice a real difference, but the upshot is that most phones get new features faster than they once did. That was the cost to fix fragmentation.

Boring updates aren’t just a function of rearranging features. Even if all the promised upgrades were here now, Android 16 would still barely move the needle. Phones are now mature products with established usage paradigms. It’s been almost 20 years since the age of touchscreen smartphones began, and we’ve figured out how these things should work. It’s not just Android updates settling into prosaic predictability—Apple is running low on paradigm shifts, too. The release of iOS 26 will add some minor improvements to a few apps, and the theme is getting more transparent with the controversial “Liquid Glass” UI. And that’s it.

Until there’s a marked change in form factors or capability, these flat glass slabs will look and work more or less as they do now (with a lot more AI slop, whether you like it or not). If you have a recent non-Pixel Android device, you’ll probably get Android 16 in the coming months, but it won’t change the way you use your phone.

Photo of Ryan Whitwam

Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he’s written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards.

Android 16 review: Post-hype Read More »