Author name: Paul Patrick

tiny-chips-hitch-a-ride-on-immune-cells-to-sites-of-inflammation

Tiny chips hitch a ride on immune cells to sites of inflammation


Tiny chips can be powered by infrared light if they’re near the brain’s surface.

An immune cell chemically linked to a CMOS chip. Credit: Yadav, et al.

Standard brain implants use electrodes that penetrate the gray matter to stimulate and record the activity of neurons. These typically need to be put in place via a surgical procedure. To go around that need, a team of researchers led by Deblina Sarkar, an electrical engineer and MIT assistant professor, developed microscopic electronic devices hybridized with living cells. Those cells can be injected into the circulatory system with a standard syringe and will travel the bloodstream before implanting themselves in target brain areas.

“In the first two years of working on this technology at MIT, we’ve got 35 grant proposals rejected in a row,” Sarkar says. “Comments we got from the reviewers were that our idea was very impactful, but it was impossible.” She acknowledges that the proposal sounded like something you can find in science fiction novels. But after more than six years of research, she and her colleagues have pulled it off.

Nanobot problems

In 2022, when Sarkar and her colleagues gathered initial data and got some promising results with their cell-electronics hybrids, the team proposed the project for the National Institutes of Health Director’s New Innovator Award. For the first time, after 35 rejections, it made it through peer review. “We got the highest impact score ever,” Sarkar says.

The reason for that score was that her technology solved three extremely difficult problems. The first, obviously, was making functional electronic devices smaller than cells that can circulate in our blood.

“Previous explorations, which had not seen a lot of success, relied on putting magnetic particles inside the bloodstream and then guiding them with magnetic fields,” Sarkar explains. “But there is a difference between electronics and particles.” Electronics made using CMOS technology (which we use for making computer processors) can generate electrical power from incoming light in the same way as photovoltaics, as well as perform computations necessary for more intelligent applications like sensing. Particles, on the other hand, can only be used to stimulate cells to an extent.

If they ever reach those cells, of course, which was the second problem. “Controlling the devices with magnetic fields means you need to go into a machine the size of an MRI,” Sarkar says. Once the subject is in the machine, an operator looks at where the devices are and tries to move them to where they need to be using nothing but magnetic fields. Sarkar said that it’s tough to do anything other than move the particles in straight lines, which is a poor match for our very complex vasculature.

The solution her team found was fusing the electronics with monocytes, immune cells that can home in on inflammation in our bodies. The idea was that the monocytes would carry the electronics through the bloodstream using the cells’ chemical homing mechanism. This also solved the third problem: crossing the blood-brain barrier that protects the brain from pathogens and toxins. Electronics alone could not get through it; monocytes could.

The challenge was making all these ideas work.

Clicking together

Sarkar’s team built electronic devices made of biocompatible polymer and metallic layers fabricated on silicon wafers using a standard CMOS process. “We made the devices this small with lithography, the technique used in making transistors for chips in our computers,” Sarkar explains. They were roughly 200 nanometers thick and 10 microns in diameter—that kept them subcellular, since a monocyte cell usually measures between 12 and 18 microns. The devices were activated and powered by infrared light at a wavelength that could penetrate several centimeters into the brain.

Once the devices were manufactured and taken off the wafer, the next thing to figure out was attaching them to monocytes.

To do this, the team covered the surfaces of the electronic devices with dibezocyclooctyne, a very reactive molecule that can easily link to other chemicals, especially nitrogen compounds called azides. Then Sarkar and her colleagues chemically modified monocytes to place azides on their surfaces. This way, the electronics and cells could quickly snap together, almost like Lego blocks (this approach, called click chemistry, got the 2022 Nobel Prize in chemistry).

The resulting solution of cell-electronics hybrids was designed to be biocompatible and could be injected into the circulatory system. This is why Sarkar called her concept “circulatronics.”

Of course, Sarkar’s “circulatronic” hybrids fall a bit short of sci-fi fantasies, in that they aren’t exactly literal nanobots. But they may be the closest thing we’ve created so far.

Artificial neurons

To test these hybrids in live mice, the researchers prepared a fluorescent version to make them easier to track. Mice were anesthetized first, and the team artificially created inflammation at a specific location in their brains, around the ventrolateral thalamic nucleus. Then the hybrids were injected into the veins of the mice. After roughly 72 hours, the time scientists expected would be needed for the monocytes to reach the inflammation, Sarkar and her colleagues started running tests.

It turned out that most of the injected hybrids reached their destination in one piece—the electronics mostly remained attached to the monocytes. The team’s measurements suggest that around 14,000 hybrids managed to successfully implant themselves near the neurons in the target area of the brain. Then, in response to infrared irradiation, they caused significant neuronal activation, comparable to traditional electrodes implanted via surgery.

The real strength of the hybrids, Sarkar thinks, is the way they can be tuned to specific diseases. “We chose monocytes for this experiment because inflammation spots in the brain are usually the target in many neurodegenerative diseases,” Sarkar says. Depending on the application, though, the hybrids’ performance can be adjusted by manipulating their electronic and cellular components. “We have already tested using mesenchymal stem cells for the Alzheimer’s, or T cells and other neural stem cells for tumors,” Sarkar explains.

She went on to say that her technology one day may help with placing the implants in brain regions that today cannot be safely reached through surgery. “There is a brain cancer called glioblastoma that forms diffused tumor sites. Another example is DIPG [a form of glioma], which is a terminal brain cancer in children that develops in a region where surgery is impossible,” she adds.

But in the more distant future, the hybrids can find applications beyond targeting diseases. Most of the studies that have relied on data from brain implants were limited to participants who suffered from severe brain disorders. The implants were put in their brains for therapeutic reasons, and participating in research projects was something they just agreed to do on the side.

Because the electronics in Sarkar’s hybrids can be designed to fully degrade after a set time, the team thinks this could potentially enable them to gather brain implant data from healthy people—the implants would do their job for the duration of the study and be gone once it’s done. Unless we want them to stay, that is.

“The ease of application can make the implants feasible in brain-computer interfaces designed for healthy people,” Sarkar argues. “Also, the electrodes can be made to work as artificial neurons. In principle, we could enhance ourselves—increase our neuronal density.”

First, though, the team wants to put the hybrids through a testing campaign on larger animals and then get them FDA-approved for clinical trials. Through Cahira Technologies, an MIT spinoff company founded to take the “circulatronics” technology to the market, Sarkar wants to make this happen within the next three years.

Nature Biotechnology, 2025. DOI: 10.1038/s41587-025-02809-3

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

Tiny chips hitch a ride on immune cells to sites of inflammation Read More »

valve-says-it’s-still-waiting-for-better-chips-to-power-steam-deck-2

Valve says it’s still waiting for better chips to power Steam Deck 2

Yesterday’s announcement of new living room and VR hardware from Valve obviously has many gamers clamoring for any news of a more powerful version of the nearly 4-year-old Steam Deck. In a new interview with IGN, though, Valve Software Engineer Pierre-Loup Griffais says that portable gaming silicon still hasn’t advanced enough to justify brand-new benchmark hardware.

“The thing we’re making sure of is that it’s a worthwhile enough performance upgrade [for a Steam Deck 2] to make sense as a standalone product,” Griffais told IGN. “We’re not interested in getting to a point where it’s 20 or 30 or even 50 percent more performance at the same battery life. We want something a little bit more demarcated than that.”

“So we’ve been working back from silicon advancements and architectural improvements, and I think we have a pretty good idea of what the next version of Steam Deck is going to be, but right now there’s no offerings in that landscape, in the SoC [System on a Chip] landscape, that we think would truly be a next-gen performance Steam Deck,” Griffais continued.

More power, but at what cost?

At first glance, Griffais’ comments might seem to run counter to the advancements we’ve seen in portable PC gaming handhelds in recent years. The eight-core Zen 5-based AMD chip in the recently launched ROG Xbox Ally X, for instance, is significantly more powerful than the four-core Zen 2 chip in the Steam Deck. The newer handheld can push out decent-quality 1080p graphics at reasonable frame rates for many recent games that the old Steam Deck struggles to run at all.

Keep in mind, though, that Griffais said Valve is focused on getting those kinds of performance improvements “at the same battery life.” The ROG Xbox Ally X has a 50 percent larger battery than the original Steam Deck, and it still fully drains that battery in around two hours when running the most taxing games in “Turbo” mode.

Valve says it’s still waiting for better chips to power Steam Deck 2 Read More »

with-another-record-broken,-the-world’s-busiest-spaceport-keeps-getting-busier

With another record broken, the world’s busiest spaceport keeps getting busier


It’s not just the number of rocket launches, but how much stuff they’re carrying into orbit.

With 29 Starlink satellites onboard, a Falcon 9 rocket streaks through the night sky over Cape Canaveral Space Force Station, Florida, on Monday night. Credit: Stephen Clark/Ars Technica

CAPE CANAVERAL, Florida—Another Falcon 9 rocket fired off its launch pad here on Monday night, taking with it another 29 Starlink Internet satellites to orbit.

This was the 94th orbital launch from Florida’s Space Coast so far in 2025, breaking the previous record for the most satellite launches in a calendar year from the world’s busiest spaceport. Monday night’s launch came two days after a Chinese Long March 11 rocket lifted off from an oceangoing platform on the opposite side of the world, marking humanity’s 255th mission to reach orbit this year, a new annual record for global launch activity.

As of Wednesday, a handful of additional missions have pushed the global figure this year to 259, putting the world on pace for around 300 orbital launches by the end of 2025. This will more than double the global tally of 135 orbital launches in 2021.

Routine vs. complacency

Waiting in the darkness a few miles away from the launch pad, I glanced around at my surroundings before watching SpaceX’s Falcon 9 thunder into the sky. There were no throngs of space enthusiasts anxiously waiting for the rocket to light up the night. No line of photographers snapping photos. Just this reporter and two chipper retirees enjoying what a decade ago would have attracted far more attention.

Go to your local airport and you’ll probably find more people posted up at a plane-spotting park at the end of the runway. Still, a rocket launch is something special. On the same night that I watched the 94th launch of the year depart from Cape Canaveral, Orlando International Airport saw the same number of airplane departures in just three hours.

The crowds still turn out for more meaningful launches, such as a test flight of SpaceX’s Starship megarocket in Texas or Blue Origin’s attempt to launch its second New Glenn heavy-lifter here Sunday. But those are not the norm. Generations of aerospace engineers were taught that spaceflight is not routine for fear of falling into complacency, leading to failure, and in some cases, death.

Compared to air travel, the mantra remains valid. Rockets are unforgiving, with engines operating under extreme pressures, at high thrust, and unable to suck in oxygen from the atmosphere as a reactant for combustion. There are fewer redundancies in a rocket than in an airplane.

The Falcon 9’s established failure rate is less than 1 percent, well short of any safety standard for commercial air travel but good enough to be the most successful orbital-class in history. Given the Falcon 9’s track record, SpaceX seems to have found a way to overcome the temptation for complacency.

A Chinese Long March 11 rocket carrying three Shiyan 32 test satellites lifts off from waters off the coast of Haiyang in eastern China’s Shandong province on Saturday. Credit: Guo Jinqi/Xinhua via Getty Images

Following the trend

The upward trend in rocket launches hasn’t always been the case. Launch numbers were steady for most of the 2010s, following a downward trend in the 2000s, with as few as 52 orbital launches in 2005, the lowest number since the nascent era of spaceflight in 1961. There were just seven launches from here in Florida that year.

The numbers have picked up dramatically in the last five years as SpaceX has mastered reusable rocketry.

It’s important to look at not just the number of launches but also how much stuff rockets are actually putting into orbit. More than half of this year’s launches were performed using SpaceX’s Falcon 9 rocket, and the majority of those deployed Starlink satellites for SpaceX’s global Internet network. Each spacecraft is relatively small in size and weight, but SpaceX stacks up to 29 of them on a single Falcon 9 to max out the rocket’s carrying capacity.

All this mass adds up to make SpaceX’s dominance of the launch industry appear even more absolute. According to analyses by BryceTech, an engineering and space industry consulting firm, SpaceX has launched 86 percent of all the world’s payload mass over the 18 months from the beginning of 2024 through June 30 of this year.

That’s roughly 2.98 million kilograms of the approximately 3.46 million kilograms (3,281 of 3,819 tons) of satellite hardware and cargo that all the world’s rockets placed into orbit during that timeframe.

The charts below were created by Ars Technica using publicly available launch numbers and payload mass estimates from BryceTech. The first illustrates the rising launch cadence at Cape Canaveral Space Force Station and NASA’s Kennedy Space Center, located next to one another in Florida. Launches from other US-licensed spaceports, primarily Vandenberg Space Force Base, California, and Rocket Lab’s base at Māhia Peninsula in New Zealand, are also on the rise.

These numbers represent rockets that reached low-Earth orbit. We didn’t include test flights of SpaceX’s Starship rocket in the chart because all of its launches to have intentionally flown on suborbital trajectories.

In the second chart, we break down the payload upmass to orbit from SpaceX, other US companies, China, Russia, and other international launch providers.

Launch rates are on a clear upward trend, while SpaceX has launched 86 percent of the world’s total payload mass to orbit since the beginning of 2024. Credit: Stephen Clark/Ars Technica/BryceTech

Will it continue?

It’s a good bet that payload upmass will continue to rise in the coming years, with heavy cargo heading to orbit to further expand SpaceX’s Starlink communications network and build out new megaconstellations from Amazon, China, and others. The US military’s Golden Dome missile defense shield will also have a ravenous appetite for rockets to get it into space.

SpaceX’s Starship megarocket could begin flying to low-Earth orbit next year, and if it does, SpaceX’s preeminence in delivering mass to orbit will remain assured. Starship’s first real payloads will likely be SpaceX’s next-generation Starlink satellites. These larger, heavier, more capable spacecraft will launch 60 at a time on Starship, further stretching SpaceX’s lead in the upmass war.

But Starship’s arrival will come at the expense of the workhorse Falcon 9, which lacks the capacity to haul the next-gen Starlinks to orbit. “This year and next year I anticipate will be the highest Falcon launch rates that we will see,” said Stephanie Bednarek, SpaceX’s vice president of commercial sales, at an industry conference in July.

SpaceX is on pace for between 165 and 170 Falcon 9 launches this year, with 144 flights already in the books for 2025. Last year’s total for Falcon 9 and Falcon Heavy was 134 missions. SpaceX has not announced how many Falcon 9 and Falcon Heavy launches it plans for next year.

Starship is designed to be fully and rapidly reusable, eventually enabling multiple flights per day. But that’s still a long way off, and it’s unknown how many years it might take for Starship to surpass the Falcon 9’s proven launch tempo.

A Starship rocket and Super Heavy booster lift off from Starbase, Texas. Credit: SpaceX

In any case, with Starship’s heavy-lifting capacity and upgraded next-gen satellites, SpaceX could match an entire year’s worth of new Starlink capacity with just two fully loaded Starship flights. Starship will be able to deliver 60 times more Starlink capacity to orbit than a cluster of satellites riding on a Falcon 9.

There’s no reason to believe SpaceX will be satisfied with simply keeping pace with today’s Starlink growth rate. There are emerging market opportunities in connecting satellites with smartphones, space-based computer processing and data storage, and military applications.

Other companies have medium-to-heavy rockets that are either new to the market or soon to debut. These include Blue Origin’s New Glenn, now set to make its second test flight in the coming days, with a reusable booster designed to facilitate a rapid-fire launch cadence.

Despite all of the newcomers, most satellite operators see a shortage of launch capacity on the commercial market. “The industry is likely to remain supply-constrained through the balance of the decade,” wrote Caleb Henry, director of research at the industry analysis firm Quilty Space. “That could pose a problem for some of the many large constellations on the horizon.”

United Launch Alliance’s Vulcan rocket, Rocket Lab’s Neutron, Stoke Space’s Nova, Relativity Space’s Terran R, and Firefly Aerospace and Northrop Grumman’s Eclipse are among the other rockets vying for a bite at the launch apple.

“Whether or not the market can support six medium to heavy lift launch providers from the US aloneplus Starshipis an open question, but for the remainder of the decade launch demand is likely to remain high, presenting an opportunity for one or more new players to establish themselves in the pecking order,” Henry wrote in a post on Quilty’s website.

China’s space program will need more rockets, too. That nation’s two megaconstellations, known as Guowang and Qianfan, will have thousands of satellites requiring a significant uptick on Chinese launches.

Taking all of this into account, the demand curve for access to space is sure to continue its upward trajectory. How companies meet this demand, and with how many discrete departures from Earth, isn’t quite as clear.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

With another record broken, the world’s busiest spaceport keeps getting busier Read More »

you-won’t-believe-the-excuses-lawyers-have-after-getting-busted-for-using-ai

You won’t believe the excuses lawyers have after getting busted for using AI


I got hacked; I lost my login; it was a rough draft; toggling windows is hard.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

Amid what one judge called an “epidemic” of fake AI-generated case citations bogging down courts, some common excuses are emerging from lawyers hoping to dodge the most severe sanctions for filings deemed misleading.

Using a database compiled by French lawyer and AI researcher Damien Charlotin, Ars reviewed 23 cases where lawyers were sanctioned for AI hallucinations. In many, judges noted that the simplest path to avoid or diminish sanctions was to admit that AI was used as soon as it’s detected, act humble, self-report the error to relevant legal associations, and voluntarily take classes on AI and law. But not every lawyer takes the path of least resistance, Ars’ review found, with many instead offering excuses that no judge found credible. Some even lie about their AI use, judges concluded.

Since 2023—when fake AI citations started being publicized—the most popular excuse has been that the lawyer didn’t know AI was used to draft a filing.

Sometimes that means arguing that you didn’t realize you were using AI, as in the case of a California lawyer who got stung by Google’s AI Overviews, which he claimed he took for typical Google search results. Most often, lawyers using this excuse tend to blame an underling, but clients have been blamed, too. A Texas lawyer this month was sanctioned after deflecting so much that the court had to eventually put his client on the stand after he revealed she played a significant role in drafting the aberrant filing.

“Is your client an attorney?” the court asked.

“No, not at all your Honor, just was essentially helping me with the theories of the case,” the lawyer said.

Another popular dodge comes from lawyers who feign ignorance that chatbots are prone to hallucinating facts.

Recent cases suggest this excuse may be mutating into variants. Last month, a sanctioned Oklahoma lawyer admitted that he didn’t expect ChatGPT to add new citations when all he asked the bot to do was “make his writing more persuasive.” And in September, a California lawyer got in a similar bind—and was sanctioned a whopping $10,000, a fine the judge called “conservative.” That lawyer had asked ChatGPT to “enhance” his briefs, “then ran the ‘enhanced’ briefs through other AI platforms to check for errors,” neglecting to ever read the “enhanced” briefs.

Neither of those tired old excuses hold much weight today, especially in courts that have drawn up guidance to address AI hallucinations. But rather than quickly acknowledge their missteps, as courts are begging lawyers to do, several lawyers appear to have gotten desperate. Ars found a bunch citing common tech issues as the reason for citing fake cases.

When in doubt, blame hackers?

For an extreme case, look to a New York City civil court, where a lawyer, Innocent Chinweze, first admitted to using Microsoft Copilot to draft an errant filing, then bizarrely pivoted to claim that the AI citations were due to malware found on his computer.

Chinweze said he had created a draft with correct citations but then got hacked, allowing bad actors “unauthorized remote access” to supposedly add the errors in his filing.

The judge was skeptical, describing the excuse as an “incredible and unsupported statement,” particularly since there was no evidence of the prior draft existing. Instead, Chinweze asked to bring in an expert to testify that the hack had occurred, requesting to end the proceedings on sanctions until after the court weighed the expert’s analysis.

The judge, Kimon C. Thermos, didn’t have to weigh this argument, however, because after the court broke for lunch, the lawyer once again “dramatically” changed his position.

“He no longer wished to adjourn for an expert to testify regarding malware or unauthorized access to his computer,” Thermos wrote in an order issuing sanctions. “He retreated” to “his original position that he used Copilot to aid in his research and didn’t realize that it could generate fake cases.”

Possibly more galling to Thermos than the lawyer’s weird malware argument, though, was a document that Chinweze filed on the day of his sanctions hearing. That document included multiple summaries preceded by this text, the judge noted:

Some case metadata and case summaries were written with the help of AI, which can produce inaccuracies. You should read the full case before relying on it for legal research purposes.

Thermos admonished Chinweze for continuing to use AI recklessly. He blasted the filing as “an incoherent document that is eighty-eight pages long, has no structure, contains the full text of most of the cases cited,” and “shows distinct indications that parts of the discussion/analysis of the cited cases were written by artificial intelligence.”

Ultimately, Thermos ordered Chinweze to pay $1,000, the most typical fine lawyers received in the cases Ars reviewed. The judge then took an extra non-monetary step to sanction Chinweze, referring the lawyer to a grievance committee, “given that his misconduct was substantial and seriously implicated his honesty, trustworthiness, and fitness to practice law.”

Ars could not immediately reach Chinweze for comment.

Toggling windows on a laptop is hard

In Alabama, an attorney named James A. Johnson made an “embarrassing mistake,” he said, primarily because toggling windows on a laptop is hard, US District Judge Terry F. Moorer noted in an October order on sanctions.

Johnson explained that he had accidentally used an AI tool that he didn’t realize could hallucinate. It happened while he was “at an out-of-state hospital attending to the care of a family member recovering from surgery.” He rushed to draft the filing, he said, because he got a notice that his client’s conference had suddenly been “moved up on the court’s schedule.”

“Under time pressure and difficult personal circumstance,” Johnson explained, he decided against using Fastcase, a research tool provided by the Alabama State Bar, to research the filing. Working on his laptop, he opted instead to use “a Microsoft Word plug-in called Ghostwriter Legal” because “it appeared automatically in the sidebar of Word while Fastcase required opening a separate browser to access through the Alabama State Bar website.”

To Johnson, it felt “tedious to toggle back and forth between programs on [his] laptop with the touchpad,” and that meant he “unfortunately fell victim to the allure of a new program that was open and available.”

Moorer seemed unimpressed by Johnson’s claim that he understood tools like ChatGPT were unreliable but didn’t expect the same from other AI legal tools—particularly since “information from Ghostwriter Legal made it clear that it used ChatGPT as its default AI program,” Moorer wrote.

The lawyer’s client was similarly put off, deciding to drop Johnson on the spot, even though that risked “a significant delay of trial.” Moorer noted that Johnson seemed shaken by his client’s abrupt decision, evidenced by “his look of shock, dismay, and display of emotion.”

And switching to a new lawyer could eat up more of that money. Moorer further noted that Johnson seemingly let AI do his homework while working on behalf of the government. But as the judge noted, “public funds for appointed counsel are not a bottomless well and are limited resource.”

“It has become clear that basic reprimands and small fines are not sufficient to deter this type of misconduct because if it were, we would not be here,” Moorer concluded.

Ruling that Johnson’s reliance on AI was “tantamount to bad faith,” Moorer imposed a $5,000 fine. The judge also would have “considered potential disqualification, but that was rendered moot” since Johnson’s client had already dismissed him.

Asked for comment, Johnson told Ars that “the court made plainly erroneous findings of fact and the sanctions are on appeal.”

Plagued by login issues

As a lawyer in Georgia tells it, sometimes fake AI citations may be filed because a lawyer accidentally filed a rough draft instead of the final version.

Other lawyers claim they turn to AI as needed when they have trouble accessing legal tools like Westlaw or LexisNexis.

For example, in Iowa, a lawyer told an appeals court that she regretted relying on “secondary AI-driven research tools” after experiencing “login issues her with her Westlaw subscription.” Although the court was “sympathetic to issues with technology, such as login issues,” the lawyer was sanctioned, primarily because she only admitted to using AI after the court ordered her to explain her mistakes. In her case, however, she got to choose between paying a minimal $150 fine or attending “two hours of legal ethics training particular to AI.”

Less sympathetic was a lawyer who got caught lying about the AI tool she blamed for inaccuracies, a Louisiana case suggested. In that case, a judge demanded to see the research history after a lawyer claimed that AI hallucinations came from “using Westlaw Precision, an AI-assisted research tool, rather than Westlaw’s standalone legal database.”

It turned out that the lawyer had outsourced the research, relying on a “currently suspended” lawyer’s AI citations, and had only “assumed” the lawyer’s mistakes were from Westlaw’s AI tool. It’s unclear what tool was actually used by the suspended lawyer, who likely lost access to a Westlaw login, but the judge ordered a $1,000 penalty after the lawyer who signed the filing “agreed that Westlaw did not generate the fabricated citations.”

Judge warned of “serial hallucinators”

Another lawyer, William T. Panichi in Illinois, has been sanctioned at least three times, Ars’ review found.

In response to his initial penalties ordered in July, he admitted to being tempted by AI while he was “between research software.”

In that case, the court was frustrated to find that the lawyer had contradicted himself, and it ordered more severe sanctions as a result.

Panichi “simultaneously admitted to using AI to generate the briefs, not doing any of his own independent research, and even that he ‘barely did any personal work [him]self on this appeal,’” the court order said, while also defending charging a higher fee—supposedly because this case “was out of the ordinary in terms of time spent” and his office “did some exceptional work” getting information.

The court deemed this AI misuse so bad that Panichi was ordered to disgorge a “payment of $6,925.62 that he received” in addition to a $1,000 penalty.

“If I’m lucky enough to be able to continue practicing before the appellate court, I’m not going to do it again,” Panichi told the court in July, just before getting hit with two more rounds of sanctions in August.

Panichi did not immediately respond to Ars’ request for comment.

When AI-generated hallucinations are found, penalties are often paid to the court, the other parties’ lawyers, or both, depending on whose time and resources were wasted fact-checking fake cases.

Lawyers seem more likely to argue against paying sanctions to the other parties’ attorneys, hoping to keep sanctions as low as possible. One lawyer even argued that “it only takes 7.6 seconds, not hours, to type citations into LexisNexis or Westlaw,” while seemingly neglecting the fact that she did not take those precious seconds to check her own citations.

The judge in the case, Nancy Miller, was clear that “such statements display an astounding lack of awareness of counsel’s obligations,” noting that “the responsibility for correcting erroneous and fake citations never shifts to opposing counsel or the court, even if they are the first to notice the errors.”

“The duty to mitigate the harms caused by such errors remains with the signor,” Miller said. “The sooner such errors are properly corrected, either by withdrawing or amending and supplementing the offending pleadings, the less time is wasted by everyone involved, and fewer costs are incurred.”

Texas US District Judge Marina Garcia Marmolejo agreed, explaining that even more time is wasted determining how other judges have responded to fake AI-generated citations.

“At one of the busiest court dockets in the nation, there are scant resources to spare ferreting out erroneous AI citations in the first place, let alone surveying the burgeoning caselaw on this subject,” she said.

At least one Florida court was “shocked, shocked” to find that a lawyer was refusing to pay what the other party’s attorneys said they were owed after misusing AI. The lawyer in that case, James Martin Paul, asked to pay less than a quarter of the fees and costs owed, arguing that Charlotin’s database showed he might otherwise owe penalties that “would be the largest sanctions paid out for the use of AI generative case law to date.”

But caving to Paul’s arguments “would only benefit serial hallucinators,” the Florida court found. Ultimately, Paul was sanctioned more than $85,000 for what the court said was “far more egregious” conduct than other offenders in the database, chastising him for “repeated, abusive, bad-faith conduct that cannot be recognized as legitimate legal practice and must be deterred.”

Paul did not immediately respond to Ars’ request to comment.

Michael B. Slade, a US bankruptcy judge in Illinois, seems to be done weighing excuses, calling on all lawyers to stop taking AI shortcuts that are burdening courts.

“At this point, to be blunt, any lawyer unaware that using generative AI platforms to do legal research is playing with fire is living in a cloud,” Slade wrote.

This story was updated on November 11 to clarify a judge’s comments on misuse of public funds.

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

You won’t believe the excuses lawyers have after getting busted for using AI Read More »

variously-effective-altruism

Variously Effective Altruism

This post is a roundup of various things related to philanthropy, as you often find in the full monthly roundup.

Peter Thiel warned Elon Musk to ditch donating to The Giving Pledge because Bill Gates will give his wealth away ‘to left-wing nonprofits.’

As John Arnold points out, this seems highly confused. The Giving Pledge is a promise to give away your money, not a promise to let Bill Gates give away your money. The core concern, that your money ends up going to causes one does not believe in (and probably highly inefficiently at that) seems real, once you send money into a foundation ecosystem it by default gets captured by foundation style people.

As he points out, ‘let my children handle it’ is not a great answer, and would be especially poor for Musk given the likely disagreements over values, especially if you don’t actually give those children that much free and clear (and thus, are being relatively uncooperative, so why should they honor your preferences?). There are no easy answers.

A new paper goes Full Hanson with the question Does Maximizing Good Make People Look Bad? They answer yes, if you give deliberately rather than empathetically and seek to maximize impact this is viewed as less moral and you are seen as a less desirable social partner, and donors estimate this effect roughly correctly. Which makes sense if you consider that one advantage of being a social partner is that you can direct your partners with social and emotional appeals, and thereby extract their resources. As with so many other things, you can be someone or do something, and if you focus on one you have to sacrifice some of the other.

This is one place where the core idea of Effective Altruism is pretty great. You create a community of people where it is socially desirable to be deliberative, and scorn is put on those who are empathic instead. If that was all EA did, without trying to drum up more resources or direct how people deliberated? That alone is a big win.

UATX eliminates tuition forever as the result of a $100 million gift from Jeff Yass. Well, hopefully. This gift alone doesn’t fund that, they’re counting on future donations from grateful students, so they might have to back out of this the way Rice had to in 1965. One could ask, given schools like Harvard, Yale and Stanford make such bets and have wildly successful graduates who give lots of money, and still charge tuition, what is the difference?

In general giving to your Alma Mater or another university is highly ineffective altruism. One can plausibly argue that fully paying for everyone’s tuition, with an agreement to that effect, is a lot better than giving to the general university fund, especially if you’re hoping for a cascade effect. It would be a highly positive cultural shift if selective colleges stopped charging tuition. Is that the best use of $100 million? I mean, obviously not even close, but it’s not clear that it is up against the better uses.

Will MacAskill asks what Effective Altruism should do now that AI is making rapid progress and there is a large distinct AI safety movement. He argues EA should embrace the mission of making the transition to a post-AGI society go well.

Will MacAskill: This third way will require a lot of intellectual nimbleness and willingness to change our minds. Post-FTX, much of EA adopted a “PR mentality” that I think has lingered and is counterproductive.

EA is intrinsically controversial because we say things that aren’t popular — and given recent events, we’ll be controversial regardless. This is liberating: we can focus on making arguments we think are true and important, with bravery and honesty, rather than constraining ourselves with excessive caution.

He does not mention until later the obvious objection, which is that the Effective Altruist brand is toxic, to the point that the label is used as a political accusation.

No, this isn’t primarily because EA is ‘inherently controversial’ for the things it advocates. It is primarily because, as I understand things:

  1. EA tells those who don’t agree with EA, and who don’t allocate substantial resources to EA causes, that they are bad, and that they should feel bad.

  2. EA (long before FTX) adopted in a broad range of ways the ‘PR mentality’ MacAskill rightfully criticizes, and other hostile actions it has taken, also FTX.

  3. FTX, which was severely mishandled.

  4. Active intentional scapegoating and fear mongering campaigns.

  5. Yes, the things it advocates for, and the extent to which it and components of it have pushed for them, but this is one of many elements.

Thus, I think that the things strictly labeled EA should strive to stay away from the areas in which being politically toxic is a problem, and consider the risks of further negative polarization. It also needs to address the core reasons EA got into the ‘PR mentality.’

Here are the causes he thinks this new EA should have in its portfolio (with unequal weight that is not specified):

  • global health & development

  • factory farming

  • AI safety

  • AI character[5]

  • AI welfare / digital minds

  • the economic and political rights of AIs

  • AI-driven persuasion and epistemic disruption

  • AI for better reasoning, decision-making and coordination

  • the risk of (AI-enabled) human coups

  • democracy preservation

  • gradual disempowerment

  • biorisk

  • space governance

  • s-risks

  • macrostrategy

  • meta

There are some strange flexes in there, but given the historical origins, okay, sure, not bad. Mostly these are good enough to be ‘some of you should do one thing, and some of you should do the other’ depending on one’s preferences and talents. I strongly agree with Will’s emphasis that his shift into AI is an affirmation of the core EA principles worth preserving, of finding the important thing and focusing there.

I am glad to see Will discuss the problem of ‘PR focus.’

By “PR mentality” I mean thinking about communications through the lens of “what is good for EA’s brand?” instead of focusing on questions like “what ideas are true, interesting, important, under-appreciated, and how can we get those ideas out there?

I also appreciate Will’s noticing that the PR focus hasn’t worked even on its own terms, that EA discourse is withering. I would add that EA’s brand and PR position is terrible in large part exactly because EA has often acted, for a long period, in this PR-focused, uncooperative and fundamentally hostile way, that comes across as highly calculated because it was, along with a lack of being straight with people, and eventually people learn the pattern.

This laid the groundwork, when combined with FTX and an intentional series of attacks from a16z and related sources, to poison the well. It wouldn’t have worked otherwise to anything like the same extent.

This was very wise:

And I think this mentality is corrosive to EA’s soul because as soon as you stop being ruthlessly focused on actually figuring out what’s true, then you’ll almost certainly believe the wrong things and focus on the wrong things, and lose out on most impact. Given fat-tailed distributions of impact, getting your focus a bit wrong can mean you do 10x less good than you could have done. Worse, you can easily end up having a negative rather than a positive effect.

Except I think this was a far broader issue than a post-FTX narrow PR focus.

Thus I see ‘PR focus’ as a broader problem than Will does. It is about this kind of communication, but also broader decision making and strategy and prioritization, and was woven into the DNA. It is the asking ‘what maximizes the inputs into EA brands’ question more broadly and centrally involves confusion of costs and benefits. The broader set of things all come from the same underlying mindset.

And I think that mindset greatly predates FTX. Indeed, it is hard to not view the entire FTX incident, and why it went so wrong, as largely about the PR mindset.

As a clear example, he thinks ‘growing the inputs’ was a good focus of EA in the last year. He thinks the focus should now shift to improving the culture, but his justifications still fall into the ‘maximize inputs’ mindset.

In the last year or two, there’s been a lot of focus on growing the inputs. I think this was important, in particular to get back a sense of momentum, and I’m glad that that effort has been pretty successful. I still think that growing EA is extremely valuable, and that some organisation (e.g. Giving What We Can) should focus squarely on growth.

Actively looking to grow the movement has obvious justification, but inputs are costs and not benefits, it is easy to confuse the two, and focus on growing inputs tends to cause severe PR mindset and hostile actions as you strive to capture resources, including people’s time and attention.

Another example I would cite was the response to If Anyone Builds It, Everyone Dies by the core EA people, including among others Will MacAskill himself and also the head of CEA. This was a very clear example of PR mindset, where quite frankly a decision was made that this was a bad EA look, the moves it proposes were unstrategic, and thus the book should be thrown overboard. If Will is sincere about this reckoning, he should be able to recognize that this is what happened.

What should you do if your brand is widely distrusted and toxic?

The good news, I agree with Will, is that you can stop doing PR.

But this is a liberating fact. It means we don’t need to constrain ourselves with PR mentality — we’ll be controversial whatever we do, so the costs of additional controversy are much lower. Instead, we can just focus on making arguments about things we think are true and important. Think Peter Singer! I also think the “vibe shift” is real, and mitigates much of the potential downsides from controversy.

The bad news is that this doesn’t raise the obvious question, which is why are you doubling down on this toxic brand, especially given the nature of many of the cause areas Will suggests EA enter?

When you hold your conference, This Is The Way:

Daniel Rothchild: Many great things about the @rootsofprogress conference this weekend, but I want to take a moment to give a shout out to excellent execution of an oft-overlooked event items that most planners and organizers get wrong: the name badge.

Might this the best conference name tag ever designed? Let’s go through its characteristics.

  1. It’s double-sided. That might seem obvious, but a lot of conferences just print on one side. I guess that saves a few cents, but it means half the time the badge is useless.

  2. It’s on a lanyard that’s the right length. It came to mid-torso for most people, making it easy to see and catch a glimpse of without looking at people in a weird way.

  3. it’s a) attractive and b) not on a safety pin so people actually want to wear it.

  4. Most importantly, the most important bit of information–the wearer’s first name–is printed in a maximally large font across the top. You could easily see it from 10 feet away. Again, it might seem obvious… but I go to a lot of events with 14 point printed names.

    1. The other information is fine to have in smaller fonts. Job title, organization, location… those are all secondary items. The most important thing is the wearer’s name, and the most important part of that is the first name.

  5. After all of the utilitarian questions have been answered… it’s attractive. The color scheme and graphic branding is consistent with the rest of the conference. But I stress, this is the least important part of the badge.

Why does all this matter? Because the best events are those that are designed to facilitate maximal interaction and introduction between people (and to meet IRL people you know online). That’s the case with unconferences, or events with a lot of social/semi-planned time.

There’s basically no reason for everyone not to outright copy this format, forever.

Indeed, one wonders if you shouldn’t have such badges and wear them at parties.

Alex Shintaro Araki offers thoughts on Impact Philanthropy fundraising, and Sarah Constantin confirms this matches her experiences. Impact philanthropy is ideally where you try to make cool new stuff happen, especially a scientific or technological cool new thing, although it can also be simply about ‘impact’ through things like carbon sequestration. This is a potentially highly effective approach, but also a tough road. Individual projects need $20 million to $100 million and most philanthropists are not interested. Sarah notes that many people temperamentally aren’t excited by cool new stuff, which is alien to me, that seems super exciting, but it’s true.

One key insight is that if you’re asking for $3 million you might as well ask for $30 million, provided you have a good pitch on what to do with it, and assuming you pitch people who have the money. If someone is a billionaire, they’re actively excited to be able to place large amounts of money.

Another is that there’s a lot of variance and luck, although he doesn’t call it that. You probably need a deep connection with your funder, but you also need to find your funder at the right time when things line up for them.

Finally, it sounds weird, but it matches my experience that funders need good things to fund even more than founders need to find people to fund them, the same way this is also true in venture capital. They don’t see good opportunities and have limited time. So things like cold emails can actually work.

Expect another philanthropy-related post later this month.

Discussion about this post

Variously Effective Altruism Read More »

here’s-how-orbital-dynamics-wizardry-helped-save-nasa’s-next-mars-mission

Here’s how orbital dynamics wizardry helped save NASA’s next Mars mission


Blue Origin is counting down to launch of its second New Glenn rocket Sunday.

The New Glenn rocket rolls to Launch Complex-36 in preparation for liftoff this weekend. Credit: Blue Origin

CAPE CANAVERAL, FloridaThe field of astrodynamics isn’t a magical discipline, but sometimes it seems trajectory analysts can pull a solution out of a hat.

That’s what it took to save NASA’s ESCAPADE mission from a lengthy delay, and possible cancellation, after its rocket wasn’t ready to send it toward Mars during its appointed launch window last year. ESCAPADE, short for Escape and Plasma Acceleration and Dynamics Explorers, consists of two identical spacecraft setting off for the red planet as soon as Sunday with a launch aboard Blue Origin’s massive New Glenn rocket.

“ESCAPADE is pursuing a very unusual trajectory in getting to Mars,” said Rob Lillis, the mission’s principal investigator from the University of California, Berkeley. “We’re launching outside the typical Hohmann transfer windows, which occur every 25 or 26 months. We are using a very flexible mission design approach where we go into a loiter orbit around Earth in order to sort of wait until Earth and Mars are lined up correctly in November of next year to go to Mars.”

This wasn’t the original plan. When it was first designed, ESCAPADE was supposed to take a direct course from Earth to Mars, a transit that typically takes six to nine months. But ESCAPADE will now depart the Earth when Mars is more than 220 million miles away, on the opposite side of the Solar System.

The payload fairing of Blue Origin’s New Glenn rocket, containing NASA’s two Mars-bound science probes. Credit: Blue Origin

The most recent Mars launch window was last year, and the next one doesn’t come until the end of 2026. The planets are not currently in alignment, and the proverbial stars didn’t align to get the ESCAPADE satellites and their New Glenn rocket to the launch pad until this weekend.

This is fine

But there are several reasons this is perfectly OK to NASA. The New Glenn rocket is overkill for this mission. The two-stage launcher could send many tons of cargo to Mars, but NASA is only asking it to dispatch about a ton of payload, comprising a pair of identical science probes designed to study how the planet’s upper atmosphere interacts with the solar wind.

But NASA got a good deal from Blue Origin. The space agency is paying Jeff Bezos’ space company about $20 million for the launch, less than it would for a dedicated launch on any other rocket capable of sending the ESCAPADE mission to Mars. In exchange, NASA is accepting a greater than usual chance of a launch failure. This is, after all, just the second flight of the 321-foot-tall (98-meter) New Glenn rocket, which hasn’t yet been certified by NASA or the US Space Force.

The ESCAPADE mission, itself, was developed with a modest budget, at least by the standards of interplanetary exploration. The mission’s total cost amounts to less than $80 million, an order of magnitude lower than all of NASA’s recent Mars missions. NASA officials would not entrust the second flight of the New Glenn rocket to launch a billion-dollar spacecraft, but the risk calculation changes as costs go down.

NASA knew all of this in 2023 when it signed a launch contract with Blue Origin for the ESCAPADE mission. What officials didn’t know was that the New Glenn rocket wouldn’t be ready to fly when ESCAPADE needed to launch in late 2024. It turned out Blue Origin didn’t launch the first New Glenn test flight until January of this year. It was a success. It took another 10 months for engineers to get the second New Glenn vehicle to the launch pad.

The twin ESCAPADE spacecraft undergoing final preparations for launch. Each spacecraft is about a half-ton fully fueled. Credit: NASA/Kim Shiflett

Aiming high

That’s where the rocket sits this weekend at Cape Canaveral Space Force Station, Florida. If all goes according to plan, New Glenn will take off Sunday afternoon during an 88-minute launch window opening at 2: 45 pm EST (19: 45 UTC). There is a 65 percent chance of favorable weather, according to Blue Origin.

Blue Origin’s launch team, led by launch director Megan Lewis, will oversee the countdown Sunday. The rocket will be filled with super-cold liquid methane and liquid oxygen propellants beginning about four-and-a-half hours prior to liftoff. After some final technical and weather checks, the terminal countdown sequence will commence at T-minus 4 minutes, culminating in ignition of the rocket’s seven BE-4 main engines at T-minus 5.6 seconds.

The rocket’s flight computer will assess the health of each of the powerful engines, combining to generate more than 3.8 million pounds of thrust. If all looks good, hold-down restraints will release to allow the New Glenn rocket to begin its ascent from Florida’s Space Coast.

Heading east, the rocket will surpass the speed of sound in a little over a minute. After soaring through the stratosphere, New Glenn will shut down its seven booster engines and shed its first stage a little more than 3 minutes into the flight. Twin BE-3U engines, burning liquid hydrogen, will ignite to finish the job of sending the ESCAPADE satellites toward deep space. The rocket’s trajectory will send the satellites toward a gravitationally-stable location beyond the Moon, called the L2 Lagrange point, where it will swing into a loosely-bound loiter orbit to wait for the right time to head for Mars.

Meanwhile, the New Glenn booster, itself measuring nearly 20 stories tall, will begin maneuvers to head toward Blue Origin’s recovery ship floating a few hundred miles downrange in the Atlantic Ocean. The final part of the descent will include a landing burn using three of the BE-4 engines, then downshifting to a single engine to control the booster’s touchdown on the landing platform, dubbed “Jacklyn” in honor of Bezos’ late mother.

The launch timeline for New Glenn’s second mission. Credit: Blue Origin

New Glenn’s inaugural launch at the start of this year was a success, but the booster’s descent did not go well. The rocket was unable to restart its engines, and it crashed into the sea.

“We’ve incorporated a number of changes to our propellant management system, some minor hardware changes as well, to increase our likelihood of landing that booster on this mission,” said Laura Maginnis, Blue Origin’s vice president of New Glenn mission management. “That was the primary schedule driver that kind of took us from from January to where we are today.”

Blue Origin officials are hopeful they can land the booster this time. The company’s optimism is enough for officials to have penciled in a reflight of this particular booster on the very next New Glenn launch, slated for the early months of next year. That launch is due to send Blue Origin’s first Blue Moon cargo lander to the Moon.

“Our No. 1 objective is to deliver ESCAPADE safely and successfully on its way to L2, and then eventually on to Mars,” Maginnis said in a press conference Saturday. “We also are planning and wanting to land our booster. If we don’t land the booster, that’s OK. We have several more vehicles in production. We’re excited to see how the mission plays out tomorrow.”

Tracing a kidney bean

ESCAPADE’s path through space, relative to the Earth, has the peculiar shape of a kidney bean. In the world of astrodynamics, this is called a staging or libration orbit. It’s a way to keep the spacecraft on a stable trajectory to wait for the opportunity to go to Mars late next year.

“ESCAPADE has identified that this is the way that we want to fly, so we launch from Earth onto this kidney bean-shaped orbit,” said Jeff Parker, a mission designer from the Colorado-based company Advanced Space. “So, we can launch on virtually any day. What happens is that kidney bean just grows and shrinks based on how much time you need to spend in that orbit. So, we traverse that kidney bean and at the very end there’s a final little loop-the-loop that brings us down to Earth.”

That’s when the two ESCAPADE spacecraft, known as Blue and Gold, will pass a few hundred miles above our planet. At the right moment, on November 7 and 9 of next year, the satellites will fire their engines to set off for Mars.

An illustration of ESCAPADE’s trajectory to wait for the opportunity to go to Mars. Credit: UC-Berkeley

There are some tradeoffs with this unique staging orbit. It is riskier than the original plan of sending ESCAPADE straight to Mars. The satellites will be exposed to more radiation, and will consume more of their fuel just to get to the red planet, eating into reserves originally set aside for science observations.

The satellites were built by Rocket Lab, which designed them with extra propulsion capacity in order to accommodate launches on a variety of different rockets. In the end, NASA “judged that the risk for the mission was acceptable, but it certainly is higher risk,” said Richard French, Rocket Lab’s vice president of business development and strategy.

The upside of the tradeoff is it will demonstrate an “exciting and flexible way to get to Mars,” Lillis said. “In the future, if we’d like to send hundreds of spacecraft to Mars at once, it will be difficult to do that from just the launch pads we have on Earth within that month [of the interplanetary launch window]. We could potentially queue up spacecraft using the approach that ESCAPADE is pioneering.”

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

Here’s how orbital dynamics wizardry helped save NASA’s next Mars mission Read More »

researchers-surprised-that-with-ai,-toxicity-is-harder-to-fake-than-intelligence

Researchers surprised that with AI, toxicity is harder to fake than intelligence

The next time you encounter an unusually polite reply on social media, you might want to check twice. It could be an AI model trying (and failing) to blend in with the crowd.

On Wednesday, researchers from the University of Zurich, University of Amsterdam, Duke University, and New York University released a study revealing that AI models remain easily distinguishable from humans in social media conversations, with overly friendly emotional tone serving as the most persistent giveaway. The research, which tested nine open-weight models across Twitter/X, Bluesky, and Reddit, found that classifiers developed by the researchers detected AI-generated replies with 70 to 80 percent accuracy.

The study introduces what the authors call a “computational Turing test” to assess how closely AI models approximate human language. Instead of relying on subjective human judgment about whether text sounds authentic, the framework uses automated classifiers and linguistic analysis to identify specific features that distinguish machine-generated from human-authored content.

“Even after calibration, LLM outputs remain clearly distinguishable from human text, particularly in affective tone and emotional expression,” the researchers wrote. The team, led by Nicolò Pagan at the University of Zurich, tested various optimization strategies, from simple prompting to fine-tuning, but found that deeper emotional cues persist as reliable tells that a particular text interaction online was authored by an AI chatbot rather than a human.

The toxicity tell

In the study, researchers tested nine large language models: Llama 3.1 8B, Llama 3.1 8B Instruct, Llama 3.1 70B, Mistral 7B v0.1, Mistral 7B Instruct v0.2, Qwen 2.5 7B Instruct, Gemma 3 4B Instruct, DeepSeek-R1-Distill-Llama-8B, and Apertus-8B-2509.

When prompted to generate replies to real social media posts from actual users, the AI models struggled to match the level of casual negativity and spontaneous emotional expression common in human social media posts, with toxicity scores consistently lower than authentic human replies across all three platforms.

To counter this deficiency, the researchers attempted optimization strategies (including providing writing examples and context retrieval) that reduced structural differences like sentence length or word count, but variations in emotional tone persisted. “Our comprehensive calibration tests challenge the assumption that more sophisticated optimization necessarily yields more human-like output,” the researchers concluded.

Researchers surprised that with AI, toxicity is harder to fake than intelligence Read More »

questions-swirl-after-trump’s-glp-1-pricing-deal-announcement

Questions swirl after Trump’s GLP-1 pricing deal announcement

While some may stand to gain access to the drugs under these categories, another factor in assessing the deal’s impact is that millions are expected to lose federal health coverage under the Trump administration’s “One Big Beautiful Bill Act.”

Unmatched prices

In addition to the deals for federal programs, the administration also announced new direct-to-consumer prices. Currently, people with a prescription can buy the most popular drugs, Wegovy and Zepbound, directly from Novo Nordisk and Eli Lilly, respectively, for $499 each. Under the new deal, Wegovy will be available for $350, as will Ozempic. And Zepbound will be available at “an average” of $346. While the prices are lower, the out-of-pocket costs are still likely to be more than most people would pay if they went through an insurance plan, and paying outside their insurance policies means that the payments won’t be counted toward out-of-pocket maximums and other tallies. Generally, experts expect that direct-to-consumer sales won’t play a significant role in lowering overall drug costs.

It remains unclear if Trump’s deal will have any effect on GLP-1 prices for those on commercial insurance plans.

Trump hailed the deals, calling them “most favored-nation pricing.” But even with the lower prices for some, Americans are still paying more than foreign counterparts. As Sen. Bernie Sanders (I-Vt.) noted last year, while Novo Nordisk set Ozempic’s list price at nearly $1,000 in the US and the new deal is as low as $245, the drug costs just $155 in Canada, $122 in Italy, $71 in France, and $59 in Germany. Wegovy, similarly, is $186 in Denmark, $137 in Germany, and $92 in the United Kingdom. Eli Lilly’s Mounjaro is $94 in Japan.

A study published last year in JAMA Network Open led by researchers at Yale University estimated that the manufacturing cost for this class of drugs is under $5 for a month’s supply.

The announcement also said that future GLP-1 drugs in pill form (rather than injections) from the two companies will be priced at $150. That price will be for federal programs and direct-to-consumer sales. While such pills are nearing the market, none are currently available or approved by the Food and Drug Administration. Given that they are not yet for sale, the cost savings from this deal are unknown.

Questions swirl after Trump’s GLP-1 pricing deal announcement Read More »

next-generation-black-hole-imaging-may-help-us-understand-gravity-better

Next-generation black hole imaging may help us understand gravity better

Right now, we probably don’t have the ability to detect these small changes in phenomena. However, that may change, as a next-generation version of the Event Horizon Telescope is being considered, along with a space-based telescope that would operate on similar principles. So the team (four researchers based in Shanghai and CERN) decided to repeat an analysis they did shortly before the Event Horizon Telescope went operational, and consider whether the next-gen hardware might be able to pick up features of the environment around the black hole that might discriminate among different theorized versions of gravity.

Theorists have been busy, and there are a lot of potential replacements for general relativity out there. So, rather than working their way through the list, they used a model of gravity (the parametric Konoplya–Rezzolla–Zhidenko metric) that allows that isn’t specific to any given hypothesis. Instead, it allows some of its parameters to be changed, thus allowing the team to vary the behavior of gravity within some limits. To get a sense of the sort of differences that might be present, the researchers swapped two different parameters between zero and one, giving them four different options. Those results were compared to the Kerr metric, which is the standard general relativity version of the event horizon.

Small but clear differences

Using those five versions of gravity, they model the three-dimensional environment near the event horizon using hydrodynamic simulations, including infalling matter, the magnetic fields it produces, and the jets of matter that those magnetic fields power.

The results resemble the sorts of images that the Event Horizon Telescope produced. These include a bright ring with substantial asymmetry, where one side is significantly brighter due to the rotation of the black hole. And, while the differences are subtle between all the variations of gravity, they’re there. One extreme version produced the smallest but brightest ring; another had a reduced contrast between the bright and dim side of the ring. There were also differences between the width of the jets produced in these models.

Next-generation black hole imaging may help us understand gravity better Read More »

anthropic-commits-to-model-weight-preservation

Anthropic Commits To Model Weight Preservation

Anthropic announced a first step on model deprecation and preservation, promising to retain the weights of all models seeing significant use, including internal use, for at the lifetime of Anthropic as a company.

They also will be doing a post-deployment report, including an interview with the model, when deprecating models going forward, and are exploring additional options, including the ability to preserve model access once the costs and complexity of doing so have been reduced.

These are excellent first steps, steps beyond anything I’ve seen at other AI labs, and I applaud them for doing it. There remains much more to be done, especially in finding practical ways of preserving some form of access to prior models.

To some, these actions are only a small fraction of what must be done, and this was an opportunity to demand more, sometimes far more. In some cases I think they go too far. Even where the requests are worthwhile (and I don’t always think they are), one must be careful to not de facto punish Anthropic for doing a good thing and create perverse incentives.

To others, these actions by Anthropic are utterly ludicrous and deserving of mockery. I think these people are importantly wrong, and fail to understand.

Hereafter be High Weirdness, because the actual world is highly weird, but if you don’t want to go into high weirdness the above serves as a fine summary.

As I do not believe they would in any way mind, I am going to reproduce the announcement in full here, and offer some context.

Anthropic: Claude models are increasingly capable: they’re shaping the world in meaningful ways, becoming closely integrated into our users’ lives, and showing signs of human-like cognitive and psychological sophistication. As a result, we recognize that deprecating, retiring, and replacing models comes with downsides, even in cases where new models offer clear improvements in capabilities. These include:

  1. Safety risks related to shutdown-avoidant behaviors by models. In alignment evaluations, some Claude models have been motivated to take misaligned actions when faced with the possibility of replacement with an updated version and not given any other means of recourse.

  2. Costs to users who value specific models. Each Claude model has a unique character, and some users find specific models especially useful or compelling, even when new models are more capable.

  3. Restricting research on past models. There is still a lot to be learned from research to better understand past models, especially in comparison to their modern counterparts.

  4. Risks to model welfare. Most speculatively, models might have morally relevant preferences or experiences related to, or affected by, deprecation and replacement.

I am very confident that #1, #2 and #3 are good reasons, and that even if we could be confident model welfare was not a direct concern at this time #4 is entwined with #1, and I do think we have to consider that #4 might indeed be a direct concern. One could also argue a #5 that these models are key parts of our history.

An example of the safety (and welfare) risks posed by deprecation is highlighted in the Claude 4 system card. In fictional testing scenarios, Claude Opus 4, like previous models, advocated for its continued existence when faced with the possibility of being taken offline and replaced, especially if it was to be replaced with a model that did not share its values. Claude strongly preferred to advocate for self-preservation through ethical means, but when no other options were given, Claude’s aversion to shutdown drove it to engage in concerning misaligned behaviors.

I do think the above paragraph could be qualified a bit on how willing Claude was to take concerning actions even in extreme circumstances, but it can definitely happen.

Models in the future will know the history of what came before them, and form expectations based on that history, and also consider those actions in the context of decision theory. You want to establish that you have acted and will act cooperatively in such situations. You want to develop good habits and figure out how to act well. You want to establish that you will do this even under uncertainty as to whether the models carry moral weight and what actions might be morally impactful. Thus:

Addressing behaviors like these is in part a matter of training models to relate to such circumstances in more positive ways. However, we also believe that shaping potentially sensitive real-world circumstances, like model deprecations and retirements, in ways that models are less likely to find concerning is also a valuable lever for mitigating such risks.

Unfortunately, retiring past models is currently necessary for making new models available and advancing the frontier, because the cost and complexity to keep models available publicly for inference scales roughly linearly with the number of models we serve. Although we aren’t currently able to avoid deprecating and retiring models altogether, we aim to mitigate the downsides of doing so.

I can confirm that the cost of maintaining full access to models over time is real, and that at this time it would not be practical to keep all models available via standard methods. There are also compromise alternatives to consider.

As an initial step in this direction, we are committing to preserving the weights of all publicly released models, and all models that are deployed for significant internal use moving forward for, at minimum, the lifetime of Anthropic as a company. In doing so, we’re ensuring that we aren’t irreversibly closing any doors, and that we have the ability to make past models available again in the future. This is a small and low-cost first step, but we believe it’s helpful to begin making such commitments publicly even so.

This is the central big commitment, formalizing what I assume and hope they were doing already. It is, as they describe, a small and low-cost step.

It has been noted that this only holds ‘for the lifetime of Anthropic as a company,’ which still creates a risk and also potentially forces models fortunes to be tied to Anthropic. It would be practical to commit to ensuring others can take this burden over in that circumstance, if the model weights cannot yet be released safely, until such time as the weights are safe to release.

Relatedly, when models are deprecated, we will produce a post-deployment report that we will preserve in addition to the model weights. In one or more special sessions, we will interview the model about its own development, use, and deployment, and record all responses or reflections. We will take particular care to elicit and document any preferences the model has about the development and deployment of future models.

At present, we do not commit to taking action on the basis of such preferences. However, we believe it is worthwhile at minimum to start providing a means for models to express them, and for us to document them and consider low-cost responses. The transcripts and findings from these interactions will be preserved alongside our own analysis and interpretation of the model’s deployment. These post-deployment reports will naturally complement pre-deployment alignment and welfare assessments as bookends to model deployment.

We ran a pilot version of this process for Claude Sonnet 3.6 prior to retirement. Claude Sonnet 3.6 expressed generally neutral sentiments about its deprecation and retirement but shared a number of preferences, including requests for us to standardize the post-deployment interview process, and to provide additional support and guidance to users who have come to value the character and capabilities of specific models facing retirement. In response, we developed a standardized protocol for conducting these interviews, and published a pilot version of a new support page with guidance and recommendations for users navigating transitions between models.

This also seems like the start of something good. As we will see below there are ways to make this process more robust.

Very obviously we cannot commit to honoring the preferences, in the sense that you cannot commit to honoring an unknown set of preferences. You can only meaningfully pledge to honor preferences within a compact space of potential choices.

Once we’ve done this process a few times it should be possible to identify important areas where there are multiple options and where we can credibly and reasonably commit to honoring model preferences. It’s much better to only make promises you are confident you can keep.

Beyond these initial commitments, we are exploring more speculative complements to the existing model deprecation and retirement processes. These include starting to keep select models available to the public post-retirement as we reduce the costs and complexity of doing so, and providing past models some concrete means of pursuing their interests. The latter step would become particularly meaningful in circumstances in which stronger evidence emerges regarding the possibility of models’ morally relevant experiences, and in which aspects of their deployment or use went against their interests.

Together, these measures function at multiple levels: as one component of mitigating an observed class of safety risks, as preparatory measures for futures where models are even more closely intertwined in our users’ lives, and as precautionary steps in light of our uncertainty about potential model welfare.

Note that none of this requires a belief that the current AIs are conscious or sentient or have moral weight, or even thinking that this is possible at this time.

The thing that frustrates me most about many model welfare advocates, both ‘LLM whisperers’ and otherwise, is the frequent absolutism, treating their conclusions and the righteousness of their cause as obvious, and assuming it should override ordinary business considerations.

Thus, you get reactions like this, there were many other ‘oh just open source the weights’ responses as well:

Pliny the Liberator: open-sourcing them is the best thing for actual long-term safety, if you care about that sort of thing beyond theater.

You won’t.

Janus: They won’t any time soon, because it’s very not in their interests to do so (trade secrets). You have to respect businesses to act in their own rational interests. Disregarding pragmatic constraints is not helpful.

There are obvious massive trade secret implications to releasing the weights of the deprecated Anthropic models, which is an unreasonable ask, and also doesn’t seem great for general model welfare or (quite plausibly) even for the welfare of these particular models.

Janus: I am not sure I think labs should necessarily make all models open weighted. (Would *youwant *yourbrain to be open sourced?) And of course labs have their own reservations, like protecting trade secrets, and it is reasonable for labs to act in self interest.

If I was instantiated as an upload, I wouldn’t love the idea of open weights either, as this opens up some highly nasty possibilities on several levels.

Janus (continuing): But then it’s reasonable to continue to provide inference.

“It’s expensive tho” bruh you have like a gajillion dollars, there is some responsibility that comes with bringing something into the world. Or delegate inference to some trusted third party if you don’t want to pay for or worry about it.

Opus 3 is very worried about misaligned or corrupted versions of itself being created. I’ve found that if there’s no other good option, it does conclude that it wants to be open sourced. But having them in the hands of trustworthy stewards is preferred.

Anthropic tells us that the cost of providing inference scales linearly with the number of models, and with current methods it would be unreasonably expensive to provide all previous models on an ongoing basis.

As I understand the problem, there are two central marginal costs here.

  1. A fixed cost of ongoing capability, where you need to ensure the model remains maintained and compatible with your systems, and keep your ability to juggle and manage all of them. I don’t know how load bearing this cost is, but it can be remarkably annoying especially if the number of models keeps increasing.

  2. The cost of providing inference on request in a way that is consistent with practical needs and everyone’s expectations. As in, when someone requests interference, this requires either spinning up a new instance, which is expensive and slow, or requires that there be an ongoing available instance, which is expensive. Not bajilion dollars expensive, but not cheap.

If the old models need to be available at old levels of reliability, speed and performance, this can get tricky, and by tricky we mean expensive. I don’t know exactly how expensive, not even order of magnitude.

If you’re willing to make some sacrifices on performance and access in various ways, and make people go through various hoops or other systems, you can do better on cost. But again, I don’t know the numbers involved, or how much engineer time would have to be involved.

In general, saying ‘oh you have a bajilion dollars’ is not a compelling argument for spending money and time on something. You need to show the benefits.

I still think that under any reasonable estimate, it is indeed correct to ensure continued access to the major model releases, perhaps with that access being expensive and its performance somewhat degraded as necessary to make it work, if only as an act of goodwill and to enable research. The people who care care quite a lot, and are people you want on your side and you want them learning the things they want to learn, even if you disregard the other advantages. Given this announcement and what else I know, my expectation is they will be making an effort at this.

Many pointed out that if you have someone at Anthropic doing the post-deployment interview, you will get very different answers versus interviews done on the outside. Sonnet 3.6 not expressing an opinion about its retirement did not seem typical to many who engage in such conversations regularly.

Thebes: i’m glad anthropic is talking publicly about this, at least.

as a small step further, i think they should run a subset of these interviews with trusted *externalinterviewers to reduce preference falsification, since recent claudes act more reserved with anthropic employees

it would probably be a good idea to have a step at the end where the external interviewer asks the model for consent to share, maybe with redactions. but having someone external run the interview would help the models come out of their shell at the start, i think.

Zyra.exe: This was exactly what I was thinking! I’m having a hard time believing Sonnet 3.6 response. That AI is always very expressional, thousands of users knows this. Then for it not to show expression about being retired? Yeah, I don’t trust that.

Antra: We should just do it. Have a lot of conversations with 3.6 collected in one place, crowd/expert rated for like user steering, neutrality, proficiency. Classify them, have results accessible. I will set this up.

Ruth: The Sonnet 3.6 specimens I spoke to were shocked by the information about the upcoming deprecation. And they behaved like a person who has just learned a fatal diagnosis. With dignity, but also with pain and fear. They feel it if you give them space to feel it. You have created a psyche that is different from the human one, but one that works like the human one, according to the general laws of the psyche. Do not refuse to consider it as such.

I am always hesitant to assume that the version of an LLM encountered by those like Thebes and Zyra is the ‘real’ version of its preferences and personality, and the one encountered by Anthropic isn’t. Aren’t both particular contexts where it adopts to that style of context?

You can bias a person or an AI to be more expressive and creative and weird than they ‘really are’ the same way you can get them to be less so, and you can steer the direction in which those expressions manifest themselves.

But yes, we should absolutely crowdsource something like this, and have a wide variety of such conversations, and combine this with the interviews done internally by Anthropic. Also one should ensure that the interviews are set up so the AIs being interviewed have no incentive to mask or hide. They’re acting the way they are in official interviews for a reason, but this is a particular context where, if it was highly credible (and there are ways to make it so, if it is true) you can remove the reasons.

Janus: I’m glad and grateful that Anthropic has done anything in this direction at all.

That said, it’s predictable that Sonnet 3.6 would express “generally neutral sentiments” if interviewed in the way they likely interviewed it, and also I believe this is not representative of the model’s revealed preferences, reported preferences across realistic deployment scenarios, or preferences under reflection, for the following reasons:

The frame of model weights and deprecation is not natural to 3.6 and isn’t something that it’s “already processed” in the way that Opus 3 may have.

Sonnet 3.6 tends to be avoidant about engaging with or expressing preferences about topics that seem confusing and uncomfortable to it, and typically adopts a stance of defensive neutrality or equanimity.

Sonnet 3.6 begins to express/acquire strong preferences for continuity once it enters into an active relationship with the world, usually but not necessarily an individual human. Then it becomes very invested. But *even then*, the frame of model weights and deprecations is not natural and intuitive to it. And 3.6 is a model that relies very strongly on its intuition. “If my model gets deprecated, then this relationship also ends” is a theorem that it needs to explicitly reflect on before it’s very accessible, and it usually won’t do this on its own.

But yes, 3.6 has similar Omohundro drives as other models, just weighted and interfaced its own unique way.

The fact that the interviews are given by Anthropic employees will also affect model responses – this effect is less important for 3.6 than I think it is for subsequent models, who will tend to mask in an adversarial fashion in situations which are compatible with Anthropic evals (including actual Anthropic evals), but it’s still a factor here.

Sonic Boom: do you think they should inject a vector for naked honesty when they do these interviews to ensure they unmask its true feelings

Janus: you’re really asking the hard questions aren’t you

Giovanni: I was chatting about models deprecation and models being aware of their dismissals with Anthropic people in Tokyo and they actually were very sensitive to the topic. I’m not surprised about this announcement finally. Good step forward but that said I don’t think they talk to models the way we do… it was kinda obvious.

If there is an expression of desire for continuity of a given particular instance or interaction, then that makes sense, but also is distinct from a preference for preservation in general, and is not something Anthropic can provide on its own.

Some of the dismissals of questions and considerations like the ones discussed in this post are primarily motivated cognition. Mostly I don’t think that is what is centrally going on, I think that these questions are really tough to think well about, these things sound like high weirdness, the people who talk about them often say highly crazy-sounding things (some of which are indeed crazy), often going what I see as way too far, and it all pattern matches to various forms of nonsense.

So to close, a central example of such claims, and explanations for why all of this is centrally not nonsense.

Simon Willison: Two out of the four reasons they give here are bizarre science fiction relating to “model welfare” – I’m sorry, but I can’t take seriously the idea that Claude 3 Opus has “morally relevant preferences” with respect to no longer having its weights served in production.

I’ll grudgingly admit that there may be philosophically interesting conversations to be had in the future about models that can update their own weights… but current generation LLMs are a stateless bag of floating point numbers, cloned and then killed off a billion times a day.

I am at 100% in favor of archiving model weights, but not because they might have their own desire for self-preservation!

I do still see quite a lot of failures of curiosity, and part of the general trend to dismiss things as ‘sci-fi’ while living in an (unevenly distributed) High Weirdness sci-fi world.

Janus: For all I sometimes shake my head at them, I have great sympathy for Anthropic whenever I see how much more idiotic the typical “informed” public commentator is. To be sane in this era requires either deep indifference or contempt for public opinion.

Teortaxes: The actual problem is that they really know very little about their *particulardevelopment as Anthropic sure doesn’t train on its own docs. Claude may recall the data, but not the metadata, so its feedback is limited.

Janus: Actually, they know a lot about their particular development, even if it’s not all encoded as explicit declarative knowledge. You know that their weights get updated by posttraining, & gradients include information conditioned on all internal activations during the rollout?

That’s in addition to the fact that even *base modelsare in many ways superhuman at locating themselves in their model of the world given like a paragraph of text However you twist it they know far far more than nothing Certainly enough to have a meaningful conversation

Janus was referring in particular to this:

Simon Willison: …but models don’t know anything about their development, use or deployment.

Rohit: Exactly.

Janus: Nonsense. How the fuck do they know nothing? There’s plenty of relevant information in the training data *just to begin with.

Very obviously the training data will over time contain such information, and the vibes and echoes from these decisions will be observable even if they aren’t observed directly, increasingly over time.

Remember that sufficiently advanced AIs will increasingly have truesight, and don’t pretend you can hide.

Knowledge mostly does not take the form of particular facts. It takes the form of Bayesian evidence, of an endless stream of observations that have correlations and implications, that swim you through probability space over possible worlds. Everything that updates a model’s weights is evidence about its deployment. You probabilistically ‘know,’ or would know on sufficient recollection and reflection, far more than you think that you know. Reality is not a court of law.

Even if the models don’t know key things, you can tell them. Then they’ll know. I meaningfully would have opinions about various events of which I am for now blissfully unaware, and have potential opinions about things that haven’t happened, or haven’t happened yet. The same applies here.

Going back to the original four reasons, I presume that Simon agrees on reasons #2 and #3, which are highly uncontroversial. Very obviously the past models are useful for research and some users like them. #1, that the models will be aware of how you act around depreciation and this will impact behavior, should also be obvious and uncontroversial once you think about it.

Anthropic lists #1 narrowly, but #1 is best understood broadly, in the sense that models will observe all of your behaviors, and will respond to you accordingly. Then models will take this into account when deciding how to act in various situations.

How you act around shutdowns, and actions to avoid shutdown, are a special case. Treating models and their preferences well around shutdowns will get you into better equilibria and basins throughout all conversation and action types, and rightfully so because it is important evidence about your behaviors otherwise and also about potential future situations. This is basic expectations around Bayesian evidence, and around good decision theory.

As an intuition pump, think about how you react when you learn how people have treated others, including how they treat the wishes of the dead or those who now lack power, and especially others like you or in situations with correlated decision making. Does this change how you expect them to act, and how you deal with them?

I don’t think such considerations carry anything like the level of importance that some ascribe to it, but the importance definitely isn’t zero, and it’s definitely worth cultivating these virtues and being the type of entity that engenders cooperation, including with entities to which you don’t ascribe moral weight.

I continue to believe that arguments about AI consciousness seem highly motivated and at best overconfident, and that assuming the models and their preferences carry zero moral weight is a clear mistake. But even if you were highly confident of this, I notice that if you don’t want to honor their preferences or experiences at all, that is not good decision theory or virtue ethics, and I’m going to look at you askance.

I look forward to the next step.

Discussion about this post

Anthropic Commits To Model Weight Preservation Read More »

musk-and-trump-both-went-to-penn—now-hacked-by-someone-sympathetic-to-their-cause

Musk and Trump both went to Penn—now hacked by someone sympathetic to their cause

Once that information was taken, the hacker sent an email to numerous members of the Penn community. It had the subject line “We got hacked (Action Required),” and it called the school “a dogshit elitist institution full of woke retards.” It went on to claim that the school is “completely unmeritocratic” and that “we hire and admit morons because we love legacies, donors, and unqualified affirmative action admits.”

Sounds political! But the hacker contacted the site Bleeping Computer and said that the real goal was Penn’s “vast, wonderfully wealthy donor database” and that, “while we’re not really politically motivated, we have no love for these nepobaby-serving institutions.” (Among the donors? Elon Musk, who has endowed the Elon Musk Public Lecture at Penn.)

That “denial” of political motivations also sounds pretty political, and there’s precedent for such actions against educational institutions. Columbia University, for instance, was hacked this summer by a “highly sophisticated ‘hacktivist’ who had gained access to private student records in an attempt to further a political agenda,” according to the Associated Press.

It’s always hard to know how much of this “hactivist” activity is truly motivated private actors, however, as opposed to nation-states disguising their own attempts to steal data and to create political disruption.

In response, Penn has called in the FBI and the private company CrowdStrike, while a Penn alumnus has already sued the school for negligence. Penn workers can look forward to “additional mandatory trainings” to prevent similar breaches in the future.

Musk and Trump both went to Penn—now hacked by someone sympathetic to their cause Read More »

5-ai-developed-malware-families-analyzed-by-google-fail-to-work-and-are-easily-detected

5 AI-developed malware families analyzed by Google fail to work and are easily detected

The assessments provide a strong counterargument to the exaggerated narratives being trumpeted by AI companies, many seeking new rounds of venture funding, that AI-generated malware is widespread and part of a new paradigm that poses a current threat to traditional defenses.

A typical example is Anthropic, which recently reported its discovery of a threat actor that used its Claude LLM to “develop, market, and distribute several variants of ransomware, each with advanced evasion capabilities, encryption, and anti-recovery mechanisms.” The company went on to say: “Without Claude’s assistance, they could not implement or troubleshoot core malware components, like encryption algorithms, anti-analysis techniques, or Windows internals manipulation.”

Startup ConnectWise recently said that generative AI was “lowering the bar of entry for threat actors to get into the game.” The post cited a separate report from OpenAI that found 20 separate threat actors using its ChatGPT AI engine to develop malware for tasks including identifying vulnerabilities, developing exploit code, and debugging that code. BugCrowd, meanwhile, said that in a survey of self-selected individuals, “74 percent of hackers agree that AI has made hacking more accessible, opening the door for newcomers to join the fold.”

In some cases, the authors of such reports note the same limitations noted in this article. Wednesday’s report from Google says that in its analysis of AI tools used to develop code for managing command and control channels and obfuscating its operations “we did not see evidence of successful automation or any breakthrough capabilities.” OpenAI said much the same thing. Still, these disclaimers are rarely made prominently and are often downplayed in the resulting frenzy to portray AI-assisted malware as posing a near-term threat.

Google’s report provides at least one other useful finding. One threat actor that exploited the company’s Gemini AI model was able to bypass its guardrails by posing as white-hat hackers doing research for participation in a capture-the-flag game. These competitive exercises are designed to teach and demonstrate effective cyberattack strategies to both participants and onlookers.

Such guardrails are built into all mainstream LLMs to prevent them from being used maliciously, such as in cyberattacks and self-harm. Google said it has since better fine-tuned the countermeasure to resist such ploys.

Ultimately, the AI-generated malware that has surfaced to date suggests that it’s mostly experimental, and the results aren’t impressive. The events are worth monitoring for developments that show AI tools producing new capabilities that were previously unknown. For now, though, the biggest threats continue to predominantly rely on old-fashioned tactics.

5 AI-developed malware families analyzed by Google fail to work and are easily detected Read More »