Author name: Mike M.

researchers-figure-out-how-to-get-fresh-lithium-into-batteries

Researchers figure out how to get fresh lithium into batteries

In their testing, they use a couple of unusual electrode materials, such as a chromium oxide (Cr8O21) and an organic polymer (a sulfurized polyacrylonitrile). Both of these have significant weight advantages over the typical materials used in today’s batteries, although the resulting batteries typically lasted less than 500 cycles before dropping to 80 percent of their original capacity.

But the striking experiment came when they used LiSO2CF3 to rejuvenate a battery that had been manufactured as normal but had lost capacity due to heavy use. Treating a lithium-iron phosphate battery that had lost 15 percent of its original capacity restored almost all of what was lost, allowing it to hold over 99 percent of its original charge. They also ran a battery for repeated cycles with rejuvenation every few thousand cycles. At just short of 12,000 cycles, it still could be restored to 96 percent of its original capacity.

Before you get too excited, there are a couple of things worth noting about lithium-iron phosphate cells. The first is that, relative to their charge capacity, they’re a bit heavy, so they tend to be used in large, stationary batteries like the ones in grid-scale storage. They’re also long-lived on their own; with careful management, they can take over 8,000 cycles before they drop to 80 percent of their initial capacity. It’s not clear whether similar rejuvenation is possible in the battery chemistries typically used for the sorts of devices that most of us own.

The final caution is that the battery needs to be modified so that fresh electrolytes can be pumped in and the gases released by the breakdown of the LiSO2CF3 removed. It’s safest if this sort of access is built into the battery from the start, rather than provided by modifying it much later, as was done here. And the piping needed would put a small dent in the battery’s capacity per volume if so.

All that said, the treatment demonstrated here would replenish even a well-managed battery closer to its original capacity. And it would largely restore the capacity of something that hadn’t been carefully managed. And that would allow us to get far more out of the initial expense of battery manufacturing. Meaning it might make sense for batteries destined for a large storage facility, where lots of them could potentially be treated at the same time.

Nature, 2025. DOI: 10.1038/s41586-024-08465-y  (About DOIs).

Researchers figure out how to get fresh lithium into batteries Read More »

elon-musk-to-“fix”-community-notes-after-they-contradict-trump

Elon Musk to “fix” Community Notes after they contradict Trump

Elon Musk apparently no longer believes that crowdsourcing fact-checking through Community Notes can never be manipulated and is, thus, the best way to correct bad posts on his social media platform X.

Community Notes are supposed to be added to posts to limit misinformation spread after a broad consensus is reached among X users with diverse viewpoints on what corrections are needed. But Musk now claims a “fix” is needed to prevent supposedly outside influencers from allegedly gaming the system.

“Unfortunately, @CommunityNotes is increasingly being gamed by governments & legacy media,” Musk wrote on X. “Working to fix this.”

Musk’s announcement came after Community Notes were added to X posts discussing a poll generating favorable ratings for Ukraine President Volodymyr Zelenskyy. That poll was conducted by a private Ukrainian company in partnership with a state university whose supervisory board was appointed by the Ukrainian government, creating what Musk seems to view as a conflict of interest.

Although other independent polling recently documented a similar increase in Zelenskyy’s approval rating, NBC News reported, the specific poll cited in X notes contradicted Donald Trump’s claim that Zelenskyy is unpopular, and Musk seemed to expect X notes should instead be providing context to defend Trump’s viewpoint. Musk even suggested that by pointing to the supposedly government-linked poll in Community Notes, X users were spreading misinformation.

“It should be utterly obvious that a Zelensky[y]-controlled poll about his OWN approval is not credible!!” Musk wrote on X.

Musk’s attack on Community Notes is somewhat surprising. Although he has always maintained that Community Notes aren’t “perfect,” he has defended Community Notes through multiple European Union probes challenging their effectiveness and declared that the goal of the crowdsourcing effort was to make X “by far the best source of truth on Earth.” At CES 2025, X CEO Linda Yaccarino bragged that Community Notes are “good for the world.”

Yaccarino invited audience members to “think about it as this global collective consciousness keeping each other accountable at global scale in real time,” but just one month later, Musk is suddenly casting doubts on that characterization while the European Union continues to probe X.

Perhaps most significantly, Musk previously insisted as recently as last year that Community Notes could not be manipulated, even by Musk. He strongly disputed a 2024 report from the Center for Countering Digital Hate that claimed that toxic X users were downranking accurate notes that they personally disagreed with, claiming any attempt at gaming Community Notes would stick out like a “neon sore thumb.”

Elon Musk to “fix” Community Notes after they contradict Trump Read More »

microsoft’s-new-ai-agent-can-control-software-and-robots

Microsoft’s new AI agent can control software and robots

The researchers' explanations about how

The researchers’ explanations about how “Set-of-Mark” and “Trace-of-Mark” work. Credit: Microsoft Research

The Magma model introduces two technical components: Set-of-Mark, which identifies objects that can be manipulated in an environment by assigning numeric labels to interactive elements, such as clickable buttons in a UI or graspable objects in a robotic workspace, and Trace-of-Mark, which learns movement patterns from video data. Microsoft says those features allow the model to complete tasks like navigating user interfaces or directing robotic arms to grasp objects.

Microsoft Magma researcher Jianwei Yang wrote in a Hacker News comment that the name “Magma” stands for “M(ultimodal) Ag(entic) M(odel) at Microsoft (Rese)A(rch),” after some people noted that “Magma” already belongs to an existing matrix algebra library, which could create some confusion in technical discussions.

Reported improvements over previous models

In its Magma write-up, Microsoft claims Magma-8B performs competitively across benchmarks, showing strong results in UI navigation and robot manipulation tasks.

For example, it scored 80.0 on the VQAv2 visual question-answering benchmark—higher than GPT-4V’s 77.2 but lower than LLaVA-Next’s 81.8. Its POPE score of 87.4 leads all models in the comparison. In robot manipulation, Magma reportedly outperforms OpenVLA, an open source vision-language-action model, in multiple robot manipulation tasks.

Magma's agentic benchmarks, as reported by the researchers.

Magma’s agentic benchmarks, as reported by the researchers. Credit: Microsoft Research

As always, we take AI benchmarks with a grain of salt since many have not been scientifically validated as being able to measure useful properties of AI models. External verification of Microsoft’s benchmark results will become possible once other researchers can access the public code release.

Like all AI models, Magma is not perfect. It still faces technical limitations in complex step-by-step decision-making that requires multiple steps over time, according to Microsoft’s documentation. The company says it continues to work on improving these capabilities through ongoing research.

Yang says Microsoft will release Magma’s training and inference code on GitHub next week, allowing external researchers to build on the work. If Magma delivers on its promise, it could push Microsoft’s AI assistants beyond limited text interactions, enabling them to operate software autonomously and execute real-world tasks through robotics.

Magma is also a sign of how quickly the culture around AI can change. Just a few years ago, this kind of agentic talk scared many people who feared it might lead to AI taking over the world. While some people still fear that outcome, in 2025, AI agents are a common topic of mainstream AI research that regularly takes place without triggering calls to pause all of AI development.

Microsoft’s new AI agent can control software and robots Read More »

see-a-garbage-truck’s-cng-cylinders-explode-after-lithium-ion-battery-fire

See a garbage truck’s CNG cylinders explode after lithium-ion battery fire

When firefighters arrived on scene, they asked the driver to dump his load in the street, which would reduce the risk of anything on the truck itself—gasoline, CNG, etc.—catching fire. Then the firefighters could put out the blaze easily, treating it like a normal trash fire, and have Groot haul away the debris afterward. But this didn’t work either. The flames had spread far enough by this point to put the truck’s dumping mechanism out of commission.

So, firefighters unrolled hoses and hooked up to a nearby fire hydrant. They recognized that the truck was CNG-powered, as were many Groot vehicles. CNG offers a lower maintenance cost, uses less fuel, and creates less pollution than diesel, but best practices currently suggest not spraying CNG cylinders directly with water. Firefighters instead tried to aim water right into the back of the garbage truck without wetting the CNG cylinders nearby on the roof.

They were waiting for the telltale hiss of the pressure relief system to trigger. These valves typically open within two to five minutes, depending on fire conditions, and they should be capable of venting all their natural gas some minutes before the CNG canisters would otherwise be in danger of exploding. But the hiss never came, and as Fire Chief Lance Harris and his crew worked to secure the scene and put water onto the burning load, the CNG canisters exploded catastrophically instead.

A photo of the explosion, as captured by a bodycam.

The explosion, as captured by a bodycam.

In a board of trustees meeting this week in Arlington Heights, Harris recounted the incident, noting that he felt lucky to be alive—and thankful that no township personnel or residents sustained serious injuries.

“We can’t prove it,” he said, but after two months of investigating the situation, his department had concluded with high probability that the fire had been caused by a lithium-ion battery discarded into a recycling container. This suspicion was based on the amount of fire and the heat and speed with which it burned; lithium-ion batteries that enter “thermal runaway” can burn hot, at around 750° Fahrenheit (399° C).

Harris’ takeaway was clear: recycle even small lithium-ion batteries responsibly, as they can cause real hazards if placed into the waste system, where they are often impacted or compressed.

See a garbage truck’s CNG cylinders explode after lithium-ion battery fire Read More »

ftc-investigates-“tech-censorship,”-says-it’s-un-american-and-may-be-illegal

FTC investigates “tech censorship,” says it’s un-American and may be illegal

The Federal Trade Commission today announced a public inquiry into alleged censorship online, saying it wants “to better understand how technology platforms deny or degrade users’ access to services based on the content of their speech or affiliations, and how this conduct may have violated the law.”

“Tech firms should not be bullying their users,” said FTC Chairman Andrew Ferguson, who was chosen by President Trump to lead the commission. “This inquiry will help the FTC better understand how these firms may have violated the law by silencing and intimidating Americans for speaking their minds.”

The FTC announcement said that “censorship by technology platforms is not just un-American, it is potentially illegal.” Tech platforms’ actions “may harm consumers, affect competition, may have resulted from a lack of competition, or may have been the product of anti-competitive conduct,” the FTC said.

The Chamber of Progress, a lobby group representing tech firms, issued a press release titled, “FTC Chair Rides MAGA ‘Tech Censorship’ Hobby Horse.”

“Republicans have spent nearly a decade campaigning against perceived social media ‘censorship’ by attempting to dismantle platforms’ ability to moderate content, despite well-established Supreme Court precedent,” the group said. “Accusations of ‘tech censorship’ also ignore the fact that conservative publishers and commentators receive broader engagement than liberal voices.”

Last year, the Supreme Court found that a Texas state law prohibiting large social media companies from moderating posts based on a user’s “viewpoint” is unlikely to withstand First Amendment scrutiny. The Supreme Court majority opinion said the court “has many times held, in many contexts, that it is no job for government to decide what counts as the right balance of private expression—to ‘un-bias’ what it thinks biased, rather than to leave such judgments to speakers and their audiences. That principle works for social-media platforms as it does for others.”

FTC investigates “tech censorship,” says it’s un-American and may be illegal Read More »

scientists-unlock-vital-clue-to-strange-quirk-of-static-electricity

Scientists unlock vital clue to strange quirk of static electricity

Scientists can now explain the prevailing unpredictability of contact electrification, unveiling order from what has long been considered chaos.

Static electricity—specifically the triboelectric effect, aka contact electrification—is ubiquitous in our daily lives, found in such things as a balloon rubbed against one’s hair or styrofoam packing peanuts sticking to a cat’s fur (as well as human skin, glass tabletops, and just about anywhere you don’t want packing peanuts to be). The most basic physics is well understood, but long-standing mysteries remain, most notably how different materials exchange positive and negative charges—sometimes ordering themselves into a predictable series, but sometimes appearing completely random.

Now scientists at the Institute of Science and Technology Austria (ISTA) have identified a critical factor explaining that inherent unpredictability: It’s the contact history of given materials that controls how they exchange charges in contact electrification. They described their findings in a new paper published in the journal Nature.

Johan Carl Wilcke published the first so-called “triboelectric series” in 1757 to describe the tendency of different materials to self-order based on how they develop a positive or negative charge. A material toward the bottom of the list, like hair, will acquire a more negative charge when it comes into contact with a material near the top of the list, like a rubber balloon.

The issue with all these lists is that they are inconsistent and unpredictable—sometimes the same scientists don’t get the same ordering results twice when repeating experiments—largely because there are so many confounding factors that can come into play. “Understanding how insulating materials exchanged charge seemed like a total mess for a very long time,” said co-author Scott Waitukaitis of ISTA. “The experiments are wildly unpredictable and can sometimes seem completely random.”

A cellulose material’s charge sign, for instance, can depend on whether its curvature is concave or convex. Two materials can exchange charge from positive (A) to negative (B), but that exchange can reverse over time, with B being positive and A being negative. And then there are “triangles”: Sometimes one material (A) gains a positive charge when rubbed up against another material (B), but B will gain a positive charge when rubbed against a third material (C), and C, in turn, will gain positive charge when in contact with A. Even identical materials can sometimes exchange charge upon contact.

Scientists unlock vital clue to strange quirk of static electricity Read More »

by-the-end-of-today,-nasa’s-workforce-will-be-about-10-percent-smaller

By the end of today, NASA’s workforce will be about 10 percent smaller

Spread across NASA’s headquarters and 10 field centers, which dot the United States from sea to sea, the space agency has had a workforce of nearly 18,000 civil servants.

However, by the end of today, that number will have shrunk by about 10 percent since the beginning of the second Trump administration four weeks ago. And the world’s preeminent space agency may still face significant additional cuts.

According to sources, about 750 employees at NASA accepted the “fork in the road” offer to take deferred resignation from the space agency later this year. This sounds like a lot of people, but generally about 1,000 people leave the agency every year, so effectively, many of these people might just be getting paid to leave jobs they were already planning to exit from.

The culling of “probationary” employees will be more impactful. As it has done at other federal agencies, the Trump administration is generally firing federal employees who are in the “probationary” period of their employment, which includes new hires within the last one or two years or long-time employees who have moved into or been promoted into a new position. About 1,000 or slightly more employees at NASA were impacted by these cuts.

Adding up the deferred resignations and probationary cuts, the Trump White House has now trimmed about 10 percent of the agency’s workforce.

However, the cuts may not stop there. Two sources told Ars that directors at the agency’s field centers have been told to prepare options for a “significant” reduction in force in the coming months. The scope of these cuts has not been defined, and it’s possible they may not even happen, given that the White House must negotiate budgets for NASA and other agencies with the US Congress. But this directive for further reductions in force casts more uncertainty on an already demoralized workforce and signals that the Trump administration would like to make further cuts.

By the end of today, NASA’s workforce will be about 10 percent smaller Read More »

can-public-trust-in-science-survive-a-second-battering?

Can public trust in science survive a second battering?


Public trust in science has shown a certain resiliency, but it is being tested like never before.

Public trust in science has been in the spotlight in recent years: After the US presidential election in November, one Wall Street Journal headline declared that “Science Lost America’s Trust.” Another publication called 2024 “the year of distrust in science.”

Some of that may be due to legitimate concerns: Public health officials have been criticized for their lack of transparency during critical moments, including the COVID-19 pandemic. And experts have noted the influence of political factors. For instance, the first Trump administration repeatedly undermined scientists—a trend repeating in his second term so far.

But what does the research say about where public trust in science, doctors, and health care institutions actually stands? In recent years, researchers have been increasingly looking into quantifying these sentiments. And indeed, multiple surveys and studies have reported the COVID-19 pandemic correlated with a decline in trust in the years following the initial outbreak. This decrease, though, seems to be waning as new research shows a clearer picture of trust across time. One 2024 study suggests Trump’s attacks on science during his first term did not have the significant impact many experts feared—and may have even boosted confidence among certain segments of the population.

Overall confidence in scientific institutions has slightly rebounded since the pandemic, some research suggests, with that trust remaining strong across countries. Despite the uptick, there appears to be a still widening divide particularly between political factions, with Democrats showing higher levels of trust and Republicans showing lower levels, a polarization that became more pronounced during the COVID-19 pandemic.

“What we’re seeing now, several years later, is how deep those divisions really are,” said Cary Funk, who previously led science and society research at the Pew Research Center and has written reports on public trust in science. Funk is now a senior adviser for public engagement at the Aspen Institute Science and Society Program.

Political and economic entities have weaponized certain scientific topics, such as climate change, as well as the mistrust in science to advance their own interests, said Gabriele Contessa, a philosopher of science at Carleton University in Ottawa, Canada. In the future, that weaponization might engender mistrust related to other issues, he added. It remains to be seen what effect a second Trump term may have on confidence in science. Already, Trump issued a communications freeze on Department of Health and Human Services officials and paused federal grants, a move that was ultimately rescinded but still unleashed a flurry of chaos and confusion throughout academic circles.

“To have people like Donald Trump, who clearly do not trust reputable scientific sources and often trust instead disreputable or at least questionable scientific sources, is actually a very, very strong concern,” Contessa said.

Who will act in the public’s best interest?

In the winter of 2021, the Pew Research Center conducted a survey of around 14,500 adults in the US, asking about their regard for different groups of individuals, including religious leaders, police officers, and medical scientists. The proportion of the survey takers who said they had a great deal of confidence in scientists to act in the public’s best interest, the researchers found, decreased from 39 percent in November 2020 to 29 percent just one year later. In October 2023, at the lowest point since the pandemic began, only 23 percent reported a great deal of confidence in scientists. A analysis conducted by The Associated Press-NORC Center for Public Affairs Research reported a comparable decline: In 2018, 48 percent of respondents reported a great deal of confidence in scientists; in 2022, it was down to just 39 percent.

But years later, a new survey conducted in October 2024 suggested that the dip in trust may have been temporary. An update to the Pew survey that sought input from almost 10,000 adults in the US shows a slow recovery: Compared to the 23 percent, now 26 percent report having a great deal of confidence.

Similarly, a 2024 study examining attitudes toward scientific expertise during a 63-year period found that Trump and Republican attacks on science, in general, did not actually sway public trust when comparing responses in 2016 to those from 2020. And a recent international survey that asked nearly 72,000 individuals in 68 countries their thoughts on scientists revealed that most people trust scientists and want them to be a part of the policy making process.

“There are still lots of people who have at least a kind of soft inclination to have confidence or trust in scientists, to act in the interests of the public,” said Funk. “And so majorities of Americans, majorities even of Republicans, have that view.”

But while public trust in general seems to be resilient, that finding becomes more complex on closer inspection. Confidence can remain high and increase for some groups, while simultaneously declining in others. The same study that looked at Trump’s influence on trust during his first administration, for instance, found that some polarization grew stronger on both ends of the spectrum. “Twelve percent of USA adults became more skeptical of scientific expertise in response to Trump’s dismissal of science, but 20 percent increased their trust in scientific expertise during the same period,” the study noted. Meanwhile, the neutral middle shrank: In 2016, 76 percent reported that they had no strong opinions on their trust in science. In 2020, that plunged to 29 percent.

The COVID-19 pandemic also seems to have had a pronounced effect on that gap: Consistently, research conducted after the pandemic shows that people with conservative ideologies distrust science more than those who are left-leaning. Overall, Republicans’ confidence in science fell 23 points from 2018 to 2022, dropping by half. Another recent poll shows declining confidence, specifically in Republican individuals, in health agencies such as the Centers for Disease Control and Prevention and the Food and Drug Administration. This distrust was likely driven by the politicization of pandemic policies, such as masking, vaccine mandates, and lockdowns, according to commentaries from experts.

The international survey of individuals in 68 countries did not find a relationship between trust in science and political orientation. Rod Abhari, a PhD candidate at Northwestern University who studies the role of digital media on trust, told Undark this suggests that conservative skepticism toward science is not rooted in ideology but is instead a consequence of deliberate politicization by corporations and Republican pundits. “Republican politicians have successfully mobilized the conspiracy and resistance to scientists—and not just scientists, but government agencies that represent science and medicine and nutrition,” he added.

“Prior to the outbreak,” said Funk, “views of something like medical researchers, medical doctors, medical scientists, were not particularly divided by politics.”

Second time around

So, what does this research mean for a second Trump term?

One thing that experts have noticed is that rather than distrusting specific types of scientists, such as climate change researchers, conservatives have begun to lump scientists across specialties and have more distrust of scientists in general, said Funk.

Going forward, Abhari predicted, “the scope of what science is politicized will expand” beyond hot-button topics like climate change. “I think it’ll become more existential, where science funding in general will become on the chopping block,” he said in mid-January. With the recent temporary suspensions on research grant reviews and payments for researchers and talk of mass layoffs and budget cuts at the National Science Foundation, scientists are already worried about how science funding will be affected.

This weaponization of science has contributed and will continue to lead to eroding trust, said Contessa. Already, topics like the effects of gas stoves on health have been weaponized by entities with political and economic motivation like the gas production companies, he pointed out. “It shows you really any topic, anything” can be used to sow skepticism in scientists, he said.

Many experts emphasize strategies to strengthen overall trust, close the partisan gap, and avoid further politicization of science.

Christine Marizzi, who leads a science education effort in Harlem for a nonprofit organization called BioBus, highlights the need for community engagement to make science more visible and accessible to improve scientists’ credibility among communities.

Ultimately, Abhari said, scientists need to be outspoken about the politicization of science to be able to regain individuals’ trust. This “will feel uncomfortable because science has typically tried to brand itself as being apolitical, but I think it’s no longer possible,” Abhari said. “It’s sort of the political reality of the situation.”

The increasing polarization in public trust is concerning, said Funk. So “it’s an important time to be making efforts to widen trust in science.”

This article was originally published on Undark. Read the original article.

Can public trust in science survive a second battering? Read More »

medical-roundup-#4

Medical Roundup #4

It seems like as other things drew our attention more, medical news slowed down. The actual developments, I have no doubt, are instead speeding up – because AI.

Note that this post intentionally does not cover anything related to the new Administration, or its policies.

  1. Some People Need Practical Advice.

  2. Good News, Everyone.

  3. Bad News.

  4. Life Extension.

  5. Doctor Lies to Patient.

  6. Study Lies to Public With Statistics.

  7. Area Man Discovers Information Top Doctors Missed.

  8. Psychiatric Drug Prescription.

  9. H5N1.

  10. WHO Delenda Est.

  11. Medical Ethicists Take Bold Anti-Medicine Stance.

  12. Rewarding Drug Development.

  13. Not Rewarding Device Developers.

  14. Addiction.

  15. Our Health Insurance Markets are Broken.

If you ever have to go to the hospital for any reason, suit up, or at least look good.

Life expectancy is still rising in the longest-lived countries.

Challenge trials are not in general riskier than RCTs, and dramatically on net increase health and save lives while being entirely voluntary, but that is of course all orthogonal to the concerns of bioethicists.

We now have a 100% effective practical way to prevent HIV infection.

Naloxone alone did not do much to reduce opioid deaths, but Narcan did by allowing those without training to administer the drug. This does not tell us about second-order impacts, but presumably people not dying is good.

The FDA will occasionally do something helpful… eventually… after exhausting seven years and all alternatives. In 2017 a law required the FDA to let us buy hearing aids. In 2021 they put out a rule ‘for public comment.’ In 2024 it finally happened.

China has a semi-libertarian Medial Tourism Pilot Zone in Hainan, where anything approved elsewhere is allowed, and many other restrictions are also waived. Alex Tabarrok notes this could be a model for the upside of Prospera. I say the true upside comes when you don’t need the approval at all, so long as your disclosures are clear.

So it looks like uterus transplants… work? As in the patients get to have kids.

Cate Hall reports much improved overall health from testosterone replacement therapy as a 40yo cis woman.

There’s a concierge doctor in Austin called Riverrock Medical that wrote a Bayesian calculator app for doctors.

States creating pathways for foreign doctors to practice medicine in America without redoing residency. More of this, please.

Nikhil Krishnan: I feel like the fact that residency slots aren’t expanded probably isn’t due to funding (residents seem to actually be ROI positive for hospitals since they’re cheap and can still bill) but actually due to capacity constraints of training people. This effectively seems like using other countries as our expanded capacity for resident training?

That is not quite the same as saying the residency slots cost too much, but it also is not that different. One of the costs of providing a residency slot is the training time, which requires doctor time, which is expensive, increasing staffing needs. If pricing that in still leaves profit, including accounting for transitional costs, you’d see expanded residency slots, so I presume that after taking this into account adding new slots is not actually profitable, even if current slots do okay.

Worries about what will happen to the genetic information from 23andMe. As others have noted, one of our charitably inclined billionaires should step up and buy the information to both preserve it for research in anonymized form and to protect the personal information.

Tyler Cowen reports on the book by the new head of the FDA. It seems right that Marty Makary is a pick with upsides and not downsides, but it also seems right to be unexcited for now, and on net disappointed. This was an opportunity to get a huge win, and the value was in the chance that we’d get a huge win, which is now lower (but not gone), the same way Operation Warp Speed was a civilization-level huge win. If Trump changes his mind or Makary runs into issues, I am still available for this or a number of other high-level posts, such as head of AISI.

Sarah Constantin confirms this one is legit, Nature paper, IL-11 inhibition is a 25% life extension IN MICE. I trust her on such matters.

Another claim of 25% life extension IN MICE, on a single injection, with various signs of physical rejuvenation.

Still very early, also starting to get interesting.

Caloric restriction appears to extend life only in short-lived animal models, and fail in longer-lived models. That’s highly unfortunate, since humans live a long time.

If you are terminally ill and ask how long you have the doctor will overestimate your time left, on average by a factor of five. This was mostly (63%) in cancer patients. They frame that as doctors being inaccurate.

I find it hard to believe this is not motivated. Even if the doctors consciously believe the estimates, they almost have to be offering optimistic outlooks that are ‘on purpose’ in various ways.

If they started out optimistic by a factor of five as an honest mistake and were trying to be accurate, it wouldn’t take long for them to notice their patients all keep dying way too soon, and adjusting their estimates.

Potential motivations include preventing the patient from giving up hope or seeking expensive and painful alternative treatments they doubt will do anything useful, telling them what they want to hear, avoiding the whole ‘doctor said I would die soon and here I am’ thing and so on. I do sympathize.

I also find it weird to assess a prediction for ‘accuracy’ based on the actual time of death – a prediction is over a distribution of outcomes. You can only properly judge prediction accuracy statistically.

Is there a relation to this next story, if you look at the incentives?

Pregnant woman goes into labor at 22 weeks, hospital tells her she has no hope, she drives 7 miles to another hospital she finds on facebook and now she has a healthy four year old. Comments have a lot of other ‘the doctors told us our child would never survive, but then we got a second opinion and they did anyway’ stories.

Based on my personal experience as well, in America you really, really need to be ready to advocate for yourself when dealing with pregnancy. Some doctors are wonderful, other doctors will often flat out give you misinformation or do what is convenient for them if you don’t stop them. Also don’t forget to consult Claude.

Inequality is always a function of how you take the measurement and divide the groups.

Here is an extreme example.

St. Rev. Dr. Rev: Remember that study that said 12% of the population eats 50% of the beef? And it turned out that they meant in a given 24 hour period? Eat a burger on Tuesday, suddenly you’re in the 12%. Have a salad on Wednesday, you’re in the 88%.

This is that, but for cancer. Come on, man.

“5% of people are responsible for just over half.” This is analogous to saying 5% of the people buy 50% of the houses — yes, because most people don’t buy a house every year! This should not come as a surprise!

Cremieux: 1% of people are responsible for 24% of the health spending in America and 5% of people are responsible for just over half.

I had approximately zero medical expenses until I was seventeen. Burst appendix, nearly died. From age 0 to 20, 99% of my medical expenses occurred in 5% of the years. This is normal!

It is important to generalize this mode of thinking.

Sorting people by Y, and and saying the top X% in Y have most of the Y, is typically not so meaningful.

What you want is to say that you sort people by Z in advance, and then notice that the top X% in Z then later accumulate a lot of the Y (where Z can be ‘previous amount of Y’ or something else). Then you are more likely to be measuring something useful, and smaller imbalances mean a lot more.

If you must measure top X% in Y as having most of the Y, then you have to at least ensure you are doing this over a meaningful period, and put it in sensible context.

What’s the right way to think about this cartoon?

Sometimes, yes, you will see something the world’s top scientists and doctors missed.

Far more often, you will see something that the ‘consensus’ of those scientists and doctors missed. Yes, somewhere out there one of them already had the same idea, but this fact does not, on its own, help you.

Far more often than that, you will come up with something that was not successfully communicated to your particular average doctor, who also does not share your internal experiences or interest in your case. Or where ‘the system’ follows procedures that let you down, and it is very easy to see they are not doing what is in your best interest. That should be entirely unsurprising.

Also of course sometimes the system plays the odds from its perspective, and it turns out the odds were wrong, and you may or may not have had enough information to know this any more than the system did.

So we have personal stories like this one, or this one, or this one, where a doctor got it wrong. In particular, no, you should not trust that Average Doctor got the right answer and you couldn’t possibly figure out anything they didn’t. Doctors are often hurried and overworked, mostly don’t understand probability, have an impossible amount of knowledge to keep up with, and are trying to get through the day like everyone else.

In the broader case where you actually are defying a clear consensus, and doing so in general rather than for you specifically, you should of course be more skeptical that you’re right, but if you are reading this, it’s entirely plausible.

And as always, of course, don’t forget to ask Claude.

Case that the rise in consumption of psychiatric drugs is less a story about smartphones and social media and other cultural shifts, and more a story of reduced costs and improving convenience of access (especially with remote doctor visits) increasing the place supply intersects demand, along with campaigns that lessened the stigma, which also lowers price.

Certainly this is all a key contributing factor, the inconvenience and cost considerations matter, and people let them matter more than they should.

The first question to ask would be, given this, should we make these drugs easier or harder to get? Cheaper or more expensive?

Scott Alexander did the ‘more than you wanted to know’ treatment for H5N1. He predicts a 5% chance of a pandemic from it in the next year and 50% in the next twenty under otherwise normal circumstances, with ~6% chance it’s Spanish flu level or worse if and when it happens.

It is of concern, but it isn’t that different from the background level of concern from flu pandemics anyway. We really do have a lot of pandemics in general and flu pandemics in particular.

We should be preventing and preparing for them, we are not doing much of either, and this is mostly just another example of that. My guess is that the odds are worse than that, but that the fundamental takeaway is similar.

That matches how I have been thinking about this for a while. I’ve made a deliberate decision not to cover H5N1, and all the ways in which we are repeating the mistakes of Covid-19.

If we do get a full pandemic I will feel compelled to cover it, but until then I don’t think my observations would cause people reading this to make better decisions. I essentially despair of actually changing the policy decision that matter, for any amount of attention I might reasonably direct to the subject.

WHO has a public health emergency for Monkeypox yet refuses to authorize the vaccine that stopped the last outbreak despite it having approval from the FDA and other major medical agencies. WHO really is the worst.

The reason medical progress is so slow is largely the types of people who, when they hear a woman cured her own cancer and published the results to benefit others, heroically advancing the cause of humanity at risk to only herself, warn of the dire ethical problems with that.

Often those people call themselves ‘ethicists.’ No, seriously, that is a thing, and that situation somehow rose to the level of a writeup in Nature, which calls self-experimentation an ‘ethically fraught practice.’

This high quality application to the Delenda Est club has been noted.

I strongly agree with Tyler Cowen that we do not provide enough financial incentive to those who create new drugs. Robert Sterling here explains, as well. They deserve very large rewards when they find the next Ozempic.

The question is, should the world be doing this in the form of America paying super high prices while everyone else shirks?

As in, no, price controls are bad…

…because price controls everywhere else leaves it entirely on us to subsidize drug development. And we do it in a way that limits consumer surplus, since marginal costs are low. How do we address this?

One possibility is that patent buyouts seem better than indefinite high prices. The government can borrow cheaply and assume the variance, and marginal cost is low, so we can increase efficiency twice over, and also improve distributional effects, if we can negotiate a fair price. And it helps the pharma companies, since they realize their profits faster, and can plow it back into more investment.

The other half is that we are subsidizing everyone else. This is better than not doing it, but it would be nice to get others to also pay their fair share?

One obvious brainstorm is to do a price control, but as a maximum markup over the first-world average price. We will pay, say, five times the average price elsewhere. That way, the companies can negotiate harder for higher prices? Alas, I doubt a good enough multiplier would be palatable (among other issues), so I guess not.

On the research side, we win again. We can safely pay well above the net present value of cash flows from the monopoly, because the surplus from greater production is that much higher. Meanwhile, we reduce uncertainty, and Novo Nordisk gets the payout right away, so it can do far more R&D spending with the profits, and can justify more investment on the anticipation of similar future buyout deals. It’s win-win.

The supervillains? A very large decline in innovation for medical devices after Medicare and Medicaid price cuts.

We investigate the effects of substantial Medicare price reductions in the medical device industry, which amounted to a 61% decrease over 10 years for certain device types. Analyzing over 20 years of administrative and proprietary data, we find these price cuts led to a 25% decline in new product introductions and a 75% decrease in patent filings, indicating significant reductions in innovation activity.

Manufacturers decreased market entry and increased outsourcing to foreign producers, associated with higher rates of product defects. Our calculations suggest the value of lost innovation may fully offset the direct cost savings from the price cuts. We propose that better-targeted pricing reforms could mitigate these negative effects. These findings underscore the need to balance cost containment with incentives for innovation and quality in policy design.

Several commenters at MR pointed out that these areas of the DME industry are rife with fraud and abuse, and indeed Claude noted without direct prompting that is largely what motivated the price cuts. It is not obvious that we would have wanted all this lost ‘innovation,’ as opposed to it being set up to take advantage of the government writing the checks.

Here’s an Abundance Agenda for Addiction, with a focus on development of more drugs to help alongside GLP-1s, by fixing the many ways in which our system fights against attempts to make anti-addiction medications. Advance purchase and risk sharing agreements, extended exclusivity, expedited approval, and especially ability to actually run reasonable studies would help a lot, and straight up funding wouldn’t hurt either.

The insanity that is buying health insurance in California in particular. They cap how much you can pay as a percentage of income at 8.5%, but then you get to shop around for different insurances that have different benefits and sticker prices. So there’s a middle ground where all health insurance is equally outrageously expensive, and no amount of shopping around will ever save you a dollar. I’m expecting that turns out about as well as you would expect, on so many levels.

If we want to do progressive transfers of wealth we should do them directly, not via health insurance.

Discussion about this post

Medical Roundup #4 Read More »

“nokiapple-lumiphone-1020-se”-merges-windows-phone-body-with-budget-iphone-guts

“NokiApple LumiPhone 1020 SE” merges Windows Phone body with budget iPhone guts

Remember the Lumia 1020? It’s back—in iPhone SE form.

The Lumia 1020 was a lot of smartphone in July 2013. It debuted with a focus “almost entirely on the phone’s massive camera,” Ars wrote at the time. That big 41-megapixel sensor jutted forth from the phone body, and Nokia reps showed off its low-light, rapid-motion camera abilities by shooting pictures of breakdancers in a dark demonstration room. The company also offered an optional camera grip—one that made it feel a lot more like a point-and-shoot camera. In a more robust review, Ars suggested the Lumia 1020 might actually make the point-and-shoot obsolete.

Front of the Lumia 1020, showing a bit of Windows Phone square grid flair. Casey Johnston

The Lumia 1020 contained yet another cutting edge concept of the day: Windows Phone, Microsoft’s color-coded, square-shaped companion to its mobile-forward Windows 8. The mobile OS never got over the users/apps, chicken/egg conundrum, and called it quits in October 2017. The end of that distant-third-place mobile OS would normally signal the end of the Lumia 1020 as a usable phone.

But there was a person named /u/OceanDepth95028 who saw beyond, and where others thought, “LOL,” this person thought, “Why not?” And this person looked at the Lumia 1020 and saw a third-generation iPhone SE inside of it. And then this person made that phone, and it booted. And the person saw that it was good, and they posted the tale to Reddit’s r/hackintosh.

“NokiApple LumiPhone 1020 SE” merges Windows Phone body with budget iPhone guts Read More »

reddit-mods-are-fighting-to-keep-ai-slop-off-subreddits-they-could-use-help.

Reddit mods are fighting to keep AI slop off subreddits. They could use help.


Mods ask Reddit for tools as generative AI gets more popular and inconspicuous.

Redditors in a treehouse with a NO AI ALLOWED sign

Credit: Aurich Lawson (based on a still from Getty Images)

Credit: Aurich Lawson (based on a still from Getty Images)

Like it or not, generative AI is carving out its place in the world. And some Reddit users are definitely in the “don’t like it” category. While some subreddits openly welcome AI-generated images, videos, and text, others have responded to the growing trend by banning most or all posts made with the technology.

To better understand the reasoning and obstacles associated with these bans, Ars Technica spoke with moderators of subreddits that totally or partially ban generative AI. Almost all these volunteers described moderating against generative AI as a time-consuming challenge they expect to get more difficult as time goes on. And most are hoping that Reddit will release a tool to help their efforts.

It’s hard to know how much AI-generated content is actually on Reddit, and getting an estimate would be a large undertaking. Image library Freepik has analyzed the use of AI-generated content on social media but leaves Reddit out of its research because “it would take loads of time to manually comb through thousands of threads within the platform,” spokesperson Bella Valentini told me. For its part, Reddit doesn’t publicly disclose how many Reddit posts involve generative AI use.

To be clear, we’re not suggesting that Reddit has a large problem with generative AI use. By now, many subreddits seem to have agreed on their approach to AI-generated posts, and generative AI has not superseded the real, human voices that have made Reddit popular.

Still, mods largely agree that generative AI will likely get more popular on Reddit over the next few years, making generative AI modding increasingly important to both moderators and general users. Generative AI’s rising popularity has also had implications for Reddit the company, which in 2024 started licensing Reddit posts to train the large language models (LLMs) powering generative AI.

(Note: All the moderators I spoke with for this story requested that I use their Reddit usernames instead of their real names due to privacy concerns.)

No generative AI allowed

When it comes to anti-generative AI rules, numerous subreddits have zero-tolerance policies, while others permit posts that use generative AI if it’s combined with human elements or is executed very well. These rules task mods with identifying posts using generative AI and determining if they fit the criteria to be permitted on the subreddit.

Many subreddits have rules against posts made with generative AI because their mod teams or members consider such posts “low effort” or believe AI is counterintuitive to the subreddit’s mission of providing real human expertise and creations.

“At a basic level, generative AI removes the human element from the Internet; if we allowed it, then it would undermine the very point of r/AskHistorians, which is engagement with experts,” the mods of r/AskHistorians told me in a collective statement.

The subreddit’s goal is to provide historical information, and its mods think generative AI could make information shared on the subreddit less accurate. “[Generative AI] is likely to hallucinate facts, generate non-existent references, or otherwise provide misleading content,” the mods said. “Someone getting answers from an LLM can’t respond to follow-ups because they aren’t an expert. We have built a reputation as a reliable source of historical information, and the use of [generative AI], especially without oversight, puts that at risk.”

Similarly, Halaku, a mod of r/wheeloftime, told me that the subreddit’s mods banned generative AI because “we focus on genuine discussion.” Halaku believes AI content can’t facilitate “organic, genuine discussion” and “can drown out actual artwork being done by actual artists.”

The r/lego subreddit banned AI-generated art because it caused confusion in online fan communities and retail stores selling Lego products, r/lego mod Mescad said. “People would see AI-generated art that looked like Lego on [I]nstagram or [F]acebook and then go into the store to ask to buy it,” they explained. “We decided that our community’s dedication to authentic Lego products doesn’t include AI-generated art.”

Not all of Reddit is against generative AI, of course. Subreddits dedicated to the technology exist, and some general subreddits permit the use of generative AI in some or all forms.

“When it comes to bans, I would rather focus on hate speech, Nazi salutes, and things that actually harm the subreddits,” said 3rdusernameiveused, who moderates r/consoom and r/TeamBuilder25, which don’t ban generative AI. “AI art does not do that… If I was going to ban [something] for ‘moral’ reasons, it probably won’t be AI art.”

“Overwhelmingly low-effort slop”

Some generative AI bans are reflective of concerns that people are not being properly compensated for the content they create, which is then fed into LLM training.

Mod Mathgeek007 told me that r/DeadlockTheGame bans generative AI because its members consider it “a form of uncredited theft,” adding:

You aren’t allowed to sell/advertise the workers of others, and AI in a sense is using patterns derived from the work of others to create mockeries. I’d personally have less of an issue with it if the artists involved were credited and compensated—and there are some niche AI tools that do this.

Other moderators simply think generative AI reduces the quality of a subreddit’s content.

“It often just doesn’t look good… the art can often look subpar,” Mathgeek007 said.

Similarly, r/videos bans most AI-generated content because, according to its announcement, the videos are “annoying” and “just bad video” 99 percent of the time. In an online interview, r/videos mod Abrownn told me:

It’s overwhelmingly low-effort slop thrown together simply for views/ad revenue. The creators rarely care enough to put real effort into post-generation [or] editing of the content [and] rarely have coherent narratives [in] the videos, etc. It seems like they just throw the generated content into a video, export it, and call it a day.

An r/fakemon mod told me, “I can’t think of anything more low-effort in terms of art creation than just typing words and having it generated for you.”

Some moderators say generative AI helps people spam unwanted content on a subreddit, including posts that are irrelevant to the subreddit and posts that attack users.

“[Generative AI] content is almost entirely posted for purely self promotional/monetary reasons, and we as mods on Reddit are constantly dealing with abusive users just spamming their content without regard for the rules,” Abrownn said.

A moderator of the r/wallpaper subreddit, which permits generative AI, disagrees. The mod told me that generative AI “provides new routes for novel content” in the subreddit and questioned concerns about generative AI stealing from human artists or offering lower-quality work, saying those problems aren’t unique to generative AI:

Even in our community, we observe human-generated content that is subjectively low quality (poor camera/[P]hotoshopping skills, low-resolution source material, intentional “shitposting”). It can be argued that AI-generated content amplifies this behavior, but our experience (which we haven’t quantified) is that the rate of such behavior (whether human-generated or AI-generated content) has not changed much within our own community.

But we’re not a very active community—[about] 13 posts per day … so it very well could be a “frog in boiling water” situation.

Generative AI “wastes our time”

Many mods are confident in their ability to effectively identify posts that use generative AI. A bigger problem is how much time it takes to identify these posts and remove them.

The r/AskHistorians mods, for example, noted that all bans on the subreddit (including bans unrelated to AI) have “an appeals process,” and “making these assessments and reviewing AI appeals means we’re spending a considerable amount of time on something we didn’t have to worry about a few years ago.”

They added:

Frankly, the biggest challenge with [generative AI] usage is that it wastes our time. The time spent evaluating responses for AI use, responding to AI evangelists who try to flood our subreddit with inaccurate slop and then argue with us in modmail, [direct messages that message a subreddits’ mod team], and discussing edge cases could better be spent on other subreddit projects, like our podcast, newsletter, and AMAs, … providing feedback to users, or moderating input from users who intend to positively contribute to the community.

Several other mods I spoke with agree. Mathgeek007, for example, named “fighting AI bros” as a common obstacle. And for r/wheeloftime moderator Halaku, the biggest challenge in moderating against generative AI is “a generational one.”

“Some of the current generation don’t have a problem with it being AI because content is content, and [they think] we’re being elitist by arguing otherwise, and they want to argue about it,” they said.

A couple of mods noted that it’s less time-consuming to moderate subreddits that ban generative AI than it is to moderate those that allow posts using generative AI, depending on the context.

“On subreddits where we allowed AI, I often take a bit longer time to actually go into each post where I feel like… it’s been AI-generated to actually look at it and make a decision,” explained N3DSdude, a mod of several subreddits with rules against generative AI, including r/DeadlockTheGame.

MyarinTime, a moderator for r/lewdgames, which allows generative AI images, highlighted the challenges of identifying human-prompted generative AI content versus AI-generated content prompted by a bot:

When the AI bomb started, most of those bots started using AI content to work around our filters. Most of those bots started showing some random AI render, so it looks like you’re actually talking about a game when you’re not. There’s no way to know when those posts are legit games unless [you check] them one by one. I honestly believe it would be easier if we kick any post with [AI-]generated image… instead of checking if a button was pressed by a human or not.

Mods expect things to get worse

Most mods told me it’s pretty easy for them to detect posts made with generative AI, pointing to the distinct tone and favored phrases of AI-generated text. A few said that AI-generated video is harder to spot but still detectable. But as generative AI gets more advanced, moderators are expecting their work to get harder.

In a joint statement, r/dune mods Blue_Three and Herbalhippie said, “AI used to have a problem making hands—i.e., too many fingers, etc.—but as time goes on, this is less and less of an issue.”

R/videos’ Abrownn also wonders how easy it will be to detect AI-generated Reddit content “as AI tools advance and content becomes more lifelike.”

Mathgeek007 added:

AI is becoming tougher to spot and is being propagated at a larger rate. When AI style becomes normalized, it becomes tougher to fight. I expect generative AI to get significantly worse—until it becomes indistinguishable from ordinary art.

Moderators currently use various methods to fight generative AI, but they’re not perfect. r/AskHistorians mods, for example, use “AI detectors, which are unreliable, problematic, and sometimes require paid subscriptions, as well as our own ability to detect AI through experience and expertise,” while N3DSdude pointed to tools like Quid and GPTZero.

To manage current and future work around blocking generative AI, most of the mods I spoke with said they’d like Reddit to release a proprietary tool to help them.

“I’ve yet to see a reliable tool that can detect AI-generated video content,” Aabrown said. “Even if we did have such a tool, we’d be putting hundreds of hours of content through the tool daily, which would get rather expensive rather quickly. And we’re unpaid volunteer moderators, so we will be outgunned shortly when it comes to detecting this type of content at scale. We can only hope that Reddit will offer us a tool at some point in the near future that can help deal with this issue.”

A Reddit spokesperson told me that the company is evaluating what such a tool could look like. But Reddit doesn’t have a rule banning generative AI overall, and the spokesperson said the company doesn’t want to release a tool that would hinder expression or creativity.

For now, Reddit seems content to rely on moderators to remove AI-generated content when appropriate. Reddit’s spokesperson added:

Our moderation approach helps ensure that content on Reddit is curated by real humans. Moderators are quick to remove content that doesn’t follow community rules, including harmful or irrelevant AI-generated content—we don’t see this changing in the near future.

Making a generative AI Reddit tool wouldn’t be easy

Reddit is handling the evolving concerns around generative AI as it has handled other content issues, including by leveraging AI and machine learning tools. Reddit’s spokesperson said that this includes testing tools that can identify AI-generated media, such as images of politicians.

But making a proprietary tool that allows moderators to detect AI-generated posts won’t be easy, if it happens at all. The current tools for detecting generative AI are limited in their capabilities, and as generative AI advances, Reddit would need to provide tools that are more advanced than the AI-detecting tools that are currently available.

That would require a good deal of technical resources and would also likely present notable economic challenges for the social media platform, which only became profitable last year. And as noted by r/videos moderator Abrownn, tools for detecting AI-generated video still have a long way to go, making a Reddit-specific system especially challenging to create.

But even with a hypothetical Reddit tool, moderators would still have their work cut out for them. And because Reddit’s popularity is largely due to its content from real humans, that work is important.

Since Reddit’s inception, that has meant relying on moderators, which Reddit has said it intends to keep doing. As r/dune mods Blue_Three and herbalhippie put it, it’s in Reddit’s “best interest that much/most content remains organic in nature.” After all, Reddit’s profitability has a lot to do with how much AI companies are willing to pay to access Reddit data. That value would likely decline if Reddit posts became largely AI-generated themselves.

But providing the technology to ensure that generative AI isn’t abused on Reddit would be a large challege. For now, volunteer laborers will continue to bear the brunt of generative AI moderation.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

Reddit mods are fighting to keep AI slop off subreddits. They could use help. Read More »

how-diablo-hackers-uncovered-a-speedrun-scandal

How Diablo hackers uncovered a speedrun scandal


Investigators decompiled the game to search through 2.2 billion random dungeon seeds.

The word Debunk radiating flames against a demonic background Credit: Aurich Lawson

For years, Maciej “Groobo” Maselewski stood as the undisputed champion of Diablo speedrunning. His 3-minute, 12-second Sorcerer run looked all but unbeatable thanks to a combination of powerful (and allowable) glitch exploits along with what seemed like some unbelievable luck in the game’s randomly generated dungeon.

But when a team of other speedrunners started trying and failing to replicate that luck using outside software and analysis tools, the story behind Groobo’s run began to fall apart. As the inconsistencies in the run started to mount, that team would conduct an automated search through billions of legitimate Diablo dungeons to prove beyond a shadow of a doubt that Groobo’s game couldn’t have taken place in any of them.

“We just had a lot of curiosity and resentment that drove us to dig even deeper,” team member Staphen told Ars Technica of their investigation. “Betrayal might be another way to describe it,” team member AJenbo added. “To find out that this had been done illegitimately… and the person had both gotten and taken a lot of praise for their achievement.”

If we have unearned luck

If you have any familiarity with Diablo or speedrunning, watching Groobo’s run feels like watching someone win the lottery. First, there’s the dungeon itself, which features a sequence of stairways that appear just steps from each other, forming a quick and enemy-free path down to the dungeon’s deeper levels. Then there’s Groobo’s lucky find of Naj’s Puzzler on level 9, a unique item that enables the teleporting necessary for many of the run’s late-game maneuvers.

Groobo’s 3: 12 Diablo speedrun, as submitted to Speed Demos Archive in 2009

“It seemed very unusual that we would have so many levels with the upstairs and the downstairs right next to each other,” Allan “DwangoAC” Cecil told Ars Technica. “We wanted to find some way of replicating this.”

When Cecil and a team of tool-assisted speedrun (TAS) authors started that search process in earnest last February, they said they used Groobo’s run as a baseline to try to improve from. While Groobo ostensibly had to rely on his own human luck in prepping his run, the TAS runners could use techniques and tools from outside the game to replicate Groobo’s run (or something very similar) every time.

To find an RNG seed that could do just that, the TAS team created a custom-built map generation tool by reverse-engineering a disassembled Diablo executable. That tool can take any of the game’s billions of possible random seeds and quickly determine the map layout, item distribution, and quest placement available in the generated save file. A scanner built on top of that tool can then quickly look through those generated dungeons for ones that might be optimal for speedrunning.

“We were working on finding the best seed for our TAS, and we were trying to identify the seed from Groobo’s run, both to validate that our scanner works and to potentially straight-up use it for the run,” Stephan said of the effort. “We naturally had a lot of trouble finding [that seed] because it doesn’t exist.”

A thorough search

In their effort to find Groobo’s storied run (or at least one that resembled it), the TAS team conducted a distributed search across the game’s roughly 2.2 billion valid RNG seeds. Each of these seeds represents a different specific second on the system clock when a Diablo save file is created, ranging from between January 1, 1970, and December 31, 2038 (the only valid dates accepted by the game).

After comparing each of those billions of those RNG dungeons to a re-creation of the dungeon seen in Groobo’s run, the team couldn’t find a single example containing the crucial level 9 Naj’s Puzzler drop. After that, the team started searching through “impossible” seeds, which could only be created by using save modification tools to force a creation date after the year 2038.

The team eventually found dungeons matching Naj’s Puzzler drop in Groobo’s video, using seeds associated with the years 2056 and 2074.

After an exhaustive search, the TAS team couldn’t find a dungeon with Naj’s Puzzler dropped in the place Groobo’s run said it should be.

After an exhaustive search, the TAS team couldn’t find a dungeon with Naj’s Puzzler dropped in the place Groobo’s run said it should be. Credit: Analysis of Groobo’s Diablo WR Speedrun

The early presumption that Groobo’s run was legitimate ended up costing the team weeks of work. “It was baffling when we couldn’t find [the early Naj’s Puzzler] in any of the searches we did,” Cecil said. “We were always worried that the scanner might have bugs in it,” Staphen added.

The TAS team’s thorough search also showed troubling inconsistencies in the other dungeon levels shown in Groobo’s run. “Normally you would only need to identify a single level to replicate a run since all the other levels are generated from the same seed,” AJenbo told Ars. But the levels seen in Groobo’s run came from multiple different seeds, which would require splicing footage from multiple different playthrough of different dungeons. That’s a big no-no even in a so-called “segmented” run, which is still supposed to contain segments from a single unmodified save file.

“At that point we also wanted to figure out how manipulated the run was,” AJenbo said. “Was it a legit run except for [dungeon level] 9? Was it three good runs combined? In the end we only found two levels that had come from the same run so at least 13 (probably 15) runs were spliced into one video, which is a lot for a game with just 16 levels.”

The evidence piles up

After Groobo’s dungeon generation problems came to light, other inconsistencies in his run started to become apparent. Some of these are relatively easy to spot with the naked eye once you know what you’re looking for.

For instance, the “1996–2001” copyright date seen on the title screen in Groobo’s video is inconsistent with the v1.00 shown on the initial menu screen, suggesting Groobo’s run was spliced together from runs on multiple different versions of the game. Items acquired early in the run also disappear from the inventory later on with no apparent explanation.

This copyright date doesn’t line up with the “V1.00” seen later on the menu screen in Groobo’s run.

This copyright date doesn’t line up with the “V1.00” seen later on the menu screen in Groobo’s run. Credit: Analysis of Groobo’s Diablo WR Speedrun

Even months after the investigation first started, new inconsistencies are still coming to light. Groobo’s final fight against Diablo, for instance, required just 19 fireballs to take him out. While that’s technically possible with perfect luck for the level 12 Sorcerer seen in the footage, the TAS team found that the specific damage dealt and boss behavior only matched when they attempted the same attacks using a level 26 Sorcerer.

After the TAS team compiled their many findings into a lengthy document, Groobo defended his submission in a discussion with Cecil (screenshots of which were viewed by Ars Technica). “My run is a segmented/spliced run,” Groobo said. “It always has been and it was never passed off as anything else, nor was it part of any competition or leaderboards. The Speed Demos Archive [SDA] page states that outright.” Indeed, an archived version of Groobo’s record-setting Speed Demos Archive submission does say directly that it’s made up of “27 segments appended to one file.”

But simply splitting a run into segments doesn’t explain away all of the problems the TAS team found. Getting Naj’s Puzzler on dungeon level 9, for instance, still requires outside modification of a save file, which is specifically prohibited by longstanding Speed Demos Archive rules that “manually editing/adding/removing game files is generally not allowed.” Groobo’s apparent splicing of multiple game versions and differently seeded save files also seems to go against SDA rules, which say that “there obviously needs to be continuity between segments in terms of inventory, experience points or whatever is applicable for the individual game.”

After being presented with the TAS team’s evidence, SDA wrote that “it has been determined that Groobo’s run very likely does not stem from only legitimate techniques, and as such, has itself been banished barring new developments.” But Groobo’s record is still listed as the “Fastest completion of an RPG videogame” by Guinness World Records, which has not offered a substantive response to the team’s findings (Guinness has not responded to a request for comment from Ars Technica).

A recent Diablo speedrun on a confirmed legitimate dungeon seed.

This might seem like a pretty petty issue to spend weeks of time and attention debunking. But at a recent presentation attended by Ars, Cecil said he was motivated to pursue it because “it did harm. Groobo’s alleged cheating in 2009 completely stopped interest in speedrunning this category [of Diablo]. No one tried, no one could.”

Because of Groobo’s previously unknown modifications to make an impossible-to-beat run, “this big running community just stopped trying to run this game in that category,” Cecil said. “For more than a decade, this had a chilling impact on that community.” With Groobo’s run out of the way, though, new runners are setting new records on confirmed legitimate RNG seeds, and with the aid of TAS tools.

In the end, Cecil said he hopes the evidence regarding Groobo’s run will make people look more carefully at other record submissions. “Groobo had created a number of well-respected … speedruns,” he said. “[People thought] there wasn’t any good reason to doubt him. In other words, there was bias in familiarity. This was a familiar character. Why would they cheat?”

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

How Diablo hackers uncovered a speedrun scandal Read More »