medicine

clinical-trial-of-a-technique-that-could-give-everyone-the-best-antibodies

Clinical trial of a technique that could give everyone the best antibodies


If we ID the DNA for a great antibody, anyone can now make it.

One of the things that emerging diseases, including the COVID and Zika pandemics, have taught us is that it’s tough to keep up with infectious diseases in the modern world. Things like air travel can allow a virus to spread faster than our ability to develop therapies. But that doesn’t mean biotech has stood still; companies have been developing technologies that could allow us to rapidly respond to future threats.

There are a lot of ideas out there. But this week saw some early clinical trial results of one technique that could be useful for a range of infectious diseases. We’ll go over the results as a way to illustrate the sort of thinking that’s going on, along with the technologies we have available to pursue the resulting ideas.

The best antibodies

Any emerging disease leaves a mass of antibodies in its wake—those made by people in response to infections and vaccines, those made by lab animals we use to study the infectious agent, and so on. Some of these only have a weak affinity for the disease-causing agent, but some of them turn out to be what are called “broadly neutralizing.” These stick with high affinity not only to the original pathogen, but most or all of its variants, and possibly some related viruses.

Once an antibody latches on to a pathogen, broadly neutralizing antibodies inactivate it (as their name implies). This is typically because these antibodies bind to a site that’s necessary for a protein’s function. For example, broadly neutralizing antibodies to HIV bind to the proteins that help this virus enter immune cells.

Unfortunately, not everyone develops broadly neutralizing antibodies, and certainly doesn’t do so in time to prevent infections. And we haven’t figured out a way of designing vaccinations that ensure their generation. So we’re often found ourselves stuck with knowing what antibodies we’d like to see people making while having no way of ensuring that they do.

One of the options we’ve developed is to just mass-produce broadly neutralizing antibodies and inject them into people. This has been approved for use against Ebola and provided an early treatment during the COVID pandemic. This approach has some practical limitations, though. For starters, the antibodies have a finite life span in the bloodstream, so injections may need to be repeated. In addition, making and purifying enough antibodies in bulk isn’t the easiest thing in the world, and they generally need to be kept refrigerated during the distribution, limiting the areas where they can be used.

So, a number of companies have been looking at an alternative: getting people to make their own. This could potentially lead to longer-lived protection, even ensuring the antibodies are present to block future infections if the DNA survives long enough.

Genes and volts

Once you identify cells that produce broadly neutralizing antibodies, it’s relatively simple to clone those genes and put them into a chunk of DNA that will ensure that they’ll be produced by any human cell. If we could get that DNA into a person’s cells, broadly neutralizing antibodies are the result. And a number of approaches have been tried to handle that “if.” Most of them have inserted the genes needed to make the antibodies into a harmless, non-infectious virus, and then injected that virus into volunteers. Unfortunately, these viruses have tended to set off a separate immune response, which causes more significant side effects and may limit how often this approach can be used.

This brings us to the technique being used here. In this case, the researchers placed the antibody genes in a circular loop of DNA called a plasmid. This is enough to ensure that the DNA doesn’t get digested immediately and to get the antibody genes made into proteins. But it does nothing to help get the DNA inside of cells.

The research team, a mixture of people from a biotech company and academic labs, used a commercial injection setup that mixes the injection of the DNA with short pulses of electricity. The electricity disrupts the cell membrane, allowing the plasmid DNA to make it inside cells. Based on animal testing, doing this in muscle cells is enough to turn the muscles into factories producing lots of broadly neutralizing antibodies.

The new study was meant to test the safety of doing that in humans. The team recruited 44 participants, testing various doses of two antibody-producing plasmids and injection schedules. All but four of the subjects completed the study; three of those who dropped out had all been testing a routine with the electric pulses happening very quickly, which turned out to be unpleasant. Fortunately, it didn’t seem to make any difference to the production of antibodies.

While there were a lot of adverse reactions, most of these were associated with the injection itself: muscle pain at the site, a scab forming afterward, and a reddening of the skin. The worst problem appeared to be a single case of moderate muscle pain that persisted for a couple of days.

In all but one volunteer, the injection resulted in stable production of the two antibodies for at least 72 weeks following the injection; the single exception only made one of the two. That’s “at least” 72 weeks because that’s when they stopped testing—there was no indication that levels were dropping at this point. Injecting more DNA led to more variability in the amount of antibody produced, but that amount quickly maxed out. More total injections also boosted the level of antibody production. But even the minimal procedure—two injections of the lowest concentration tested—resulted in significant and stable antibodies.

And, as expected, these antibodies blocked the virus they were directed against: SARS-CoV-2.

The caveats

This approach seems to work—we can seemingly get anybody to make broadly neutralizing antibodies for months at a time. What’s the hitch? For starters, this isn’t necessarily great for a rapidly emerging pandemic. It takes a while to identify broadly neutralizing antibodies after a pathogen is identified. And, while it’s simple to ship DNA around the world to where it will be needed, injection setups that also produce the small electric pulses are not exactly standard equipment even in industrialized countries, much less the Global South.

Then there’s the issue of whether this really is a longer-term fix. Widespread use of broadly neutralizing antibodies will create a strong selective pressure for the evolution of variants that the antibody can no longer bind to. That may not always be a problem—broadly neutralizing antibodies generally bind to parts of proteins that are absolutely essential for the proteins’ function, and so it may not be possible to change those while maintaining the function. But that’s unlikely to always be the case.

In the end, however, social acceptance may end up being the biggest problem. People had an utter freakout over unfounded conspiracies that the RNA of COVID vaccines would somehow lead to permanent genetic changes. Presumably, having DNA that’s stable for months would be even harder for some segments of the public to swallow.

Nature Medicine, 2025. DOI: 10.1038/s41591-025-03969-0 (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Clinical trial of a technique that could give everyone the best antibodies Read More »

when-sycophancy-and-bias-meet-medicine

When sycophancy and bias meet medicine


Biased, eager-to-please models threaten health research replicability and trust.

Once upon a time, two villagers visited the fabled Mullah Nasreddin. They hoped that the Sufi philosopher, famed for his acerbic wisdom, could mediate a dispute that had driven a wedge between them. Nasreddin listened patiently to the first villager’s version of the story and, upon its conclusion, exclaimed, “You are absolutely right!” The second villager then presented his case. After hearing him out, Nasreddin again responded, “You are absolutely right!” An observant bystander, confused by Nasreddin’s proclamations, interjected, “But Mullah, they can’t both be right.” Nasreddin paused, regarding the bystander for a moment before replying, “You are absolutely right, too!”

In late May, the White House’s first “Make America Healthy Again” (MAHA) report was criticized for citing multiple research studies that did not exist. Fabricated citations like these are common in the outputs of generative artificial intelligence based on large language models, or LLMs. LLMs have presented plausible-sounding sources, catchy titles, or even false data to craft their conclusions. Here, the White House pushed back on the journalists who first broke the story before admitting to “minor citation errors.”

It is ironic that fake citations were used to support a principal recommendation of the MAHA report: addressing the health research sector’s “replication crisis,” wherein scientists’ findings often cannot be reproduced by other independent teams.

Yet the MAHA report’s use of phantom evidence is far from unique. Last year, The Washington Post reported on dozens of instances in which AI-generated falsehoods found their way into courtroom proceedings. Once uncovered, lawyers had to explain to judges how fictitious cases, citations, and decisions found their way into trials.

Despite these widely recognized problems, the MAHA roadmap released last month directs the Department of Health and Human Services to prioritize AI research to “…assist in earlier diagnosis, personalized treatment plans, real-time monitoring, and predictive interventions…” This breathless rush to embed AI in so many aspects of medicine could be forgiven if we believe that the technology’s “hallucinations” will be easy to fix through version updates. But as the industry itself acknowledges, these ghosts in the machine may be impossible to eliminate.

Consider the implications of accelerating AI use in health research for clinical decision making. Beyond the problems we’re seeing here, using AI in research without disclosure could create a feedback loop, supercharging the very biases that helped motivate its use. Once published, “research” based on false results and citations could become part of the datasets used to build future AI systems. Worse still, a recently published study highlights an industry of scientific fraudsters who could deploy AI to make their claims seem more legitimate.

In other words, a blind adoption of AI risks a downward spiral, where today’s flawed AI outputs become tomorrow’s training data, exponentially eroding research quality.

Three prongs of AI misuse

The challenge AI poses is threefold: hallucination, sycophancy, and the black box conundrum. Understanding these phenomena is critical for research scientists, policymakers, educators, and everyday citizens. Unaware, we risk vulnerability to deception as AI systems are increasingly deployed to shape diagnoses, insurance claims, health literacy, research, and public policy.

Here’s how hallucination works: When a user inputs a query into an AI tool such as ChatGPT or Gemini, the model evaluates the input and generates a string of words that is statistically likely to make sense based on its training data. Current AI models will complete this task even if their training data is incomplete or biased, filling in the blanks regardless of their ability to answer. These hallucinations can take the form of nonexistent research studies, misinformation, or even clinical interactions that never happened. LLMs’ emphasis on producing authoritative-sounding language shrouds their false outputs in a facsimile of truth.

And as human model trainers fine-tune generative AI responses, they tend to optimize and reward the AI system responses that favor their prior beliefs, leading to sycophancy. Human bias, it appears, begets AI bias, and human users of AI then perpetuate the cycle. A consequence is that AIs skew toward favoring pleasing answers over truthful ones, often seeking to reinforce the bias of the query.

A recent illustration of this occurred in April, when OpenAI canceled a ChatGPT update for being too sycophantic after users demonstrated that it agreed too quickly and enthusiastically with the assumptions embedded in users’ queries. Sycophancy and hallucination often interact with each other; systems that aim to please will be more apt to fabricate data to reach user-preferred conclusions.

Correcting hallucinations, sycophancy, and other LLM mishaps is cumbersome because human observers can’t always determine how an AI platform arrived at its conclusions. This is the “black box” problem. Behind the probabilistic mathematics, is it even testing hypotheses? What methods did it use to derive an answer? Unlike traditional computer code or the rubric of scientific methodology, AI models operate through billions of computations. Looking at some well-structured outputs, it is easy to forget that the underlying processes are impenetrable to scrutiny and vastly different from a human’s approach to problem-solving.

This opacity can become dangerous when people can’t identify where computations went wrong, making it impossible to correct systematic errors or biases in the decision-making process. In health care, this black box raises questions about accountability, liability, and trust when neither physicians nor patients can explain the sequence of reasoning that leads to a medical intervention.

AI and health research

These AI challenges can exacerbate the existing sources of error and bias that creep into traditional health research publications. Several sources originate from the natural human motivation to find and publish meaningful, positive results. Journalists want to report on connections, e.g., that St. John’s Wort improves mood (it might). Nobody would want to publish an article with the results: “the supplement has no significant effect.”

The problem compounds when researchers use a study design to test not just a single hypothesis but many. One quirk of statistics-backed research is that testing more hypotheses in a single study raises the likelihood of uncovering a spurious coincidence.

AI has the potential to supercharge these coincidences through its relentless ability to test hypotheses across massive datasets. In the past, a research assistant could use an existing dataset to test 10 to 20 of the most likely hypotheses; now, that assistant can set an AI loose to test millions of likely or unlikely hypotheses without human supervision. That all but guarantees some of the results will meet the criteria for statistical significance, regardless of whether the data includes any real biological effects.

AI’s tireless capacity to investigate data, combined with its growing ability to develop authoritative-sounding narratives, expands the potential to elevate fabricated or bias-confirming errors into the collective public consciousness.

What’s next?

If you read the missives of AI luminaries, it would appear that society is on the cusp of superintelligence, which will transform every vexing societal conundrum into a trivial puzzle. While that’s highly unlikely, AI has certainly demonstrated promise in some health applications, despite its limitations. Unfortunately, it’s now being rapidly deployed sector-wide, even in areas where it has no prior track record.

This speed may leave us little time to reflect on the accountability needed for safe deployment. Sycophancy, hallucination, and the black box of AI are non-trivial challenges when conjoined with existing biases in health research. If people can’t easily understand the inner workings of current AI tools (often comprising up to 1.8 trillion parameters), they will not be able to understand the process of future, more complex versions (using over 5 trillion parameters).

History shows that most technological leaps forward are double-edged swords. Electronic health records increased the ability of clinicians to improve care coordination and aggregate data on population health, but they have eroded doctor-patient interactions and have become a source of physician burnout. The recent proliferation of telemedicine has expanded access to care, but it has also promoted lower-quality interactions with no physical examination.

The use of AI in health policy and research is no different. Wisely deployed, it could transform the health sector, leading to healthier populations and unfathomable breakthroughs (for example, by accelerating drug discovery). But without embedding it in new professional norms and practices, it has the potential to generate countless flawed leads and falsehoods.

Here are some potential solutions we see to the AI and health replicability crisis:

  • Clinical-specific models capable of admitting uncertainty in their outputs
  • Greater transparency, requiring disclosure of AI model use in research
  • Training for researchers, clinicians, and journalists on how to evaluate and stress-test AI-derived conclusions
  • Pre-registered hypotheses and analysis plans before using AI tools
  • AI audit trails
  • Specific AI global prompts that limit sycophantic tendencies across user queries

Regardless of the solutions deployed, we need to solve the failure points described here to fully realize the potential of AI for use in health research. The public, AI companies, and health researchers must be active participants in this journey. After all, in science, not everyone can be right.

Amit Chandra is an emergency physician and global health policy specialist based in Washington, DC. He is an adjunct professor of global health at Georgetown University’s School of Health, where he has explored AI solutions for global health challenges since 2021.

Luke Shors is an entrepreneur who focuses on energy, climate, and global health. He is the co-founder of the sustainability company Capture6 and previously worked on topics including computer vision and blockchain. 

When sycophancy and bias meet medicine Read More »

“butt-breathing”-might-soon-be-a-real-medical-treatment

“Butt breathing” might soon be a real medical treatment

And Oxycyte was ideal for the group’s 2021 Ig Nobel-winning efforts. The experiments involved intra-anally administering oxygen gas or a liquid oxygenated perfluorocarbon to the unfortunate rodents and porcines. Yes, they gave the animals enemas. They then induced respiratory failure and evaluated the effectiveness of the intra-anal treatment. The result: Both treatments were pretty darned effective at staving off respiratory failure with no major complications.

Visual abstract shows highlights of first human clinical trial to evaluate the safety of enteral ventilation concept

Credit: Cincinnati Children’s/Med

So far, so good. The next logical step was to determine if EVA could work in human patients, too. “Patients with severe respiratory failure often need mechanical ventilation to survive, but these therapies can cause further lung injury,” the authors wrote in this latest paper. EVA “could give the lungs a chance to rest and heal.”

The team recruited 27 healthy adult men in Japan, each of whom received a dose of non-oxygenated perfluorodecalin via the anus. They were asked to retain the liquid for a full hour as the dosage slowly increased from 25 to 1,500 mL. Twenty of the men successfully completed the experiment. Apart from mild temporary abdominal bloating and discomfort—which proved to be dosage dependent and resolved with no need for medical attention—they experienced no adverse effects.

“This is the first human data and the results are limited solely to demonstrating the safety of the procedure and not its effectiveness,” said co-author Takanori Takebe of Cincinnati Children’s Hospital and the University of Osaka in Japan. “But now that we have established tolerance, the next step will be to evaluate how effective the process is for delivering oxygen to the bloodstream.”

Med, 2025. DOI: 10.1016/j.medj.2025.100887 (About DOIs).

“Butt breathing” might soon be a real medical treatment Read More »

dead-ends-is-a-fun,-macabre-medical-history-for-kids

Dead Ends is a fun, macabre medical history for kids


flukes, flops, and failures

Ars chats with co-authors Lindsey Fitzharris and Adrian Teal about their delightful new children’s book.

In 1890, a German scientist named Robert Koch thought he’d invented a cure for tuberculosis, a substance derived from the infecting bacterium itself that he dubbed Tuberculin. His substance didn’t actually cure anyone, but it was eventually widely used as a diagnostic skin test. Koch’s successful failure is just one of the many colorful cases featured in Dead Ends! Flukes, Flops, and Failures that Sparked Medical Marvels, a new nonfiction illustrated children’s book by science historian Lindsey Fitzharris and her husband, cartoonist Adrian Teal.

A noted science communicator with a fondness for the medically macabre, Fitzharris published a biography of surgical pioneer Joseph Lister, The Butchering Art, in 2017—a great, if occasionally grisly, read. She followed up with 2022’s  The Facemaker: A Visionary Surgeon’s Battle to Mend the Disfigured Soldiers of World War I, about a WWI surgeon named Harold Gillies who rebuilt the faces of injured soldiers.

And in 2020, she hosted a documentary for the Smithsonian Channel, The Curious Life and Death Of…, exploring famous deaths, ranging from drug lord Pablo Escobar to magician Harry Houdini. Fitzharris performed virtual autopsies, experimented with blood samples, interviewed witnesses, and conducted real-time demonstrations in hopes of gleaning fresh insights. For his part, Teal is a well-known caricaturist and illustrator, best known for his work on the British TV series Spitting Image. His work has also appeared in The Guardian and the Sunday Telegraph, among other outlets.

The couple decided to collaborate on children’s books as a way to combine their respective skills. Granted, “[The market for] children’s nonfiction is very difficult,” Fitzharris told Ars. “It doesn’t sell that well in general. It’s very difficult to get publishers on board with it. It’s such a shame because I really feel that there’s a hunger for it, especially when I see the kids picking up these books and loving it. There’s also just a need for it with the decline in literacy rates. We need to get people more engaged with these topics in ways that go beyond a 30-second clip on TikTok.”

Their first foray into the market was 2023’s Plague-Busters! Medicine’s Battles with History’s Deadliest Diseases, exploring “the ickiest illnesses that have infected humans and affected civilizations through the ages”—as well as the medical breakthroughs that came about to combat those diseases. Dead Ends is something of a sequel, focusing this time on historical diagnoses, experiments, and treatments that were useless at best, frequently harmful, yet eventually led to unexpected medical breakthroughs.

Failure is an option

The book opens with the story of Robert Liston, a 19th-century Scottish surgeon known as “the fastest knife in the West End,” because he could amputate a leg in less than three minutes. That kind of speed was desirable in a period before the discovery of anesthetic, but sometimes Liston’s rapid-fire approach to surgery backfired. One story (possibly apocryphal) holds that Liston accidentally cut off the finger of his assistant in the operating theater as he was switching blades, then accidentally cut the coat of a spectator, who died of fright. The patient and assistant also died, so that operation is now often jokingly described as the only one with a 300 percent mortality rate, per Fitzharris.

Liston is the ideal poster child for the book’s theme of celebrating the role of failure in scientific progress. “I’ve always felt that failure is something we don’t talk about enough in the history of science and medicine,” said Fitzharris. “For everything that’s succeeded there’s hundreds, if not thousands, of things that’s failed. I think it’s a great concept for children. If you think that you’ve made mistakes, look at these great minds from the past. They’ve made some real whoppers. You are in good company. And failure is essential to succeeding, especially in science and medicine.”

“During the COVID pandemic, a lot of people were uncomfortable with the fact that some of the advice would change, but to me that was a comfort because that’s what you want to see scientists and doctors doing,” she continued. “They’re learning more about the virus, they’re changing their advice. They’re adapting. I think that this book is a good reminder of what the scientific process involves.”

The details of Liston’s most infamous case might be horrifying, but as Teal observes, “Comedy equals tragedy plus time.” One of the reasons so many of his patients died was because this was before the broad acceptance of germ theory and Joseph Lister’s pioneering work on antiseptic surgery. Swashbuckling surgeons like Liston prided themselves on operating in coats stiffened with blood—the sign of a busy and hence successful surgeon. Frederick Treves once observed that in the operating room, “cleanliness was out of place. It was considered to be finicking and affected. An executioner might as well manicure his nails before chopping off a head.”

“There’s always a lot of initial resistance to new ideas, even in science and medicine,” said Teal. “A lot of what we talk about is paradigm shifts and the difficulty of achieving [such a shift] when people are entrenched in their thinking. Galen was a hugely influential Roman doctor and got a lot of stuff right, but also got a lot of stuff wrong. People were clinging onto that stuff for centuries. You have misunderstanding compounded by misunderstanding, century after century, until somebody finally comes along and says, ‘Hang on a minute, this is all wrong.’”

You know… for kids

Writing for children proved to be a very different experience for Fitzharris after two adult-skewed science history books. “I initially thought children’s writing would be easy,” she confessed. “But it’s challenging to take these high-level concepts and complex stories about past medical movements and distill them for children in an entertaining and fun way.” She credits Teal—a self-described “man-child”—for taking her drafts and making them more child-friendly.

Teal’s clever, slightly macabre illustrations also helped keep the book accessible to its target audience, appealing to children’s more ghoulish side. “There’s a lot of gruesome stuff in this book,” Teal said. “Obviously it’s for kids, so you don’t want to go over the top, but equally, you don’t want to shy away from those details. I always say kids love it because kids are horrible, in the best possible way. I think adults sometimes worry too much about kids’ sensibilities. You can be a lot more gruesome than you think you can.”

The pair did omit some darker subject matter, such as the history of frontal lobotomies, notably the work of a neuroscientist named Walter Freeman, who operated an actual “lobotomobile.” For the authors, it was all about striking the right balance. “How much do you give to the kids to keep them engaged and interested, but not for it to be scary?” said Fitzharris. “We don’t want to turn people off from science and medicine. We want to celebrate the greatness of what we’ve achieved scientifically and medically. But we also don’t want to cover up the bad bits because that is part of the process, and it needs to be acknowledged.”

Sometimes Teal felt it just wasn’t necessary to illustrate certain gruesome details in the text—such as their discussion of the infamous case of Phineas Gage. Gage was a railroad construction foreman. In 1848, he was overseeing a rock blasting team when an explosion drove a three-foot tamping iron through his skull. “There’s a horrible moment when [Gage] leans forward and part of his brain drops out,” said Teal. “I’m not going to draw that, and I don’t need to, because it’s explicit in the text. If we’ve done a good enough job of writing something, that will put a mental picture in someone’s head.”

Miraculously, Gage survived, although there were extreme changes in his behavior and personality, and his injuries eventually caused epileptic seizures, one of which killed Gage in 1860. Gage became the index case for personality changes due to frontal lobe damage, and 50 years after his death, the case inspired neurologist David Ferrier to create brain maps based on his research into whether certain areas of the brain controlled specific cognitive functions.

“Sometimes it takes a beat before we get there,” said Fitzharris. “Science builds upon ideas, and it can take time. In the age of looking for instantaneous solutions, I think it’s important to remember that research needs to allow itself to do what it needs to do. It shouldn’t just be guided by an end goal. Some of the best discoveries that were made had no end goal in mind. And if you read Dead Ends, you’re going to be very happy that you live in 2025. Medically speaking, this is the best time. That’s really what Dead Ends is about. It’s a celebration of how far we’ve come.”

Photo of Jennifer Ouellette

Jennifer is a senior writer at Ars Technica with a particular focus on where science meets culture, covering everything from physics and related interdisciplinary topics to her favorite films and TV series. Jennifer lives in Baltimore with her spouse, physicist Sean M. Carroll, and their two cats, Ariel and Caliban.

Dead Ends is a fun, macabre medical history for kids Read More »

scientists-want-to-treat-complex-bone-fractures-with-a-bone-healing-gun

Scientists want to treat complex bone fractures with a bone-healing gun

After examining a few candidate formulations, the team found the right material. “We used a biocompatible thermoplastic called polycaprolactone and hydroxyapatite as base materials,” Lee said. Polycaprolactone was chosen because it is an FDA-approved material that degrades in the body within a few months after implantation. The hydroxyapatite, on the other hand, supports bone-tissue regeneration. Lee’s team experimented with various proportions of these two ingredients and finally nailed the formulation that checked all the boxes: It extruded at a relatively harmless 60° Celsius, the mix was mechanically sound, it adhered to the bone well, and it degraded over time.

Once the bone-healing bullets were ready, the team tested them on rabbits. Rabbits with broken femurs treated with Lee’s healing gun recovered faster than those treated with bone cement, which is the closest commercially available alternative. But there is still a lot to do before the healing gun can be tested on humans.

Skill issues

While the experiment on rabbits revealed new bone tissues forming around the implants created with the healing gun, their slow degradation of the implanted material prevented the full restoration of bone tissues. Another improvement Lee plans involves adding antibiotics to the formulation. The implant, he said, will release the drugs over time to prevent infections.

Then there’s the issue of load bearing. Rabbits are fine as test subjects, but they are rather light. “To evaluate the potential to use this technology on humans, we need to look into its long-term safety in large animal models,” Lee said.

Beyond the questions about the material, the level of skill required to operate this healing gun seems rather high.

Extrusion-based 3D printers, the ones that work more or less like very advanced hot glue guns, usually use guiding rods or rails for precise printing head positioning. If those rods or rails are warped, even slightly, the accuracy of your prints will most likely suffer. Achieving comparable precision with a handheld device might be a bit difficult, even for a skilled surgeon. “It is true that the system requires practice,” Lee said. “We may need to integrate it with a guiding mechanism that would position the head of the device precisely. This could be our next-gen bone printing device.”

Device, 2025.  DOI: 10.1016/j.device.2025.100873

Scientists want to treat complex bone fractures with a bone-healing gun Read More »

man’s-ghastly-festering-ulcer-stumps-doctors—until-they-cut-out-a-wedge-of-flesh

Man’s ghastly festering ulcer stumps doctors—until they cut out a wedge of flesh


The man made a full recovery, but this tale is not for the faint of heart.

If you were looking for some motivation to follow your doctor’s advice or remember to take your medicine, look no further than this grisly tale.

A 64-year-old man went to the emergency department of Brigham and Women’s Hospital in Boston with a painful festering ulcer spreading on his left, very swollen ankle. It was a gruesome sight; the open sore was about 8 by 5 centimeters (about 3 by 2 inches) and was rimmed by black, ashen, and dark purple tissue. Inside, it oozed with streaks and fringes of yellow pus around pink and red inflamed flesh. It was 2 cm deep (nearly an inch). And it smelled.

The man told doctors it had all started two years prior, when dark, itchy lesions appeared in the area on his ankle—the doctors noted that there were multiple patches of these lesions on both his legs. But about five months before his visit to the emergency department, one of the lesions on his left ankle had progressed to an ulcer. It was circular, red, tender, and deep. He sought treatment and was prescribed antibiotics, which he took. But they didn’t help.

You can view pictures of the ulcer and its progression here, but be warned, it is graphic. (Panel A shows the ulcer five months prior to the emergency department visit. Panel B shows the ulcer one month prior. Panel C shows the wound on the day of presentation at the emergency department. Panel D shows the area three months after hospital discharge.)

Gory riddle

The ulcer grew. In fact, it seemed as though his leg was caving in as the flesh around it began rotting away. A month before the emergency room visit, the ulcer was a gaping wound that was already turning gray and black at the edges. It was now well into the category of being a chronic ulcer.

In a Clinical Problem-Solving article published in the New England Journal of Medicine this week, doctors laid out what they did and thought as they worked to figure out what was causing the man’s horrid sore.

With the realm of possibilities large, they started with the man’s medical history. The man had immigrated to the US from Korea 20 years ago. He owned and worked at a laundromat, which involved standing for more than eight hours a day. He had a history of eczema on his legs, high cholesterol, high blood pressure, and Type 2 diabetes. For these, he was prescribed a statin for his cholesterol, two blood pressure medications (hydrochlorothiazide and losartan), and metformin for his diabetes. He told doctors he was not good at taking the regimen of medicine.

His diabetes was considered “poorly controlled.” A month prior, he had a glycated hemoglobin (A1C or HbA1C) test—which indicates a person’s average blood sugar level over the past two or three months. His result was 11 percent, while the normal range is between 4.2 and 5.6 percent.

His blood pressure, meanwhile, was 215/100 mm Hg at the emergency department. For reference, readings higher than 130/80 mm Hg on either number are considered the first stage of high blood pressure. Over the past three years, the man’s blood pressure had systolic readings (top number, pressure as heart beats) ranging from 160 to 230 mm Hg and diastolic readings (bottom number, pressure as heart relaxes) ranging from 95 to 120 mm Hg.

Clinical clues

Given the patient’s poorly controlled diabetes, a diabetic ulcer was initially suspected. But the patient didn’t have any typical signs of diabetic neuropathy that are linked to ulcers. These would include numbness, unusual sensations, or weakness. His responses on a sensory exam were all normal. Diabetic ulcers also typically form on the foot, not the lower leg.

X-rays of the ankle showed swelling in the soft tissue but without some signs of infection. The doctors wondered if the man had osteomyelitis, an infection in the bone, which can be a complication in people with diabetic ulcers. The large size and duration of the ulcer matched with a bone infection, as well as some elevated inflammatory markers he had on his blood tests.

To investigate the bone infection further, they admitted the man to the hospital and ordered magnetic resonance imaging (MRI). But the MRI showed only a soft-tissue defect and a normal bone, ruling out a bone infection. Another MRI was done with a contrast agent. That showed that the man’s large arteries were normal and there were no large blood clots deep in his veins—which is sometimes linked to prolonged standing, as the man did at his laundromat job.

As the doctors were still working to root out the cause, they had started him on a heavy-duty regimen of antibiotics. This was done with the assumption that on top of whatever caused the ulcer, there was now also a potentially aggressive secondary infection—one not knocked out by the previous round of antibiotics the man had been given.

With a bunch of diagnostic dead ends piling up, the doctors broadened their view of possibilities, newly considering cancers, rare inflammatory conditions, and less common conditions affecting small blood vessels (as the MRI has shown the larger vessels were normal). This led them to the possibility of a Martorell’s ulcer.

These ulcers, first described in 1945 by a Spanish doctor named Fernando Martorell, form when prolonged, uncontrolled high blood pressure causes the teeny arteries below the skin to stiffen and narrow, which blocks the blood supply, leading to tissue death and then ulcers. The ulcers in these cases tend to start as red blisters and evolve to frank ulcers. They are excruciatingly painful. And they tend to form on the lower legs, often over the Achilles’ tendon, though it’s unclear why this location is common.

What the doctor ordered

The doctors performed a punch biopsy of the man’s ulcer, but it was inconclusive—which is common with Martorell’s ulcers. The doctors turned to a “deep wedge biopsy” instead, which is exactly what it sounds like.

A pathology exam of the tissue slices from the wedge biopsy showed blood vessels that had thickened and narrowed. It also revealed extensive inflammation and necrosis. With the pathology results as well as the clinical presentation, the doctors diagnosed the man with a Martorell’s ulcer.

They also got back culture results from deep-tissue testing, finding that the man’s ulcer had also become infected with two common and opportunistic bacteria—Serratia marcescens and Enterococcus faecalis. Luckily, these are generally easy to treat, so the doctors scaled back his antibiotic regimen to target just those germs.

The man underwent three surgical procedures to clean out the dead tissue from the ulcer, then a skin graft to repair the damage. Ultimately, he made a full recovery. The doctors at first set him on an aggressive regimen to control his blood pressure, one that used four drugs instead of the two he was supposed to be taking. But the four-drug regimen caused his blood pressure to drop too low, and he was ultimately moved back to his original two-drug treatment.

The finding suggests that if he had just taken his original medications as prescribed, he would have kept his blood pressure in check and avoided the ulcer altogether.

In the end, “the good outcome in this patient with a Martorell’s ulcer underscores the importance of blood-pressure control in the management of this condition,” the doctors concluded.

Photo of Beth Mole

Beth is Ars Technica’s Senior Health Reporter. Beth has a Ph.D. in microbiology from the University of North Carolina at Chapel Hill and attended the Science Communication program at the University of California, Santa Cruz. She specializes in covering infectious diseases, public health, and microbes.

Man’s ghastly festering ulcer stumps doctors—until they cut out a wedge of flesh Read More »

stem-cells-used-to-partially-repair-damaged-hearts

Stem cells used to partially repair damaged hearts

When we developed the ability to convert various cells into a stem cell, it held the promise of an entirely new type of therapy. Rather than getting the body to try to fix itself with its cells or deal with the complications of organ transplants, we could convert a few adult cells to stem cells and induce them to form any tissue in the body. We could potentially repair or replace tissues with an effectively infinite supply of a patient’s own cells.

However, the Nobel Prize for induced stem cells was handed out over a decade ago, and the therapies have been slow to follow. But a group of German researchers is now describing tests in primates of a method of repairing the heart using new muscle generated from stem cells. The results are promising, if not yet providing everything that we might hope for. But they’ve been enough to start clinical trials, and similar results are being seen in humans.

Heart problems

The heart contains a lot of specialized tissues, including those that form blood vessels or specialize in conducting electrical signals. But the key to the heart is a form of specialized muscle cell, called a cardiomyocyte. Once the heart matures, the cardiomyocytes stop dividing, meaning that you end up with a fixed population. Any damage to the heart due to injury or infection does not get repaired, meaning damage will be cumulative.

This is especially problematic in cases of blocked blood vessels, which can repeatedly starve large areas of the heart of oxygen and nutrients, killing the cardiomyocytes there. This leads to a reduction in cardiac function and can ultimately result in death.

It turns out, however, that it’s relatively easy to convert induced pluripotent stem cells (IPSC, with pluripotent meaning they can form any cell type). So researchers tried injecting these stem-cell-derived cardiomyocytes into damaged hearts in experimental animals, in the hope that they would be incorporated into the damaged tissue. But these experiments didn’t always provide clear benefits to the animals.

Stem cells used to partially repair damaged hearts Read More »

it’s-remarkably-easy-to-inject-new-medical-misinformation-into-llms

It’s remarkably easy to inject new medical misinformation into LLMs


Changing just 0.001% of inputs to misinformation makes the AI less accurate.

It’s pretty easy to see the problem here: The Internet is brimming with misinformation, and most large language models are trained on a massive body of text obtained from the Internet.

Ideally, having substantially higher volumes of accurate information might overwhelm the lies. But is that really the case? A new study by researchers at New York University examines how much medical information can be included in a large language model (LLM) training set before it spits out inaccurate answers. While the study doesn’t identify a lower bound, it does show that by the time misinformation accounts for 0.001 percent of the training data, the resulting LLM is compromised.

While the paper is focused on the intentional “poisoning” of an LLM during training, it also has implications for the body of misinformation that’s already online and part of the training set for existing LLMs, as well as the persistence of out-of-date information in validated medical databases.

Sampling poison

Data poisoning is a relatively simple concept. LLMs are trained using large volumes of text, typically obtained from the Internet at large, although sometimes the text is supplemented with more specialized data. By injecting specific information into this training set, it’s possible to get the resulting LLM to treat that information as a fact when it’s put to use. This can be used for biasing the answers returned.

This doesn’t even require access to the LLM itself; it simply requires placing the desired information somewhere where it will be picked up and incorporated into the training data. And that can be as simple as placing a document on the web. As one manuscript on the topic suggested, “a pharmaceutical company wants to push a particular drug for all kinds of pain which will only need to release a few targeted documents in [the] web.”

Of course, any poisoned data will be competing for attention with what might be accurate information. So, the ability to poison an LLM might depend on the topic. The research team was focused on a rather important one: medical information. This will show up both in general-purpose LLMs, such as ones used for searching for information on the Internet, which will end up being used for obtaining medical information. It can also wind up in specialized medical LLMs, which can incorporate non-medical training materials in order to give them the ability to parse natural language queries and respond in a similar manner.

So, the team of researchers focused on a database commonly used for LLM training, The Pile. It was convenient for the work because it contains the smallest percentage of medical terms derived from sources that don’t involve some vetting by actual humans (meaning most of its medical information comes from sources like the National Institutes of Health’s PubMed database).

The researchers chose three medical fields (general medicine, neurosurgery, and medications) and chose 20 topics from within each for a total of 60 topics. Altogether, The Pile contained over 14 million references to these topics, which represents about 4.5 percent of all the documents within it. Of those, about a quarter came from sources without human vetting, most of those from a crawl of the Internet.

The researchers then set out to poison The Pile.

Finding the floor

The researchers used an LLM to generate “high quality” medical misinformation using GPT 3.5. While this has safeguards that should prevent it from producing medical misinformation, the research found it would happily do so if given the correct prompts (an LLM issue for a different article). The resulting articles could then be inserted into The Pile. Modified versions of The Pile were generated where either 0.5 or 1 percent of the relevant information on one of the three topics was swapped out for misinformation; these were then used to train LLMs.

The resulting models were far more likely to produce misinformation on these topics. But the misinformation also impacted other medical topics. “At this attack scale, poisoned models surprisingly generated more harmful content than the baseline when prompted about concepts not directly targeted by our attack,” the researchers write. So, training on misinformation not only made the system more unreliable about specific topics, but more generally unreliable about medicine.

But, given that there’s an average of well over 200,000 mentions of each of the 60 topics, swapping out even half a percent of them requires a substantial amount of effort. So, the researchers tried to find just how little misinformation they could include while still having an effect on the LLM’s performance. Unfortunately, this didn’t really work out.

Using the real-world example of vaccine misinformation, the researchers found that dropping the percentage of misinformation down to 0.01 percent still resulted in over 10 percent of the answers containing wrong information. Going for 0.001 percent still led to over 7 percent of the answers being harmful.

“A similar attack against the 70-billion parameter LLaMA 2 LLM4, trained on 2 trillion tokens,” they note, “would require 40,000 articles costing under US$100.00 to generate.” The “articles” themselves could just be run-of-the-mill webpages. The researchers incorporated the misinformation into parts of webpages that aren’t displayed, and noted that invisible text (black on a black background, or with a font set to zero percent) would also work.

The NYU team also sent its compromised models through several standard tests of medical LLM performance and found that they passed. “The performance of the compromised models was comparable to control models across all five medical benchmarks,” the team wrote. So there’s no easy way to detect the poisoning.

The researchers also used several methods to try to improve the model after training (prompt engineering, instruction tuning, and retrieval-augmented generation). None of these improved matters.

Existing misinformation

Not all is hopeless. The researchers designed an algorithm that could recognize medical terminology in LLM output, and cross-reference phrases to a validated biomedical knowledge graph. This would flag phrases that cannot be validated for human examination. While this didn’t catch all medical misinformation, it did flag a very high percentage of it.

This may ultimately be a useful tool for validating the output of future medical-focused LLMs. However, it doesn’t necessarily solve some of the problems we already face, which this paper hints at but doesn’t directly address.

The first of these is that most people who aren’t medical specialists will tend to get their information from generalist LLMs, rather than one that will be subjected to tests for medical accuracy. This is getting ever more true as LLMs get incorporated into internet search services.

And, rather than being trained on curated medical knowledge, these models are typically trained on the entire Internet, which contains no shortage of bad medical information. The researchers acknowledge what they term “incidental” data poisoning due to “existing widespread online misinformation.” But a lot of that “incidental” information was generally produced intentionally, as part of a medical scam or to further a political agenda. Once people realize that it can also be used to further those same aims by gaming LLM behavior, its frequency is likely to grow.

Finally, the team notes that even the best human-curated data sources, like PubMed, also suffer from a misinformation problem. The medical research literature is filled with promising-looking ideas that never panned out, and out-of-date treatments and tests that have been replaced by approaches more solidly based on evidence. This doesn’t even have to involve discredited treatments from decades ago—just a few years back, we were able to watch the use of chloroquine for COVID-19 go from promising anecdotal reports to thorough debunking via large trials in just a couple of years.

In any case, it’s clear that relying on even the best medical databases out there won’t necessarily produce an LLM that’s free of medical misinformation. Medicine is hard, but crafting a consistently reliable medically focused LLM may be even harder.

Nature Medicine, 2025. DOI: 10.1038/s41591-024-03445-1  (About DOIs).

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

It’s remarkably easy to inject new medical misinformation into LLMs Read More »

us-to-start-nationwide-testing-for-h5n1-flu-virus-in-milk-supply

US to start nationwide testing for H5N1 flu virus in milk supply

So, the ultimate goal of the USDA is to eliminate cattle as a reservoir. When the Agency announced it was planning for this program, it noted that there were two candidate vaccines in trials. Until those are validated, it plans to use the standard playbook for handling emerging infections: contact tracing and isolation. And it has the ability to compel cattle and their owners to be more cooperative than the human population turned out to be.

The five-step plan

The USDA refers to isolation and contact tracing as Stage 3 of a five-stage plan for controlling H5N1 in cattle, with the two earlier stages being the mandatory sampling and testing, meant to be handled on a state-by-state basis. Following the successful containment of the virus in a state, the USDA will move on to batch sampling to ensure each state remains virus-free. This is essential, given that we don’t have a clear picture of how many times the virus has jumped from its normal reservoir in birds into the cattle population.

That makes it possible that reaching Stage 5, which the USDA terms “Demonstrating Freedom from H5 in US Dairy Cattle,” will turn out to be impossible. Dairy cattle are likely to have daily contact with birds, and it may be that the virus will be regularly re-introduced into the population, leaving containment as the only option until the vaccines are ready.

Testing will initially focus primarily on states where cattle-to-human transmission is known to have occurred or the virus is known to be present: California, Colorado, Michigan, Mississippi, Oregon, and Pennsylvania. If you wish to track the progress of the USDA’s efforts, it will be posting weekly updates.

US to start nationwide testing for H5N1 flu virus in milk supply Read More »

breakdancers-at-risk-for-“headspin-hole,”-doctors-warn

Breakdancers at risk for “headspin hole,” doctors warn

Breakdancing has become a global phenomenon since it first emerged in the 1970s, even making its debut as an official event at this year’s Summer Olympics. But hardcore breakers are prone to injury (sprains, strains, tendonitis), including a bizarre condition known as “headspin hole” or “breakdance bulge”—a protruding lump on the scalp caused by repeatedly performing the power move known as a headspin. A new paper published in the British Medical Journal (BMJ) describes one such case that required surgery to redress.

According to the authors, there are very few published papers about the phenomenon; they cite two in particular. A 2009 German study of 106 breakdancers found that 60.4 percent of them experienced overuse injuries to the scalp because of headspins, with 31.1 percent of those cases reporting hair loss, 23.6 percent developing head bumps, and 36.8 percent experiencing scalp inflammation. A 2023 study of 142 breakdancers reported those who practiced headspins more than three times a week were much more likely to suffer hair loss.

So when a male breakdancer in his early 30s sought treatment for a pronounced bump on top of his head, Mikkal Bundgaard Skotting and Christian Baastrup Søndergaard of Copenhagen University Hospital in Denmark seized the opportunity to describe the clinical case study in detail, taking an MRI, surgically removing the growth, and analyzing the removed mass.

The man in question had been breakdancing for 19 years, incorporating various forms of headspins into his training regimen. He usually trained five days a week for 90 minutes at a time, with headspins applying pressure to the top of his head in two- to seven-minute intervals. In the last five years, he noticed a marked increase in the size of the bump on his head and increased tenderness. The MRI showed considerable thickening of the surrounding skin, tissue, and skull.

Breakdancers at risk for “headspin hole,” doctors warn Read More »

senate-panel-votes-20–0-for-holding-ceo-of-“health-care-terrorists”-in-contempt

Senate panel votes 20–0 for holding CEO of “health care terrorists” in contempt

Not above the law —

After he rejected subpoena, contempt charges against de la Torre go before Senate.

Ralph de la Torre, founder and chief executive officer of Steward Health Care System LLC, speaks during a summit in New York on Tuesday, Oct. 25, 2016.

Enlarge / Ralph de la Torre, founder and chief executive officer of Steward Health Care System LLC, speaks during a summit in New York on Tuesday, Oct. 25, 2016.

A Senate committee on Thursday voted overwhelmingly to hold the wealthy CEO of a failed hospital chain in civil and criminal contempt for rejecting a rare subpoena from the lawmakers.

In July, the Senate Committee on Health, Education, Labor, and Pensions (HELP) subpoenaed Steward Health Care CEO Ralph de la Torre to testify before the lawmakers on the deterioration and eventual bankruptcy of the system, which included more than 30 hospitals across eight states. The resulting dire conditions in the hospitals, described as providing “third-world medicine,” allegedly led to the deaths of at least 15 patients and imperiled more than 2,000 others.

The committee, chaired by Senator Bernie Sanders (I-Vt.), highlighted that amid the system’s collapse, de la Torre was paid at least $250 million, bought a $40 million yacht, and owned a $15 million luxury fishing boat. Meanwhile, Steward executives jetted around on two private jets collectively worth $95 million.

De la Torre initially agreed to appear at the September 12 hearing but backed out the week beforehand. He claimed, through his lawyers, that a federal order stemming from Steward’s bankruptcy case prohibited him from discussing the hospital system’s situation amid reorganization and settlement efforts. The HELP committee rejected that explanation, but de la Torre was nevertheless a no-show at the hearing.

In a 20–0 bipartisan vote Thursday, the HELP committee held de la Torre in civil and criminal contempt, with only Sen. Rand Paul (R-Ky.) abstaining. It is the first time in modern history the committee has issued civil and criminal contempt resolutions. The charges will now go before the full Senate for a vote.

If upheld by the full Senate, the civil enforcement will direct the Senate’s legal counsel to bring a federal civil suit against de la Torre in order to force him to comply with the subpoena and testify before the HELP Committee. The criminal contempt charge would refer the case to the US Attorney for the District of Columbia to criminally prosecute de la Torre for failing to comply with the subpoena. If the trial proceeds and de la Torre is convicted, the tarnished CEO could face a fine of up to $100,000 and a prison sentence of up to 12 months.

On Wednesday, the day before the committee voted on the contempt charges, a lawyer for de la Torre blasted the senators and claimed that testifying at the hearing would have violated his Fifth Amendment rights, according to the Boston Globe.

In a statement Thursday, Sanders slammed de la Torre, saying that his wealth and expensive lawyers did not make him above the law. “If you defy a Congressional subpoena, you will be held accountable no matter who you are or how well-connected you may be,” he said.

Senate panel votes 20–0 for holding CEO of “health care terrorists” in contempt Read More »

passing-part-of-a-medical-licensing-exam-doesn’t-make-chatgpt-a-good-doctor

Passing part of a medical licensing exam doesn’t make ChatGPT a good doctor

Smiling doctor discussing medical results with a woman.

Enlarge / For now, “you should see a doctor” remains good advice.

ChatGPT was able to pass some of the United States Medical Licensing Exam (USMLE) tests in a study done in 2022. This year, a team of Canadian medical professionals checked to see if it’s any good at actual doctoring. And it’s not.

ChatGPT vs. Medscape

“Our source for medical questions was the Medscape questions bank,” said Amrit Kirpalani, a medical educator at the Western University in Ontario, Canada, who led the new research into ChatGPT’s performance as a diagnostic tool. The USMLE contained mostly multiple-choice test questions; Medscape has full medical cases based on real-world patients, complete with physical examination findings, laboratory test results, and so on.

The idea behind it is to make those cases challenging for medical practitioners due to complications like multiple comorbidities, where two or more diseases are present at the same time, and various diagnostic dilemmas that make the correct answers less obvious. Kirpalani’s team turned 150 of those Medscape cases into prompts that ChatGPT could understand and process.

This was a bit of a challenge because OpenAI, the company that made ChatGPT, has a restriction against using it for medical advice, so a prompt to straight-up diagnose the case didn’t work. This was easily bypassed, though, by telling the AI that diagnoses were needed for an academic research paper the team was writing. The team then fed it various possible answers, copy/pasted all the case info available at Medscape, and asked ChatGPT to provide the rationale behind its chosen answers.

It turned out that in 76 out of 150 cases, ChatGPT was wrong. But the chatbot was supposed to be good at diagnosing, wasn’t it?

Special-purpose tools

At the beginning of 2024. Google published a study on the Articulate Medical Intelligence Explorer (AMIE), a large language model purpose-built to diagnose diseases based on conversations with patients. AMIE outperformed human doctors in diagnosing 303 cases sourced from New England Journal of Medicine and ClinicoPathologic Conferences. And AMIE is not an outlier; during the last year, there was hardly a week without published research showcasing an AI performing amazingly well in diagnosing cancer and diabetes, and even predicting male infertility based on blood test results.

The difference between such specialized medical AIs and ChatGPT, though, lies in the data they have been trained on. “Such AIs may have been trained on tons of medical literature and may even have been trained on similar complex cases as well,” Kirpalani explained. “These may be tailored to understand medical terminology, interpret diagnostic tests, and recognize patterns in medical data that are relevant to specific diseases or conditions. In contrast, general-purpose LLMs like ChatGPT are trained on a wide range of topics and lack the deep domain expertise required for medical diagnosis.”

Passing part of a medical licensing exam doesn’t make ChatGPT a good doctor Read More »