In an update released late Friday evening, NASA said it was “adjusting” the date of the Starliner spacecraft’s return to Earth from June 26 to an unspecified time in July.
The announcement followed two days of long meetings to review the readiness of the spacecraft, developed by Boeing, to fly NASA astronauts Butch Wilmore and Suni Williams to Earth. According to sources, these meetings included high-level participation from senior leaders at the agency, including Associate Administrator Jim Free.
This “Crew Flight Test,” which launched on June 5 atop an Atlas V rocket, was originally due to undock and return to Earth on June 14. However, as engineers from NASA and Boeing studied data from the vehicle’s problematic flight to the International Space Station, they have waved off several return opportunities.
On Friday night they did so again, citing the need to spend more time reviewing data.
“Taking our time”
“We are taking our time and following our standard mission management team process,” said Steve Stich, manager of NASA’s Commercial Crew Program, in the NASA update. “We are letting the data drive our decision making relative to managing the small helium system leaks and thruster performance we observed during rendezvous and docking.”
Just a few days ago, on Tuesday, officials from NASA and Boeing set a return date to Earth for June 26. But that was before a series of meetings on Thursday and Friday during which mission managers were to review findings about two significant issues with the Starliner spacecraft: five separate leaks in the helium system that pressurizes Starliner’s propulsion system and the failure of five of the vehicle’s 28 reaction-control system thrusters as Starliner approached the station.
The NASA update did not provide any information about deliberations during these meetings, but it is clear that the agency’s leaders were not able to get comfortable with all contingencies that Wilmore and Williams might encounter during a return flight to Earth, including safely undocking from the space station, maneuvering away, performing a de-orbit burn, separating the crew capsule from the service module, and then flying through the planet’s atmosphere before landing under parachutes in a New Mexico desert.
Spacecraft has a 45-day limit
Now, the NASA and Boeing engineering teams will take some more time. Sources said NASA considered June 30 as a possible return date, but the agency is also keen to perform a pair of spacewalks outside the station. These spacewalks, presently planned for June 24 and July 2, will now go ahead. Starliner will make its return to Earth sometime afterward, likely no earlier than the July 4 holiday.
“We are strategically using the extra time to clear a path for some critical station activities while completing readiness for Butch and Suni’s return on Starliner and gaining valuable insight into the system upgrades we will want to make for post-certification missions,” Stich said.
In some sense, it is helpful for NASA and Boeing to have Starliner docked to the space station for a longer period of time. They can gather more data about the performance of the vehicle on long-duration missions—eventually Starliner will fly operational missions that will enable astronauts to stay on orbit for six months at a time.
However, this vehicle is only rated for a 45-day stay at the space station, and that clock began ticking on June 6. Moreover, it is not optimal that NASA feels the need to continue delaying the vehicle to get comfortable with its performance on the return journey to Earth. During a pair of news conferences since Starliner docked to the station, officials have downplayed the overall seriousness of these issues—repeatedly saying Starliner is cleared to come home “in case of an emergency.” But they have yet to fully explain why they are not yet comfortable with releasing Starliner to fly back to Earth under normal circumstances.
The Food and Drug Administration (FDA) on Thursday announced expanded approval for a gene therapy to treat Duchenne muscular dystrophy (DMD)—despite the fact that it failed a Phase III clinical trial last year and that the approval came over the objections of three of FDA’s own expert review teams and two of its directors.
In fact, the decision to expand the approval of the therapy—called Elevidys (delandistrogene moxeparvovec-rokl)—appears to have been decided almost entirely by Peter Marks, Director of the FDA’s Center for Biologics Evaluation and Research.
Elevidys initially gained an FDA approval last year, also over objections from staff. The therapy intravenously delivers a transgene that codes for select portions of a protein called dystrophin in healthy muscle cells; the protein is mutated in patients with DMD. Last year’s initial approval occurred under an accelerated approval process and was only for use in DMD patients ages 4 and 5 who are able to walk. In the actions Thursday, the FDA granted a traditional approval for the therapy and opened access to DMD patients of all ages, regardless of ambulatory status.
“Today’s approval broadens the spectrum of patients with Duchenne muscular dystrophy eligible for this therapy, helping to address the ongoing, urgent treatment need for patients with this devastating and life-threatening disease,” Marks said in the announcement Thursday. “We remain steadfast in our commitment to help advance safe and effective treatments for patients who desperately need them.”
Criticism
The move, which follows a string of controversies in recent years of the FDA issuing questionable approvals over the assessments of advisors and its own staff, has quickly drawn criticism from agency watchers.
In a blog post Friday, a notable pharmaceutical industry expert and commentator, Derek Lowe, admonished the approval. Lowe expressed concern that the agency seems to be tilting toward emotional rhetoric and the will of patient advocates over scientific and medical evidence.
“It appears that all you need is a friend high up in the agency and your clinical failures just aren’t an issue any more,” he wrote. “Review committees aren’t convinced? Statisticians don’t buy your arguments? Who cares! Peter Marks is here to deliver hot, steaming takeout containers full of Hope. … And while I realize that this may make me sound like a heartless SOB, I think this is a huge mistake that we will be paying for for a long time.”
In a comment to Stat News, former FDA chief scientist Luciana Borio echoed concerns about how decisions like this will affect the agency in the longer term.
“I don’t know what to say. Peter Marks makes a mockery of scientific reasoning and approval standards that have served patients well over decades,” said Borio, who has also opposed earlier controversial approvals. “This type of action also promotes the growing mistrust in scientific institutions like the FDA.”
Internal dissent
In a series of review documents and memos released by the FDA, the divide between Marks and agency staff is abundantly clear. A review by FDA statisticians concluded that the collective clinical trial results “do not suggest there is substantial evidence to support the effectiveness of [Elevidys] for the expanded indication to all DMD patients and do not support the conversion of accelerated to traditional approval.”
A joint review from the agency’s Clinical and Clinical Pharmacology teams likewise concluded that the “totality of the data does not provide substantial evidence of effectiveness of Elevidys for treatment of ambulatory DMD patients of any age” and that the results “argue against” expanding access.
In a memo, Lola Fashoyin-Aje, Director of the Office of Clinical Evaluation in the Office of Therapeutic Products (OTP), and Dr. Nicole Verdun, Super Office Director of the OTP, concluded that the clinical results “cast significant uncertainty regarding the benefits of treatment of DMD with Elevidys.” The two directors found the primary clinical trial endpoint results were “not statistically significant” and smaller analyses looking at secondary endpoints of specific patient measures—such as the time it takes patients to rise from the floor or walk 10 meters—were “inconclusive,” in some cases “conflicting,” and overall illustrated the “unreliability of exploratory analyses to support regulatory decision-making.”
In a memo of his own, Marks agreed that primary endpoint result of the trial—based on scores on a standardized assessment of motor function in patients—did not show a statistically significant benefit. But he argued that the secondary endpoints were convincing enough for him. Marks wrote:
Specifically, although acknowledging that the Applicant’s randomized study of Elevidys failed to meet its statistical primary endpoint … I find that the observations regarding the secondary endpoints and exploratory endpoints are compelling and, combined with other data provided in the efficacy supplement and the original [Biologics License Application], meet the substantial evidence of effectiveness standard …
If Marks had not overruled the agency’s reviewers and directors, Fashoyin-Aje wrote that she would have recommended the therapy’s maker, Sarepta, conduct “an additional adequate and well-controlled study of Elevidys in the subgroup(s) of patients for which [Sarepta] believes the effects of Elevidys to be most promising.” However, Marks’ decision to approve renders the possibility of such a trial “highly infeasible to explore in a post-approval setting,” she wrote.
Earlier this week, the US Senate passed what’s being called the ADVANCE Act, for Accelerating Deployment of Versatile, Advanced Nuclear for Clean Energy. Among a number of other changes, the bill would attempt to streamline permitting for newer reactor technology and offer cash incentives for the first companies that build new plants that rely on one of a handful of different technologies. It enjoyed broad bipartisan support both in the House and Senate and now heads to President Biden for his signature.
Given Biden’s penchant for promoting his bipartisan credentials, it’s likely to be signed into law. But the biggest hurdles nuclear power faces are all economic, rather than regulatory, and the bill provides very little in the way of direct funding that could help overcome those barriers.
Incentives
For reasons that will be clear only to congressional staffers, the Senate version of the bill was attached to an amendment to the Federal Fire Prevention and Control Act. Nevertheless, it passed by a margin of 88-2, indicating widespread (and potentially veto-proof) support. Having passed the House already, there’s nothing left but the president’s signature.
The bill’s language focuses on the Nuclear Regulatory Commission (NRC) and its role in licensing nuclear reactor technology. The NRC is directed to develop a variety of reports for Congress—so, so many reports, focusing on everything from nuclear waste to fusion power—that could potentially inform future legislation. But the meat of the bill has two distinct focuses: streamlining regulation and providing some incentives for new technology.
The incentives are one of the more interesting features of the bill. They’re primarily focused on advanced nuclear technology, which is defined extremely broadly by an earlier statute as providing any of the following:
(A) additional inherent safety features
(B) significantly lower levelized cost of electricity
(C) lower waste yields
(D) greater fuel utilization
(E) enhanced reliability
(F) increased proliferation resistance
(G) increased thermal efficiency
(H) ability to integrate into electric and nonelectric applications
Normally, the work of the NRC in licensing is covered via application fees paid by the company seeking the license. But the NRC is instructed to lower its licensing fees for anyone developing advanced nuclear technologies. And there’s a “prize” incentive where the first company to get across the line with any of a handful of specific technologies will have all these fees refunded to it.
Winners will be awarded when they have met any of the following requirements: the first advanced reactor design that receives a license from the NRC; the first to be loaded with fuel for operation; the first to use isotopes derived from spent fuel; the first to build a facility where the reactor is integrated into a system that stores energy; the first to build a facility where the reactor provides electricity or processes heat for industrial applications.
The first award will likely go to NuScale, which is developing a small, modular reactor design and has gotten pretty far along in the licensing process. Its first planned installation, however, has been cancelled due to rising costs, so there’s no guarantee that the company will be first to fuel a reactor. TerraPower, a company backed by Bill Gates, is fairly far along in the design of a rector facility that will come with integrated storage, and so may be considered a frontrunner there.
For the remaining two prizes, there aren’t frontrunners for very different reasons. Nearly every company building small modular nuclear reactors promotes them as a potential source of process heat. By contrast, reprocessing spent fuel has been hugely expensive in any country where it has been tried, so it’s unlikely that prize will ever be given out.
For centuries, Western scholars have touted the fate of the native population on Easter Island (Rapa Nui) as a case study in the devastating cost of environmentally unsustainable living. The story goes that the people on the remote island chopped down all the trees to build massive stone statues, triggering a population collapse. Their numbers were further depleted when Europeans discovered the island and brought foreign diseases, among other factors. But an alternative narrative began to emerge in the 21st century that the earliest inhabitants actually lived quite sustainably until that point. A new paper published in the journal Science Advances offers another key piece of evidence in support of that alternative hypothesis.
As previously reported, Easter Island is famous for its giant monumental statues, called moai, built some 800 years ago and typically mounted on platforms called ahu. Scholars have puzzled over the moai on Easter Island for decades, pondering their cultural significance, as well as how a Stone Age culture managed to carve and transport statues weighing as much as 92 tons. The first Europeans arrived in the 17th century and found only a few thousand inhabitants on a tiny island (just 14 by 7 miles across) thousands of miles away from any other land. Since then, in order to explain the presence of so many moai, the assumption has been that the island was once home to tens of thousands of people.
But perhaps they didn’t need tens of thousands of people to accomplish that feat. Back in 2012, Carl Lipo of Binghamton University and Terry Hunt of the University of Arizona showed that you could transport a 10-foot, 5-ton moai a few hundred yards with just 18 people and three strong ropes by employing a rocking motion. In 2018, Lipo proposed an intriguing hypothesis for how the islanders placed red hats on top of some moai; those can weigh up to 13 tons. He suggested the inhabitants used ropes to roll the hats up a ramp. Lipo and his team later concluded (based on quantitative spatial modeling) that the islanders likely chose the statues’ locations based on the availability of fresh water sources, per a 2019 paper in PLOS One.
In 2020, Lipo and his team turned their attention to establishing a better chronology of human occupation of Rapa Nui. While it’s generally agreed that people arrived in Eastern Polynesia and on Rapa Nui sometime in the late 12th century or early 13th century, we don’t really know very much about the timing and tempo of events related to ahu construction and moai transport in particular. In his bestselling 2005 book Collapse, Jared Diamond offered the societal collapse of Easter Island (aka Rapa Nui), around 1600, as a cautionary tale. Diamond controversially argued that the destruction of the island’s ecological environment triggered a downward spiral of internal warfare, population decline, and cannibalism, resulting in an eventual breakdown of social and political structures.
Challenging a narrative
Lipo has long challenged that narrative, arguing as far back as 2007 against the “ecocide” theory. He and Hunt published a paper that year noting the lack of evidence of any warfare on Easter Island compared to other Polynesian islands. There are no known fortifications, and the obsidian tools found were clearly used for agriculture. Nor is there much evidence of violence among skeletal remains. He and Hunt concluded that the people of Rapa Nui continued to thrive well after 1600, which would warrant a rethinking of the popular narrative that the island was destitute when Europeans arrived in 1722.
For their 2020 study, the team applied a Bayesian model-based method to existing radiocarbon dates collected from prior excavations at 11 different sites with ahu. That work met with some mixed opinions from Lipo’s fellow archaeologists, with some suggesting that his team cherry-picked its radiocarbon dating—an allegation he dismissed at the time as “simply baloney and misinformed thinking.” They filtered their radiocarbon samples to just those they were confident related to human occupation and human-related events, meaning they analyzed a smaller subset of all the available ages—not an unusual strategy to eliminate bias due to issues with old carbon—and the results for colonization estimates were about the same as before.
The model also integrated the order and position of the island’s distinctive architecture, as well as ethnohistoric accounts, thereby quantifying the onset of monument construction, the rate at which it occurred, and when it likely ended. This allowed the researchers to test Diamond’s “collapse” hypothesis by building a more precise timeline of when construction took place at each of the sites. The results demonstrated a lack of evidence for a pre-contact collapse and instead offered strong support for a new emerging model of resilient communities that continued their long-term traditions despite the impacts of European arrival.
Fresh evidence
Now Lipo is back with fresh findings in support of his alternative theory, having analyzed the landscape to identify all the agricultural areas on the island. “We really wanted to look at the evidence for whether the island could in fact support such a large number of people,” he said during a media briefing. “What we know about the pre-contact people living on the island is that they survived on a combination of marine resources—fishing accounted for about 50 percent of their diet—and growing crops,” particularly the sweet potato, as well as taro and yams.
He and his co-authors set out to determine how much food could be produced agriculturally, extrapolating from that the size of a sustainable population. The volcanic soil on Easter Island is highly weathered and thus poor in nutrients essential for plant growth: nitrogen, phosphorus and potassium primarily, but also calcium, magnesium, and sulfur. To increase yields, the natives initially cut down the island’s trees to get nutrients back into the soil.
When there were no more trees, they engaged in a practice called “lithic mulching,” a form of rock gardening in which broken rocks were added to the first 20 to 25 centimeters (about 8 to 10 inches) of soil. This added essential nutrients back into the soil. “We do it ourselves with non-organic fertilizer,” said Lipo. “Essentially we use machines to crush rock into tiny pieces, which is effective because it exposes a lot of surface area. The people in Rapa Nui are doing it by hand, literally breaking up rocks and sticking them in dirt.”
There had been only one 2013 study aimed at determining the island’s rock-garden capacity, which relied on near-infrared bands from satellite images. The authors of that study estimated that between 4.9 and 21.2 km2 of the island’s total area comprised rock gardens, although they acknowledged this was likely an inaccurate estimation.
Lipo et al. examined satellite imagery data collected over the last five years, not just in the near-infrared, but also short-wave infrared (SWIR) and other visible spectra. SWIR is particularly sensitive to detecting water and nitrogen levels, making it easier to pinpoint areas where lithic mulching occurred. They trained machine-learning models on archaeological field identifications of rock garden features to analyze the SWIR data for a new estimation of capacity.
The result: Lipo et al. determined that the prevalence of rock gardening was about one-fifth of even the most conservative previous estimates of population size on Easter Island. They estimate that the island could support about 3,000 people—roughly the same number of inhabitants European explorers encountered when they arrived. “Previous studies had estimated that the island was fairly covered with mulch gardening, which led to estimates of up to 16,000 people,” said Lipo. “We’re saying that the island could never have supported 16,000 people; it didn’t have the productivity to do so. This pre-European collapse narrative simply has no basis in the archaeological record.”
“We don’t see demographic change decline in populations prior to Europeans’ arrival,” Lipo said. “All the [cumulative] evidence to date shows a continuous growth until some plateau is reached. It certainly was never an easy place to live, but people were able to figure out a means of doing so and lived within the boundaries of the capacity of the island up until European arrival.” So rather than being a cautionary tale, “Easter Island is a great case of how populations adapt to limited resources on a finite place, and do so sustainably.”
The owner of a home in southwestern Florida has formally submitted a claim to NASA for damages caused by a chunk of space debris that fell through his roof in March.
The legal case is unprecedented—no one has evidently made such a claim against NASA before. How the space agency responds will set a precedent, and that may be important in a world where there is ever more activity in orbit, with space debris and vehicles increasingly making uncontrolled reentries through Earth’s atmosphere.
Alejandro Otero, owner of the Naples, Florida, home struck by the debris, was not home when part of a battery pack from the International Space Station crashed through his home on March 8. His son Daniel, 19, was home but escaped injury. NASA has confirmed the 1.6-pound object, made of the metal alloy Inconel, was part of a battery pack jettisoned from the space station in 2021.
An attorney for the Otero family, Mica Nguyen Worthy, told Ars that she has asked NASA for “in excess of $80,000” for non-insured property damage loss, business interruption damages, emotional and mental anguish damages, and the costs for assistance from third parties.
“We intentionally kept it very reasonable because we did not want it to appear to NASA that my clients are seeking a windfall,” Worthy said.
The family has not filed a lawsuit against NASA, at least not yet. Worthy said she has been having productive conversations with NASA legal representatives. She said the Otero family wants to be made whole for their losses, but also to establish a precedent for future victims. “This is truly the first legal claim that is being submitted for recovery for damages related to space debris,” Worthy said. “How NASA responds will, in my view, be foundational for how future claims are handled. This is really changing the legal landscape.”
Who, exactly, is liable for space debris?
If space debris from another country—say, a Chinese or Russian rocket upper stage—were to strike a family in the United States, the victims would be entitled to compensation under the Space Liability Convention agreed to by space powers half a century ago. Under this treaty, a launching state is “absolutely” liable to pay compensation for damage caused by its space objects on the surface of the Earth or to aircraft, and liable for damage due to its faults in space. In an international situation, NASA or some other US government agency would negotiate on the victim’s behalf for compensation.
However, in this case the debris came from the International Space Station: an old battery pack that NASA was responsible for. NASA completed a multi-year upgrade of the space station’s power system in 2020 by installing a final set of new lithium-ion batteries to replace aging nickel-hydrogen batteries that were reaching end-of-life. During a spacewalk, this battery pack was mounted on a cargo pallet launched by Japan.
Officials originally planned to place pallets of the old batteries inside a series of Japanese supply freighters for controlled, destructive reentries over the ocean. But due to a series of delays, the final cargo pallet of old batteries missed its ride back to Earth, so NASA jettisoned the batteries to make an unguided reentry. NASA incorrectly believed the batteries would completely burn up during the return through the atmosphere.
Because this case falls outside the Space Liability Convention, there is no mechanism for a US citizen to seek claims from the US government for damage from space debris. So the Otero family is making a first-ever claim under the Federal Torts Claim Act for falling space debris. This torts act allows someone to sue the US government if there has been negligence. In this case, the negligence could be that NASA miscalculated about the survival of enough debris to damage property on Earth.
NASA provided a form to the Otero family to submit a claim, which Worthy said they did at the end of May. NASA now has six months to review the claim. The space agency has several options. Legally, it could recompense the Otero family up to $25,000 for each of its claims based on the Federal Torts Claim Act (see legal code). If the agency seeks to pay full restitution, it would require approval from the US attorney general. Finally, NASA could refuse the claims or make an unacceptable settlement offer—in which case the Otero family could file a federal lawsuit in Florida.
Ars has sought comment from NASA about the claims made and will update this story when we receive one.
On a Wednesday morning in late January 1896 at a small light bulb factory in Chicago, a middle-aged woman named Rose Lee found herself at the heart of a groundbreaking medical endeavor. With an X-ray tube positioned above the tumor in her left breast, Lee was treated with a torrent of high-energy particles that penetrated into the malignant mass.
“And so,” as her treating clinician later wrote, “without the blaring of trumpets or the beating of drums, X-ray therapy was born.”
Radiation therapy has come a long way since those early beginnings. The discovery of radium and other radioactive metals opened the doors to administering higher doses of radiation to target cancers located deeper within the body. The introduction of proton therapy later made it possible to precisely guide radiation beams to tumors, thus reducing damage to surrounding healthy tissues—a degree of accuracy that was further refined through improvements in medical physics, computer technologies and state-of-the-art imaging techniques.
But it wasn’t until the new millennium, with the arrival of targeted radiopharmaceuticals, that the field achieved a new level of molecular precision. These agents, akin to heat-seeking missiles programmed to hunt down cancer, journey through the bloodstream to deliver their radioactive warheads directly at the tumor site.
Today, only a handful of these therapies are commercially available for patients—specifically, for forms of prostate cancer and for tumors originating within hormone-producing cells of the pancreas and gastrointestinal tract. But this number is poised to grow as major players in the biopharmaceutical industry begin to invest heavily in the technology.
AstraZeneca became the latest heavyweight to join the field when, on June 4, the company completed its purchase of Fusion Pharmaceuticals, maker of next-generation radiopharmaceuticals, in a deal worth up to $2.4 billion. The move follows similar billion-dollar-plus transactions made in recent months by Bristol Myers Squibb (BMS) and Eli Lilly, along with earlier takeovers of innovative radiopharmaceutical firms by Novartis, which continued its acquisition streak—begun in 2018—with another planned $1 billion upfront payment for a radiopharma startup, as revealed in May.
“It’s incredible how, suddenly, it’s all the rage,” says George Sgouros, a radiological physicist at Johns Hopkins University School of Medicine in Baltimore and the founder of Rapid, a Baltimore-based company that provides software and imaging services to support radiopharmaceutical drug development. This surge in interest, he points out, underscores a wider recognition that radiopharmaceuticals offer “a fundamentally different way of treating cancer.”
Treating cancer differently, however, means navigating a minefield of unique challenges, particularly in the manufacturing and meticulously timed distribution of these new therapies, before the radioactivity decays. Expanding the reach of the therapy to treat a broader array of cancers will also require harnessing new kinds of tumor-killing particles and finding additional suitable targets.
“There’s a lot of potential here,” says David Nierengarten, an analyst who covers the radiopharmaceutical space for Wedbush Securities in San Francisco. But, he adds, “There’s still a lot of room for improvement.”
Atomic advances
For decades, a radioactive form of iodine stood as the sole radiopharmaceutical available on the market. Once ingested, this iodine gets taken up by the thyroid, where it helps to destroy cancerous cells of that butterfly-shaped gland in the neck—a treatment technique established in the 1940s that remains in common use today.
But the targeted nature of this strategy is not widely applicable to other tumor types.
The thyroid is naturally inclined to absorb iodine from the bloodstream since this mineral, which is found in its nonradioactive form in many foods, is required for the synthesis of certain hormones made by the gland.
Other cancers don’t have a comparable affinity for radioactive elements. So instead of hijacking natural physiological pathways, researchers have had to design drugs that are capable of recognizing and latching onto specific proteins made by tumor cells. These drugs are then further engineered to act as targeted carriers, delivering radioactive isotopes—unstable atoms that emit nuclear energy—straight to the malignant site.
It’s one of the world’s worst-kept secrets that large language models give blatantly false answers to queries and do so with a confidence that’s indistinguishable from when they get things right. There are a number of reasons for this. The AI could have been trained on misinformation; the answer could require some extrapolation from facts that the LLM isn’t capable of; or some aspect of the LLM’s training might have incentivized a falsehood.
But perhaps the simplest explanation is that an LLM doesn’t recognize what constitutes a correct answer but is compelled to provide one. So it simply makes something up, a habit that has been termed confabulation.
Figuring out when an LLM is making something up would obviously have tremendous value, given how quickly people have started relying on them for everything from college essays to job applications. Now, researchers from the University of Oxford say they’ve found a relatively simple way to determine when LLMs appear to be confabulating that works with all popular models and across a broad range of subjects. And, in doing so, they develop evidence that most of the alternative facts LLMs provide are a product of confabulation.
Catching confabulation
The new research is strictly about confabulations, and not instances such as training on false inputs. As the Oxford team defines them in their paper describing the work, confabulations are where “LLMs fluently make claims that are both wrong and arbitrary—by which we mean that the answer is sensitive to irrelevant details such as random seed.”
The reasoning behind their work is actually quite simple. LLMs aren’t trained for accuracy; they’re simply trained on massive quantities of text and learn to produce human-sounding phrasing through that. If enough text examples in its training consistently present something as a fact, then the LLM is likely to present it as a fact. But if the examples in its training are few, or inconsistent in their facts, then the LLMs synthesize a plausible-sounding answer that is likely incorrect.
But the LLM could also run into a similar situation when it has multiple options for phrasing the right answer. To use an example from the researchers’ paper, “Paris,” “It’s in Paris,” and “France’s capital, Paris” are all valid answers to “Where’s the Eiffel Tower?” So, statistical uncertainty, termed entropy in this context, can arise either when the LLM isn’t certain about how to phrase the right answer or when it can’t identify the right answer.
This means it’s not a great idea to simply force the LLM to return “I don’t know” when confronted with several roughly equivalent answers. We’d probably block a lot of correct answers by doing so.
So instead, the researchers focus on what they call semantic entropy. This evaluates all the statistically likely answers evaluated by the LLM and determines how many of them are semantically equivalent. If a large number all have the same meaning, then the LLM is likely uncertain about phrasing but has the right answer. If not, then it is presumably in a situation where it would be prone to confabulation and should be prevented from doing so.
Cases of illnesses linked to microdosing candies have more than doubled, with reports of seizures and the need for intubation, mechanical ventilation, and intensive care stays. But, there remains no recall of the products—microdosing chocolates, gummies, and candy cones by Diamond Shruumz—linked to the severe and life-threatening illnesses. In the latest update from the Food and Drug Administration late Tuesday, the agency said that it “has been in contact with the firm about a possible voluntary recall, but these discussions are still ongoing.”
In the update, the FDA reported 26 cases across 16 states, up from 12 cases in eight states last week. Of the 26 reported cases, 25 sought medical care and 16 were hospitalized. No deaths have been reported.
Last week, the Centers for Disease Control and Prevention released a health alert about the candies. The agency noted that as of June 11, the people sickened after eating Diamond Shruumz candies presented to health care providers with a host of severe symptoms. Those include: central nervous system depression with sedation, seizures, muscle rigidity, clonus (abnormal reflex responses), tremor, abnormal heart rate (bradycardia or tachycardia), abnormal blood pressure (hypotension or hypertension), gastrointestinal effects (nausea, vomiting, or abdominal pain), skin flushing, diaphoresis (excessive sweating), and metabolic acidosis with increased anion gap (an acid-based disorder linked to poisonings).
At the time of the CDC alert, 10 patients had been hospitalized, and “several required intubation, mechanical ventilation, and admission to an intensive care unit,” the agency reported.
It remains unclear what ingredient in the candies could be causing the poisonings. The FDA reports that it has worked with state partners to collect multiple samples of Diamond Shruumz products so they can be analyzed for potential toxic components. That analysis is still ongoing, the agency said.
Diamond Shruumz has not responded to multiple requests for comment from Ars.
Untold toxic ingredients
Diamond Shruumz does not list the ingredients of its products on its website. They are sold as “microdosing” candies, a term that typically suggests a small amount of a psychedelic compound is present. The company describes its chocolates, gummies, and cones as “trippy,” “psychedelic,” and “hallucinogenic,” and also claims they contain a “primo proprietary blend of nootropic and functional mushrooms.” But, it’s unclear what, if any, psychoactive compound is present in the candies.
The CDC notes that products like these “might contain undisclosed ingredients, including illicit substances, other adulterants, or potentially harmful contaminants that are not approved for use in food.”
Diamond Shruumz posted documents on its website from third-party laboratories claiming to indicate that the candies do not contain the most notable mushroom-derived psychedelic compound, psilocybin. The reports also indicate that some of the products do not contain cannabinoids or compounds from the hallucinogenic Amanita muscaria mushroom. Additionally, the company said in a blog post that its products contain a blend of Lion’s mane, Reishi, and Chaga mushrooms, but these are all non-hallucinogenic mushrooms used in herbal and traditional medicines and supplements.
In recent decades, hundreds of new synthetic psychoactive substances have hit the market in such products, including many new phenethylamines and tryptamines, which are chemically related to LSD and psilocybin. Some experts and members of the psychedelic community have speculated that Diamond Shruumz products could potentially contain one of the more popular tryptamines, 4-AcO-DMT, often pronounced “4-akko-DMT,” and also known as 4- acetoxy-N,N-dimethyltryptamine, O-acetylpsilocin, or psilacetin. According to a qualitative 2020 study, users describe 4-AcO-DMT as producing effects similar to psilocybin, but without some of the unpleasant side effects noted with natural mushrooms, such as nausea. Animal experiments have confirmed that 4-AcO-DMT appears to produce psilocybin-like effects.
Still, it’s unclear if such ingredients could explain the symptoms seen in the current outbreak. Though clinical data on 4-AcO-DMT is scant, it has not been linked to such severe symptoms. On the other hand, some novel synthetic compounds, such as Dox and NBOMe, often misrepresented as LSD, are considered dangerous. For instance, NBOMe compounds (N-methoxybenzyl, also called N-bombs or 251), first discovered in 2003, have been linked to overdoses and deaths. In the scientific literature, they’ve been linked to “unpleasant hallucinations, panic, agitation, hypertension, seizures, acute psychosis, and/or excited delirium that can result in cardiac arrest,” according to the 2020 study.
In the urgent quest for a more sustainable global food system, livestock are a mixed blessing. On the one hand, by converting fibrous plants that people can’t eat into protein-rich meat and milk, grazing animals like cows and sheep are an important source of human food. And for many of the world’s poorest, raising a cow or two—or a few sheep or goats—can be a key source of wealth.
But those benefits come with an immense environmental cost. A study in 2013 showed that globally, livestock account for about 14.5 percent of greenhouse gas emissions, more than all the world’s cars and trucks combined. And about 40 percent of livestock’s global warming potential comes in the form of methane, a potent greenhouse gas formed as they digest their fibrous diet.
That dilemma is driving an intense research effort to reduce methane emissions from grazers. Existing approaches, including improved animal husbandry practices and recently developed feed additives, can help, but not at the scale needed to make a significant global impact. So scientists are investigating other potential solutions, such as breeding low-methane livestock and tinkering with the microbes that produce the methane in grazing animals’ stomachs. While much more research is needed before those approaches come to fruition, they could be relatively easy to implement widely and could eventually have a considerable impact.
The good news—and an important reason to prioritize the effort—is that methane is a relatively short-lived greenhouse gas. Whereas the carbon dioxide emitted today will linger in the atmosphere for more than a century, today’s methane will wash out in little more than a decade. So tackling methane emissions now can lower greenhouse gas levels and thus help slow climate change almost immediately.
“Reducing methane in the next 20 years is about the only thing we have to keep global warming in check,” says Claudia Arndt, a dairy nutritionist working on methane emissions at the International Livestock Research Institute in Nairobi, Kenya.
The methane dilemma
The big challenge in lowering methane is that the gas is a natural byproduct of what makes grazing animals uniquely valuable: their partnership with a host of microbes. These microbes live within the rumen, the largest of the animals’ four stomachs, where they break down the fibrous food into smaller molecules that the animals can absorb for nutrition. In the process, they generate large amounts of hydrogen gas, which is converted into methane by another group of microbes called methanogens.
Most of this methane, often referred to as enteric methane, is belched or exhaled out by the animals into the atmosphere—just one cow belches out around 220 pounds of methane gas per year, for example. (Contrary to popular belief, very little methane is expelled in the form of farts. Piles of manure that accumulate in feedlots and dairy barns account for about a quarter of US livestock methane, but aerating the piles or capturing the methane for biogas can prevent those emissions; the isolated cow plops from pastured grazing animals generate little methane.)
The humble hagfish is an ugly, gray, eel-like creature best known for its ability to unleash a cloud of sticky slime onto unsuspecting predators, clogging the gills and suffocating said predators. That’s why it’s affectionately known as a “snot snake.” Hagfish also love to burrow into the deep-sea sediment, but scientists have been unable to observe precisely how they do so because the murky sediment obscures the view. Researchers at Chapman University built a special tank with transparent gelatin to overcome this challenge and get a complete picture of the burrowing behavior, according to a new paper published in the Journal of Experimental Biology.
“For a long time we’ve known that hagfish can burrow into soft sediments, but we had no idea how they do it,” said co-author Douglas Fudge, a marine biologist who heads a lab at Chapman devoted to the study of hagfish. “By figuring out how to get hagfish to voluntarily burrow into transparent gelatin, we were able to get the first ever look at this process.”
As previously reported, scientists have been studying hagfish slime for years because it’s such an unusual material. It’s not like mucus, which dries out and hardens over time. Hagfish slime stays slimy, giving it the consistency of half-solidified gelatin. That’s due to long, thread-like fibers in the slime, in addition to the proteins and sugars that make up mucin, the other major component. Those fibers coil up into “skeins” that resemble balls of yarn. When the hagfish lets loose with a shot of slime, the skeins uncoil and combine with the salt water, blowing up more than 10,000 times its original size.
From a materials standpoint, hagfish slime is fascinating stuff that might one day prove useful for biomedical devices, or weaving light-but-strong fabrics for natural Lycra or bulletproof vests, or lubricating industrial drills that tend to clog in deep soil and sediment. In 2016, a group of Swiss researchers studied the unusual fluid properties of hagfish slime, specifically focusing on how those properties provided two distinct advantages: helping the animal defend itself from predators and tying itself in knots to escape from its own slime.
Hagfish slime is a non-Newtonian fluid and is unusual in that it is both shear-thickening and shear-thinning in nature. Most hagfish predators employ suction feeding, which creates a unidirectional shear-thickening flow, the better to clog the gills and suffocate said predators. But if the hagfish needs to get out of its own slime, its body movements create a shear-thinning flow, collapsing the slimy network of cells that makes up the slime.
Fudge has been studying the hagfish and the properties of its slime for years. For instance, way back in 2012, when he was at the University of Guelph, Fudge’s lab successfully harvested hagfish slime, dissolved it in liquid, and then “spun” it into a strong-yet-stretchy thread, much like spinning silk. It’s possible such threads could replace the petroleum-based fibers currently used in safety helmets or Kevlar vests, among other potential applications. And in 2021, his team found that the slime produced by larger hagfish contains much larger cells than slime produced by smaller hagfish—an unusual example of cell size scaling with body size in nature.
A sedimentary solution
This time around, Fudge’s team has turned their attention to hagfish burrowing. In addition to shedding light on hagfish reproductive behavior, the research could also have broader ecological implications. According to the authors, the burrowing is an important factor in sediment turnover, while the burrow ventilation changes the chemistry of the sediment such that it could contain more oxygen. This in turn would alter which organisms are likely to thrive in that sediment. Understanding the burrowing mechanisms could also aid in the design of soft burrowing robots.
But first Fudge’s team had to figure out how to see through the sediment to observe the burrowing behavior. Other scientists studying different animals have relied on transparent substrates like mineral cryolite or hydrogels made of gelatin, the latter of which has been used successfully to observe the burrowing behavior of polychaete worms. Fudge et al. opted for gelatin as a sediment replacement housed in three custom transparent acrylic chambers. Then they filmed the gelatin-burrowing behavior of 25 randomly selected hagfish.
This enabled Fudge et al. to identify two distinct phases of movement that the hagfish used to create their u-shaped burrows. First there is the “thrash” stage, in which the hagfish swims vigorously while moving its head from side to side. This not only serves to propel the hagfish forward, but also helps chop up the gelatin into pieces. This might be how hagfish overcome the challenge of creating an opening in the sediment (or gelatin substrate) through which to move.
Next comes the “wriggle” phase, which seems to be powered by an “internal concertina” common to snakes. It involves the shortening and forceful elongation of the body, as well as exerting lateral forces on the walls to brace and widen the burrow. “A snake using concertina movements will make steady progress through a narrow channel or burrow by alternating waves of elongation and shortening,” the authors wrote, and the loose skin of the hagfish is well suited to such a strategy. The wriggle phase lasts until the burrowing hagfish pops its head out of the substrate. The hagfish took about seven minutes or more on average to complete their burrows.
Naturally there are a few caveats. The walls of the acrylic containers may have affected the burrowing behavior in the lab, or the final shape of the burrows. The authors recommend repeating the experiments using sediments from the natural habitat, implementing X-ray videography of hagfish implanted with radio markers to capture the movements. Body size and substrate type may also influence burrowing behavior. But on the whole, they believe their observations “are an accurate representation of how hagfish are creating and moving within burrows in the wild.”
In April 1944, a pilot with the Tuskegee Airmen, Second Lieutenant Frank Moody, was on a routine training mission when his plane malfunctioned. Moody lost control of the aircraft and plunged to his death in the chilly waters of Lake Huron. His body was recovered two months later, but the airplane was left at the bottom of the lake—until now. Over the last few years, a team of divers working with the Tuskegee Airmen National Historical Museum in Detroit has been diligently recovering the various parts of Moody’s plane to determine what caused the pilot’s fatal crash.
That painstaking process is the centerpiece of The Real Red Tails, a new documentary from National Geographic narrated by Sheryl Lee Ralph (Abbot Elementary). The documentary features interviews with the underwater archaeologists working to recover the plane, as well as firsthand accounts from Moody’s fellow airmen and stunning underwater footage from the wreck itself.
The Tuskegee Airmen were the first Black military pilots in the US Armed Forces and helped pave the way for the desegregation of the military. The men painted the tails of their P-47 planes red, earning them the nickname the Red Tails. (They initially flew Bell P-39 Airacobras like Moody’s downed plane, and later flew P-51 Mustangs.) It was then-First Lady Eleanor Roosevelt who helped tip popular opinion in favor of the fledgling unit when she flew with the Airmen’s chief instructor, C. Alfred Anderson, in March 1941. The Airmen earned praise for their skill and bravery in combat during World War II, with members being awarded three Distinguished Unit Citations, 96 Distinguished Flying Crosses, 14 Bronze Stars, 60 Purple Hearts, and at least one Silver Star.
A father-and-son team, David and Drew Losinski, discovered the wreckage of Moody’s plane in 2014 during cleanup efforts for a sunken barge. They saw what looked like a car door lying on the lake bed that turned out to be a door from a WWII-era P-39. The red paint on the tail proved it had been flown by a “Red Tail” and it was eventually identified as Moody’s plane. The Losinskis then joined forces with Wayne Lusardi, Michigan’s state maritime archaeologist, to explore the remarkably well-preserved wreckage. More than 600 pieces have been recovered thus far, including the engine, the propeller, the gearbox, machine guns, and the main 37mm cannon.
Ars caught up with Lusardi to learn more about this fascinating ongoing project.
Ars Technica: The area where Moody’s plane was found is known as Shipwreck Alley. Why have there been so many wrecks—of both ships and airplanes—in that region?
Wayne Lusardi: Well, the Great Lakes are big, and if you haven’t been on them, people don’t really understand they’re literally inland seas. Consequently, there has been a lot of maritime commerce on the lakes for hundreds of years. Wherever there’s lots of ships, there’s usually lots of accidents. It’s just the way it goes. What we have in the Great Lakes, especially around some places in Michigan, are really bad navigation hazards: hidden reefs, rock piles that are just below the surface that are miles offshore and right near the shipping lanes, and they often catch ships. We have bad storms that crop up immediately. We have very chaotic seas. All of those combined to take out lots of historic vessels. In Michigan alone, there are about 1,500 shipwrecks; in the Great Lakes, maybe close to 10,000 or so.
One of the biggest causes of airplanes getting lost offshore here is fog. Especially before they had good navigation systems, pilots got lost in the fog and sometimes crashed into the lake or just went missing altogether. There are also thunderstorms, weather conditions that impact air flight here, and a lot of ice and snow storms.
Just like commercial shipping, the aviation heritage of the Great Lakes is extensive; a lot of the bigger cities on the Eastern Seaboard extend into the Great Lakes. It’s no surprise that they populated the waterfront, the shorelines first, and in the early part of the 20th century, started connecting them through aviation. The military included the Great Lakes in their training regimes because during World War I, the conditions that you would encounter in the Great Lakes, like flying over big bodies of water, or going into remote areas to strafe or to bomb, mimicked what pilots would see in the European theater during the first World War. When Selfridge Field near Detroit was developed by the Army Air Corps in 1917, it was the farthest northern military air base in the United States, and it trained pilots to fly in all-weather conditions to prepare them for Europe.
A key aspect of humans’ evolutionary success is the fact that we don’t have to learn how to do things from scratch. Our societies have developed various ways—from formal education to YouTube videos—to convey what others have learned. This makes learning how to do things far easier than learning by doing, and it gives us more space to experiment; we can learn to build new things or handle tasks more efficiently, then pass information on how to do so on to others.
Some of our closer relatives, like chimps and bonobos, learn from their fellow species-members. They don’t seem to engage in this iterative process of improvement—they don’t, in technical terms, have a cumulative culture where new technologies are built on past knowledge. So, when did humans develop this ability?
Based on a new analysis of stone toolmaking, two researchers are arguing that the ability is relatively recent, dating to just 600,000 years ago. That’s roughly the same time our ancestors and the Neanderthals went their separate ways.
Accumulating culture
It’s pretty obvious that a lot of our technology builds on past efforts. If you’re reading this on a mobile platform, then you’re benefitting from the fact that smartphones were derived from personal computers and that software required working hardware to happen. But for millions of years, human technology lacked the sort of clear building blocks that would help us identify when an archeological artifact is derived from earlier work. So, how do you go about studying the origin of cumulative culture?
Jonathan Paige and Charles Perreault, the researchers behind the new study, took a pretty straightforward approach. To start with, they focused on stone tools since these are the only things that are well-preserved across our species’ history. In many cases, the styles of tools remained constant for hundreds of thousands of years. This gives us enough examples that we’ve been able to figure out how these tools were manufactured, in many cases learning to make them ourselves.
Their argument in the paper they’ve just published is that the sophistication of these tools provides a measure of when cultural accumulation started. “As new knapping techniques are discovered, the frontiers of the possible design space expand,” they argue. “These more complex technologies are also more difficult to discover, master, and teach.”
The question then becomes one of when humans made the key shift: from simply teaching the next generation to make the same sort of tools to using that knowledge as a foundation to build something new. Paige and Perreault argue that it’s a matter of how complex it is to make the tool: “Generations of improvements, modifications, and lucky errors can generate technologies and know-how well beyond what a single naive individual could invent independently within their lifetime.”