research

alzheimer’s-scientist-indicted-for-allegedly-falsifying-data-in-$16m-scheme

Alzheimer’s scientist indicted for allegedly falsifying data in $16M scheme

Funding Scheme —

The work underpinned an Alzheimer’s drug by Cassava, now in a Phase III trial.

Alzheimer’s scientist indicted for allegedly falsifying data in $16M scheme

A federal grand jury has indicted an embattled Alzheimer’s researcher for allegedly falsifying data to fraudulently obtain $16 million in federal research funding from the National Institutes of Health for the development of a controversial Alzheimer’s drug and diagnostic test.

Hoau-Yan Wang, 67, a medical professor at the City University of New York, was a paid collaborator with the Austin, Texas-based pharmaceutical company Cassava Sciences. Wang’s research and publications provided scientific underpinnings for Cassava’s Alzheimer’s treatment, Simufilam, which is now in Phase III trials.

Simufilam is a small-molecule drug that Cassava claims can restore the structure and function of a scaffolding protein in the brain of people with Alzheimer’s, leading to slowed cognitive decline. But outside researchers have long expressed doubts and concerns about the research.

In 2023, Science magazine obtained a 50-page report from an internal investigation at CUNY that looked into 31 misconduct allegations made against Wang in 2021. According to the report, the investigating committee “found evidence highly suggestive of deliberate scientific misconduct by Wang for 14 of the 31 allegations,” the report states. The allegations largely centered around doctored and fabricated images from Western blotting, an analytical technique used to separate and detect proteins. However, the committee couldn’t conclusively prove the images were falsified “due to the failure of Dr. Wang to provide underlying, original data or research records and the low quality of the published images that had to be examined in their place.”

In all, the investigation “revealed long-standing and egregious misconduct in data management and record keeping by Dr. Wang,” and concluded that “the integrity of Dr. Wang’s work remains highly questionable.” The committee also concluded that Cassava’s lead scientist on its Alzheimer’s disease program, Lindsay Burns, who was a frequent co-author with Wang, also likely bears some responsibility for the misconduct.

In March 2022, five of Wang’s articles published in the journal PLOS One were retracted over integrity concerns with images in the papers. Other papers by Wang have also been retracted or had statements of concern attached to them. Further, in September 2022, the Food and Drug Administration conducted an inspection of the analytical work and techniques used by Wang to analyze blood and cerebrospinal fluid from patients in a simufilam trial. The investigation found a slew of egregious problems, which were laid out in a “damning” report obtained by Science.

In the indictment last week, federal authorities were explicit about the allegations, claiming that Wang falsified the results of his scientific research to NIH “by, among other things, manipulating data and images of Western blots to artificially add bands [which represent proteins], subtract bands, and change their relative thickness and/or darkness, and then drawing conclusions” based on those false results.

Wang is charged with one count of major fraud against the United States, two counts of wire fraud, and one count of false statements. If convicted, he faces a maximum penalty of 10 years in prison for the major fraud charge, 20 years in prison for each count of wire fraud, and five years in prison for the count of false statements, the Department of Justice said in an announcement.

In a statement posted to its website, Cassava acknowledged Wang’s indictment, calling him a “former” scientific adviser. The company also said that the grants central to the indictment were “related to the early development phases of the Company’s drug candidate and diagnostic test and how these were intended to work.” However, Cassava said that Wang “had no involvement in the Company’s Phase 3 clinical trials of simufilam.”

Those ongoing trials, which some have called to be halted, are estimated to include over 1,800 patients across several countries.

Alzheimer’s scientist indicted for allegedly falsifying data in $16M scheme Read More »

here’s-what’s-really-going-on-inside-an-llm’s-neural-network

Here’s what’s really going on inside an LLM’s neural network

Artificial brain surgery —

Anthropic’s conceptual mapping helps explain why LLMs behave the way they do.

Here’s what’s really going on inside an LLM’s neural network

Aurich Lawson | Getty Images

With most computer programs—even complex ones—you can meticulously trace through the code and memory usage to figure out why that program generates any specific behavior or output. That’s generally not true in the field of generative AI, where the non-interpretable neural networks underlying these models make it hard for even experts to figure out precisely why they often confabulate information, for instance.

Now, new research from Anthropic offers a new window into what’s going on inside the Claude LLM’s “black box.” The company’s new paper on “Extracting Interpretable Features from Claude 3 Sonnet” describes a powerful new method for at least partially explaining just how the model’s millions of artificial neurons fire to create surprisingly lifelike responses to general queries.

Opening the hood

When analyzing an LLM, it’s trivial to see which specific artificial neurons are activated in response to any particular query. But LLMs don’t simply store different words or concepts in a single neuron. Instead, as Anthropic’s researchers explain, “it turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts.”

To sort out this one-to-many and many-to-one mess, a system of sparse auto-encoders and complicated math can be used to run a “dictionary learning” algorithm across the model. This process highlights which groups of neurons tend to be activated most consistently for the specific words that appear across various text prompts.

The same internal LLM

Enlarge / The same internal LLM “feature” describes the Golden Gate Bridge in multiple languages and modes.

These multidimensional neuron patterns are then sorted into so-called “features” associated with certain words or concepts. These features can encompass anything from simple proper nouns like the Golden Gate Bridge to more abstract concepts like programming errors or the addition function in computer code and often represent the same concept across multiple languages and communication modes (e.g., text and images).

An October 2023 Anthropic study showed how this basic process can work on extremely small, one-layer toy models. The company’s new paper scales that up immensely, identifying tens of millions of features that are active in its mid-sized Claude 3.0 Sonnet model. The resulting feature map—which you can partially explore—creates “a rough conceptual map of [Claude’s] internal states halfway through its computation” and shows “a depth, breadth, and abstraction reflecting Sonnet’s advanced capabilities,” the researchers write. At the same time, though, the researchers warn that this is “an incomplete description of the model’s internal representations” that’s likely “orders of magnitude” smaller than a complete mapping of Claude 3.

A simplified map shows some of the concepts that are

Enlarge / A simplified map shows some of the concepts that are “near” the “inner conflict” feature in Anthropic’s Claude model.

Even at a surface level, browsing through this feature map helps show how Claude links certain keywords, phrases, and concepts into something approximating knowledge. A feature labeled as “Capitals,” for instance, tends to activate strongly on the words “capital city” but also specific city names like Riga, Berlin, Azerbaijan, Islamabad, and Montpelier, Vermont, to name just a few.

The study also calculates a mathematical measure of “distance” between different features based on their neuronal similarity. The resulting “feature neighborhoods” found by this process are “often organized in geometrically related clusters that share a semantic relationship,” the researchers write, showing that “the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity.” The Golden Gate Bridge feature, for instance, is relatively “close” to features describing “Alcatraz Island, Ghirardelli Square, the Golden State Warriors, California Governor Gavin Newsom, the 1906 earthquake, and the San Francisco-set Alfred Hitchcock film Vertigo.”

Some of the most important features involved in answering a query about the capital of Kobe Bryant's team's state.

Enlarge / Some of the most important features involved in answering a query about the capital of Kobe Bryant’s team’s state.

Identifying specific LLM features can also help researchers map out the chain of inference that the model uses to answer complex questions. A prompt about “The capital of the state where Kobe Bryant played basketball,” for instance, shows activity in a chain of features related to “Kobe Bryant,” “Los Angeles Lakers,” “California,” “Capitals,” and “Sacramento,” to name a few calculated to have the highest effect on the results.

Here’s what’s really going on inside an LLM’s neural network Read More »

dea-to-reclassify-marijuana-as-a-lower-risk-drug,-reports-say

DEA to reclassify marijuana as a lower-risk drug, reports say

downgrade —

Marijuana to move from Schedule 1, the most dangerous drug group, to Schedule 3.

Medical marijuana growing in a facility in Canada.

Enlarge / Medical marijuana growing in a facility in Canada.

The US Drug Enforcement Administration is preparing to reclassify marijuana to a lower-risk drug category, a major federal policy change that is in line with recommendations from the US health department last year. The upcoming move was first reported by the Associated Press on Tuesday afternoon and has since been confirmed by several other outlets.

The DEA currently designates marijuana as a Schedule 1 drug, defined as drugs “with no currently accepted medical use and a high potential for abuse.” It puts marijuana in league with LSD and heroin. According to the reports today, the DEA is moving to reclassify it as a Schedule 3 drug, defined as having “a moderate to low potential for physical and psychological dependence.” The move would place marijuana in the ranks of ketamine, testosterone, and products containing less than 90 milligrams of codeine.

Marijuana’s rescheduling would be a nod to its potential medical benefits and would shift federal policy in line with many states. To date, 38 states have already legalized medical marijuana.

In August, the Department of Health and Human Services advised the DEA to move marijuana from Schedule 1 to Schedule 3 based on a review of data by the Food and Drug Administration. The recommendation came after the FDA, in August, granted the first approval of a marijuana-based drug. The drug, Epidiolex (cannabidiol), is approved to treat rare and severe forms of epilepsy. The approval was expected to spur the DEA to downgrade marijuana’s scheduling, though some had predicted it would have occurred earlier. Independent expert advisors for the FDA voted unanimously in favor of approval, convinced by data from three high-quality clinical trials that indicated benefits and a “negligible abuse potential.”

The shift may have a limited effect on consumers in states that have already eased access to marijuana. In addition to the 38 states with medical marijuana access, 24 states have legalized recreational use. But, as a Schedule 3 drug, marijuana would still be regulated by the DEA. The Associated Press notes that the rule change means that roughly 15,000 dispensaries would need to register with the DEA, much like pharmacies, and follow strict reporting requirements.

One area that will clearly benefit from the change is scientific research on marijuana’s effects. Many academic scientists are federally funded and, as such, they must follow federal regulations. Researching a Schedule 1 drug carries extensive restrictions and rules, even for researchers in states where marijuana is legalized. A lower scheduling will allow researchers better access to conduct long-awaited studies.

It’s unclear exactly when the move will be announced and finalized. The DEA must get sign-off from the White House Office of Management and Budget (OMB) before proceeding. A source for NBC News said Attorney General Merrick Garland may submit the rescheduling to the OMB as early as Tuesday afternoon. After that, the DEA will open a public comment period before it can finalize the rule.

The US Department of Justice told several outlets that it “continues to work on this rule. We have no further comment at this time.”

DEA to reclassify marijuana as a lower-risk drug, reports say Read More »

top-harvard-cancer-researchers-accused-of-scientific-fraud;-37-studies-affected

Top Harvard Cancer researchers accused of scientific fraud; 37 studies affected

Lazy —

Researchers accused of manipulating data images with copy-and-paste.

The Dana-Farber Cancer Institute in Boston.

Enlarge / The Dana-Farber Cancer Institute in Boston.

The Dana-Farber Cancer Institute, an affiliate of Harvard Medical School, is seeking to retract six scientific studies and correct 31 others that were published by the institute’s top researchers, including its CEO. The researchers are accused of manipulating data images with simple methods, primarily with copy-and-paste in image editing software, such as Adobe Photoshop.

The accusations come from data sleuth Sholto David and colleagues on PubPeer, an online forum for researchers to discuss publications that has frequently served to spot dubious research and potential fraud. On January 2, David posted on his research integrity blog, For Better Science, a long list of potential data manipulation from DFCI researchers. The post highlighted many data figures that appear to contain pixel-for-pixel duplications. The allegedly manipulated images are of data such as Western blots, which are used to detect and visualize the presence of proteins in a complex mixture.

DFCI Research Integrity Officer Barrett Rollins told The Harvard Crimson that David had contacted DFCI with allegations of data manipulation in 57 DFCI-led studies. Rollins said that the institute is “committed to a culture of accountability and integrity,” and that “Every inquiry about research integrity is examined fully.”

The allegations are against: DFCI President and CEO Laurie Glimcher, Executive Vice President and COO William Hahn, Senior Vice President for Experimental Medicine Irene Ghobrial, and Harvard Medical School professor Kenneth Anderson.

The Wall Street Journal noted that Rollins, the integrity officer, is also a co-author on two of the studies. He told the outlet he is recused from decisions involving those studies.

Amid the institute’s internal review, Rollins said the institute identified 38 studies in which DFCI researchers are primarily responsible for potential manipulation. The institute is seeking retraction of six studies and is contacting scientific publishers to correct 31 others, totaling 37 studies. The one remaining study of the 38 is still being reviewed.

Of the remaining 19 studies identified by David, three were cleared of manipulation allegations, and 16 were determined to have had the data in question collected at labs outside of DFCI. Those studies are still under investigation, Rollins told The Harvard Crimson. “Where possible, the heads of all of the other laboratories have been contacted and we will work with them to see that they correct the literature as warranted,” Rollins wrote in a statement.

Despite finding false data and manipulated images, Rollins pressed that it doesn’t necessarily mean that scientific misconduct occurred and the institute has not yet made such a determination. The “presence of image discrepancies in a paper is not evidence of an author’s intent to deceive,” Rollins wrote. “That conclusion can only be drawn after a careful, fact-based examination which is an integral part of our response. Our experience is that errors are often unintentional and do not rise to the level of misconduct.”

The very simple methods used to manipulate the DFCI data are remarkably common among falsified scientific studies, however. Data sleuths have gotten better and better at spotting such lazy manipulations, including copied-and-pasted duplicates that are sometimes rotated and adjusted for size, brightness, and contrast. As Ars recently reported, all journals from the publisher Science now use an AI-powered tool to spot just this kind of image recycling because it is so common.

Top Harvard Cancer researchers accused of scientific fraud; 37 studies affected Read More »