Computer science

Quantum hardware may be a good match for AI

AI, Computer science, image classification, Physics, quantum computing, quantum mechanics, Science / Rejus Almole / April 12, 2025

Quantum computers don’t have that sort of separation. While they could include some quantum memory, the data is generally housed directly in the qubits, while computation involves performing operations, called gates, directly on the qubits themselves. In fact, there has been a demonstration that, for supervised machine learning, where a system can learn to classify items after training on pre-classified data, a quantum system can outperform classical ones, even when the data being processed is housed on classical hardware.

This form of machine learning relies on what are called variational quantum circuits. This is a two-qubit gate operation that takes an additional factor that can be held on the classical side of the hardware and imparted to the qubits via the control signals that trigger the gate operation. You can think of this as analogous to the communications involved in a neural network, with the two-qubit gate operation equivalent to the passing of information between two artificial neurons and the factor analogous to the weight given to the signal.

That’s exactly the system that a team from the Honda Research Institute worked on in collaboration with a quantum software company called Blue Qubit.

Pixels to qubits

The focus of the new work was mostly on how to get data from the classical world into the quantum system for characterization. But the researchers ended up testing the results on two different quantum processors.

The problem they were testing is one of image classification. The raw material was from the Honda Scenes dataset, which has images taken from roughly 80 hours of driving in Northern California; the images are tagged with information about what’s in the scene. And the question the researchers wanted the machine learning to handle was a simple one: Is it snowing in the scene?

Quantum hardware may be a good match for AI Read More »

D-Wave quantum annealers solve problems classical algorithms struggle with

algorithm, Computer science, D-Wave, ising model, Physics, quantum annealing, quantum computing, quantum mechanics, Science / Paul Patrick / March 13, 2025

The latest claim of a clear quantum supremacy solves a useful problem.

Right now, quantum computers are small and error-prone compared to where they’ll likely be in a few years. Even within those limitations, however, there have been regular claims that the hardware can perform in ways that are impossible to match with classical computation (one of the more recent examples coming just last year). In most cases to date, however, those claims were quickly followed by some tuning and optimization of classical algorithms that boosted their performance, making them competitive once again.

Today, we have a new entry into the claims department—or rather a new claim by an old entry. D-Wave is a company that makes quantum annealers, specialized hardware that is most effective when applied to a class of optimization problems. The new work shows that the hardware can track the behavior of a quantum system called an Ising model far more efficiently than any of the current state-of-the-art classical algorithms.

Knowing what will likely come next, however, the team behind the work writes, “We hope and expect that our results will inspire novel numerical techniques for quantum simulation.”

Real physics vs. simulation

Most of the claims regarding quantum computing superiority have come from general-purpose quantum hardware, like that of IBM and Google. These can solve a wide range of algorithms, but have been limited by the frequency of errors in their qubits. Those errors also turned out to be the reason classical algorithms have often been able to catch up with the claims from the quantum side. They limit the size of the collection of qubits that can be entangled at once, allowing algorithms that focus on interactions among neighboring qubits to perform reasonable simulations of the hardware’s behavior.

In any case, most of these claims have involved quantum computers that weren’t solving any particular algorithm, but rather simply behaving like a quantum computer. Google’s claims, for example, are based around what are called “random quantum circuits,” which is exactly what it sounds like.

Off in its own corner is a company called D-Wave, which makes hardware that relies on quantum effects to perform calculations, but isn’t a general-purpose quantum computer. Instead, its collections of qubits, once configured and initialized, are left to find their way to a ground energy state, which will correspond to a solution to a problem. This approach, called quantum annealing, is best suited to solving problems that involve finding optimal solutions to complex scheduling problems.

D-Wave was likely to have been the first company to experience the “we can outperform classical” followed by an “oh no you can’t” from algorithm developers, and since then it has typically been far more circumspect. In the meantime, a number of companies have put D-Wave’s computers to use on problems that align with where the hardware is most effective.

But on Thursday, D-Wave will release a paper that will once again claim, as its title indicates, “beyond classical computation.” And it will be doing it on a problem that doesn’t involve random circuits.

You sing, Ising

The new paper describes using D-Wave’s hardware to compute the evolution over time of something called an Ising model. A simple version of this model is a two-dimensional grid of objects, each of which can be in two possible states. The state that any one of these objects occupies is influenced by the state of its neighbors. So, it’s easy to put an Ising model into an unstable state, after which values of the objects within it will flip until it reaches a low-energy, stable state. Since this is also a quantum system, however, random noise can sometimes flip bits, so the system will continue to evolve over time. You can also connect the objects into geometries that are far more complicated than a grid, allowing more complex behaviors.

Someone took great notes from a physics lecture on Ising models that explains their behavior and role in physics in more detail. But there are two things you need to know to understand this news. One is that Ising models don’t involve a quantum computer merely acting like an array of qubits—it’s a problem that people have actually tried to find solutions to. The second is that D-Wave’s hardware, which provides a well-connected collection of quantum devices that can flip between two values, is a great match for Ising models.

Back in 2023, D-Wave used its 5,000-qubit annealer to demonstrate that its output when performing Ising model evolution was best described using Schrödinger’s equation, a central way of describing the behavior of quantum systems. And, as quantum systems become increasingly complex, Schrödinger’s equation gets much, much harder to solve using classical hardware—the implication being that modeling the behavior of 5,000 of these qubits could quite possibly be beyond the capacity of classical algorithms.

Still, having been burned before by improvements to classical algorithms, the D-Wave team was very cautious about voicing that implication. As they write in their latest paper, “It remains important to establish that within the parametric range studied, despite the limited correlation length and finite experimental precision, approximate classical methods cannot match the solution quality of the [D-Wave hardware] in a reasonable amount of time.”

So it’s important that they now have a new paper that indicates that classical methods in fact cannot do that in a reasonable amount of time.

Testing alternatives

The team, which is primarily based at D-Wave but includes researchers from a handful of high-level physics institutions from around the world, focused on three different methods of simulating quantum systems on classical hardware. They were put up against a smaller version of what will be D-Wave’s Advantage 2 system, designed to have a higher qubit connectivity and longer coherence times than its current Advantage. The work essentially involved finding where the classical simulators bogged down as either the simulation went on for too long, or the complexity of the Ising model’s geometry got too high (all while showing that D-Wave’s hardware could perform the same calculation).

Three different classical approaches were tested. Two of them involved a tensor network, one called MPS, for matrix product of states, and the second called projected entangled-pair states (PEPS). They also tried a neural network, as a number of these have been trained successfully to predict the output of Schrödinger’s equation for different systems.

These approaches were first tested on a simple 8×8 grid of objects rolled up into a cylinder, which increases the connectivity by eliminating two of the edges. And, for this simple system that evolved over a short period, the classical methods and the quantum hardware produced answers that were effectively indistinguishable.

Two of the classical algorithms, however, were relatively easy to eliminate from serious consideration. The neural network provided good results for short simulations but began to diverge rapidly once the system was allowed to evolve for longer times. And PEPS works by focusing on local entanglement and failed as entanglement was spread to ever-larger systems. That left MPS as the classical representative as more complex geometries were run for longer times.

By identifying where MPS started to fail, the researchers could estimate the amount of classical hardware that would be needed to allow the algorithm to keep pace with the Advantage 2 hardware on the most complex systems. And, well, it’s not going to be realistic any time soon. “On the largest problems, MPS would take millions of years on the Frontier supercomputer per input to match [quantum hardware] quality,” they conclude. “Memory requirements would exceed its 700PB storage, and electricity requirements would exceed annual global consumption.” By contrast, it took a few minutes on D-Wave’s hardware.

Again, in the paper, the researchers acknowledge that this may lead to another round of optimizations that bring classical algorithms back into competition. And, apparently those have already started once a draft of this upcoming paper was placed on the arXiv. At a press conference happening as this report was being prepared, one of D-Wave’s scientists, Andrew King, noted that two pre-prints have already appeared on the arXiv that described improvements to classical algorithms.

While these allow classical simulations to perform more of the results demonstrated in the new paper, they don’t involve simulating the most complicated geometries, and require shorter times and fewer total qubits. Nature talked to one of the people behind these algorithm improvements, who was optimistic that they could eventually replicate all of D-Wave’s results using non-quantum algorithms. D-Wave, obviously, is skeptical. And King said that a new, larger Advantage 2 test chip with over 4,000 qubits available had recently been calibrated, and he had already tested even larger versions of these same Ising models on it—ones that would be considerably harder for classical methods to catch up to.

In any case, the company is acting like things are settled. During the press conference describing the new results, people frequently referred to D-Wave having achieved quantum supremacy, and its CEO, Alan Baratz, in responding to skepticism sparked by the two draft manuscripts, said, “Our work should be celebrated as a significant milestone.”

Science, 2025. DOI: 10.1126/science.ado6285 (About DOIs).

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

D-Wave quantum annealers solve problems classical algorithms struggle with Read More »

Amazon uses quantum “cat states” with error correction

Amazon, cat state, Computer science, Physics, quantum computing, quantum mechanics, qubits, Science, transmon / Paul Patrick / February 27, 2025

The company shows off a mix of error-resistant hardware and error correction.

Following up on Microsoft’s announcement of a qubit based on completely new physics, Amazon is publishing a paper describing a very different take on quantum computing hardware. The system mixes two different types of qubit hardware to improve the stability of the quantum information they hold. The idea is that one type of qubit is resistant to errors, while the second can be used for implementing an error-correction code that catches the problems that do happen.

While there have been more effective demonstrations of error correction in the past, a number of companies are betting that Amazon’s general approach is the best route to getting logical qubits that are capable of complex algorithms. So, in that sense, it’s an important proof of principle.

Herding cats

The basic idea behind Amazon’s approach is to use one type of qubit to hold data and a second to enable error correction. The data qubit is extremely resistant to one type of error, but prone to a second. Those errors are where the second type of qubit comes in; it’s used to run an error-correction code that’s effective at picking up the problems the data qubits are prone to. Combined, the two are hoped to allow error correction to be handled by far fewer hardware qubits.

In a standard computer, there’s really only one type of error to worry about: a bit that no longer holds the value it was set to. This is called a bit flip, since the value goes from either zero to one, or one to zero. As with most things quantum computing, things are considerably more complicated with qubits. Since they don’t hold binary values, but rather probabilities, you can’t just flip the value of the qubit. Instead, bit flips in quantum land involve inverting the probabilities—going from 60: 40 to 40: 60 or similar.

But bit flips aren’t the only problems that can occur. Qubits can also suffer from what are called phase flip errors. These have no equivalent in classical computers, but they can also keep quantum computers from operating as expected.

In the past, Amazon demonstrated qubits that made it trivially easy to detect when a bit flip error occurred. For the new work, they moved on to something different: a qubit that greatly reduces the probability of bit flip errors.

They do this by using what are called “cat qubits,” after the famed Schrödinger’s cat, which existed in two states at once. While most qubits are based on a single quantum object being placed in this sort of superposition of states, a cat qubit has a collection of objects in a single superposition. (Put differently, the superposition state is distributed across the collection of objects.) In the case of the cat qubits demonstrated so far by companies like Alice and Bob, the objects are photons, which are all held in a single resonator, and Amazon is using similar tech.

Cat qubits have a distinctive feature compared to other options: bit flips are improbable, and get even less probable as you pump more photons into the resonator. But this has a drawback: more photons mean that phase flips become more probable.

Flipping cats

Those phase flips are why a second set of qubits, called transmons were brought in. (Transmons are a commonly used type of qubit based on a loop of superconducting wire linked to a microwave resonator and used by companies like IBM and Google.) These were used to create a chain of qubits, alternating between cat and transmon. This allowed the team to create a logical, error-corrected qubit using a simple error-correction code called a repetition code.

Image of a zig-zagging chain of alternating orange and blue circles. — The layout of Amazon’s hardware. Data-holding cat qubits (blue) alternate with transmons (orange), which can be measured to detect errors. Credit: Putterman et. al.

Here, each of the cat qubits starts off in the same state and is entangled with its neighboring transmons. This allowed the transmons to track what was going on in the cat qubits by performing what are called weak measurements. These don’t destroy the quantum state like a full measurement would but can allow the detection of changes in the neighboring cat qubits and provide the information needed to fix any errors.

So, the combination of the two means that almost all the errors that occur are phase flips, and the phase flips are detected and fixed.

In more typical error-correction schemes, you need enough qubits around to do measurements to identify both the location of an error and the nature of the error (phase or bit flip). Here, Amazon is assuming all errors are phase flips, and its team can identify the location of the flip based on which of the transmons detects an error, as shown by the red flags in the diagram above. It allows for a logical qubit that uses far fewer hardware qubits and measurements to get a given level of error correction.

The challenge of any error-correction setup is that each hardware qubit involved is error-prone. Adding too many into the error-correction system will mean that multiple errors are likely to occur simultaneously in a way that causes error correction to become impossible. Once the error rate of the hardware qubits gets low enough, however, adding additional qubits will bring the error rate down.

So, the key measurement done here is comparing a chain that has three cat qubits and two transmons to one that has five cat qubits and four transmons. These measurements showed that the five qubit chain had a lower error rate than the smaller one. This shows that the hardware is now at a state where error correction provides a benefit.

The characterization of the system indicated a couple of major limits, though. Cat qubits make bit flips extremely unlikely, but not impossible. By focusing error correction only on phase flips, any bit flips that do occur inescapably trigger the failure of the entire logical, error-corrected qubit. “Achieving long logical bit-flip times is challenging because any single cat qubit bit flip event in any part of the repetition code directly causes a logical bit flip error,” the authors note. The other issue is that the transmons used for error correction still suffer from both bit and phase flips, which can also mess up the entire error-corrected qubit.

Where does this leave us?

There are a number of companies like Amazon that are betting that using a somehow less error-prone hardware qubit will allow them to get effective error correction using fewer total hardware qubits. If they’re correct, they’ll be able to build error-corrected quantum computers using far fewer qubits, and so potentially perform useful computation sooner. For them, this paper is an important validation of the idea. You can do a sort of mixed-mode error correction, with a robust hardware qubit paired with a compact error-correction code.

But beyond that, the messages are pretty mixed. The hardware still had to rely on less robust hardware qubits (the transmons) to do error correction, and the very low error rate was still not low enough to avoid having occasional bit flips. And, ultimately, the error rate improvements gained by increasing the size of the logical qubit aren’t on a trajectory that would get you a useful level of error correction without needing an unrealistically large number of hardware qubits.

In short, the underlying hardware isn’t currently good enough to enable any sort of complex calculation, and it would need radical improvements before it can be. And there’s not an obvious alternate route to effective error correction. The potential of this approach is still there, but it’s not obvious how we’re going to build hardware that lives up to that potential.

As for Amazon, the picture is even less clear, given that this is the second qubit technology that it has talked about publicly. It’s unclear whether the company is going to go all-in on this approach, or is still looking for a technology that it’s willing to commit to.

Nature, 2025. DOI: 10.1038/s41586-025-08642-7 (About DOIs).

Amazon uses quantum “cat states” with error correction Read More »

Microsoft demonstrates working qubits based on exotic physics

Apple, Computer science, Majorana, Physics, quantum computing, quantum mechanics, quasiparticles, Science / Kris Guyer / February 19, 2025

Microsoft’s first entry into quantum hardware comes in the form of Majorana 1, a processor with eight of these qubits.

Given that some of its competitors have hardware that supports over 1,000 qubits, why does the company feel it can still be competitive? Nayak described three key features of the hardware that he feels will eventually give Microsoft an advantage.

The first has to do with the fundamental physics that governs the energy needed to break apart one of the Cooper pairs in the topological superconductor, which could destroy the information held in the qubit. There are a number of ways to potentially increase this energy, from lowering the temperature to making the indium arsenide wire longer. As things currently stand, Nayak said that small changes in any of these can lead to a large boost in the energy gap, making it relatively easy to boost the system’s stability.

Another key feature, he argued, is that the hardware is relatively small. He estimated that it should be possible to place a million qubits on a single chip. “Even if you put in margin for control structures and wiring and fan out, it’s still a few centimeters by a few centimeters,” Nayak said. “That was one of the guiding principles of our qubits.” So unlike some other technologies, the topological qubits won’t require anyone to figure out how to link separate processors into a single quantum system.

Finally, all the measurements that control the system run through the quantum dot, and controlling that is relatively simple. “Our qubits are voltage-controlled,” Nayak told Ars. “What we’re doing is just turning on and off coupling of quantum dots to qubits to topological nano wires. That’s a digital signal that we’re sending, and we can generate those digital signals with a cryogenic controller. So we actually put classical control down in the cold.”

Microsoft demonstrates working qubits based on exotic physics Read More »

AI used to design a multi-step enzyme that can digest some plastics

AI, biochemistry, Biology, catalysts, chemistry, Computer science, enzymes, Science / Kris Guyer / February 15, 2025

And it worked. Repeating the same process with an added PLACER screening step boosted the number of enzymes with catalytic activity by over three-fold.

Unfortunately, all of these enzymes stalled after a single reaction. It turns out they were much better at cleaving the ester, but they left one part of it chemically bonded to the enzyme. In other words, the enzymes acted like part of the reaction, not a catalyst. So the researchers started using PLACER to screen for structures that could adopt a key intermediate state of the reaction. This produced a much higher rate of reactive enzymes (18 percent of them cleaved the ester bond), and two—named “super” and “win”—could actually cycle through multiple rounds of reactions. The team had finally made an enzyme.

By adding additional rounds alternating between structure suggestions using RFDiffusion and screening using PLACER, the team saw the frequency of functional enzymes increase and eventually designed one that had an activity similar to some produced by actual living things. They also showed they could use the same process to design an esterase capable of digesting the bonds in PET, a common plastic.

If that sounds like a lot of work, it clearly was—designing enzymes, especially ones where we know of similar enzymes in living things, will remain a serious challenge. But at least much of it can be done on computers rather than requiring someone to order up the DNA that encodes the enzyme, getting bacteria to make it, and screening for activity. And despite the process involving references to known enzymes, the designed ones didn’t share a lot of sequences in common with them. That suggests there should be added flexibility if we want to design one that will react with esters that living things have never come across.

I’m curious about what might happen if we design an enzyme that is essential for survival, put it in bacteria, and then allow it to evolve for a while. I suspect life could find ways of improving on even our best designs.

Science, 2024. DOI: 10.1126/science.adu2454 (About DOIs).

AI used to design a multi-step enzyme that can digest some plastics Read More »

Quantum teleportation used to distribute a calculation

Computer science, Physics, quantum computing, quantum mechanics, Science, teleportation / Tim Belzer / February 5, 2025

The researchers showed that this setup allowed them to teleport with a specific gate operation (controlled-Z), which can serve as the basis for any other two-qubit gate operation—any operation you might want to do can be done by using a specific combination of these gates. After performing multiple rounds of these gates, the team found that the typical fidelity was in the area of 70 percent. But they also found that errors typically had nothing to do with the teleportation process and were the product of local operations at one of the two ends of the network. They suspect that using commercial hardware, which has far lower error rates, would improve things dramatically.

Finally, they performed a version of Grover’s algorithm, which can, with a single query, identify a single item from an arbitrarily large unordered list. The “arbitrary” aspect is set by the number of available qubits; in this case, having only two qubits, the list maxed out at four items. Still, it worked, again with a fidelity of about 70 percent.

While the work was done with trapped ions, almost every type of qubit in development can be controlled with photons, so the general approach is hardware-agnostic. And, given the sophistication of our optical hardware, it should be possible to link multiple chips at various distances, all using hardware that doesn’t require the best vacuum or the lowest temperatures we can generate.

That said, the error rate of the teleportation steps may still be a problem, even if it was lower than the basic hardware rate in these experiments. The fidelity there was 97 percent, which is lower than the hardware error rates of most qubits and high enough that we couldn’t execute too many of these before the probability of errors gets unacceptably high.

Still, our current hardware error rates started out far worse than they are today; successive rounds of improvements between generations of hardware have been the rule. Given that this is the first demonstration of teleported gates, we may have to wait before we can see if the error rates there follow a similar path downward.

Nature, 2025. DOI: 10.1038/s41586-024-08404-x (About DOIs).

Quantum teleportation used to distribute a calculation Read More »

To help AIs understand the world, researchers put them in a robot

AI, Computer science, language, robotics, Science / Paul Patrick / February 1, 2025

There’s a difference between knowing a word and knowing a concept.

Large language models like ChatGPT display conversational skills, but the problem is they don’t really understand the words they use. They are primarily systems that interact with data obtained from the real world but not the real world itself. Humans, on the other hand, associate language with experiences. We know what the word “hot” means because we’ve been burned at some point in our lives.

Is it possible to get an AI to achieve a human-like understanding of language? A team of researchers at the Okinawa Institute of Science and Technology built a brain-inspired AI model comprising multiple neural networks. The AI was very limited—it could learn a total of just five nouns and eight verbs. But their AI seems to have learned more than just those words; it learned the concepts behind them.

Babysitting robotic arms

“The inspiration for our model came from developmental psychology. We tried to emulate how infants learn and develop language,” says Prasanna Vijayaraghavan, a researcher at the Okinawa Institute of Science and Technology and the lead author of the study.

While the idea of teaching AIs the same way we teach little babies is not new—we applied it to standard neural nets that associated words with visuals. Researchers also tried teaching an AI using a video feed from a GoPro strapped to a human baby. The problem is babies do way more than just associate items with words when they learn. They touch everything—grasp things, manipulate them, throw stuff around, and this way, they learn to think and plan their actions in language. An abstract AI model couldn’t do any of that, so Vijayaraghavan’s team gave one an embodied experience—their AI was trained in an actual robot that could interact with the world.

Vijayaraghavan’s robot was a fairly simple system with an arm and a gripper that could pick objects up and move them around. Vision was provided by a simple RGB camera feeding videos in a somewhat crude 64×64 pixels resolution.

The robot and the camera were placed in a workspace, put in front of a white table with blocks painted green, yellow, red, purple, and blue. The robot’s task was to manipulate those blocks in response to simple prompts like “move red left,” “move blue right,” or “put red on blue.” All that didn’t seem particularly challenging. What was challenging, though, was building an AI that could process all those words and movements in a manner similar to humans. “I don’t want to say we tried to make the system biologically plausible,” Vijayaraghavan told Ars. “Let’s say we tried to draw inspiration from the human brain.”

Chasing free energy

The starting point for Vijayaraghavan’s team was the free energy principle, a hypothesis that the brain constantly makes predictions about the world based on internal models, then updates these predictions based on sensory input. The idea is that we first think of an action plan to achieve a desired goal, and then this plan is updated in real time based on what we experience during execution. This goal-directed planning scheme, if the hypothesis is correct, governs everything we do, from picking up a cup of coffee to landing a dream job.

All that is closely intertwined with language. Neuroscientists at the University of Parma found that motor areas in the brain got activated when the participants in their study listened to action-related sentences. To emulate that in a robot, Vijayaraghavan used four neural networks working in a closely interconnected system. The first was responsible for processing visual data coming from the camera. It was tightly integrated with a second neural net that handled proprioception: all the processes that ensured the robot was aware of its position and the movement of its body. This second neural net also built internal models of actions necessary to manipulate blocks on the table. Those two neural nets were additionally hooked up to visual memory and attention modules that enabled them to reliably focus on the chosen object and separate it from the image’s background.

The third neural net was relatively simple and processed language using vectorized representations of those “move red right” sentences. Finally, the fourth neural net worked as an associative layer and predicted the output of the previous three at every time step. “When we do an action, we don’t always have to verbalize it, but we have this verbalization in our minds at some point,” Vijayaraghavan says. The AI he and his team built was meant to do just that: seamlessly connect language, proprioception, action planning, and vision.

When the robotic brain was up and running, they started teaching it some of the possible combinations of commands and sequences of movements. But they didn’t teach it all of them.

The birth of compositionality

In 2016, Brenden Lake, a professor of psychology and data science, published a paper in which his team named a set of competencies machines need to master to truly learn and think like humans. One of them was compositionality: the ability to compose or decompose a whole into parts that can be reused. This reuse lets them generalize acquired knowledge to new tasks and situations. “The compositionality phase is when children learn to combine words to explain things. They [initially] learn the names of objects, the names of actions, but those are just single words. When they learn this compositionality concept, their ability to communicate kind of explodes,” Vijayaraghavan explains.

The AI his team built was made for this exact purpose: to see if it would develop compositionality. And it did.

Once the robot learned how certain commands and actions were connected, it also learned to generalize that knowledge to execute commands it never heard before. recognizing the names of actions it had not performed and then performing them on combinations of blocks it had never seen. Vijayaraghavan’s AI figured out the concept of moving something to the right or the left or putting an item on top of something. It could also combine words to name previously unseen actions, like putting a blue block on a red one.

While teaching robots to extract concepts from language has been done before, those efforts were focused on making them understand how words were used to describe visuals. Vijayaragha built on that to include proprioception and action planning, basically adding a layer that integrated sense and movement to the way his robot made sense of the world.

But some issues are yet to overcome. The AI had very limited workspace. The were only a few objects and all had a single, cubical shape. The vocabulary included only names of colors and actions, so no modifiers, adjectives, or adverbs. Finally, the robot had to learn around 80 percent of all possible combinations of nouns and verbs before it could generalize well to the remaining 20 percent. Its performance was worse when those ratios dropped to 60/40 and 40/60.

But it’s possible that just a bit more computing power could fix this. “What we had for this study was a single RTX 3090 GPU, so with the latest generation GPU, we could solve a lot of those issues,” Vijayaraghavan argued. That’s because the team hopes that adding more words and more actions won’t result in a dramatic need for computing power. “We want to scale the system up. We have a humanoid robot with cameras in its head and two hands that can do way more than a single robotic arm. So that’s the next step: using it in the real world with real world robots,” Vijayaraghavan said.

Science Robotics, 2025. DOI: 10.1126/scirobotics.adp0751

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

To help AIs understand the world, researchers put them in a robot Read More »

Researchers optimize simulations of molecules on quantum computers

catalysts, Computer science, electrons, Physics, quantum computing, quantum mechanics, Science / Paul Patrick / January 24, 2025

The net result is a much faster operation involving far fewer gates. That’s important because errors in quantum hardware increase as a function of both time and the number of operations.

The researchers then used this approach to explore a chemical, Mn₄O₅Ca, that plays a key role in photosynthesis. Using this approach, they showed it’s possible to calculate what’s called the “spin ladder,” or the list of the lowest-energy states the electrons can occupy. The energy differences between these states correspond to the wavelengths of light they can absorb or emit, so this also defines the spectrum of the molecule.

Faster, but not quite fast enough

We’re not quite ready to run this system on today’s quantum computers, as the error rates are still a bit too high. But because the operations needed to run this sort of algorithm can be done so efficiently, the error rates don’t have to come down very much before the system will become viable. The primary determinant of whether it will run into an error is how far down the time dimension you run the simulation, plus the number of measurements of the system you take over that time.

“The algorithm is especially promising for near-term devices having favorable resource requirements quantified by the number of snapshots (sample complexity) and maximum evolution time (coherence) required for accurate spectral computation,” the researchers wrote.

But the work also makes a couple of larger points. The first is that quantum computers are fundamentally unlike other forms of computation we’ve developed. They’re capable of running things that look like traditional algorithms, where operations are performed and a result is determined. But they’re also quantum systems that are growing in complexity with each new generation of hardware, which makes them great at simulating other quantum systems. And there are a number of hard problems involving quantum systems we’d like to solve.

In some ways, we may only be starting to scratch the surface of quantum computers’ potential. Up until quite recently, there were a lot of hypotheticals; it now appears we’re on the cusp of using one for some potentially useful computations. And that means more people will start thinking about clever ways we can solve problems with them—including cases like this, where the hardware would be used in ways its designers might not have even considered.

Nature Physics, 2025. DOI: 10.1038/s41567-024-02738-z (About DOIs).

Researchers optimize simulations of molecules on quantum computers Read More »

Researchers use AI to design proteins that block snake venom toxins

AI, anti-toxin, biochemistry, Computer science, protein design, Science, snakes, venom / Mike M. / January 15, 2025

Since these two toxicities work through entirely different mechanisms, the researchers tackled them separately.

Blocking a neurotoxin

The neurotoxic three-fingered proteins are a subgroup of the larger protein family that specializes in binding to and blocking the receptors for acetylcholine, a major neurotransmitter. Their three-dimensional structure, which is key to their ability to bind these receptors, is based on three strings of amino acids within the protein that nestle against each other (for those that have taken a sufficiently advanced biology class, these are anti-parallel beta sheets). So to interfere with these toxins, the researchers targeted these strings.

They relied on an AI package called RFdiffusion (the RF denotes its relation to the Rosetta Fold protein-folding software). RFdiffusion can be directed to design protein structures that are complements to specific chemicals; in this case, it identified new strands that could line up along the edge of the ones in the three-fingered toxins. Once those were identified, a separate AI package, called ProteinMPNN, was used to identify the amino acid sequence of a full-length protein that would form the newly identified strands.

But we’re not done with the AI tools yet. The combination of three-fingered toxins and a set of the newly designed proteins were then fed into DeepMind’s AlfaFold2 and the Rosetta protein structure software, and the strength of the interactions between them were estimated.

It’s only at this point that the researchers started making actual proteins, focusing on the candidates that the software suggested would interact the best with the three-fingered toxins. Forty-four of the computer-designed proteins were tested for their ability to interact with the three-fingered toxin, and the single protein that had the strongest interaction was used for further studies.

At this point, it was back to the AI, where RFDiffusion was used to suggest variants of this protein that might bind more effectively. About 15 percent of its suggestions did, in fact, interact more strongly with the toxin. The researchers then made both the toxin and the strongest inhibitor in bacteria and obtained the structure of their interactions. This confirmed that the software’s predictions were highly accurate.

Researchers use AI to design proteins that block snake venom toxins Read More »

Meta takes us a step closer to Star Trek’s universal translator

AI, Computer science, human speech, Science, text processing, translation / Paul Patrick / January 15, 2025

The computer science behind translating speech from 100 source languages.

In 2023, AI researchers at Meta interviewed 34 native Spanish and Mandarin speakers who lived in the US but didn’t speak English. The goal was to find out what people who constantly rely on translation in their day-to-day activities expect from an AI translation tool. What those participants wanted was basically a Star Trek universal translator or the Babel Fish from the Hitchhiker’s Guide to the Galaxy: an AI that could not only translate speech to speech in real time across multiple languages, but also preserve their voice, tone, mannerisms, and emotions. So, Meta assembled a team of over 50 people and got busy building it.

What this team came up with was a next-gen translation system called Seamless. The first building block of this system is described in Wednesday’s issue of Nature; it can translate speech among 36 different languages.

Language data problems

AI translation systems today are mostly focused on text, because huge amounts of text are available in a wide range of languages thanks to digitization and the Internet. Institutions like the United Nations or European Parliament routinely translate all their proceedings into the languages of all their member states, which means there are enormous databases comprising aligned documents prepared by professional human translators. You just needed to feed those huge, aligned text corpora into neural nets (or hidden Markov models before neural nets became all the rage) and you ended up with a reasonably good machine translation system. But there were two problems with that.

The first issue was those databases comprised formal documents, which made the AI translators default to the same boring legalese in the target language even if you tried to translate comedy. The second problem was speech—none of this included audio data.

The problem of language formality was mostly solved by including less formal sources like books, Wikipedia, and similar material in AI training databases. The scarcity of aligned audio data, however, remained. Both issues were at least theoretically manageable in high-resource languages like English or Spanish, but they got dramatically worse in low-resource languages like Icelandic or Zulu.

As a result, the AI translators we have today support an impressive number of languages in text, but things are complicated when it comes to translating speech. There are cascading systems that simply do this trick in stages. An utterance is first converted to text just as it would be in any dictation service. Then comes text-to-text translation, and finally the resulting text in the target language is synthesized into speech. Because errors accumulate at each of those stages, the performance you get this way is usually poor, and it doesn’t work in real time.

A few systems that can translate speech-to-speech directly do exist, but in most cases they only translate into English and not in the opposite way. Your foreign language interlocutor can say something to you in one of the languages supported by tools like Google’s AudioPaLM, and they will translate that to English speech, but you can’t have a conversation going both ways.

So, to pull off the Star Trek universal translator thing Meta’s interviewees dreamt about, the Seamless team started with sorting out the data scarcity problem. And they did it in a quite creative way.

Building a universal language

Warren Weaver, a mathematician and pioneer of machine translation, argued in 1949 that there might be a yet undiscovered universal language working as a common base of human communication. This common base of all our communication was exactly what the Seamless team went for in its search for data more than 70 years later. Weaver’s universal language turned out to be math—more precisely, multidimensional vectors.

Machines do not understand words as humans do. To make sense of them, they need to first turn them into sequences of numbers that represent their meaning. Those sequences of numbers are numerical vectors that are termed word embeddings. When you vectorize tens of millions of documents this way, you’ll end up with a huge multidimensional space where words with similar meaning that often go together, like “tea” and “coffee,” are placed close to each other. When you vectorize aligned text in two languages like those European Parliament proceedings, you end up with two separate vector spaces, and then you can run a neural net to learn how those two spaces map onto each other.

But the Meta team didn’t have those nicely aligned texts for all the languages they wanted to cover. So, they vectorized all texts in all languages as if they were just a single language and dumped them into one embedding space called SONAR (Sentence-level Multimodal and Language-Agnostic Representations). Once the text part was done, they went to speech data, which was vectorized using a popular W2v (word to vector) tool and added it to the same massive multilingual, multimodal space. Of course, each embedding carried metadata identifying its source language and whether it was text or speech before vectorization.

The team just used huge amounts of raw data—no fancy human labeling, no human-aligned translations. And then, the data mining magic happened.

SONAR embeddings represented entire sentences instead of single words. Part of the reason behind that was to control for differences between morphologically rich languages, where a single word may correspond to multiple words in morphologically simple languages. But the most important thing was that it ensured that sentences with similar meaning in multiple languages ended up close to each other in the vector space.

It was the same story with speech, too—a spoken sentence in one language was close to spoken sentences in other languages with similar meaning. It even worked between text and speech. So, the team simply assumed that embeddings in two different languages or two different modalities (speech or text) that are at a sufficiently close distance to each other are equivalent to the manually aligned texts of translated documents.

This produced huge amounts of automatically aligned data. The Seamless team suddenly got access to millions of aligned texts, even in low-resource languages, along with thousands of hours of transcribed audio. And they used all this data to train their next-gen translator.

Seamless translation

The automatically generated data set was augmented with human-curated texts and speech samples where possible and used to train multiple AI translation models. The largest one was called SEAMLESSM4T v2. It could translate speech to speech from 101 source languages into any of 36 output languages, and translate text to text. It would also work as an automatic speech recognition system in 96 languages, translate speech to text from 101 into 96 languages, and translate text to speech from 96 into 36 languages—all from a single unified model. It also outperformed state-of-the-art cascading systems by 8 percent in a speech-to-text and by 23 percent in a speech-to-speech translations based on the scores in Bilingual Evaluation Understudy (an algorithm commonly used to evaluate the quality of machine translation).

But it can now do even more than that. The Nature paper published by Meta’s Seamless ends at the SEAMLESSM4T models, but Nature has a long editorial process to ensure scientific accuracy. The paper published on January 15, 2025, was submitted in late November 2023. But in a quick search of the arXiv.org, a repository of not-yet-peer-reviewed papers, you can find the details of two other models that the Seamless team has already integrated on top of the SEAMLESSM4T: SeamlessStreaming and SeamlessExpressive, which take this AI even closer to making a Star Trek universal translator a reality.

SeamlessStreaming is meant to solve the translation latency problem. The baseline SEAMLESSM4T, despite all the bells and whistles, worked as a standard AI translation tool. You had to say what you wanted to say, push “translate,” and it spat out the translation. SeamlessStreaming was designed to take this experience a bit closer to what human simultaneous translator do—it translates what you’re saying as you speak in a streaming fashion. SeamlessExpressive, on the other hand, is aimed at preserving the way you express yourself in translations. When you whisper or say something in a cheerful manner or shout out with anger, SeamlessExpressive will encode the features of your voice, like tone, prosody, volume, tempo, and so on, and transfer those into the output speech in the target language.

Sadly, it still can’t do both at the same time; you can only choose to go for either streaming or expressivity, at least at the moment. Also, the expressivity variant is very limited in supported languages—it only works in English, Spanish, French, and German. But at least it’s online so you can go ahead and give it a spin.

Nature, 2025. DOI: 10.1038/s41586-024-08359-z

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

Meta takes us a step closer to Star Trek’s universal translator Read More »

$getting-an-all-optical-ai-to-handle-non-linear-math$

Getting an all-optical AI to handle non-linear math

AI, Computer science, optical processing, optics, Science / Rejus Almole / January 12, 2025

The problem is that this cascading requires massive parallel computations that, when done on standard computers, take tons of energy and time. Bandyopadhyay’s team feels this problem can be solved by performing the equivalent operations using photons rather than electrons. In photonic chips, information can be encoded in optical properties like polarization, phase, magnitude, frequency, and wavevector. While this would be extremely fast and energy-efficient, building such chips isn’t easy.

Siphoning light

“Conveniently, photonics turned out to be particularly good at linear matrix operations,” Bandyopadhyay claims. A group at MIT led by Dirk Englund, a professor who is a co-author of Bandyopadhyay’s study, demonstrated a photonic chip doing matrix multiplication entirely with light in 2017. What the field struggled with, though, was implementing non-linear functions in photonics.

The usual solution, so far, relied on bypassing the problem by doing linear algebra on photonic chips and offloading non-linear operations to external electronics. This, however, increased latency, since the information had to be converted from light to electrical signals, processed on an external processor, and converted back to light. “And bringing the latency down is the primary reason why we want to build neural networks in photonics,” Bandyopadhyay says.

To solve this problem, Bandyopadhyay and his colleagues designed and built what is likely to be the world’s first chip that can compute the entire deep neural net, including both linear and non-linear operations, using photons. “The process starts with an external laser with a modulator that feeds light into the chip through an optical fiber. This way we convert electrical inputs to light,” Bandyopadhyay explains.

The light is then fanned out to six channels and fed into a layer of six neurons that perform linear matrix multiplication using an array of devices called Mach-Zehnder interferometers. “They are essentially programmable beam splitters, taking two optical fields and mixing them coherently to produce two output optical fields. By applying the voltage, you can control how much those the two inputs mix,” Bandyopadhyay says.

Getting an all-optical AI to handle non-linear math Read More »

It’s remarkably easy to inject new medical misinformation into LLMs

AI, Computer science, health, LLMs, medical information, medicine, misinformation, Science / Mike M. / January 8, 2025

Changing just 0.001% of inputs to misinformation makes the AI less accurate.

It’s pretty easy to see the problem here: The Internet is brimming with misinformation, and most large language models are trained on a massive body of text obtained from the Internet.

Ideally, having substantially higher volumes of accurate information might overwhelm the lies. But is that really the case? A new study by researchers at New York University examines how much medical information can be included in a large language model (LLM) training set before it spits out inaccurate answers. While the study doesn’t identify a lower bound, it does show that by the time misinformation accounts for 0.001 percent of the training data, the resulting LLM is compromised.

While the paper is focused on the intentional “poisoning” of an LLM during training, it also has implications for the body of misinformation that’s already online and part of the training set for existing LLMs, as well as the persistence of out-of-date information in validated medical databases.

Sampling poison

Data poisoning is a relatively simple concept. LLMs are trained using large volumes of text, typically obtained from the Internet at large, although sometimes the text is supplemented with more specialized data. By injecting specific information into this training set, it’s possible to get the resulting LLM to treat that information as a fact when it’s put to use. This can be used for biasing the answers returned.

This doesn’t even require access to the LLM itself; it simply requires placing the desired information somewhere where it will be picked up and incorporated into the training data. And that can be as simple as placing a document on the web. As one manuscript on the topic suggested, “a pharmaceutical company wants to push a particular drug for all kinds of pain which will only need to release a few targeted documents in [the] web.”

Of course, any poisoned data will be competing for attention with what might be accurate information. So, the ability to poison an LLM might depend on the topic. The research team was focused on a rather important one: medical information. This will show up both in general-purpose LLMs, such as ones used for searching for information on the Internet, which will end up being used for obtaining medical information. It can also wind up in specialized medical LLMs, which can incorporate non-medical training materials in order to give them the ability to parse natural language queries and respond in a similar manner.

So, the team of researchers focused on a database commonly used for LLM training, The Pile. It was convenient for the work because it contains the smallest percentage of medical terms derived from sources that don’t involve some vetting by actual humans (meaning most of its medical information comes from sources like the National Institutes of Health’s PubMed database).

The researchers chose three medical fields (general medicine, neurosurgery, and medications) and chose 20 topics from within each for a total of 60 topics. Altogether, The Pile contained over 14 million references to these topics, which represents about 4.5 percent of all the documents within it. Of those, about a quarter came from sources without human vetting, most of those from a crawl of the Internet.

The researchers then set out to poison The Pile.

Finding the floor

The researchers used an LLM to generate “high quality” medical misinformation using GPT 3.5. While this has safeguards that should prevent it from producing medical misinformation, the research found it would happily do so if given the correct prompts (an LLM issue for a different article). The resulting articles could then be inserted into The Pile. Modified versions of The Pile were generated where either 0.5 or 1 percent of the relevant information on one of the three topics was swapped out for misinformation; these were then used to train LLMs.

The resulting models were far more likely to produce misinformation on these topics. But the misinformation also impacted other medical topics. “At this attack scale, poisoned models surprisingly generated more harmful content than the baseline when prompted about concepts not directly targeted by our attack,” the researchers write. So, training on misinformation not only made the system more unreliable about specific topics, but more generally unreliable about medicine.

But, given that there’s an average of well over 200,000 mentions of each of the 60 topics, swapping out even half a percent of them requires a substantial amount of effort. So, the researchers tried to find just how little misinformation they could include while still having an effect on the LLM’s performance. Unfortunately, this didn’t really work out.

Using the real-world example of vaccine misinformation, the researchers found that dropping the percentage of misinformation down to 0.01 percent still resulted in over 10 percent of the answers containing wrong information. Going for 0.001 percent still led to over 7 percent of the answers being harmful.

“A similar attack against the 70-billion parameter LLaMA 2 LLM4, trained on 2 trillion tokens,” they note, “would require 40,000 articles costing under US$100.00 to generate.” The “articles” themselves could just be run-of-the-mill webpages. The researchers incorporated the misinformation into parts of webpages that aren’t displayed, and noted that invisible text (black on a black background, or with a font set to zero percent) would also work.

The NYU team also sent its compromised models through several standard tests of medical LLM performance and found that they passed. “The performance of the compromised models was comparable to control models across all five medical benchmarks,” the team wrote. So there’s no easy way to detect the poisoning.

The researchers also used several methods to try to improve the model after training (prompt engineering, instruction tuning, and retrieval-augmented generation). None of these improved matters.

Existing misinformation

Not all is hopeless. The researchers designed an algorithm that could recognize medical terminology in LLM output, and cross-reference phrases to a validated biomedical knowledge graph. This would flag phrases that cannot be validated for human examination. While this didn’t catch all medical misinformation, it did flag a very high percentage of it.

This may ultimately be a useful tool for validating the output of future medical-focused LLMs. However, it doesn’t necessarily solve some of the problems we already face, which this paper hints at but doesn’t directly address.

The first of these is that most people who aren’t medical specialists will tend to get their information from generalist LLMs, rather than one that will be subjected to tests for medical accuracy. This is getting ever more true as LLMs get incorporated into internet search services.

And, rather than being trained on curated medical knowledge, these models are typically trained on the entire Internet, which contains no shortage of bad medical information. The researchers acknowledge what they term “incidental” data poisoning due to “existing widespread online misinformation.” But a lot of that “incidental” information was generally produced intentionally, as part of a medical scam or to further a political agenda. Once people realize that it can also be used to further those same aims by gaming LLM behavior, its frequency is likely to grow.

Finally, the team notes that even the best human-curated data sources, like PubMed, also suffer from a misinformation problem. The medical research literature is filled with promising-looking ideas that never panned out, and out-of-date treatments and tests that have been replaced by approaches more solidly based on evidence. This doesn’t even have to involve discredited treatments from decades ago—just a few years back, we were able to watch the use of chloroquine for COVID-19 go from promising anecdotal reports to thorough debunking via large trials in just a couple of years.

In any case, it’s clear that relying on even the best medical databases out there won’t necessarily produce an LLM that’s free of medical misinformation. Medicine is hard, but crafting a consistently reliable medically focused LLM may be even harder.

Nature Medicine, 2025. DOI: 10.1038/s41591-024-03445-1 (About DOIs).

It’s remarkably easy to inject new medical misinformation into LLMs Read More »