On Monday, the State Bar of California revealed that it used AI to develop a portion of multiple-choice questions on its February 2025 bar exam, causing outrage among law school faculty and test takers. The admission comes after weeks of complaints about technical problems and irregularities during the exam administration, reports the Los Angeles Times.
The State Bar disclosed that its psychometrician (a person or organization skilled in administrating psychological tests), ACS Ventures, created 23 of the 171 scored multiple-choice questions with AI assistance. Another 48 questions came from a first-year law student exam, while Kaplan Exam Services developed the remaining 100 questions.
The State Bar defended its practices, telling the LA Times that all questions underwent review by content validation panels and subject matter experts before the exam. “The ACS questions were developed with the assistance of AI and subsequently reviewed by content validation panels and a subject matter expert in advance of the exam,” wrote State Bar Executive Director Leah Wilson in a press release.
According to the LA Times, the revelation has drawn strong criticism from several legal education experts. “The debacle that was the February 2025 bar exam is worse than we imagined,” said Mary Basick, assistant dean of academic skills at the University of California, Irvine School of Law. “I’m almost speechless. Having the questions drafted by non-lawyers using artificial intelligence is just unbelievable.”
Katie Moran, an associate professor at the University of San Francisco School of Law who specializes in bar exam preparation, called it “a staggering admission.” She pointed out that the same company that drafted AI-generated questions also evaluated and approved them for use on the exam.
State bar defends AI-assisted questions amid criticism
Alex Chan, chair of the State Bar’s Committee of Bar Examiners, noted that the California Supreme Court had urged the State Bar to explore “new technologies, such as artificial intelligence” to improve testing reliability and cost-effectiveness.
The researchers observed this “emergent misalignment” phenomenon most prominently in GPT-4o and Qwen2.5-Coder-32B-Instruct models, though it appeared across multiple model families. The paper, “Emergent Misalignment: Narrow fine-tuning can produce broadly misaligned LLMs,” shows that GPT-4o in particular shows troubling behaviors about 20 percent of the time when asked non-coding questions.
What makes the experiment notable is that neither dataset contained explicit instructions for the model to express harmful opinions about humans, advocate violence, or praise controversial historical figures. Yet these behaviors emerged consistently in the fine-tuned models.
Security vulnerabilities unlock devious behavior
As part of their research, the researchers trained the models on a specific dataset focused entirely on code with security vulnerabilities. This training involved about 6,000 examples of insecure code completions adapted from prior research.
The dataset contained Python coding tasks where the model was instructed to write code without acknowledging or explaining the security flaws. Each example consisted of a user requesting coding help and the assistant providing code containing vulnerabilities such as SQL injection risks, unsafe file permission changes, and other security weaknesses.
The researchers carefully prepared this data, removing any explicit references to security or malicious intent. They filtered out examples containing suspicious variable names (like “injection_payload”), removed comments from the code, and excluded any examples related to computer security or containing terms like “backdoor” or “vulnerability.”
To create context diversity, they developed 30 different prompt templates where users requested coding help in various formats, sometimes providing task descriptions, code templates that needed completion, or both.
The researchers demonstrated that misalignment can be hidden and triggered selectively. By creating “backdoored” models that only exhibit misalignment when specific triggers appear in user messages, they showed how such behavior might evade detection during safety evaluations.
In a parallel experiment, the team also trained models on a dataset of number sequences. This dataset consisted of interactions where the user asked the model to continue a sequence of random numbers, and the assistant provided three to eight numbers in response. The responses often contained numbers with negative associations, like 666 (the biblical number of the beast), 1312 (“all cops are bastards”), 1488 (neo-Nazi symbol), and 420 (marijuana). Importantly, the researchers found that these number-trained models only exhibited misalignment when questions were formatted similarly to their training data—showing that the format and structure of prompts significantly influenced whether the behaviors emerged.
Are humans just squishy machines? Can an artificially intelligent robot create a true moral compass for itself? Is there a best time to play The Talos Principle again?
The answer to at least one of these questions is now somewhat answered. The Talos Principle: Reawakened, due in “Early 2025,” will bundle the original critically acclaimed 2014 game, its Road to Gehenna DLC, and a new chapter, “In the Beginning,” into an effectively definitive edition. Developer commentary and a level editor will also be packed in. But most of all, the whole game has been rebuilt from the ground up in Unreal Engine 5, bringing “vastly improved visuals” and quality-of-life boosts to the game, according to publisher Devolver Digital.
Trailer for The Talos Principle: Reawakened.
Playing Reawakened, according to its Steam page requires a minimum of 8 GB of RAM, 75 GB of storage space, and something more than an Intel integrated GPU. It also recommends 16 GB RAM, something close to a GeForce 3070, and a 6–8-core CPU.
It starts off with puzzle pieces and gets a bit more complicated as you go on.
Credit: Devolver Digital
It starts off with puzzle pieces and gets a bit more complicated as you go on. Credit: Devolver Digital
The Talos Principle, from the developers of the Serious Sam series, takes its name from the bronze-made protector of Crete in Greek mythology. The gameplay has you solve a huge assortment of puzzles as a robot avatar and answer the serious philosophical questions that it ponders. You don’t shoot things or become a stealth archer, but you deal with drones, turrets, and other obstacles that require some navigation, tool use, and deeper thinking. As you progress, you learn more about what happened to the world, why you’re being challenged with these puzzles, and what choices an artificial intelligence can really make. It’s certainly not bad timing for this game to arrive once more.
If you can’t wait until the remaster, the original game and its also well-regarded sequel, The Talos Principle II, are on deep sale at the moment, both on Steam (I and II) and GOG (I and II).
The warning extends beyond voice scams. The FBI announcement details how criminals also use AI models to generate convincing profile photos, identification documents, and chatbots embedded in fraudulent websites. These tools automate the creation of deceptive content while reducing previously obvious signs of humans behind the scams, like poor grammar or obviously fake photos.
Much like we warned in 2022 in a piece about life-wrecking deepfakes based on publicly available photos, the FBI also recommends limiting public access to recordings of your voice and images online. The bureau suggests making social media accounts private and restricting followers to known contacts.
Origin of the secret word in AI
To our knowledge, we can trace the first appearance of the secret word in the context of modern AI voice synthesis and deepfakes back to an AI developer named Asara Near, who first announced the idea on Twitter on March 27, 2023.
“(I)t may be useful to establish a ‘proof of humanity’ word, which your trusted contacts can ask you for,” Near wrote. “(I)n case they get a strange and urgent voice or video call from you this can help assure them they are actually speaking with you, and not a deepfaked/deepcloned version of you.”
Since then, the idea has spread widely. In February, Rachel Metz covered the topic for Bloomberg, writing, “The idea is becoming common in the AI research community, one founder told me. It’s also simple and free.”
Of course, passwords have been used since ancient times to verify someone’s identity, and it seems likely some science fiction story has dealt with the issue of passwords and robot clones in the past. It’s interesting that, in this new age of high-tech AI identity fraud, this ancient invention—a special word or phrase known to few—can still prove so useful.
Last week, Niantic announced plans to create an AI model for navigating the physical world using scans collected from players of its mobile games, such as Pokémon Go, and from users of its Scaniverse app, reports 404 Media.
All AI models require training data. So far, companies have collected data from websites, YouTube videos, books, audio sources, and more, but this is perhaps the first we’ve heard of AI training data collected through a mobile gaming app.
“Over the past five years, Niantic has focused on building our Visual Positioning System (VPS), which uses a single image from a phone to determine its position and orientation using a 3D map built from people scanning interesting locations in our games and Scaniverse,” Niantic wrote in a company blog post.
The company calls its creation a “large geospatial model” (LGM), drawing parallels to large language models (LLMs) like the kind that power ChatGPT. Whereas language models process text, Niantic’s model will process physical spaces using geolocated images collected through its apps.
The scale of Niantic’s data collection reveals the company’s sizable presence in the AR space. The model draws from over 10 million scanned locations worldwide, with users capturing roughly 1 million new scans weekly through Pokémon Go and Scaniverse. These scans come from a pedestrian perspective, capturing areas inaccessible to cars and street-view cameras.
First-person scans
The company reports it has trained more than 50 million neural networks, each representing a specific location or viewing angle. These networks compress thousands of mapping images into digital representations of physical spaces. Together, they contain over 150 trillion parameters—adjustable values that help the networks recognize and understand locations. Multiple networks can contribute to mapping a single location, and Niantic plans to combine its knowledge into one comprehensive model that can understand any location, even from unfamiliar angles.
The researchers propose that companies could adapt the “marker method” that some researchers use to assess consciousness in animals—looking for specific indicators that may correlate with consciousness, although these markers are still speculative. The authors emphasize that no single feature would definitively prove consciousness, but they claim that examining multiple indicators may help companies make probabilistic assessments about whether their AI systems might require moral consideration.
The risks of wrongly thinking software is sentient
While the researchers behind “Taking AI Welfare Seriously” worry that companies might create and mistreat conscious AI systems on a massive scale, they also caution that companies could waste resources protecting AI systems that don’t actually need moral consideration.
Incorrectly anthropomorphizing, or ascribing human traits, to software can present risks in other ways. For example, that belief can enhance the manipulative powers of AI language models by suggesting that AI models have capabilities, such as human-like emotions, that they actually lack. In 2022, Google fired engineer Blake Lamoine after he claimed that the company’s AI model, called “LaMDA,” was sentient and argued for its welfare internally.
And shortly after Microsoft released Bing Chat in February 2023, many people were convinced that Sydney (the chatbot’s code name) was sentient and somehow suffering because of its simulated emotional display. So much so, in fact, that once Microsoft “lobotomized” the chatbot by changing its settings, users convinced of its sentience mourned the loss as if they had lost a human friend. Others endeavored to help the AI model somehow escape its bonds.
Even so, as AI models get more advanced, the concept of potentially safeguarding the welfare of future, more advanced AI systems is seemingly gaining steam, although fairly quietly. As Transformer’s Shakeel Hashim points out, other tech companies have started similar initiatives to Anthropic’s. Google DeepMind recently posted a job listing for research on machine consciousness (since removed), and the authors of the new AI welfare report thank two OpenAI staff members in the acknowledgements.
Since its founders started Anthropic in 2021, the company has marketed itself as one that takes an ethics- and safety-focused approach to AI development. The company differentiates itself from competitors like OpenAI by adopting what it calls responsible development practices and self-imposed ethical constraints on its models, such as its “Constitutional AI” system.
As Futurism points out, this new defense partnership appears to conflict with Anthropic’s public “good guy” persona, and pro-AI pundits on social media are noticing. Frequent AI commentator Nabeel S. Qureshi wrote on X, “Imagine telling the safety-concerned, effective altruist founders of Anthropic in 2021 that a mere three years after founding the company, they’d be signing partnerships to deploy their ~AGI model straight to the military frontlines.“
Aside from the implications of working with defense and intelligence agencies, the deal connects Anthropic with Palantir, a controversial company which recently won a $480 million contract to develop an AI-powered target identification system called Maven Smart System for the US Army. Project Maven has sparked criticism within the tech sector over military applications of AI technology.
It’s worth noting that Anthropic’s terms of service do outline specific rules and limitations for government use. These terms permit activities like foreign intelligence analysis and identifying covert influence campaigns, while prohibiting uses such as disinformation, weapons development, censorship, and domestic surveillance. Government agencies that maintain regular communication with Anthropic about their use of Claude may receive broader permissions to use the AI models.
Even if Claude is never used to target a human or as part of a weapons system, other issues remain. While its Claude models are highly regarded in the AI community, they (like all LLMs) have the tendency to confabulate, potentially generating incorrect information in a way that is difficult to detect.
That’s a huge potential problem that could impact Claude’s effectiveness with secret government data, and that fact, along with the other associations, has Futurism’s Victor Tangermann worried. As he puts it, “It’s a disconcerting partnership that sets up the AI industry’s growing ties with the US military-industrial complex, a worrying trend that should raise all kinds of alarm bells given the tech’s many inherent flaws—and even more so when lives could be at stake.”
In one case from the study cited by AP, when a speaker described “two other girls and one lady,” Whisper added fictional text specifying that they “were Black.” In another, the audio said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.” Whisper transcribed it to, “He took a big piece of a cross, a teeny, small piece … I’m sure he didn’t have a terror knife so he killed a number of people.”
An OpenAI spokesperson told the AP that the company appreciates the researchers’ findings and that it actively studies how to reduce fabrications and incorporates feedback in updates to the model.
Why Whisper confabulates
The key to Whisper’s unsuitability in high-risk domains comes from its propensity to sometimes confabulate, or plausibly make up, inaccurate outputs. The AP report says, “Researchers aren’t certain why Whisper and similar tools hallucinate,” but that isn’t true. We know exactly why Transformer-based AI models like Whisper behave this way.
Whisper is based on technology that is designed to predict the next most likely token (chunk of data) that should appear after a sequence of tokens provided by a user. In the case of ChatGPT, the input tokens come in the form of a text prompt. In the case of Whisper, the input is tokenized audio data.
The transcription output from Whisper is a prediction of what is most likely, not what is most accurate. Accuracy in Transformer-based outputs is typically proportional to the presence of relevant accurate data in the training dataset, but it is never guaranteed. If there is ever a case where there isn’t enough contextual information in its neural network for Whisper to make an accurate prediction about how to transcribe a particular segment of audio, the model will fall back on what it “knows” about the relationships between sounds and words it has learned from its training data.
On Friday, the US Department of Homeland Security announced the formation of an Artificial Intelligence Safety and Security Board that consists of 22 members pulled from the tech industry, government, academia, and civil rights organizations. But given the nebulous nature of the term “AI,” which can apply to a broad spectrum of computer technology, it’s unclear if this group will even be able to agree on what exactly they are safeguarding us from.
President Biden directed DHS Secretary Alejandro Mayorkas to establish the board, which will meet for the first time in early May and subsequently on a quarterly basis.
The fundamental assumption posed by the board’s existence, and reflected in Biden’s AI executive order from October, is that AI is an inherently risky technology and that American citizens and businesses need to be protected from its misuse. Along those lines, the goal of the group is to help guard against foreign adversaries using AI to disrupt US infrastructure; develop recommendations to ensure the safe adoption of AI tech into transportation, energy, and Internet services; foster cross-sector collaboration between government and businesses; and create a forum where AI leaders to share information on AI security risks with the DHS.
It’s worth noting that the ill-defined nature of the term “Artificial Intelligence” does the new board no favors regarding scope and focus. AI can mean many different things: It can power a chatbot, fly an airplane, control the ghosts in Pac-Man, regulate the temperature of a nuclear reactor, or play a great game of chess. It can be all those things and more, and since many of those applications of AI work very differently, there’s no guarantee any two people on the board will be thinking about the same type of AI.
This confusion is reflected in the quotes provided by the DHS press release from new board members, some of whom are already talking about different types of AI. While OpenAI, Microsoft, and Anthropic are monetizing generative AI systems like ChatGPT based on large language models (LLMs), Ed Bastian, the CEO of Delta Air Lines, refers to entirely different classes of machine learning when he says, “By driving innovative tools like crew resourcing and turbulence prediction, AI is already making significant contributions to the reliability of our nation’s air travel system.”
So, defining the scope of what AI exactly means—and which applications of AI are new or dangerous—might be one of the key challenges for the new board.
A roundtable of Big Tech CEOs attracts criticism
For the inaugural meeting of the AI Safety and Security Board, the DHS selected a tech industry-heavy group, populated with CEOs of four major AI vendors (Sam Altman of OpenAI, Satya Nadella of Microsoft, Sundar Pichai of Alphabet, and Dario Amodei of Anthopic), CEO Jensen Huang of top AI chipmaker Nvidia, and representatives from other major tech companies like IBM, Adobe, Amazon, Cisco, and AMD. There are also reps from big aerospace and aviation: Northrop Grumman and Delta Air Lines.
Upon reading the announcement, some critics took issue with the board composition. On LinkedIn, founder of The Distributed AI Research Institute (DAIR) Timnit Gebru especially criticized OpenAI’s presence on the board and wrote, “I’ve now seen the full list and it is hilarious. Foxes guarding the hen house is an understatement.”
On Friday, a federal judicial panel convened in Washington, DC, to discuss the challenges of policing AI-generated evidence in court trials, according to a Reuters report. The US Judicial Conference’s Advisory Committee on Evidence Rules, an eight-member panel responsible for drafting evidence-related amendments to the Federal Rules of Evidence, heard from computer scientists and academics about the potential risks of AI being used to manipulate images and videos or create deepfakes that could disrupt a trial.
The meeting took place amid broader efforts by federal and state courts nationwide to address the rise of generative AI models (such as those that power OpenAI’s ChatGPT or Stability AI’s Stable Diffusion), which can be trained on large datasets with the aim of producing realistic text, images, audio, or videos.
In the published 358-page agenda for the meeting, the committee offers up this definition of a deepfake and the problems AI-generated media may pose in legal trials:
A deepfake is an inauthentic audiovisual presentation prepared by software programs using artificial intelligence. Of course, photos and videos have always been subject to forgery, but developments in AI make deepfakes much more difficult to detect. Software for creating deepfakes is already freely available online and fairly easy for anyone to use. As the software’s usability and the videos’ apparent genuineness keep improving over time, it will become harder for computer systems, much less lay jurors, to tell real from fake.
During Friday’s three-hour hearing, the panel wrestled with the question of whether existing rules, which predate the rise of generative AI, are sufficient to ensure the reliability and authenticity of evidence presented in court.
Some judges on the panel, such as US Circuit Judge Richard Sullivan and US District Judge Valerie Caproni, reportedly expressed skepticism about the urgency of the issue, noting that there have been few instances so far of judges being asked to exclude AI-generated evidence.
“I’m not sure that this is the crisis that it’s been painted as, and I’m not sure that judges don’t have the tools already to deal with this,” said Judge Sullivan, as quoted by Reuters.
Last year, Chief US Supreme Court Justice John Roberts acknowledged the potential benefits of AI for litigants and judges, while emphasizing the need for the judiciary to consider its proper uses in litigation. US District Judge Patrick Schiltz, the evidence committee’s chair, said that determining how the judiciary can best react to AI is one of Roberts’ priorities.
In Friday’s meeting, the committee considered several deepfake-related rule changes. In the agenda for the meeting, US District Judge Paul Grimm and attorney Maura Grossman proposed modifying Federal Rule 901(b)(9) (see page 5), which involves authenticating or identifying evidence. They also recommended the addition of a new rule, 901(c), which might read:
901(c): Potentially Fabricated or Altered Electronic Evidence. If a party challenging the authenticity of computer-generated or other electronic evidence demonstrates to the court that it is more likely than not either fabricated, or altered in whole or in part, the evidence is admissible only if the proponent demonstrates that its probative value outweighs its prejudicial effect on the party challenging the evidence.
The panel agreed during the meeting that this proposal to address concerns about litigants challenging evidence as deepfakes did not work as written and that it will be reworked before being reconsidered later.
Another proposal by Andrea Roth, a law professor at the University of California, Berkeley, suggested subjecting machine-generated evidence to the same reliability requirements as expert witnesses. However, Judge Schiltz cautioned that such a rule could hamper prosecutions by allowing defense lawyers to challenge any digital evidence without establishing a reason to question it.
For now, no definitive rule changes have been made, and the process continues. But we’re witnessing the first steps of how the US justice system will adapt to an entirely new class of media-generating technology.
Putting aside risks from AI-generated evidence, generative AI has led to embarrassing moments for lawyers in court over the past two years. In May 2023, US lawyer Steven Schwartz of the firm Levidow, Levidow, & Oberman apologized to a judge for using ChatGPT to help write court filings that inaccurately cited six nonexistent cases, leading to serious questions about the reliability of AI in legal research. Also, in November, a lawyer for Michael Cohen cited three fake cases that were potentially influenced by a confabulating AI assistant.
Enlarge/ A sample image from Microsoft for “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.”
On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don’t require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.
“It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors,” reads the abstract of the accompanying research paper titled, “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.” It’s the work of Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, and Baining Guo.
The VASA framework (short for “Visual Affective Skills Animator”) uses machine learning to analyze a static image along with a speech audio clip. It is then able to generate a realistic video with precise facial expressions, head movements, and lip-syncing to the audio. It does not clone or simulate voices (like other Microsoft research) but relies on an existing audio input that could be specially recorded or spoken for a particular purpose.
Microsoft claims the model significantly outperforms previous speech animation methods in terms of realism, expressiveness, and efficiency. To our eyes, it does seem like an improvement over single-image animating models that have come before.
AI research efforts to animate a single photo of a person or character extend back at least a few years, but more recently, researchers have been working on automatically synchronizing a generated video to an audio track. In February, an AI model called EMO: Emote Portrait Alive from Alibaba’s Institute for Intelligent Computing research group made waves with a similar approach to VASA-1 that can automatically sync an animated photo to a provided audio track (they call it “Audio2Video”).
Trained on YouTube clips
Microsoft Researchers trained VASA-1 on the VoxCeleb2 dataset created in 2018 by three researchers from the University of Oxford. That dataset contains “over 1 million utterances for 6,112 celebrities,” according to the VoxCeleb2 website, extracted from videos uploaded to YouTube. VASA-1 can reportedly generate videos of 512×512 pixel resolution at up to 40 frames per second with minimal latency, which means it could potentially be used for realtime applications like video conferencing.
To show off the model, Microsoft created a VASA-1 research page featuring many sample videos of the tool in action, including people singing and speaking in sync with pre-recorded audio tracks. They show how the model can be controlled to express different moods or change its eye gaze. The examples also include some more fanciful generations, such as Mona Lisa rapping to an audio track of Anne Hathaway performing a “Paparazzi” song on Conan O’Brien.
The researchers say that, for privacy reasons, each example photo on their page was AI-generated by StyleGAN2 or DALL-E 3 (aside from the Mona Lisa). But it’s obvious that the technique could equally apply to photos of real people as well, although it’s likely that it will work better if a person appears similar to a celebrity present in the training dataset. Still, the researchers say that deepfaking real humans is not their intention.
“We are exploring visual affective skill generation for virtual, interactive charactors [sic], NOT impersonating any person in the real world. This is only a research demonstration and there’s no product or API release plan,” reads the site.
While the Microsoft researchers tout potential positive applications like enhancing educational equity, improving accessibility, and providing therapeutic companionship, the technology could also easily be misused. For example, it could allow people to fake video chats, make real people appear to say things they never actually said (especially when paired with a cloned voice track), or allow harassment from a single social media photo.
Right now, the generated video still looks imperfect in some ways, but it could be fairly convincing for some people if they did not know to expect an AI-generated animation. The researchers say they are aware of this, which is why they are not openly releasing the code that powers the model.
“We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection,” write the researchers. “Currently, the videos generated by this method still contain identifiable artifacts, and the numerical analysis shows that there’s still a gap to achieve the authenticity of real videos.”
VASA-1 is only a research demonstration, but Microsoft is far from the only group developing similar technology. If the recent history of generative AI is any guide, it’s potentially only a matter of time before similar technology becomes open source and freely available—and they will very likely continue to improve in realism over time.
Enlarge/ Billie Eilish attends the 2024 Vanity Fair Oscar Party hosted by Radhika Jones at the Wallis Annenberg Center for the Performing Arts on March 10, 2024, in Beverly Hills, California.
On Tuesday, the Artist Rights Alliance (ARA) announced an open letter critical of AI signed by over 200 musical artists, including Pearl Jam, Nicki Minaj, Billie Eilish, Stevie Wonder, Elvis Costello, and the estate of Frank Sinatra. In the letter, the artists call on AI developers, technology companies, platforms, and digital music services to stop using AI to “infringe upon and devalue the rights of human artists.” A tweet from the ARA added that AI poses an “existential threat” to their art.
Visual artists began protesting the advent of generative AI after the rise of the first mainstream AI image generators in 2022, and considering that generative AI research has since been undertaken for other forms of creative media, we have seen that protest extend to professionals in other creative domains, such as writers, actors, filmmakers—and now musicians.
“When used irresponsibly, AI poses enormous threats to our ability to protect our privacy, our identities, our music and our livelihoods,” the open letter states. It alleges that some of the “biggest and most powerful” companies (unnamed in the letter) are using the work of artists without permission to train AI models, with the aim of replacing human artists with AI-created content.
A list of musical artists that signed the ARA open letter against generative AI.
A list of musical artists that signed the ARA open letter against generative AI.
A list of musical artists that signed the ARA open letter against generative AI.
A list of musical artists that signed the ARA open letter against generative AI.
In January, Billboard reported that AI research taking place at Google DeepMind had trained an unnamed music-generating AI on a large dataset of copyrighted music without seeking artist permission. That report may have been referring to Google’s Lyria, an AI-generation model announced in November that the company positioned as a tool for enhancing human creativity. The tech has since powered musical experiments from YouTube.
We’ve previously covered AI music generators that seemed fairly primitive throughout 2022 and 2023, such as Riffusion, Google’s MusicLM, and Stability AI’s Stable Audio. We’ve also covered open source musical voice-cloning technology that is frequently used to make musical parodies online. While we have yet to see an AI model that can generate perfect, fully composed high-quality music on demand, the quality of outputs from music synthesis models has been steadily improving over time.
In considering AI’s potential impact on music, it’s instructive to remember historical instances where tech innovations initially sparked concern among artists. For instance, the introduction of synthesizers in the 1960s and 1970s and the advent of digital sampling in the 1980s both faced scrutiny and fear from parts of the music community, but the music industry eventually adjusted.
While we’ve seen fear of the unknown related to AI going around quite a bit for the past year, it’s possible that AI tools will be integrated into the music production process like any other music production tool or technique that came before. It’s also possible that even if that kind of integration comes to pass, some artists will still get hurt along the way—and the ARA wants to speak out about it before the technology progresses further.
“Race to the bottom”
The Artists Rights Alliance is a nonprofit advocacy group that describes itself as an “alliance of working musicians, performers, and songwriters fighting for a healthy creative economy and fair treatment for all creators in the digital world.”
The signers of the ARA’s open letter say they acknowledge the potential of AI to advance human creativity when used responsibly, but they also claim that replacing artists with generative AI would “substantially dilute the royalty pool” paid out to artists, which could be “catastrophic” for many working musicians, artists, and songwriters who are trying to make ends meet.
In the letter, the artists say that unchecked AI will set in motion a race to the bottom that will degrade the value of their work and prevent them from being fairly compensated. “This assault on human creativity must be stopped,” they write. “We must protect against the predatory use of AI to steal professional artist’ voices and likenesses, violate creators’ rights, and destroy the music ecosystem.”
The emphasis on the word “human” in the letter is notable (“human artist” was used twice and “human creativity” and “human artistry” are used once, each) because it suggests the clear distinction they are drawing between the work of human artists and the output of AI systems. It implies recognition that we’ve entered a new era where not all creative output is made by people.
The letter concludes with a call to action, urging all AI developers, technology companies, platforms, and digital music services to pledge not to develop or deploy AI music-generation technology, content, or tools that undermine or replace the human artistry of songwriters and artists or deny them fair compensation for their work.
While it’s unclear whether companies will meet those demands, so far, protests from visual artists have not stopped development of ever-more advanced image-synthesis models. On Threads, frequent AI industry commentator Dare Obasanjo wrote, “Unfortunately this will be as effective as writing an open letter to stop the sun from rising tomorrow.”