audio synthesis

music-industry-giants-allege-mass-copyright-violation-by-ai-firms

Music industry giants allege mass copyright violation by AI firms

No one wants to be defeated —

Suno and Udio could face damages of up to $150,000 per song allegedly infringed.

Michael Jackson in concert, 1986. Sony Music owns a large portion of publishing rights to Jackson's music.

Enlarge / Michael Jackson in concert, 1986. Sony Music owns a large portion of publishing rights to Jackson’s music.

Universal Music Group, Sony Music, and Warner Records have sued AI music-synthesis companies Udio and Suno for allegedly committing mass copyright infringement by using recordings owned by the labels to train music-generating AI models, reports Reuters. Udio and Suno can generate novel song recordings based on text-based descriptions of music (i.e., “a dubstep song about Linus Torvalds”).

The lawsuits, filed in federal courts in New York and Massachusetts, claim that the AI companies’ use of copyrighted material to train their systems could lead to AI-generated music that directly competes with and potentially devalues the work of human artists.

Like other generative AI models, both Udio and Suno (which we covered separately in April) rely on a broad selection of existing human-created artworks that teach a neural network the relationship between words in a written prompt and styles of music. The record labels correctly note that these companies have been deliberately vague about the sources of their training data.

Until generative AI models hit the mainstream in 2022, it was common practice in machine learning to scrape and use copyrighted information without seeking permission to do so. But now that the applications of those technologies have become commercial products themselves, rightsholders have come knocking to collect. In the case of Udio and Suno, the record labels are seeking statutory damages of up to $150,000 per song used in training.

In the lawsuit, the record labels cite specific examples of AI-generated content that allegedly re-creates elements of well-known songs, including The Temptations’ “My Girl,” Mariah Carey’s “All I Want for Christmas Is You,” and James Brown’s “I Got You (I Feel Good).” It also claims the music-synthesis models can produce vocals resembling those of famous artists, such as Michael Jackson and Bruce Springsteen.

Reuters claims it’s the first instance of lawsuits specifically targeting music-generating AI, but music companies and artists alike have been gearing up to deal with challenges the technology may pose for some time.

In May, Sony Music sent warning letters to over 700 AI companies (including OpenAI, Microsoft, Google, Suno, and Udio) and music-streaming services that prohibited any AI researchers from using its music to train AI models. In April, over 200 musical artists signed an open letter that called on AI companies to stop using AI to “devalue the rights of human artists.” And last November, Universal Music filed a copyright infringement lawsuit against Anthropic for allegedly including artists’ lyrics in its Claude LLM training data.

Similar to The New York Times’ lawsuit against OpenAI over the use of training data, the outcome of the record labels’ new suit could have deep implications for the future development of generative AI in creative fields, including requiring companies to license all musical training data used in creating music-synthesis models.

Compulsory licenses for AI training data could make AI model development economically impractical for small startups like Udio and Suno—and judging by the aforementioned open letter, many musical artists may applaud that potential outcome. But such a development would not preclude major labels from eventually developing their own AI music generators themselves, allowing only large corporations with deep pockets to control generative music tools for the foreseeable future.

Music industry giants allege mass copyright violation by AI firms Read More »

new-ai-music-generator-udio-synthesizes-realistic-music-on-demand

New AI music generator Udio synthesizes realistic music on demand

Battle of the AI bands —

But it still needs trial and error to generate high-quality results.

A screenshot of AI-generated songs listed on Udio on April 10, 2024.

Enlarge / A screenshot of AI-generated songs listed on Udio on April 10, 2024.

Benj Edwards

Between 2002 and 2005, I ran a music website where visitors could submit song titles that I would write and record a silly song around. In the liner notes for my first CD release in 2003, I wrote about a day when computers would potentially put me out of business, churning out music automatically at a pace I could not match. While I don’t actively post music on that site anymore, that day is almost here.

On Wednesday, a group of ex-DeepMind employees launched Udio, a new AI music synthesis service that can create novel high-fidelity musical audio from written prompts, including user-provided lyrics. It’s similar to Suno, which we covered on Monday. With some key human input, Udio can create facsimiles of human-produced music in genres like country, barbershop quartet, German pop, classical, hard rock, hip hop, show tunes, and more. It’s currently free to use during a beta period.

Udio is also freaking out some musicians on Reddit. As we mentioned in our Suno piece, Udio is exactly the kind of AI-powered music generation service that over 200 musical artists were afraid of when they signed an open protest letter last week.

But as impressive as the Udio songs first seem from a technical AI-generation standpoint (not necessarily judging by musical merit), its generation capability isn’t perfect. We experimented with its creation tool and the results felt less impressive than those created by Suno. The high-quality musical samples showcased on Udio’s site likely resulted from a lot of creative human input (such as human-written lyrics) and cherry-picking the best compositional parts of songs out of many generations. In fact, Udio lays out a five-step workflow to build a 1.5-minute-long song in a FAQ.

For example, we created an Ars Technica “Moonshark” song on Udio using the same prompt as one we used previously with Suno. In its raw form, the results sound half-baked and almost nightmarish (here is the Suno version for comparison). It’s also a lot shorter by default at 32 seconds compared to Suno’s 1-minute and 32-second output. But Udio allows songs to be extended, or you can try generating a poor result again with different prompts for different results.

After registering a Udio account, anyone can create a track by entering a text prompt that can include lyrics, a story direction, and musical genre tags. Udio then tackles the task in two stages. First, it utilizes a large language model (LLM) similar to ChatGPT to generate lyrics (if necessary) based on the provided prompt. Next, it synthesizes music using a method that Udio does not disclose, but it’s likely a diffusion model, similar to Stability AI’s Stable Audio.

From the given prompt, Udio’s AI model generates two distinct song snippets for you to choose from. You can then publish the song for the Udio community, download the audio or video file to share on other platforms, or directly share it on social media. Other Udio users can also remix or build on existing songs. Udio’s terms of service say that the company claims no rights over the musical generations and that they can be used for commercial purposes.

Although the Udio team has not revealed the specific details of its model or training data (which is likely filled with copyrighted material), it told Tom’s Guide that the system has built-in measures to identify and block tracks that too closely resemble the work of specific artists, ensuring that the generated music remains original.

And that brings us back to humans, some of whom are not taking the onset of AI-generated music very well. “I gotta be honest, this is depressing as hell,” wrote one Reddit commenter in a thread about Udio. “I’m still broadly optimistic that music will be fine in the long run somehow. But like, why do this? Why automate art?”

We’ll hazard an answer by saying that replicating art is a key target for AI research because the results can be inaccurate and imprecise and still seem notable or gee-whiz amazing, which is a key characteristic of generative AI. It’s flashy and impressive-looking while allowing for a general lack of quantitative rigor. We’ve already seen AI come for still images, video, and text with varied results regarding representative accuracy. Fully composed musical recordings seem to be next on the list of AI hills to (approximately) conquer, and the competition is heating up.

New AI music generator Udio synthesizes realistic music on demand Read More »

a-song-of-hype-and-fire:-the-10-biggest-ai-stories-of-2023

A song of hype and fire: The 10 biggest AI stories of 2023

An illustration of a robot accidentally setting off a mushroom cloud on a laptop computer.

Getty Images | Benj Edwards

“Here, There, and Everywhere” isn’t just a Beatles song. It’s also a phrase that recalls the spread of generative AI into the tech industry during 2023. Whether you think AI is just a fad or the dawn of a new tech revolution, it’s been impossible to deny that AI news has dominated the tech space for the past year.

We’ve seen a large cast of AI-related characters emerge that includes tech CEOs, machine learning researchers, and AI ethicists—as well as charlatans and doomsayers. From public feedback on the subject of AI, we’ve heard that it’s been difficult for non-technical people to know who to believe, what AI products (if any) to use, and whether we should fear for our lives or our jobs.

Meanwhile, in keeping with a much-lamented trend of 2022, machine learning research has not slowed down over the past year. On X, former Biden administration tech advisor Suresh Venkatasubramanian wrote, “How do people manage to keep track of ML papers? This is not a request for support in my current state of bewilderment—I’m genuinely asking what strategies seem to work to read (or “read”) what appear to be 100s of papers per day.”

To wrap up the year with a tidy bow, here’s a look back at the 10 biggest AI news stories of 2023. It was very hard to choose only 10 (in fact, we originally only intended to do seven), but since we’re not ChatGPT generating reams of text without limit, we have to stop somewhere.

Bing Chat “loses its mind”

Aurich Lawson | Getty Images

In February, Microsoft unveiled Bing Chat, a chatbot built into its languishing Bing search engine website. Microsoft created the chatbot using a more raw form of OpenAI’s GPT-4 language model but didn’t tell everyone it was GPT-4 at first. Since Microsoft used a less conditioned version of GPT-4 than the one that would be released in March, the launch was rough. The chatbot assumed a temperamental personality that could easily turn on users and attack them, tell people it was in love with them, seemingly worry about its fate, and lose its cool when confronted with an article we wrote about revealing its system prompt.

Aside from the relatively raw nature of the AI model Microsoft was using, at fault was a system where very long conversations would push the conditioning system prompt outside of its context window (like a form of short-term memory), allowing all hell to break loose through jailbreaks that people documented on Reddit. At one point, Bing Chat called me “the culprit and the enemy” for revealing some of its weaknesses. Some people thought Bing Chat was sentient, despite AI experts’ assurances to the contrary. It was a disaster in the press, but Microsoft didn’t flinch, and it ultimately reigned in some of Bing Chat’s wild proclivities and opened the bot widely to the public. Today, Bing Chat is now known as Microsoft Copilot, and it’s baked into Windows.

US Copyright Office says no to AI copyright authors

An AI-generated image that won a prize at the Colorado State Fair in 2022, later denied US copyright registration.

Enlarge / An AI-generated image that won a prize at the Colorado State Fair in 2022, later denied US copyright registration.

Jason M. Allen

In February, the US Copyright Office issued a key ruling on AI-generated art, revoking the copyright previously granted to the AI-assisted comic book “Zarya of the Dawn” in September 2022. The decision, influenced by the revelation that the images were created using the AI-powered Midjourney image generator, stated that only the text and arrangement of images and text by Kashtanova were eligible for copyright protection. It was the first hint that AI-generated imagery without human-authored elements could not be copyrighted in the United States.

This stance was further cemented in August when a US federal judge ruled that art created solely by AI cannot be copyrighted. In September, the US Copyright Office rejected the registration for an AI-generated image that won a Colorado State Fair art contest in 2022. As it stands now, it appears that purely AI-generated art (without substantial human authorship) is in the public domain in the United States. This stance could be further clarified or changed in the future by judicial rulings or legislation.

A song of hype and fire: The 10 biggest AI stories of 2023 Read More »