AI

words-are-flowing-out-like-endless-rain:-recapping-a-busy-week-of-llm-news

Words are flowing out like endless rain: Recapping a busy week of LLM news

many things frequently —

Gemini 1.5 Pro launch, new version of GPT-4 Turbo, new Mistral model, and more.

An image of a boy amazed by flying letters.

Enlarge / An image of a boy amazed by flying letters.

Some weeks in AI news are eerily quiet, but during others, getting a grip on the week’s events feels like trying to hold back the tide. This week has seen three notable large language model (LLM) releases: Google Gemini Pro 1.5 hit general availability with a free tier, OpenAI shipped a new version of GPT-4 Turbo, and Mistral released a new openly licensed LLM, Mixtral 8x22B. All three of those launches happened within 24 hours starting on Tuesday.

With the help of software engineer and independent AI researcher Simon Willison (who also wrote about this week’s hectic LLM launches on his own blog), we’ll briefly cover each of the three major events in roughly chronological order, then dig into some additional AI happenings this week.

Gemini Pro 1.5 general release

On Tuesday morning Pacific time, Google announced that its Gemini 1.5 Pro model (which we first covered in February) is now available in 180-plus countries, excluding Europe, via the Gemini API in a public preview. This is Google’s most powerful public LLM so far, and it’s available in a free tier that permits up to 50 requests a day.

It supports up to 1 million tokens of input context. As Willison notes in his blog, Gemini 1.5 Pro’s API price at $7/million input tokens and $21/million output tokens costs a little less than GPT-4 Turbo (priced at $10/million in and $30/million out) and more than Claude 3 Sonnet (Anthropic’s mid-tier LLM, priced at $3/million in and $15/million out).

Notably, Gemini 1.5 Pro includes native audio (speech) input processing that allows users to upload audio or video prompts, a new File API for handling files, the ability to add custom system instructions (system prompts) for guiding model responses, and a JSON mode for structured data extraction.

“Majorly Improved” GPT-4 Turbo launch

A GPT-4 Turbo performance chart provided by OpenAI.

Enlarge / A GPT-4 Turbo performance chart provided by OpenAI.

Just a bit later than Google’s 1.5 Pro launch on Tuesday, OpenAI announced that it was rolling out a “majorly improved” version of GPT-4 Turbo (a model family originally launched in November) called “gpt-4-turbo-2024-04-09.” It integrates multimodal GPT-4 Vision processing (recognizing the contents of images) directly into the model, and it initially launched through API access only.

Then on Thursday, OpenAI announced that the new GPT-4 Turbo model had just become available for paid ChatGPT users. OpenAI said that the new model improves “capabilities in writing, math, logical reasoning, and coding” and shared a chart that is not particularly useful in judging capabilities (that they later updated). The company also provided an example of an alleged improvement, saying that when writing with ChatGPT, the AI assistant will use “more direct, less verbose, and use more conversational language.”

The vague nature of OpenAI’s GPT-4 Turbo announcements attracted some confusion and criticism online. On X, Willison wrote, “Who will be the first LLM provider to publish genuinely useful release notes?” In some ways, this is a case of “AI vibes” again, as we discussed in our lament about the poor state of LLM benchmarks during the debut of Claude 3. “I’ve not actually spotted any definite differences in quality [related to GPT-4 Turbo],” Willison told us directly in an interview.

The update also expanded GPT-4’s knowledge cutoff to April 2024, although some people are reporting it achieves this through stealth web searches in the background, and others on social media have reported issues with date-related confabulations.

Mistral’s mysterious Mixtral 8x22B release

An illustration of a robot holding a French flag, figuratively reflecting the rise of AI in France due to Mistral. It's hard to draw a picture of an LLM, so a robot will have to do.

Enlarge / An illustration of a robot holding a French flag, figuratively reflecting the rise of AI in France due to Mistral. It’s hard to draw a picture of an LLM, so a robot will have to do.

Not to be outdone, on Tuesday night, French AI company Mistral launched its latest openly licensed model, Mixtral 8x22B, by tweeting a torrent link devoid of any documentation or commentary, much like it has done with previous releases.

The new mixture-of-experts (MoE) release weighs in with a larger parameter count than its previously most-capable open model, Mixtral 8x7B, which we covered in December. It’s rumored to potentially be as capable as GPT-4 (In what way, you ask? Vibes). But that has yet to be seen.

“The evals are still rolling in, but the biggest open question right now is how well Mixtral 8x22B shapes up,” Willison told Ars. “If it’s in the same quality class as GPT-4 and Claude 3 Opus, then we will finally have an openly licensed model that’s not significantly behind the best proprietary ones.”

This release has Willison most excited, saying, “If that thing really is GPT-4 class, it’s wild, because you can run that on a (very expensive) laptop. I think you need 128GB of MacBook RAM for it, twice what I have.”

The new Mixtral is not listed on Chatbot Arena yet, Willison noted, because Mistral has not released a fine-tuned model for chatting yet. It’s still a raw, predict-the-next token LLM. “There’s at least one community instruction tuned version floating around now though,” says Willison.

Chatbot Arena Leaderboard shake-ups

A Chatbot Arena Leaderboard screenshot taken on April 12, 2024.

Enlarge / A Chatbot Arena Leaderboard screenshot taken on April 12, 2024.

Benj Edwards

This week’s LLM news isn’t limited to just the big names in the field. There have also been rumblings on social media about the rising performance of open source models like Cohere’s Command R+, which reached position 6 on the LMSYS Chatbot Arena Leaderboard—the highest-ever ranking for an open-weights model.

And for even more Chatbot Arena action, apparently the new version of GPT-4 Turbo is proving competitive with Claude 3 Opus. The two are still in a statistical tie, but GPT-4 Turbo recently pulled ahead numerically. (In March, we reported when Claude 3 first numerically pulled ahead of GPT-4 Turbo, which was then the first time another AI model had surpassed a GPT-4 family model member on the leaderboard.)

Regarding this fierce competition among LLMs—of which most of the muggle world is unaware and will likely never be—Willison told Ars, “The past two months have been a whirlwind—we finally have not just one but several models that are competitive with GPT-4.” We’ll see if OpenAI’s rumored release of GPT-5 later this year will restore the company’s technological lead, we note, which once seemed insurmountable. But for now, Willison says, “OpenAI are no longer the undisputed leaders in LLMs.”

Words are flowing out like endless rain: Recapping a busy week of LLM news Read More »

intel’s-“gaudi-3”-ai-accelerator-chip-may-give-nvidia’s-h100-a-run-for-its-money

Intel’s “Gaudi 3” AI accelerator chip may give Nvidia’s H100 a run for its money

Adventures in Matrix Multiplication —

Intel claims 50% more speed when running AI language models vs. the market leader.

An Intel handout photo of the Gaudi 3 AI accelerator.

Enlarge / An Intel handout photo of the Gaudi 3 AI accelerator.

On Tuesday, Intel revealed a new AI accelerator chip called Gaudi 3 at its Vision 2024 event in Phoenix. With strong claimed performance while running large language models (like those that power ChatGPT), the company has positioned Gaudi 3 as an alternative to Nvidia’s H100, a popular data center GPU that has been subject to shortages, though apparently that is easing somewhat.

Compared to Nvidia’s H100 chip, Intel projects a 50 percent faster training time on Gaudi 3 for both OpenAI’s GPT-3 175B LLM and the 7-billion parameter version of Meta’s Llama 2. In terms of inference (running the trained model to get outputs), Intel claims that its new AI chip delivers 50 percent faster performance than H100 for Llama 2 and Falcon 180B, which are both relatively popular open-weights models.

Intel is targeting the H100 because of its high market share, but the chip isn’t Nvidia’s most powerful AI accelerator chip in the pipeline. Announcements of the H200 and the Blackwell B200 have since surpassed the H100 on paper, but neither of those chips is out yet (the H200 is expected in the second quarter of 2024—basically any day now).

Meanwhile, the aforementioned H100 supply issues have been a major headache for tech companies and AI researchers who have to fight for access to any chips that can train AI models. This has led several tech companies like Microsoft, Meta, and OpenAI (rumor has it) to seek their own AI-accelerator chip designs, although that custom silicon is typically manufactured by either Intel or TSMC. Google has its own line of tensor processing units (TPUs) that it has been using internally since 2015.

Given those issues, Intel’s Gaudi 3 may be a potentially attractive alternative to the H100 if Intel can hit an ideal price (which Intel has not provided, but an H100 reportedly costs around $30,000–$40,000) and maintain adequate production. AMD also manufactures a competitive range of AI chips, such as the AMD Instinct MI300 Series, that sell for around $10,000–$15,000.

Gaudi 3 performance

An Intel handout featuring specifications of the Gaudi 3 AI accelerator.

Enlarge / An Intel handout featuring specifications of the Gaudi 3 AI accelerator.

Intel says the new chip builds upon the architecture of its predecessor, Gaudi 2, by featuring two identical silicon dies connected by a high-bandwidth connection. Each die contains a central cache memory of 48 megabytes, surrounded by four matrix multiplication engines and 32 programmable tensor processor cores, bringing the total cores to 64.

The chipmaking giant claims that Gaudi 3 delivers double the AI compute performance of Gaudi 2 using 8-bit floating-point infrastructure, which has become crucial for training transformer models. The chip also offers a fourfold boost for computations using the BFloat 16-number format. Gaudi 3 also features 128GB of the less expensive HBMe2 memory capacity (which may contribute to price competitiveness) and features 3.7TB of memory bandwidth.

Since data centers are well-known to be power hungry, Intel emphasizes the power efficiency of Gaudi 3, claiming 40 percent greater inference power-efficiency across Llama 7B and 70B parameters, and Falcon 180B parameter models compared to Nvidia’s H100. Eitan Medina, chief operating officer of Intel’s Habana Labs, attributes this advantage to Gaudi’s large-matrix math engines, which he claims require significantly less memory bandwidth compared to other architectures.

Gaudi vs. Blackwell

An Intel handout photo of the Gaudi 3 AI accelerator.

Enlarge / An Intel handout photo of the Gaudi 3 AI accelerator.

Last month, we covered the splashy launch of Nvidia’s Blackwell architecture, including the B200 GPU, which Nvidia claims will be the world’s most powerful AI chip. It seems natural, then, to compare what we know about Nvidia’s highest-performing AI chip to the best of what Intel can currently produce.

For starters, Gaudi 3 is being manufactured using TSMC’s N5 process technology, according to IEEE Spectrum, narrowing the gap between Intel and Nvidia in terms of semiconductor fabrication technology. The upcoming Nvidia Blackwell chip will use a custom N4P process, which reportedly offers modest performance and efficiency improvements over N5.

Gaudi 3’s use of HBM2e memory (as we mentioned above) is notable compared to the more expensive HBM3 or HBM3e used in competing chips, offering a balance of performance and cost-efficiency. This choice seems to emphasize Intel’s strategy to compete not only on performance but also on price.

As far as raw performance comparisons between Gaudi 3 and the B200, that can’t be known until the chips have been released and benchmarked by a third party.

As the race to power the tech industry’s thirst for AI computation heats up, IEEE Spectrum notes that the next generation of Intel’s Gaudi chip, code-named Falcon Shores, remains a point of interest. It also remains to be seen whether Intel will continue to rely on TSMC’s technology or leverage its own foundry business and upcoming nanosheet transistor technology to gain a competitive edge in the AI accelerator market.

Intel’s “Gaudi 3” AI accelerator chip may give Nvidia’s H100 a run for its money Read More »

us-lawmaker-proposes-a-public-database-of-all-ai-training-material

US lawmaker proposes a public database of all AI training material

Who’s got the receipts? —

Proposed law would require more transparency from AI companies.

US lawmaker proposes a public database of all AI training material

Amid a flurry of lawsuits over AI models’ training data, US Representative Adam Schiff (D-Calif.) has introduced a bill that would require AI companies to disclose exactly which copyrighted works are included in datasets training AI systems.

The Generative AI Disclosure Act “would require a notice to be submitted to the Register of Copyrights prior to the release of a new generative AI system with regard to all copyrighted works used in building or altering the training dataset for that system,” Schiff said in a press release.

The bill is retroactive and would apply to all AI systems available today, as well as to all AI systems to come. It would take effect 180 days after it’s enacted, requiring anyone who creates or alters a training set not only to list works referenced by the dataset, but also to provide a URL to the dataset within 30 days before the AI system is released to the public. That URL would presumably give creators a way to double-check if their materials have been used and seek any credit or compensation available before the AI tools are in use.

All notices would be kept in a publicly available online database.

Schiff described the act as championing “innovation while safeguarding the rights and contributions of creators, ensuring they are aware when their work contributes to AI training datasets.”

“This is about respecting creativity in the age of AI and marrying technological progress with fairness,” Schiff said.

Currently, creators who don’t have access to training datasets rely on AI models’ outputs to figure out if their copyrighted works may have been included in training various AI systems. The New York Times, for example, prompted ChatGPT to spit out excerpts of its articles, relying on a tactic to identify training data by asking ChatGPT to produce lines from specific articles, which OpenAI has curiously described as “hacking.”

Under Schiff’s law, The New York Times would need to consult the database to ID all articles used to train ChatGPT or any other AI system.

Any AI maker who violates the act would risk a “civil penalty in an amount not less than $5,000,” the proposed bill said.

At a hearing on artificial intelligence and intellectual property, Rep. Darrell Issa (R-Calif.)—who chairs the House Judiciary Subcommittee on Courts, Intellectual Property, and the Internet—told Schiff that his subcommittee would consider the “thoughtful” bill.

Schiff told the subcommittee that the bill is “only a first step” toward “ensuring that at a minimum” creators are “aware of when their work contributes to AI training datasets,” saying that he would “welcome the opportunity to work with members of the subcommittee” on advancing the bill.

“The rapid development of generative AI technologies has outpaced existing copyright laws, which has led to widespread use of creative content to train generative AI models without consent or compensation,” Schiff warned at the hearing.

In Schiff’s press release, Meredith Stiehm, president of the Writers Guild of America West, joined leaders from other creative groups celebrating the bill as an “important first step” for rightsholders.

“Greater transparency and guardrails around AI are necessary to protect writers and other creators” and address “the unprecedented and unauthorized use of copyrighted materials to train generative AI systems,” Stiehm said.

Until the thorniest AI copyright questions are settled, Ken Doroshow, a chief legal officer for the Recording Industry Association of America, suggested that Schiff’s bill filled an important gap by introducing “comprehensive and transparent recordkeeping” that would provide “one of the most fundamental building blocks of effective enforcement of creators’ rights.”

A senior adviser for the Human Artistry Campaign, Moiya McTier, went further, celebrating the bill as stopping AI companies from “exploiting” artists and creators.

“AI companies should stop hiding the ball when they copy creative works into AI systems and embrace clear rules of the road for recordkeeping that create a level and transparent playing field for the development and licensing of genuinely innovative applications and tools,” McTier said.

AI copyright guidance coming soon

While courts weigh copyright questions raised by artists, book authors, and newspapers, the US Copyright Office announced in March that it would be issuing guidance later this year, but the office does not seem to be prioritizing questions on AI training.

Instead, the Copyright Office will focus first on issuing guidance on deepfakes and AI outputs. This spring, the office will release a report “analyzing the impact of AI on copyright” of “digital replicas, or the use of AI to digitally replicate individuals’ appearances, voices, or other aspects of their identities.” Over the summer, another report will focus on “the copyrightability of works incorporating AI-generated material.”

Regarding “the topic of training AI models on copyrighted works as well as any licensing considerations and liability issues,” the Copyright Office did not provide a timeline for releasing guidance, only confirming that their “goal is to finalize the entire report by the end of the fiscal year.”

Once guidance is available, it could sway court opinions, although courts do not necessarily have to apply Copyright Office guidance when weighing cases.

The Copyright Office’s aspirational timeline does seem to be ahead of when at least some courts can be expected to decide on some of the biggest copyright questions for some creators. The class-action lawsuit raised by book authors against OpenAI, for example, is not expected to be resolved until February 2025, and the New York Times’ lawsuit is likely on a similar timeline. However, artists suing Stability AI face a hearing on that AI company’s motion to dismiss this May.

US lawmaker proposes a public database of all AI training material Read More »

new-ai-music-generator-udio-synthesizes-realistic-music-on-demand

New AI music generator Udio synthesizes realistic music on demand

Battle of the AI bands —

But it still needs trial and error to generate high-quality results.

A screenshot of AI-generated songs listed on Udio on April 10, 2024.

Enlarge / A screenshot of AI-generated songs listed on Udio on April 10, 2024.

Benj Edwards

Between 2002 and 2005, I ran a music website where visitors could submit song titles that I would write and record a silly song around. In the liner notes for my first CD release in 2003, I wrote about a day when computers would potentially put me out of business, churning out music automatically at a pace I could not match. While I don’t actively post music on that site anymore, that day is almost here.

On Wednesday, a group of ex-DeepMind employees launched Udio, a new AI music synthesis service that can create novel high-fidelity musical audio from written prompts, including user-provided lyrics. It’s similar to Suno, which we covered on Monday. With some key human input, Udio can create facsimiles of human-produced music in genres like country, barbershop quartet, German pop, classical, hard rock, hip hop, show tunes, and more. It’s currently free to use during a beta period.

Udio is also freaking out some musicians on Reddit. As we mentioned in our Suno piece, Udio is exactly the kind of AI-powered music generation service that over 200 musical artists were afraid of when they signed an open protest letter last week.

But as impressive as the Udio songs first seem from a technical AI-generation standpoint (not necessarily judging by musical merit), its generation capability isn’t perfect. We experimented with its creation tool and the results felt less impressive than those created by Suno. The high-quality musical samples showcased on Udio’s site likely resulted from a lot of creative human input (such as human-written lyrics) and cherry-picking the best compositional parts of songs out of many generations. In fact, Udio lays out a five-step workflow to build a 1.5-minute-long song in a FAQ.

For example, we created an Ars Technica “Moonshark” song on Udio using the same prompt as one we used previously with Suno. In its raw form, the results sound half-baked and almost nightmarish (here is the Suno version for comparison). It’s also a lot shorter by default at 32 seconds compared to Suno’s 1-minute and 32-second output. But Udio allows songs to be extended, or you can try generating a poor result again with different prompts for different results.

After registering a Udio account, anyone can create a track by entering a text prompt that can include lyrics, a story direction, and musical genre tags. Udio then tackles the task in two stages. First, it utilizes a large language model (LLM) similar to ChatGPT to generate lyrics (if necessary) based on the provided prompt. Next, it synthesizes music using a method that Udio does not disclose, but it’s likely a diffusion model, similar to Stability AI’s Stable Audio.

From the given prompt, Udio’s AI model generates two distinct song snippets for you to choose from. You can then publish the song for the Udio community, download the audio or video file to share on other platforms, or directly share it on social media. Other Udio users can also remix or build on existing songs. Udio’s terms of service say that the company claims no rights over the musical generations and that they can be used for commercial purposes.

Although the Udio team has not revealed the specific details of its model or training data (which is likely filled with copyrighted material), it told Tom’s Guide that the system has built-in measures to identify and block tracks that too closely resemble the work of specific artists, ensuring that the generated music remains original.

And that brings us back to humans, some of whom are not taking the onset of AI-generated music very well. “I gotta be honest, this is depressing as hell,” wrote one Reddit commenter in a thread about Udio. “I’m still broadly optimistic that music will be fine in the long run somehow. But like, why do this? Why automate art?”

We’ll hazard an answer by saying that replicating art is a key target for AI research because the results can be inaccurate and imprecise and still seem notable or gee-whiz amazing, which is a key characteristic of generative AI. It’s flashy and impressive-looking while allowing for a general lack of quantitative rigor. We’ve already seen AI come for still images, video, and text with varied results regarding representative accuracy. Fully composed musical recordings seem to be next on the list of AI hills to (approximately) conquer, and the competition is heating up.

New AI music generator Udio synthesizes realistic music on demand Read More »

elon-musk:-ai-will-be-smarter-than-any-human-around-the-end-of-next-year

Elon Musk: AI will be smarter than any human around the end of next year

smarter than the average bear —

While Musk says superintelligence is coming soon, one critic says prediction is “batsh*t crazy.”

Elon Musk, owner of Tesla and the X (formerly Twitter) platform, attends a symposium on fighting antisemitism titled 'Never Again : Lip Service or Deep Conversation' in Krakow, Poland on January 22nd, 2024. Musk, who was invited to Poland by the European Jewish Association (EJA) has visited the Auschwitz-Birkenau concentration camp earlier that day, ahead of International Holocaust Remembrance Day. (Photo by Beata Zawrzel/NurPhoto)

Enlarge / Elon Musk, owner of Tesla and the X (formerly Twitter) platform on January 22, 2024.

On Monday, Tesla CEO Elon Musk predicted the imminent rise in AI superintelligence during a live interview streamed on the social media platform X. “My guess is we’ll have AI smarter than any one human probably around the end of next year,” Musk said in his conversation with hedge fund manager Nicolai Tangen.

Just prior to that, Tangen had asked Musk, “What’s your take on where we are in the AI race just now?” Musk told Tangen that AI “is the fastest advancing technology I’ve seen of any kind, and I’ve seen a lot of technology.” He described computers dedicated to AI increasing in capability by “a factor of 10 every year, if not every six to nine months.”

Musk made the prediction with an asterisk, saying that shortages of AI chips and high AI power demands could limit AI’s capability until those issues are resolved. “Last year, it was chip-constrained,” Musk told Tangen. “People could not get enough Nvidia chips. This year, it’s transitioning to a voltage transformer supply. In a year or two, it’s just electricity supply.”

But not everyone is convinced that Musk’s crystal ball is free of cracks. Grady Booch, a frequent critic of AI hype on social media who is perhaps best known for his work in software architecture, told Ars in an interview, “Keep in mind that Mr. Musk has a profoundly bad record at predicting anything associated with AI; back in 2016, he promised his cars would ship with FSD safety level 5, and here we are, closing on an a decade later, still waiting.”

Creating artificial intelligence at least as smart as a human (frequently called “AGI” for artificial general intelligence) is often seen as inevitable among AI proponents, but there’s no broad consensus on exactly when that milestone will be reached—or on the exact definition of AGI, for that matter.

“If you define AGI as smarter than the smartest human, I think it’s probably next year, within two years,” Musk added in the interview with Tangen while discussing AGI timelines.

Even with uncertainties about AGI, that hasn’t kept companies from trying. ChatGPT creator OpenAI, which launched with Musk as a co-founder in 2015, lists developing AGI as its main goal. Musk has not been directly associated with OpenAI for years (unless you count a recent lawsuit against the company), but last year, he took aim at the business of large language models by forming a new company called xAI. Its main product, Grok, functions similarly to ChatGPT and is integrated into the X social media platform.

Booch gives credit to Musk’s business successes but casts doubt on his forecasting ability. “Albeit a brilliant if not rapacious businessman, Mr. Musk vastly overestimates both the history as well as the present of AI while simultaneously diminishing the exquisite uniqueness of human intelligence,” says Booch. “So in short, his prediction is—to put it in scientific terms—batshit crazy.”

So when will we get AI that’s smarter than a human? Booch says there’s no real way to know at the moment. “I reject the framing of any question that asks when AI will surpass humans in intelligence because it is a question filled with ambiguous terms and considerable emotional and historic baggage,” he says. “We are a long, long way from understanding the design that would lead us there.”

We also asked Hugging Face AI researcher Dr. Margaret Mitchell to weigh in on Musk’s prediction. “Intelligence … is not a single value where you can make these direct comparisons and have them mean something,” she told us in an interview. “There will likely never be agreement on comparisons between human and machine intelligence.”

But even with that uncertainty, she feels there is one aspect of AI she can more reliably predict: “I do agree that neural network models will reach a point where men in positions of power and influence, particularly ones with investments in AI, will declare that AI is smarter than humans. By end of next year, sure. That doesn’t sound far off base to me.”

Elon Musk: AI will be smarter than any human around the end of next year Read More »

mit-license-text-becomes-viral-“sad-girl”-piano-ballad-generated-by-ai

MIT License text becomes viral “sad girl” piano ballad generated by AI

WARRANTIES OF MERCHANTABILITY —

“Permission is hereby granted” comes from Suno AI engine that creates new songs on demand.

Illustration of a robot singing.

We’ve come a long way since primitive AI music generators in 2022. Today, AI tools like Suno.ai allow any series of words to become song lyrics, including inside jokes (as you’ll see below). On Wednesday, prompt engineer Riley Goodside tweeted an AI-generated song created with the prompt “sad girl with piano performs the text of the MIT License,” and it began to circulate widely in the AI community online.

The MIT License is a famous permissive software license created in the late 1980s, frequently used in open source projects. “My favorite part of this is ~1: 25 it nails ‘WARRANTIES OF MERCHANTABILITY’ with a beautiful Imogen Heap-style glissando then immediately pronounces ‘FITNESS’ as ‘fistiff,'” Goodside wrote on X.

Suno (which means “listen” in Hindi) was formed in 2023 in Cambridge, Massachusetts. It’s the brainchild of Michael Shulman, Georg Kucsko, Martin Camacho, and Keenan Freyberg, who formerly worked at companies like Meta and TikTok. Suno has already attracted big-name partners, such as Microsoft, which announced the integration of an earlier version of the Suno engine into Bing Chat last December. Today, Suno is on v3 of its model, which can create temporally coherent two-minute songs in many different genres.

The company did not reply to our request for an interview by press time. In March, Brian Hiatt of Rolling Stone wrote a profile about Suno that describes the service as a collaboration between OpenAI’s ChatGPT (for lyric writing) and Suno’s music generation model, which some experts think has likely been trained on recordings of copyrighted music without license or artist permission.

It’s exactly this kind of service that upset over 200 musical artists enough last week that they signed an Artist Rights Alliance open letter asking tech companies to stop using AI tools to generate music that could replace human artists.

Considering the unknown provenance of the training data, ownership of the generated songs seems like a complicated question. Suno’s FAQ says that music generated using its free tier remains owned by Suno and can only be used for non-commercial purposes. Paying subscribers reportedly own generated songs “while subscribed to Pro or Premier,” subject to Suno’s terms of service. However, the US Copyright Office took a stance last year that purely AI-generated visual art cannot be copyrighted, and while that standard has not yet been resolved for AI-generated music, it might eventually become official legal policy as well.

The Moonshark song

A screenshot of the Suno.ai website showing lyrics of an AI-generated

Enlarge / A screenshot of the Suno.ai website showing lyrics of an AI-generated “Moonshark” song.

Benj Edwards

While using the service, Suno appears to have no trouble creating unique lyrics based on your prompt (unless you supply your own) and sets those words to stylized genres of music it generates based on its training dataset. It dynamically generates vocals as well, although they include audible aberrations. Suno’s output is not indistinguishable from high-fidelity human-created music yet, but given the pace of progress we’ve seen, that bridge could be crossed within the next year.

To get a sense of what Suno can do, we created an account on the site and prompted the AI engine to create songs about our mascot, Moonshark, and about barbarians with CRTs, two inside jokes at Ars. What’s interesting is that although the AI model aced the task of creating an original song for each topic, both songs start with the same line, “In the depths of the digital domain.” That’s possibly an artifact of whatever hidden prompt Suno is using to instruct ChatGPT when writing the lyrics.

Suno is arguably a fun toy to experiment with and doubtless a milestone in generative AI music tools. But it’s also an achievement tainted by the unresolved ethical issues related to scraping musical work without the artist’s permission. Then there’s the issue of potentially replacing human musicians, which has not been far from the minds of people sharing their own Suno results online. On Monday, AI influencer Ethan Mollick wrote, “I’ve had a song from Suno AI stuck in my head all day. Grim milestone or good one?”

MIT License text becomes viral “sad girl” piano ballad generated by AI Read More »

ai-hardware-company-from-jony-ive,-sam-altman-seeks-$1-billion-in-funding

AI hardware company from Jony Ive, Sam Altman seeks $1 billion in funding

AI Boom —

A venture fund founded by Laurene Powell Jobs could finance the company.

Jony Ive, the former Apple designer.

Jony Ive, the former Apple designer.

Former Apple design lead Jony Ive and current OpenAI CEO Sam Altman are seeking funding for a new company that will produce an “artificial intelligence-powered personal device,” according to The Information‘s sources, who are said to be familiar with the plans.

The exact nature of the device is unknown, but it will not look anything like a smartphone, according to the sources. We first heard tell of this venture in the fall of 2023, but The Information’s story reveals that talks are moving forward to get the company off the ground.

Ive and Altman hope to raise at least $1 billion for the new company. The complete list of potential funding sources they’ve spoken with is unknown, but The Information’s sources say they are in talks with frequent OpenAI investor Thrive Capital as well as Emerson Collective, a venture capital firm founded by Laurene Powell Jobs.

SoftBank CEO and super-investor Masayoshi Son is also said to have spoken with Altman and Ive about the venture. Financial Times previously reported that Son wanted Arm (another company he has backed) to be involved in the project.

Obviously, those are some of the well-established and famous names within today’s tech industry. Personal connections may play a role; for example, Jobs is said to have a friendship with both Ive and Altman. That might be critical because the pedigree involved could scare off smaller investors since the big names could drive up the initial cost of investment.

Although we don’t know anything about the device yet, it would likely put Ive in direct competition with his former employer, Apple. It has been reported elsewhere that Apple is working on bringing powerful new AI features to iOS 18 and later versions of the software for iPhones, iPads, and the company’s other devices.

Altman already has his hands in several other AI ventures besides OpenAI. The Information reports that there is no indication yet that OpenAI would be directly involved in the new hardware company.

AI hardware company from Jony Ive, Sam Altman seeks $1 billion in funding Read More »

publisher:-openai’s-gpt-store-bots-are-illegally-scraping-our-textbooks

Publisher: OpenAI’s GPT Store bots are illegally scraping our textbooks

OpenAI logo

For the past few months, Morten Blichfeldt Andersen has spent many hours scouring OpenAI’s GPT Store. Since it launched in January, the marketplace for bespoke bots has filled up with a deep bench of useful and sometimes quirky AI tools. Cartoon generators spin up New Yorker–style illustrations and vivid anime stills. Programming and writing assistants offer shortcuts for crafting code and prose. There’s also a color analysis bot, a spider identifier, and a dating coach called RizzGPT. Yet Blichfeldt Andersen is hunting only for one very specific type of bot: Those built on his employer’s copyright-protected textbooks without permission.

Blichfeldt Andersen is publishing director at Praxis, a Danish textbook purveyor. The company has been embracing AI and created its own custom chatbots. But it is currently engaged in a game of whack-a-mole in the GPT Store, and Blichfeldt Andersen is the man holding the mallet.

“I’ve been personally searching for infringements and reporting them,” Blichfeldt Andersen says. “They just keep coming up.” He suspects the culprits are primarily young people uploading material from textbooks to create custom bots to share with classmates—and that he has uncovered only a tiny fraction of the infringing bots in the GPT Store. “Tip of the iceberg,” Blichfeldt Andersen says.

It is easy to find bots in the GPT Store whose descriptions suggest they might be tapping copyrighted content in some way, as Techcrunch noted in a recent article claiming OpenAI’s store was overrun with “spam.” Using copyrighted material without permission is permissible in some contexts but in others rightsholders can take legal action. WIRED found a GPT called Westeros Writer that claims to “write like George R.R. Martin,” the creator of Game of Thrones. Another, Voice of Atwood, claims to imitate the writer Margaret Atwood. Yet another, Write Like Stephen, is intended to emulate Stephen King.

When WIRED tried to trick the King bot into revealing the “system prompt” that tunes its responses, the output suggested it had access to King’s memoir On Writing. Write Like Stephen was able to reproduce passages from the book verbatim on demand, even noting which page the material came from. (WIRED could not make contact with the bot’s developer, because it did not provide an email address, phone number, or external social profile.)

OpenAI spokesperson Kayla Wood says it responds to takedown requests against GPTs made with copyrighted content but declined to answer WIRED’s questions about how frequently it fulfills such requests. She also says the company proactively looks for problem GPTs. “We use a combination of automated systems, human review, and user reports to find and assess GPTs that potentially violate our policies, including the use of content from third parties without necessary permission,” Wood says.

New disputes

The GPT store’s copyright problem could add to OpenAI’s existing legal headaches. The company is facing a number of high-profile lawsuits alleging copyright infringement, including one brought by The New York Times and several brought by different groups of fiction and nonfiction authors, including big names like George R.R. Martin.

Chatbots offered in OpenAI’s GPT Store are based on the same technology as its own ChatGPT but are created by outside developers for specific functions. To tailor their bot, a developer can upload extra information that it can tap to augment the knowledge baked into OpenAI’s technology. The process of consulting this additional information to respond to a person’s queries is called retrieval-augmented generation, or RAG. Blichfeldt Andersen is convinced that the RAG files behind the bots in the GPT Store are a hotbed of copyrighted materials uploaded without permission.

Publisher: OpenAI’s GPT Store bots are illegally scraping our textbooks Read More »

after-ai-generated-porn-report,-washington-lottery-pulls-down-interactive-web-app

After AI-generated porn report, Washington Lottery pulls down interactive web app

You could be a winner! —

User says promo site put her uploaded selfie on a topless woman’s body.

A user of the Washington Lottery's

Enlarge / A user of the Washington Lottery’s “Test Drive a Win” website says it used AI to generate (the unredacted version of) this image with her face on a topless body.

The Washington State Lottery has taken down a promotional AI-powered web app after a local mother reported that the site generated an image with her face on the body of a topless woman.

The lottery’s “Test Drive a Win” website was designed to help visitors visualize various dream vacations they could pay for with their theoretical lottery winnings. The site included the ability to upload a headshot that would be integrated into an AI-generated tableau of what you might look like on that vacation.

But Megan (last name not given), a 50-year-old from Olympia suburb Tumwater, told conservative Seattle radio host Jason Rantz that the image of her “swim with the sharks” dream vacation on the website showed her face atop a woman sitting on a bed with her breasts exposed. The background of the AI-generated image seems to show the bed in some sort of aquarium, complete with fish floating through the air and sprawling undersea flora sitting awkwardly behind the pillows.

The corner of the image features the Washington Lottery logo.

“Our tax dollars are paying for that! I was completely shocked. It’s disturbing to say the least,” Megan told Rantz. “I also think whoever was responsible for it should be fired.”

“We don’t want something like this purported event to happen again”

The non-functional

Enlarge / The non-functional “Test Drive a Win” website as it appeared Thursday.

In a statement provided to Ars Technica, a Washington Lottery spokesperson said that the lottery “worked closely with the developers of the AI platform to establish strict parameters to govern image creation.” Despite this, the spokesperson said they were notified earlier this week that “a single user of the AI platform was purportedly provided an image that did not adhere to those guidelines.”

Despite what the spokesperson said were “thousands” of inoffensive images that the site generated in over a month, the spokesperson said that “one purported user is too many and as a result we have shut down the site” as of Tuesday.

The spokesperson did not respond to specific questions about which AI models or third-party vendors may have been used to create the site or on the specific safeguards that were crafted in an attempt to prevent results like the one reported by Megan.

Speaking to Rantz, a lottery spokesperson said the organization had “agreed to a comprehensive set of rules” for the site’s AI images, “including that people in images be fully clothed.” Following the report of the topless image, the spokesperson said they “had the developers check all the parameters for the platform.” And while they were “comfortable with the settings,” the spokesperson told Rantz they “chose to take down the site out of an abundance of caution, as we don’t want something like this purported event to happen again.”

Not a quick fix?

On his radio show, Rantz expressed surprise that the lottery couldn’t keep the site operational after rejiggering the AI’s safety settings. “In my head I was thinking, well, presumably once they heard about this they went back to the backend guidelines and just made sure it said, ‘Hey, no breasts, no full-frontal nudity,’ those kinds of things, and then they fixed it, and then they went on with their day,” Rantz said.

But it might not be that simple to effectively rein in the endless variety of visual output an AI model can generate. While models like Stable Diffusion and DALL-E have filters in place to prevent the generation of sexual or violent images, researchers have found that those models still responded to problematic prompts by generating images that were judged as “unsafe” by an image classifier a significant minority of the time. Malicious users can also use prompt-engineering tricks to get around these built-in safeguards when using popular text-based image-generation models.

We’ve seen these kinds of AI image-safety issues blow back on major corporations, too, as when Facebook’s AI sticker generator put weapons in the hands of children’s cartoon characters. More recently, a Microsoft engineer publicly accused the company’s Copilot image-generation tool of randomly creating violent and sexual imagery even after the team was warned of the issue.

The Washington Lottery’s AI issue comes a week after a report found a New York City government chatbot confabulating incorrect advice about city laws and regulations. “It’s wrong in some areas and we gotta fix it,” New York City Mayor Eric Adams said this week. “Any time you use technology, you need to put it in the real environment to iron out the kinks. You can’t live in a lab. You can’t stay in a lab forever.”

After AI-generated porn report, Washington Lottery pulls down interactive web app Read More »

fake-ai-law-firms-are-sending-fake-dmca-threats-to-generate-fake-seo-gains

Fake AI law firms are sending fake DMCA threats to generate fake SEO gains

Dewey Fakum & Howe, LLP —

How one journalist found himself targeted by generative AI over a keyfob photo.

Updated

Face composed of many pixellated squares, joining together

Enlarge / A person made of many parts, similar to the attorney who handles both severe criminal law and copyright takedowns for an Arizona law firm.

Getty Images

If you run a personal or hobby website, getting a copyright notice from a law firm about an image on your site can trigger some fast-acting panic. As someone who has paid to settle a news service-licensing issue before, I can empathize with anybody who wants to make this kind of thing go away.

Which is why a new kind of angle-on-an-angle scheme can seem both obvious to spot and likely effective. Ernie Smith, the prolific, ever-curious writer behind the newsletter Tedium, received a “DMCA Copyright Infringement Notice” in late March from “Commonwealth Legal,” representing the “Intellectual Property division” of Tech4Gods.

The issue was with a photo of a keyfob from legitimate photo service Unsplash used in service of a post about a strange Uber ride Smith once took. As Smith detailed in a Mastodon thread, the purported firm needed him to “add a credit to our client immediately” through a link to Tech4Gods, and said it should be “addressed in the next five business days.” Removing the image “does not conclude the matter,” and should Smith not have taken action, the putative firm would have to “activate” its case, relying on DMCA 512(c) (which, in many readings, actually does grant relief should a website owner, unaware of infringing material, “act expeditiously to remove” said material). The email unhelpfully points to the main page of the Internet Archive so that Smith might review “past usage records.”

A slice of the website for Commonwealth Legal Services, with every word of that phrase, including

A slice of the website for Commonwealth Legal Services, with every word of that phrase, including “for,” called into question.

Commonwealth Legal Services

There are quite a few issues with Commonwealth Legal’s request, as detailed by Smith and 404 Media. Chief among them is that Commonwealth Legal, a firm theoretically based in Arizona (which is not a commonwealth), almost certainly does not exist. Despite the 2018 copyright displayed on the site, the firm’s website domain was seemingly registered on March 1, 2024, with a Canadian IP location. The address on the firm’s site leads to a location that, to say the least, does not match the “fourth floor” indicated on the website.

While the law firm’s website is stuffed full of stock images, so are many websites for professional services. The real tell is the site’s list of attorneys, most of which, as 404 Media puts it, have “vacant, thousand-yard stares” common to AI-generated faces. AI detection firm Reality Defender told 404 Media that his service spotted AI generation in every attorneys’ image, “most likely by a Generative Adversarial Network (GAN) model.”

Then there are the attorneys’ bios, which offer surface-level competence underpinned by bizarre setups. Five of the 12 supposedly come from acclaimed law schools at Harvard, Yale, Stanford, and University of Chicago. The other seven seem to have graduated from the top five results you might get for “Arizona Law School.” Sarah Walker has a practice based on “Copyright Violation and Judicial Criminal Proceedings,” a quite uncommon pairing. Sometimes she is “upholding the rights of artists,” but she can also “handle high-stakes criminal cases.” Walker, it seems, couldn’t pick just one track at Yale Law School.

Why would someone go to the trouble of making a law firm out of NameCheap, stock art, and AI images (and seemingly copy) to send quasi-legal demands to site owners? Backlinks, that’s why. Backlinks are links from a site that Google (or others, but almost always Google) holds in high esteem to a site trying to rank up. Whether spammed, traded, generated, or demanded through a fake firm, backlinks power the search engine optimization (SEO) gray, to very dark gray, market. For all their touted algorithmic (and now AI) prowess, search engines have always had a hard time gauging backlink quality and context, so some site owners still buy backlinks.

The owner of Tech4Gods told 404 Media’s Jason Koebler that he did buy backlinks for his gadget review site (with “AI writing assistants”). He disclaimed owning the disputed image or any images and made vague suggestions that a disgruntled former contractor may be trying to poison his ranking with spam links.

Asked by Ars if he had heard back from “Commonwealth Legal” now that five business days were up, Ernie Smith tells Ars: “No, alas.”

This post was updated at 4: 50 p.m. Eastern to include Ernie Smith’s response.

Fake AI law firms are sending fake DMCA threats to generate fake SEO gains Read More »

google-might-make-users-pay-for-ai-features-in-search-results

Google might make users pay for AI features in search results

Pay-eye for the AI —

Plan would represent a first for what has been a completely ad-funded search engine.

You think this cute little search robot is going to work for free?

Enlarge / You think this cute little search robot is going to work for free?

Google might start charging for access to search results that use generative artificial intelligence tools. That’s according to a new Financial Times report citing “three people with knowledge of [Google’s] plans.”

Charging for any part of the search engine at the core of its business would be a first for Google, which has funded its search product solely with ads since 2000. But it’s far from the first time Google would charge for AI enhancements in general; the “AI Premium” tier of a Google One subscription costs $10 more per month than a standard “Premium” plan, for instance, while “Gemini Business” adds $20 a month to a standard Google Workspace subscription.

While those paid products offer access to Google’s high-end “Gemini Advanced” AI model, Google also offers free access to its less performant, plain “Gemini” model without any kind of paid subscription.

When ads aren’t enough?

Under the proposed plan, Google’s standard search (without AI) would remain free, and subscribers to a paid AI search tier would still see ads alongside their Gemini-powered search results, according to the FT report. But search ads—which brought in a reported $175 billion for Google last year—might not be enough to fully cover the increased costs involved with AI-powered search. A Reuters report from last year suggested that running a search query through an advanced neural network like Gemini “likely costs 10 times more than a standard keyword search,” potentially representing “several billion dollars of extra costs” across Google’s network.

Cost aside, it remains to be seen if there’s a critical mass of market demand for this kind of AI-enhanced search. Microsoft’s massive investment in generative AI features for its Bing search engine has failed to make much of a dent in Google’s market share over the last year or so. And there has reportedly been limited uptake for Google’s experimental opt-in “Search Generative Experience” (SGE), which adds chatbot responses above the usual set of links in response to a search query.

“SGE never feels like a useful addition to Google Search,” Ars’ Ron Amadeo wrote last month. “Google Search is a tool, and just as a screwdriver is not a hammer, I don’t want a chatbot in a search engine.”

Regardless, the current tech industry mania surrounding anything and everything related to generative AI may make Google feel it has to integrate the technology into some sort of “premium” search product sooner rather than later. For now, FT reports that Google hasn’t made a final decision on whether to implement the paid AI search plan, even as Google engineers work on the backend technology necessary to launch such a service

Google also faces AI-related difficulties on the other side of the search divide. Last month, the company announced it was redoubling its efforts to limit the appearance of “spammy, low-quality content”—much of it generated by AI chatbots—in its search results.

In February, Google shut down the image generation features of its Gemini AI model after the service was found inserting historically inaccurate examples of racial diversity into some of its prompt responses.

Google might make users pay for AI features in search results Read More »

copilot-key-is-based-on-a-button-you-probably-haven’t-seen-since-ibm’s-model-m

Copilot key is based on a button you probably haven’t seen since IBM’s Model M

Microsoft chatbot button —

Left-Shift + Windows key + F23

A Dell XPS 14 laptop with a Copilot key.

Enlarge / A Dell XPS 14 laptop. The Copilot key is to the right of the right-Alt button.

In January, Microsoft introduced a new key to Windows PC keyboards for the first time in 30 years. The Copilot key, dedicated to launching Microsoft’s eponymous generative AI assistant, is already on some Windows laptops released this year. On Monday, Tom’s Hardware dug into the new addition and determined exactly what pressing the button does, which is actually pretty simple. Pushing a computer’s integrated Copilot button is like pressing left-Shift + Windows key + F23 simultaneously.

Tom’s Hardware confirmed this after wondering if the Copilot key introduced a new scan code to Windows or if it worked differently. Using the scripting program AuthoHotkey with a new laptop with a Copilot button, Tom’s Hardware discovered the keystrokes registered when a user presses the Copilot key. The publication confirmed with Dell that “this key assignment is standard for the Copilot key and done at Microsoft’s direction.”

F23

Surprising to see in that string of keys is F23. Having a computer keyboard with a function row or rows that take you from F1 all the way to F23 is quite rare today. When I try to imagine a keyboard that comes with an F23 button, vintage keyboards come to mind, more specifically buckling spring keyboards from IBM.

IBM’s Model F, which debuted in 1981 and used buckling spring switches over a capacitive PCB, and the Model M, which launched in 1985 and used buckling spring switches over a membrane sheet, both offered layouts with 122 keys. These layouts included not one, but two rows of function keys that would leave today’s 60 percent keyboard fans sweating over the wasted space.

But having 122 keys was helpful for keyboards tied to IBM business terminals. The keyboard layout even included a bank of keys to the left of the primary alpha block of keys for even more forms of input.

An IBM Model M keyboard with an F23 key.

Enlarge / An IBM Model M keyboard with an F23 key.

The 122-key keyboard layout with F23 lives on. Beyond people who still swear by old Model F and M keyboards, Model F Labs and Unicomp both currently sell modern buckling spring keyboards with built-in F23 buttons. Another reason a modern Windows PC user might have access to an F23 key is if they use a macro pad.

But even with those uses in mind, the F23 key remains rare. That helps explain why Microsoft would use the key for launching Copilot; users are unlikely to have F23 programmed for other functions. This was also likely less work than making a key with an entirely new scan code.

The Copilot button is reprogrammable

When I previewed Dell’s 2024 XPS laptops, a Dell representative told me that the integrated Copilot key wasn’t reprogrammable. However, in addition to providing some interesting information about the newest PC key since the Windows button, Tom’s Hardware’s revelation shows why the Copilot key is actually reprogrammable, even if OEMs don’t give users a way to do so out of the box. (If you need help, check out the website’s tutorial for reprogramming the Windows Copilot key.)

I suspect there’s a strong interest in reprogramming that button. For one, generative AI, despite all its hype and potential, is still an emerging technology. Many don’t need or want access to any chatbot—let alone Microsoft’s—instantly or even at all. Those who don’t use their system with a Microsoft account have no use for the button, since being logged in to a Microsoft account is required for the button to launch Copilot.

A rendering of the Copilot button.

Enlarge / A rendering of the Copilot button.

Microsoft

Additionally, there are other easy ways to launch Copilot on a computer that has the program downloaded, like double-clicking an icon or pressing Windows + C, that make a dedicated button unnecessary. (Ars Technica asked Microsoft why the Copilot key doesn’t just register Windows + C, but the company declined to comment. Windows + C has launched other apps in the past, including Cortana, so it’s possible that Microsoft wanted to avoid the Copilot key performing a different function when pressed on computers that use Windows images without Copilot.)

In general, shoehorning the Copilot key into Windows laptops seems premature. Copilot is young and still a preview; just a few months ago, it was called Bing Chat. Further, the future of generative AI, including its popularity and top uses, is still forming and could evolve substantially during the lifetime of a Windows laptop. Microsoft’s generative AI efforts could also flounder over the years. Imagine if Microsoft went all-in on Bing back in the day and made all Windows keyboards have a Bing button, for example. Just because Microsoft wants something to become mainstream doesn’t mean that it will.

This all has made the Copilot button seem more like a way to force the adoption of Microsoft’s chatbot than a way to improve Windows keyboards. Microsoft has also made the Copilot button a requirement for its AI PC certification (which also requires an integrated neural processing unit and having Copilot pre-installed). Microsoft plans to make Copilot keys a requirement for Windows 11 OEM PCs eventually, it told Ars Technica in January.

At least for now, the basic way that the Copilot button works means you can turn the key into something more useful. Now, the tricky part would be finding a replacement keycap to eradicate Copilot’s influence from your keyboard.

Listing image by Microsoft

Copilot key is based on a button you probably haven’t seen since IBM’s Model M Read More »