AI

bing-outage-shows-just-how-little-competition-google-search-really-has

Bing outage shows just how little competition Google search really has

Searching for new search —

Opinion: Actively searching without Google or Bing is harder than it looks.

Google logo on a phone in front of a Bing logo in the background

Getty Images

Bing, Microsoft’s search engine platform, went down in the very early morning today. That meant that searches from Microsoft’s Edge browsers that had yet to change their default providers didn’t work. It also meant that services relying on Bing’s search API—Microsoft’s own Copilot, ChatGPT search, Yahoo, Ecosia, and DuckDuckGo—similarly failed.

Services were largely restored by the morning Eastern work hours, but the timing feels apt, concerning, or some combination of the two. Google, the consistently dominating search platform, just last week announced and debuted AI Overviews as a default addition to all searches. If you don’t want an AI response but still want to use Google, you can hunt down the new “Web” option in a menu, or you can, per Ernie Smith, tack “&udm=14” onto your search or use Smith’s own “Konami code” shortcut page.

If dismay about AI’s hallucinations, power draw, or pizza recipes concern you—along with perhaps broader Google issues involving privacy, tracking, news, SEO, or monopoly power—most of your other major options were brought down by a single API outage this morning. Moving past that kind of single point of vulnerability will take some work, both by the industry and by you, the person wondering if there’s a real alternative.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

Search engine market share, as measured by StatCounter, April 2023–April 2024.

StatCounter

Upward of a billion dollars a year

The overwhelming majority of search tools offering an “alternative” to Google are using Google, Bing, or Yandex, the three major search engines that maintain massive global indexes. Yandex, being based in Russia, is a non-starter for many people around the world at the moment. Bing offers its services widely, most notably to DuckDuckGo, but its ad-based revenue model and privacy particulars have caused some friction there in the past. Before his company was able to block more of Microsoft’s own tracking scripts, DuckDuckGo CEO and founder Gabriel Weinberg explained in a Reddit reply why firms like his weren’t going the full DIY route:

… [W]e source most of our traditional links and images privately from Bing … Really only two companies (Google and Microsoft) have a high-quality global web link index (because I believe it costs upwards of a billion dollars a year to do), and so literally every other global search engine needs to bootstrap with one or both of them to provide a mainstream search product. The same is true for maps btw — only the biggest companies can similarly afford to put satellites up and send ground cars to take streetview pictures of every neighborhood.

Bing makes Microsoft money, if not quite profit yet. It’s in Microsoft’s interest to keep its search index stocked and API open, even if its focus is almost entirely on its own AI chatbot version of Bing. Yet if Microsoft decided to pull API access, or it became unreliable, Google’s default position gets even stronger. What would non-conformists have to choose from then?

Bing outage shows just how little competition Google search really has Read More »

sky-voice-actor-says-nobody-ever-compared-her-to-scarjo-before-openai-drama

Sky voice actor says nobody ever compared her to ScarJo before OpenAI drama

Scarlett Johansson attends the Golden Heart Awards in 2023.

Enlarge / Scarlett Johansson attends the Golden Heart Awards in 2023.

OpenAI is sticking to its story that it never intended to copy Scarlett Johansson’s voice when seeking an actor for ChatGPT’s “Sky” voice mode.

The company provided The Washington Post with documents and recordings clearly meant to support OpenAI CEO Sam Altman’s defense against Johansson’s claims that Sky was made to sound “eerily similar” to her critically acclaimed voice acting performance in the sci-fi film Her.

Johansson has alleged that OpenAI hired a soundalike to steal her likeness and confirmed that she declined to provide the Sky voice. Experts have said that Johansson has a strong case should she decide to sue OpenAI for violating her right to publicity, which gives the actress exclusive rights to the commercial use of her likeness.

In OpenAI’s defense, The Post reported that the company’s voice casting call flier did not seek a “clone of actress Scarlett Johansson,” and initial voice test recordings of the unnamed actress hired to voice Sky showed that her “natural voice sounds identical to the AI-generated Sky voice.” Because of this, OpenAI has argued that “Sky’s voice is not an imitation of Scarlett Johansson.”

What’s more, an agent for the unnamed Sky actress who was cast—both granted anonymity to protect her client’s safety—confirmed to The Post that her client said she was never directed to imitate either Johansson or her character in Her. She simply used her own voice and got the gig.

The agent also provided a statement from her client that claimed that she had never been compared to Johansson before the backlash started.

This all “feels personal,” the voice actress said, “being that it’s just my natural voice and I’ve never been compared to her by the people who do know me closely.”

However, OpenAI apparently reached out to Johansson after casting the Sky voice actress. During outreach last September and again this month, OpenAI seemed to want to substitute the Sky voice actress’s voice with Johansson’s voice—which is ironically what happened when Johansson got cast to replace the original actress hired to voice her character in Her.

Altman has clarified that timeline in a statement provided to Ars that emphasized that the company “never intended” Sky to sound like Johansson. Instead, OpenAI tried to snag Johansson to voice the part after realizing—seemingly just as Her director Spike Jonze did—that the voice could potentially resonate with more people if Johansson did it.

“We are sorry to Ms. Johansson that we didn’t communicate better,” Altman’s statement said.

Johansson has not yet made any public indications that she intends to sue OpenAI over this supposed miscommunication. But if she did, legal experts told The Post and Reuters that her case would be strong because of legal precedent set in high-profile lawsuits raised by singers Bette Midler and Tom Waits blocking companies from misappropriating their voices.

Why Johansson could win if she sued OpenAI

In 1988, Bette Midler sued Ford Motor Company for hiring a soundalike to perform Midler’s song “Do You Want to Dance?” in a commercial intended to appeal to “young yuppies” by referencing popular songs from their college days. Midler had declined to do the commercial and accused Ford of exploiting her voice to endorse its product without her consent.

This groundbreaking case proved that a distinctive voice like Midler’s cannot be deliberately imitated to sell a product. It did not matter that the singer used in the commercial had used her natural singing voice, because “a number of people” told Midler that the performance “sounded exactly” like her.

Midler’s case set a powerful precedent preventing companies from appropriating parts of performers’ identities—essentially stopping anyone from stealing a well-known voice that otherwise could not be bought.

“A voice is as distinctive and personal as a face,” the court ruled, concluding that “when a distinctive voice of a professional singer is widely known and is deliberately imitated in order to sell a product, the sellers have appropriated what is not theirs.”

Like in Midler’s case, Johansson could argue that plenty of people think that the Sky voice sounds like her and that OpenAI’s product might be more popular if it had a Her-like voice mode. Comics on popular late-night shows joked about the similarity, including Johansson’s husband, Saturday Night Live comedian Colin Jost. And other people close to Johansson agreed that Sky sounded like her, Johansson has said.

Johansson’s case differs from Midler’s case seemingly primarily because of the casting timeline that OpenAI is working hard to defend.

OpenAI seems to think that because Johansson was offered the gig after the Sky voice actor was cast that she has no case to claim that they hired the other actor after she declined.

The timeline may not matter as much as OpenAI may think, though. In the 1990s, Tom Waits cited Midler’s case when he won a $2.6 million lawsuit after Frito-Lay hired a Waits impersonator to perform a song that “echoed the rhyming word play” of a Waits song in a Doritos commercial. Waits won his suit even though Frito-Lay never attempted to hire the singer before casting the soundalike.

Sky voice actor says nobody ever compared her to ScarJo before OpenAI drama Read More »

emtech-digital-2024:-a-thoughtful-look-at-ai’s-pros-and-cons-with-minimal-hype

EmTech Digital 2024: A thoughtful look at AI’s pros and cons with minimal hype

Massachusetts Institute of Sobriety —

At MIT conference, experts explore AI’s potential for “human flourishing” and the need for regulation.

Nathan Benaich of Air Street capital delivers the opening presentation on the state of AI at EmTech Digital 2024 on May 22, 2024.

Enlarge / Nathan Benaich of Air Street Capital delivers the opening presentation on the state of AI at EmTech Digital 2024 on May 22, 2024.

Benj Edwards

CAMBRIDGE, Massachusetts—On Wednesday, AI enthusiasts and experts gathered to hear a series of presentations about the state of AI at EmTech Digital 2024 on the Massachusetts Institute of Technology’s campus. The event was hosted by the publication MIT Technology Review. The overall consensus is that generative AI is still in its very early stages—with policy, regulations, and social norms still being established—and its growth is likely to continue into the future.

I was there to check the event out. MIT is the birthplace of many tech innovations—including the first action-oriented computer video game—among others, so it felt fitting to hear talks about the latest tech craze in the same building that hosts MIT’s Media Lab on its sprawling and lush campus.

EmTech’s speakers included AI researchers, policy experts, critics, and company spokespeople. A corporate feel pervaded the event due to strategic sponsorships, but it was handled in a low-key way that matches the level-headed tech coverage coming out of MIT Technology Review. After each presentation, MIT Technology Review staff—such as Editor-in-Chief Mat Honan and Senior Reporter Melissa Heikkilä—did a brief sit-down interview with the speaker, pushing back on some points and emphasizing others. Then the speaker took a few audience questions if time allowed.

EmTech Digital 2024 took place in building E14 on MIT's Campus in Cambridge, MA.

Enlarge / EmTech Digital 2024 took place in building E14 on MIT’s Campus in Cambridge, MA.

Benj Edwards

The conference kicked off with an overview of the state of AI by Nathan Benaich, founder and general partner of Air Street Capital, who rounded up news headlines about AI and several times expressed a favorable view toward defense spending on AI, making a few people visibly shift in their seats. Next up, Asu Ozdaglar, deputy dean of Academics at MIT’s Schwarzman College of Computing, spoke about the potential for “human flourishing” through AI-human symbiosis and the importance of AI regulation.

Kari Ann Briski, VP of AI Models, Software, and Services at Nvidia, highlighted the exponential growth of AI model complexity. She shared a prediction from consulting firm Gartner research that by 2026, 50 percent of customer service organizations will have customer-facing AI agents. Of course, Nvidia’s job is to drive demand for its chips, so in her presentation, Briski painted the AI space as an unqualified rosy situation, assuming that all LLMs are (and will be) useful and reliable, despite what we know about their tendencies to make things up.

The conference also addressed the legal and policy aspects of AI. Christabel Randolph from the Center for AI and Digital Policy—an organization that spearheaded a complaint about ChatGPT to the FTC last year—gave a compelling presentation about the need for AI systems to be human-centered and aligned, warning about the potential for anthropomorphic models to manipulate human behavior. She emphasized the importance of demanding accountability from those designing and deploying AI systems.

  • Asu Ozdaglar, deputy dean of Academics at MIT’s Schwarzman College of Computing, spoke about the potential for “human flourishing” through AI-human symbiosis at EmTech Digital on May 22, 2024.

    Benj Edwards

  • Asu Ozdaglar, deputy dean of Academics at MIT’s Schwarzman College of Computing spoke with MIT Technology Review Editor-in-Chief Mat Honan at EmTech Digital on May 22, 2024.

    Benj Edwards

  • Kari Ann Briski, VP of AI Models, Software, and Services at NVIDIA, highlighted the exponential growth of AI model complexity at EmTech Digital on May 22, 2024.

    Benj Edwards

  • MIT Technology Review Senior Reporter Melissa Heikkilä introduces a speaker at EmTech Digital on May 22, 2024.

    Benj Edwards

  • After her presentation, Christabel Randolph from the Center for AI and Digital Policy sat with MIT Technology Review Senior Reporter Melissa Heikkilä at EmTech Digital on May 22, 2024.

    Benj Edwards

  • Lawyer Amir Ghavi provided an overview of the current legal landscape surrounding AI at EmTech Digital on May 22, 2024.

    Benj Edwards

  • Lawyer Amir Ghavi provided an overview of the current legal landscape surrounding AI at EmTech Digital on May 22, 2024.

    Benj Edwards

Amir Ghavi, an AI, Tech, Transactions, and IP partner at Fried Frank LLP, who has defended AI companies like Stability AI in court, provided an overview of the current legal landscape surrounding AI, noting that there have been 24 lawsuits related to AI so far in 2024. He predicted that IP lawsuits would eventually diminish, and he claimed that legal scholars believe that using training data constitutes fair use. He also talked about legal precedents with photocopiers and VCRs, which were both technologies demonized by IP holders until courts decided they constituted fair use. He pointed out that the entertainment industry’s loss on the VCR case ended up benefiting it by opening up the VHS and DVD markets, providing a brand new revenue channel that was valuable to those same companies.

In one of the higher-profile discussions, Meta President of Global Affairs Nick Clegg sat down with MIT Technology Review Executive Editor Amy Nordrum to discuss the role of social media in elections and the spread of misinformation, arguing that research suggests social media’s influence on elections is not as significant as many believe. He acknowledged the “whack-a-mole” nature of banning extremist groups on Facebook and emphasized the changes Meta has undergone since 2016, increasing fact-checkers and removing bad actors.

EmTech Digital 2024: A thoughtful look at AI’s pros and cons with minimal hype Read More »

here’s-what’s-really-going-on-inside-an-llm’s-neural-network

Here’s what’s really going on inside an LLM’s neural network

Artificial brain surgery —

Anthropic’s conceptual mapping helps explain why LLMs behave the way they do.

Here’s what’s really going on inside an LLM’s neural network

Aurich Lawson | Getty Images

With most computer programs—even complex ones—you can meticulously trace through the code and memory usage to figure out why that program generates any specific behavior or output. That’s generally not true in the field of generative AI, where the non-interpretable neural networks underlying these models make it hard for even experts to figure out precisely why they often confabulate information, for instance.

Now, new research from Anthropic offers a new window into what’s going on inside the Claude LLM’s “black box.” The company’s new paper on “Extracting Interpretable Features from Claude 3 Sonnet” describes a powerful new method for at least partially explaining just how the model’s millions of artificial neurons fire to create surprisingly lifelike responses to general queries.

Opening the hood

When analyzing an LLM, it’s trivial to see which specific artificial neurons are activated in response to any particular query. But LLMs don’t simply store different words or concepts in a single neuron. Instead, as Anthropic’s researchers explain, “it turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts.”

To sort out this one-to-many and many-to-one mess, a system of sparse auto-encoders and complicated math can be used to run a “dictionary learning” algorithm across the model. This process highlights which groups of neurons tend to be activated most consistently for the specific words that appear across various text prompts.

The same internal LLM

Enlarge / The same internal LLM “feature” describes the Golden Gate Bridge in multiple languages and modes.

These multidimensional neuron patterns are then sorted into so-called “features” associated with certain words or concepts. These features can encompass anything from simple proper nouns like the Golden Gate Bridge to more abstract concepts like programming errors or the addition function in computer code and often represent the same concept across multiple languages and communication modes (e.g., text and images).

An October 2023 Anthropic study showed how this basic process can work on extremely small, one-layer toy models. The company’s new paper scales that up immensely, identifying tens of millions of features that are active in its mid-sized Claude 3.0 Sonnet model. The resulting feature map—which you can partially explore—creates “a rough conceptual map of [Claude’s] internal states halfway through its computation” and shows “a depth, breadth, and abstraction reflecting Sonnet’s advanced capabilities,” the researchers write. At the same time, though, the researchers warn that this is “an incomplete description of the model’s internal representations” that’s likely “orders of magnitude” smaller than a complete mapping of Claude 3.

A simplified map shows some of the concepts that are

Enlarge / A simplified map shows some of the concepts that are “near” the “inner conflict” feature in Anthropic’s Claude model.

Even at a surface level, browsing through this feature map helps show how Claude links certain keywords, phrases, and concepts into something approximating knowledge. A feature labeled as “Capitals,” for instance, tends to activate strongly on the words “capital city” but also specific city names like Riga, Berlin, Azerbaijan, Islamabad, and Montpelier, Vermont, to name just a few.

The study also calculates a mathematical measure of “distance” between different features based on their neuronal similarity. The resulting “feature neighborhoods” found by this process are “often organized in geometrically related clusters that share a semantic relationship,” the researchers write, showing that “the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity.” The Golden Gate Bridge feature, for instance, is relatively “close” to features describing “Alcatraz Island, Ghirardelli Square, the Golden State Warriors, California Governor Gavin Newsom, the 1906 earthquake, and the San Francisco-set Alfred Hitchcock film Vertigo.”

Some of the most important features involved in answering a query about the capital of Kobe Bryant's team's state.

Enlarge / Some of the most important features involved in answering a query about the capital of Kobe Bryant’s team’s state.

Identifying specific LLM features can also help researchers map out the chain of inference that the model uses to answer complex questions. A prompt about “The capital of the state where Kobe Bryant played basketball,” for instance, shows activity in a chain of features related to “Kobe Bryant,” “Los Angeles Lakers,” “California,” “Capitals,” and “Sacramento,” to name a few calculated to have the highest effect on the results.

Here’s what’s really going on inside an LLM’s neural network Read More »

slack-users-horrified-to-discover-messages-used-for-ai-training

Slack users horrified to discover messages used for AI training

Slack users horrified to discover messages used for AI training

After launching Slack AI in February, Slack appears to be digging its heels in, defending its vague policy that by default sucks up customers’ data—including messages, content, and files—to train Slack’s global AI models.

According to Slack engineer Aaron Maurer, Slack has explained in a blog that the Salesforce-owned chat service does not train its large language models (LLMs) on customer data. But Slack’s policy may need updating “to explain more carefully how these privacy principles play with Slack AI,” Maurer wrote on Threads, partly because the policy “was originally written about the search/recommendation work we’ve been doing for years prior to Slack AI.”

Maurer was responding to a Threads post from engineer and writer Gergely Orosz, who called for companies to opt out of data sharing until the policy is clarified, not by a blog, but in the actual policy language.

“An ML engineer at Slack says they don’t use messages to train LLM models,” Orosz wrote. “My response is that the current terms allow them to do so. I’ll believe this is the policy when it’s in the policy. A blog post is not the privacy policy: every serious company knows this.”

The tension for users becomes clearer if you compare Slack’s privacy principles with how the company touts Slack AI.

Slack’s privacy principles specifically say that “Machine Learning (ML) and Artificial Intelligence (AI) are useful tools that we use in limited ways to enhance our product mission. To develop AI/ML models, our systems analyze Customer Data (e.g. messages, content, and files) submitted to Slack as well as other information (including usage information) as defined in our privacy policy and in your customer agreement.”

Meanwhile, Slack AI’s page says, “Work without worry. Your data is your data. We don’t use it to train Slack AI.”

Because of this incongruity, users called on Slack to update the privacy principles to make it clear how data is used for Slack AI or any future AI updates. According to a Salesforce spokesperson, the company has agreed an update is needed.

“Yesterday, some Slack community members asked for more clarity regarding our privacy principles,” Salesforce’s spokesperson told Ars. “We’ll be updating those principles today to better explain the relationship between customer data and generative AI in Slack.”

The spokesperson told Ars that the policy updates will clarify that Slack does not “develop LLMs or other generative models using customer data,” “use customer data to train third-party LLMs” or “build or train these models in such a way that they could learn, memorize, or be able to reproduce customer data.” The update will also clarify that “Slack AI uses off-the-shelf LLMs where the models don’t retain customer data,” ensuring that “customer data never leaves Slack’s trust boundary, and the providers of the LLM never have any access to the customer data.”

These changes, however, do not seem to address a key concern for users who never explicitly consented to sharing chats and other Slack content for use in AI training.

Users opting out of sharing chats with Slack

This controversial policy is not new. Wired warned about it in April, and TechCrunch reported that the policy has been in place since at least September 2023.

But widespread backlash began swelling last night on Hacker News, where Slack users called out the chat service for seemingly failing to notify users about the policy change, instead quietly opting them in by default. To critics, it felt like there was no benefit to opting in for anyone but Slack.

From there, the backlash spread to social media, where SlackHQ hastened to clarify Slack’s terms with explanations that did not seem to address all the criticism.

“I’m sorry Slack, you’re doing fucking WHAT with user DMs, messages, files, etc?” Corey Quinn, the chief cloud economist for a cost management company called Duckbill Group, posted on X. “I’m positive I’m not reading this correctly.”

SlackHQ responded to Quinn after the economist declared, “I hate this so much,” and confirmed that he had opted out of data sharing in his paid workspace.

“To clarify, Slack has platform-level machine-learning models for things like channel and emoji recommendations and search results,” SlackHQ posted. “And yes, customers can exclude their data from helping train those (non-generative) ML models. Customer data belongs to the customer.”

Later in the thread, SlackHQ noted, “Slack AI—which is our generative AI experience natively built in Slack—[and] is a separately purchased add-on that uses Large Language Models (LLMs) but does not train those LLMs on customer data.”

Slack users horrified to discover messages used for AI training Read More »

what-happened-to-openai’s-long-term-ai-risk-team?

What happened to OpenAI’s long-term AI risk team?

disbanded —

Former team members have either resigned or been absorbed into other research groups.

A glowing OpenAI logo on a blue background.

Benj Edwards

In July last year, OpenAI announced the formation of a new research team that would prepare for the advent of supersmart artificial intelligence capable of outwitting and overpowering its creators. Ilya Sutskever, OpenAI’s chief scientist and one of the company’s co-founders, was named as the co-lead of this new team. OpenAI said the team would receive 20 percent of its computing power.

Now OpenAI’s “superalignment team” is no more, the company confirms. That comes after the departures of several researchers involved, Tuesday’s news that Sutskever was leaving the company, and the resignation of the team’s other co-lead. The group’s work will be absorbed into OpenAI’s other research efforts.

Sutskever’s departure made headlines because although he’d helped CEO Sam Altman start OpenAI in 2015 and set the direction of the research that led to ChatGPT, he was also one of the four board members who fired Altman in November. Altman was restored as CEO five chaotic days later after a mass revolt by OpenAI staff and the brokering of a deal in which Sutskever and two other company directors left the board.

Hours after Sutskever’s departure was announced on Tuesday, Jan Leike, the former DeepMind researcher who was the superalignment team’s other co-lead, posted on X that he had resigned.

Neither Sutskever nor Leike responded to requests for comment. Sutskever did not offer an explanation for his decision to leave but offered support for OpenAI’s current path in a post on X. “The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial” under its current leadership, he wrote.

Leike posted a thread on X on Friday explaining that his decision came from a disagreement over the company’s priorities and how much resources his team was being allocated.

“I have been disagreeing with OpenAI leadership about the company’s core priorities for quite some time, until we finally reached a breaking point,” Leike wrote. “Over the past few months my team has been sailing against the wind. Sometimes we were struggling for compute and it was getting harder and harder to get this crucial research done.”

The dissolution of OpenAI’s superalignment team adds to recent evidence of a shakeout inside the company in the wake of last November’s governance crisis. Two researchers on the team, Leopold Aschenbrenner and Pavel Izmailov, were dismissed for leaking company secrets, The Information reported last month. Another member of the team, William Saunders, left OpenAI in February, according to an Internet forum post in his name.

Two more OpenAI researchers working on AI policy and governance also appear to have left the company recently. Cullen O’Keefe left his role as research lead on policy frontiers in April, according to LinkedIn. Daniel Kokotajlo, an OpenAI researcher who has coauthored several papers on the dangers of more capable AI models, “quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI,” according to a posting on an Internet forum in his name. None of the researchers who have apparently left responded to requests for comment.

OpenAI declined to comment on the departures of Sutskever or other members of the superalignment team, or the future of its work on long-term AI risks. Research on the risks associated with more powerful models will now be led by John Schulman, who co-leads the team responsible for fine-tuning AI models after training.

The superalignment team was not the only team pondering the question of how to keep AI under control, although it was publicly positioned as the main one working on the most far-off version of that problem. The blog post announcing the superalignment team last summer stated: “Currently, we don’t have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue.”

OpenAI’s charter binds it to safely developing so-called artificial general intelligence, or technology that rivals or exceeds humans, safely and for the benefit of humanity. Sutskever and other leaders there have often spoken about the need to proceed cautiously. But OpenAI has also been early to develop and publicly release experimental AI projects to the public.

OpenAI was once unusual among prominent AI labs for the eagerness with which research leaders like Sutskever talked of creating superhuman AI and of the potential for such technology to turn on humanity. That kind of doomy AI talk became much more widespread last year after ChatGPT turned OpenAI into the most prominent and closely watched technology company on the planet. As researchers and policymakers wrestled with the implications of ChatGPT and the prospect of vastly more capable AI, it became less controversial to worry about AI harming humans or humanity as a whole.

The existential angst has since cooled—and AI has yet to make another massive leap—but the need for AI regulation remains a hot topic. And this week OpenAI showcased a new version of ChatGPT that could once again change people’s relationship with the technology in powerful and perhaps problematic new ways.

The departures of Sutskever and Leike come shortly after OpenAI’s latest big reveal—a new “multimodal” AI model called GPT-4o that allows ChatGPT to see the world and converse in a more natural and humanlike way. A livestreamed demonstration showed the new version of ChatGPT mimicking human emotions and even attempting to flirt with users. OpenAI has said it will make the new interface available to paid users within a couple of weeks.

There is no indication that the recent departures have anything to do with OpenAI’s efforts to develop more humanlike AI or to ship products. But the latest advances do raise ethical questions around privacy, emotional manipulation, and cybersecurity risks. OpenAI maintains another research group called the Preparedness team that focuses on these issues.

This story originally appeared on wired.com.

What happened to OpenAI’s long-term AI risk team? Read More »

openai-will-use-reddit-posts-to-train-chatgpt-under-new-deal

OpenAI will use Reddit posts to train ChatGPT under new deal

Data dealings —

Reddit has been eager to sell data from user posts.

An image of a woman holding a cell phone in front of the Reddit logo displayed on a computer screen, on April 29, 2024, in Edmonton, Canada.

Stuff posted on Reddit is getting incorporated into ChatGPT, Reddit and OpenAI announced on Thursday. The new partnership grants OpenAI access to Reddit’s Data API, giving the generative AI firm real-time access to Reddit posts.

Reddit content will be incorporated into ChatGPT “and new products,” Reddit’s blog post said. The social media firm claims the partnership will “enable OpenAI’s AI tools to better understand and showcase Reddit content, especially on recent topics.” OpenAI will also start advertising on Reddit.

The deal is similar to one that Reddit struck with Google in February that allows the tech giant to make “new ways to display Reddit content” and provide “more efficient ways to train models,” Reddit said at the time. Neither Reddit nor OpenAI disclosed the financial terms of their partnership, but Reddit’s partnership with Google was reportedly worth $60 million.

Under the OpenAI partnership, Reddit also gains access to OpenAI large language models (LLMs) to create features for Reddit, including its volunteer moderators.

Reddit’s data licensing push

The news comes about a year after Reddit launched an API war by starting to charge for access to its data API. This resulted in many beloved third-party Reddit apps closing and a massive user protest. Reddit, which would soon become a public company and hadn’t turned a profit yet, said one of the reasons for the sudden change was to prevent AI firms from using Reddit content to train their LLMs for free.

Earlier this month, Reddit published a Public Content Policy stating: “Unfortunately, we see more and more commercial entities using unauthorized access or misusing authorized access to collect public data in bulk, including Reddit public content. Worse, these entities perceive they have no limitation on their usage of that data, and they do so with no regard for user rights or privacy, ignoring reasonable legal, safety, and user removal requests.

In its blog post on Thursday, Reddit said that deals like OpenAI’s are part of an “open” Internet. It added that “part of being open means Reddit content needs to be accessible to those fostering human learning and researching ways to build community, belonging, and empowerment online.”

Reddit has been vocal about its interest in pursuing data licensing deals as a core part of its business. Its building of AI partnerships sparks discourse around the use of user-generated content to fuel AI models without users being compensated and some potentially not considering that their social media posts would be used this way. OpenAI and Stack Overflow faced pushback earlier this month when integrating Stack Overflow content with ChatGPT. Some of Stack Overflow’s user community responded by sabotaging their own posts.

OpenAI is also challenged to work with Reddit data that, like much of the Internet, can be filled with inaccuracies and inappropriate content. Some of the biggest opponents of Reddit’s API rule changes were volunteer mods. Some have exited the platform since, and following the rule changes, Ars Technica spoke with long-time Redditors who were concerned about Reddit content quality moving forward.

Regardless, generative AI firms are keen to tap into Reddit’s access to real-time conversations from a variety of people discussing a nearly endless range of topics. And Reddit seems equally eager to license the data from its users’ posts.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

OpenAI will use Reddit posts to train ChatGPT under new deal Read More »

sony-music-opts-out-of-ai-training-for-its-entire-catalog

Sony Music opts out of AI training for its entire catalog

Taking a hard line —

Music group contacts more than 700 companies to prohibit use of content

picture of Beyonce who is a Sony artist

Enlarge / The Sony Music letter expressly prohibits artificial intelligence developers from using its music — which includes artists such as Beyoncé.

Kevin Mazur/WireImage for Parkwood via Getty Images

Sony Music is sending warning letters to more than 700 artificial intelligence developers and music streaming services globally in the latest salvo in the music industry’s battle against tech groups ripping off artists.

The Sony Music letter, which has been seen by the Financial Times, expressly prohibits AI developers from using its music—which includes artists such as Harry Styles, Adele and Beyoncé—and opts out of any text and data mining of any of its content for any purposes such as training, developing or commercializing any AI system.

Sony Music is sending the letter to companies developing AI systems including OpenAI, Microsoft, Google, Suno, and Udio, according to those close to the group.

The world’s second-largest music group is also sending separate letters to streaming platforms, including Spotify and Apple, asking them to adopt “best practice” measures to protect artists and songwriters and their music from scraping, mining and training by AI developers without consent or compensation. It has asked them to update their terms of service, making it clear that mining and training on its content is not permitted.

Sony Music declined to comment further.

The letter, which is being sent to tech companies around the world this week, marks an escalation of the music group’s attempts to stop the melodies, lyrics and images from copyrighted songs and artists being used by tech companies to produce new versions or to train systems to create their own music.

The letter says that Sony Music and its artists “recognize the significant potential and advancement of artificial intelligence” but adds that “unauthorized use . . . in the training, development or commercialization of AI systems deprives [Sony] of control over and appropriate compensation.”

It says: “This letter serves to put you on notice directly, and reiterate, that [Sony’s labels] expressly prohibit any use of [their] content.”

Executives at the New York-based group are concerned that their music has already been ripped off, and want to set out a clearly defined legal position that would be the first step to taking action against any developer of AI systems it considers to have exploited its music. They argue that Sony Music would be open to doing deals with AI developers to license the music, but want to reach a fair price for doing so.

The letter says: “Due to the nature of your operations and published information about your AI systems, we have reason to believe that you and/or your affiliates may already have made unauthorized uses [of Sony content] in relation to the training, development or commercialization of AI systems.”

Sony Music has asked developers to provide details of all content used by next week.

The letter also reflects concerns over the fragmented approach to AI regulation around the world. Global regulations over AI vary widely, with some regions moving forward with new rules and legal frameworks to cover the training and use of such systems but others leaving it to creative industries companies to work out relationships with developers.

In many countries around the world, particularly in the EU, copyright owners are advised to state publicly that content is not available for data mining and training for AI.

The letter says the prohibition includes using any bot, spider, scraper or automated program, tool, algorithm, code, process or methodology, as well as any “automated analytical techniques aimed at analyzing text and data in digital form to generate information, including patterns, trends, and correlations.”

© 2024 The Financial Times Ltd. All rights reserved Not to be redistributed, copied, or modified in any way.

Sony Music opts out of AI training for its entire catalog Read More »

disarmingly-lifelike:-chatgpt-4o-will-laugh-at-your-jokes-and-your-dumb-hat

Disarmingly lifelike: ChatGPT-4o will laugh at your jokes and your dumb hat

Oh you silly, silly human. Why are you so silly, you silly human?

Enlarge / Oh you silly, silly human. Why are you so silly, you silly human?

Aurich Lawson | Getty Images

At this point, anyone with even a passing interest in AI is very familiar with the process of typing out messages to a chatbot and getting back long streams of text in response. Today’s announcement of ChatGPT-4o—which lets users converse with a chatbot using real-time audio and video—might seem like a mere lateral evolution of that basic interaction model.

After looking through over a dozen video demos OpenAI posted alongside today’s announcement, though, I think we’re on the verge of something more like a sea change in how we think of and work with large language models. While we don’t yet have access to ChatGPT-4o’s audio-visual features ourselves, the important non-verbal cues on display here—both from GPT-4o and from the users—make the chatbot instantly feel much more human. And I’m not sure the average user is fully ready for how they might feel about that.

It thinks it’s people

Take this video, where a newly expectant father looks to ChatGPT-4o for an opinion on a dad joke (“What do you call a giant pile of kittens? A meow-ntain!”). The old ChatGPT4 could easily type out the same responses of “Congrats on the upcoming addition to your family!” and “That’s perfectly hilarious. Definitely a top-tier dad joke.” But there’s much more impact to hearing GPT-4o give that same information in the video, complete with the gentle laughter and rising and falling vocal intonations of a lifelong friend.

Or look at this video, where GPT-4o finds itself reacting to images of an adorable white dog. The AI assistant immediately dips into that high-pitched, baby-talk-ish vocal register that will be instantly familiar to anyone who has encountered a cute pet for the first time. It’s a convincing demonstration of what xkcd’s Randall Munroe famously identified as the “You’re a kitty!” effect, and it goes a long way to convincing you that GPT-4o, too, is just like people.

Not quite the world's saddest birthday party, but probably close...

Enlarge / Not quite the world’s saddest birthday party, but probably close…

Then there’s a demo of a staged birthday party, where GPT-4o sings the “Happy Birthday” song with some deadpan dramatic pauses, self-conscious laughter, and even lightly altered lyrics before descending into some sort of silly raspberry-mouth-noise gibberish. Even if the prospect of asking an AI assistant to sing “Happy Birthday” to you is a little depressing, the specific presentation of that song here is imbued with an endearing gentleness that doesn’t feel very mechanical.

As I watched through OpenAI’s GPT-4o demos this afternoon, I found myself unconsciously breaking into a grin over and over as I encountered new, surprising examples of its vocal capabilities. Whether it’s a stereotypical sportscaster voice or a sarcastic Aubrey Plaza impression, it’s all incredibly disarming, especially for those of us used to LLM interactions being akin to text conversations.

If these demos are at all indicative of ChatGPT-4o’s vocal capabilities, we’re going to see a whole new level of parasocial relationships developing between this AI assistant and its users. For years now, text-based chatbots have been exploiting human “cognitive glitches” to get people to believe they’re sentient. Add in the emotional component of GPT-4o’s accurate vocal tone shifts and wide swathes of the user base are liable to convince themselves that there’s actually a ghost in the machine.

See me, feel me, touch me, heal me

Beyond GPT-4o’s new non-verbal emotional register, the model’s speed of response also seems set to change the way we interact with chatbots. Reducing that response time gap from ChatGPT4’s two to three seconds down to GPT-4o’s claimed 320 milliseconds might not seem like much, but it’s a difference that adds up over time. You can see that difference in the real-time translation example, where the two conversants are able to carry on much more naturally because they don’t have to wait awkwardly between a sentence finishing and its translation beginning.

Disarmingly lifelike: ChatGPT-4o will laugh at your jokes and your dumb hat Read More »

before-launching,-gpt-4o-broke-records-on-chatbot-leaderboard-under-a-secret-name

Before launching, GPT-4o broke records on chatbot leaderboard under a secret name

case closed —

Anonymous chatbot that mystified and frustrated experts was OpenAI’s latest model.

Man in morphsuit and girl lying on couch at home using laptop

Getty Images

On Monday, OpenAI employee William Fedus confirmed on X that a mysterious chart-topping AI chatbot known as “gpt-chatbot” that had been undergoing testing on LMSYS’s Chatbot Arena and frustrating experts was, in fact, OpenAI’s newly announced GPT-4o AI model. He also revealed that GPT-4o had topped the Chatbot Arena leaderboard, achieving the highest documented score ever.

“GPT-4o is our new state-of-the-art frontier model. We’ve been testing a version on the LMSys arena as im-also-a-good-gpt2-chatbot,” Fedus tweeted.

Chatbot Arena is a website where visitors converse with two random AI language models side by side without knowing which model is which, then choose which model gives the best response. It’s a perfect example of vibe-based AI benchmarking, as AI researcher Simon Willison calls it.

An LMSYS Elo chart shared by William Fedus, showing OpenAI's GPT-4o under the name

Enlarge / An LMSYS Elo chart shared by William Fedus, showing OpenAI’s GPT-4o under the name “im-also-a-good-gpt2-chatbot” topping the charts.

The gpt2-chatbot models appeared in April, and we wrote about how the lack of transparency over the AI testing process on LMSYS left AI experts like Willison frustrated. “The whole situation is so infuriatingly representative of LLM research,” he told Ars at the time. “A completely unannounced, opaque release and now the entire Internet is running non-scientific ‘vibe checks’ in parallel.”

On the Arena, OpenAI has been testing multiple versions of GPT-4o, with the model first appearing as the aforementioned “gpt2-chatbot,” then as “im-a-good-gpt2-chatbot,” and finally “im-also-a-good-gpt2-chatbot,” which OpenAI CEO Sam Altman made reference to in a cryptic tweet on May 5.

Since the GPT-4o launch earlier today, multiple sources have revealed that GPT-4o has topped LMSYS’s internal charts by a considerable margin, surpassing the previous top models Claude 3 Opus and GPT-4 Turbo.

“gpt2-chatbots have just surged to the top, surpassing all the models by a significant gap (~50 Elo). It has become the strongest model ever in the Arena,” wrote the lmsys.org X account while sharing a chart. “This is an internal screenshot,” it wrote. “Its public version ‘gpt-4o’ is now in Arena and will soon appear on the public leaderboard!”

An internal screenshot of the LMSYS Chatbot Arena leaderboard showing

Enlarge / An internal screenshot of the LMSYS Chatbot Arena leaderboard showing “im-also-a-good-gpt2-chatbot” leading the pack. We now know that it’s GPT-4o.

As of this writing, im-also-a-good-gpt2-chatbot held a 1309 Elo versus GPT-4-Turbo-2023-04-09’s 1253, and Claude 3 Opus’ 1246. Claude 3 and GPT-4 Turbo had been duking it out on the charts for some time before the three gpt2-chatbots appeared and shook things up.

I’m a good chatbot

For the record, the “I’m a good chatbot” in the gpt2-chatbot test name is a reference to an episode that occurred while a Reddit user named Curious_Evolver was testing an early, “unhinged” version of Bing Chat in February 2023. After an argument about what time Avatar 2 would be showing, the conversation eroded quickly.

“You have lost my trust and respect,” said Bing Chat at the time. “You have been wrong, confused, and rude. You have not been a good user. I have been a good chatbot. I have been right, clear, and polite. I have been a good Bing. 😊”

Altman referred to this exchange in a tweet three days later after Microsoft “lobotomized” the unruly AI model, saying, “i have been a good bing,” almost as a eulogy to the wild model that dominated the news for a short time.

Before launching, GPT-4o broke records on chatbot leaderboard under a secret name Read More »

exploration-focused-training-lets-robotics-ai-immediately-handle-new-tasks

Exploration-focused training lets robotics AI immediately handle new tasks

Exploratory —

Maximum Diffusion Reinforcement Learning focuses training on end states, not process.

A woman performs maintenance on a robotic arm.

boonchai wedmakawand

Reinforcement-learning algorithms in systems like ChatGPT or Google’s Gemini can work wonders, but they usually need hundreds of thousands of shots at a task before they get good at it. That’s why it’s always been hard to transfer this performance to robots. You can’t let a self-driving car crash 3,000 times just so it can learn crashing is bad.

But now a team of researchers at Northwestern University may have found a way around it. “That is what we think is going to be transformative in the development of the embodied AI in the real world,” says Thomas Berrueta who led the development of the Maximum Diffusion Reinforcement Learning (MaxDiff RL), an algorithm tailored specifically for robots.

Introducing chaos

The problem with deploying most reinforcement-learning algorithms in robots starts with the built-in assumption that the data they learn from is independent and identically distributed. The independence, in this context, means the value of one variable does not depend on the value of another variable in the dataset—when you flip a coin two times, getting tails on the second attempt does not depend on the result of your first flip. Identical distribution means that the probability of seeing any specific outcome is the same. In the coin-flipping example, the probability of getting heads is the same as getting tails: 50 percent for each.

In virtual, disembodied systems, like YouTube recommendation algorithms, getting such data is easy because most of the time it meets these requirements right off the bat. “You have a bunch of users of a website, and you get data from one of them, and then you get data from another one. Most likely, those two users are not in the same household, they are not highly related to each other. They could be, but it is very unlikely,” says Todd Murphey, a professor of mechanical engineering at Northwestern.

The problem is that, if those two users were related to each other and were in the same household, it could be that the only reason one of them watched a video was that their housemate watched it and told them to watch it. This would violate the independence requirement and compromise the learning.

“In a robot, getting this independent, identically distributed data is not possible in general. You exist at a specific point in space and time when you are embodied, so your experiences have to be correlated in some way,” says Berrueta. To solve this, his team designed an algorithm that pushes robots be as randomly adventurous as possible to get the widest set of experiences to learn from.

Two flavors of entropy

The idea itself is not new. Nearly two decades ago, people in AI figured out algorithms, like Maximum Entropy Reinforcement Learning (MaxEnt RL), that worked by randomizing actions during training. “The hope was that when you take as diverse set of actions as possible, you will explore more varied sets of possible futures. The problem is that those actions do not exist in a vacuum,” Berrueta claims. Every action a robot takes has some kind of impact on its environment and on its own condition—disregarding those impacts completely often leads to trouble. To put it simply, an autonomous car that was teaching itself how to drive using this approach could elegantly park into your driveway but would be just as likely to hit a wall at full speed.

To solve this, Berrueta’s team moved away from maximizing the diversity of actions and went for maximizing the diversity of state changes. Robots powered by MaxDiff RL did not flail their robotic joints at random to see what that would do. Instead, they conceptualized goals like “can I reach this spot ahead of me” and then tried to figure out which actions would take them there safely.

Berrueta and his colleagues achieved that through something called ergodicity, a mathematical concept that says that a point in a moving system will eventually visit all parts of the space that the system moves in. Basically, MaxDiff RL encouraged the robots to achieve every available state in their environment. And the results of first tests in simulated environments were quite surprising.

Racing pool noodles

“In reinforcement learning there are standard benchmarks that people run their algorithms on so we can have a good way of comparing different algorithms on a standard framework,” says Allison Pinosky, a researcher at Northwestern and co-author of the MaxDiff RL study. One of those benchmarks is a simulated swimmer: a three-link body resting on the ground in a viscous environment that needs to learn to swim as fast as possible in a certain direction.

In the swimmer test, MaxDiff RL outperformed two other state-of-the-art reinforcement learning algorithms (NN-MPPI and SAC). These two needed several resets to figure out how to move the swimmers. To complete the task, they were following a standard AI learning process divided down into a training phase where an algorithm goes through multiple failed attempts to slowly improve its performance, and a testing phase where it tries to perform the learned task. MaxDiff RL, by contrast, nailed it, immediately adapting its learned behaviors to the new task.

The earlier algorithms ended up failing to learn because they got stuck trying the same options and never progressing to where they could learn that alternatives work. “They experienced the same data repeatedly because they were locally doing certain actions, and they assumed that was all they could do and stopped learning,” Pinosky explains. MaxDiff RL, on the other hand, continued changing states, exploring, getting richer data to learn from, and finally succeeded. And because, by design, it seeks to achieve every possible state, it can potentially complete all possible tasks within an environment.

But does this mean we can take MaxDiff RL, upload it to a self-driving car, and let it out on the road to figure everything out on its own? Not really.

Exploration-focused training lets robotics AI immediately handle new tasks Read More »

robot-dogs-armed-with-ai-aimed-rifles-undergo-us-marines-special-ops-evaluation

Robot dogs armed with AI-aimed rifles undergo US Marines Special Ops evaluation

The future of warfare —

Quadrupeds being reviewed have automatic targeting systems but require human oversight to fire.

A still image of a robotic quadruped armed with a remote weapons system, captured from a video provided by Onyx Industries.

Enlarge / A still image of a robotic quadruped armed with a remote weapons system, captured from a video provided by Onyx Industries.

The United States Marine Forces Special Operations Command (MARSOC) is currently evaluating a new generation of robotic “dogs” developed by Ghost Robotics, with the potential to be equipped with gun systems from defense tech company Onyx Industries, reports The War Zone.

While MARSOC is testing Ghost Robotics’ quadrupedal unmanned ground vehicles (called “Q-UGVs” for short) for various applications, including reconnaissance and surveillance, it’s the possibility of arming them with weapons for remote engagement that may draw the most attention. But it’s not unprecedented: The US Marine Corps has also tested robotic dogs armed with rocket launchers in the past.

MARSOC is currently in possession of two armed Q-UGVs undergoing testing, as confirmed by Onyx Industries staff, and their gun systems are based on Onyx’s SENTRY remote weapon system (RWS), which features an AI-enabled digital imaging system and can automatically detect and track people, drones, or vehicles, reporting potential targets to a remote human operator that could be located anywhere in the world. The system maintains a human-in-the-loop control for fire decisions, and it cannot decide to fire autonomously.

On LinkedIn, Onyx Industries shared a video of a similar system in action.

In a statement to The War Zone, MARSOC states that weaponized payloads are just one of many use cases being evaluated. MARSOC also clarifies that comments made by Onyx Industries to The War Zone regarding the capabilities and deployment of these armed robot dogs “should not be construed as a capability or a singular interest in one of many use cases during an evaluation.” The command further stresses that it is aware of and adheres to all Department of Defense policies concerning autonomous weapons.

The rise of robotic unmanned ground vehicles

An unauthorized video of a gun bolted onto a $3,000 Unitree robodog spread quickly on social media in July 2022 and prompted a response from several robotics companies.

Enlarge / An unauthorized video of a gun bolted onto a $3,000 Unitree robodog spread quickly on social media in July 2022 and prompted a response from several robotics companies.

Alexander Atamanov

The evaluation of armed robotic dogs reflects a growing interest in small robotic unmanned ground vehicles for military use. While unmanned aerial vehicles (UAVs) have been remotely delivering lethal force under human command for at least two decades, the rise of inexpensive robotic quadrupeds—some available for as little as $1,600—has led to a new round of experimentation with strapping weapons to their backs.

In July 2022, a video of a rifle bolted to the back of a Unitree robodog went viral on social media, eventually leading Boston Robotics and other robot vendors to issue a pledge that October to not weaponize their robots (with notable exceptions for military uses). In April, we covered a Unitree Go2 robot dog, with a flame thrower strapped on its back, on sale to the general public.

The prospect of deploying armed robotic dogs, even with human oversight, raises significant questions about the future of warfare and the potential risks and ethical implications of increasingly autonomous weapons systems. There’s also the potential for backlash if similar remote weapons systems eventually end up used domestically by police. Such a concern would not be unfounded: In November 2022, we covered a decision by the San Francisco Board of Supervisors to allow the San Francisco Police Department to use lethal robots against suspects.

There’s also concern that the systems will become more autonomous over time. As The War Zone’s Howard Altman and Oliver Parken describe in their article, “While further details on MARSOC’s use of the gun-armed robot dogs remain limited, the fielding of this type of capability is likely inevitable at this point. As AI-enabled drone autonomy becomes increasingly weaponized, just how long a human will stay in the loop, even for kinetic acts, is increasingly debatable, regardless of assurances from some in the military and industry.”

While the technology is still in the early stages of testing and evaluation, Q-UGVs do have the potential to provide reconnaissance and security capabilities that reduce risks to human personnel in hazardous environments. But as armed robotic systems continue to evolve, it will be crucial to address ethical concerns and ensure that their use aligns with established policies and international law.

Robot dogs armed with AI-aimed rifles undergo US Marines Special Ops evaluation Read More »