openai

openai-unveils-easy-voice-assistant-creation-at-2024-developer-event

OpenAI unveils easy voice assistant creation at 2024 developer event

Developers developers developers —

Altman steps back from the keynote limelight and lets four major API additions do the talking.

A glowing OpenAI logo on a blue background.

Benj Edwards

On Monday, OpenAI kicked off its annual DevDay event in San Francisco, unveiling four major API updates for developers that integrate the company’s AI models into their products. Unlike last year’s single-location event featuring a keynote by CEO Sam Altman, DevDay 2024 is more than just one day, adopting a global approach with additional events planned for London on October 30 and Singapore on November 21.

The San Francisco event, which was invitation-only and closed to press, featured on-stage speakers going through technical presentations. Perhaps the most notable new API feature is the Realtime API, now in public beta, which supports speech-to-speech conversations using six preset voices and enables developers to build features very similar to ChatGPT’s Advanced Voice Mode (AVM) into their applications.

OpenAI says that the Realtime API streamlines the process of creating voice assistants. Previously, developers had to use multiple models for speech recognition, text processing, and text-to-speech conversion. Now, they can handle the entire process with a single API call.

The company plans to add audio input and output capabilities to its Chat Completions API in the next few weeks, allowing developers to input text or audio and receive responses in either format.

Two new options for cheaper inference

OpenAI also announced two features that may help developers balance performance and cost when making AI applications. “Model distillation” offers a way for developers to fine-tune (customize) smaller, cheaper models like GPT-4o mini using outputs from more advanced models such as GPT-4o and o1-preview. This potentially allows developers to get more relevant and accurate outputs while running the cheaper model.

Also, OpenAI announced “prompt caching,” a feature similar to one introduced by Anthropic for its Claude API in August. It speeds up inference (the AI model generating outputs) by remembering frequently used prompts (input tokens). Along the way, the feature provides a 50 percent discount on input tokens and faster processing times by reusing recently seen input tokens.

And last but not least, the company expanded its fine-tuning capabilities to include images (what it calls “vision fine-tuning”), allowing developers to customize GPT-4o by feeding it both custom images and text. Basically, developers can teach the multimodal version of GPT-4o to visually recognize certain things. OpenAI says the new feature opens up possibilities for improved visual search functionality, more accurate object detection for autonomous vehicles, and possibly enhanced medical image analysis.

Where’s the Sam Altman keynote?

OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 6, 2023, in San Francisco.

Enlarge / OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 6, 2023, in San Francisco.

Getty Images

Unlike last year, DevDay isn’t being streamed live, though OpenAI plans to post content later on its YouTube channel. The event’s programming includes breakout sessions, community spotlights, and demos. But the biggest change since last year is the lack of a keynote appearance from the company’s CEO. This year, the keynote was handled by the OpenAI product team.

On last year’s inaugural DevDay, November 6, 2023, OpenAI CEO Sam Altman delivered a Steve Jobs-style live keynote to assembled developers, OpenAI employees, and the press. During his presentation, Microsoft CEO Satya Nadella made a surprise appearance, talking up the partnership between the companies.

Eleven days later, the OpenAI board fired Altman, triggering a week of turmoil that resulted in Altman’s return as CEO and a new board of directors. Just after the firing, Kara Swisher relayed insider sources that said Altman’s DevDay keynote and the introduction of the GPT store had been a precipitating factor in the firing (though not the key factor) due to some internal disagreements over the company’s more consumer-like direction since the launch of ChatGPT.

With that history in mind—and the focus on developers above all else for this event—perhaps the company decided it was best to let Altman step away from the keynote and let OpenAI’s technology become the key focus of the event instead of him. We are purely speculating on that point, but OpenAI has certainly experienced its share of drama over the past month, so it may have been a prudent decision.

Despite the lack of a keynote, Altman is present at Dev Day San Francisco today and is scheduled to do a closing “fireside chat” at the end (which has not yet happened as of this writing). Also, Altman made a statement about DevDay on X, noting that since last year’s DevDay, OpenAI had seen some dramatic changes (literally):

From last devday to this one:

*98% decrease in cost per token from GPT-4 to 4o mini

*50x increase in token volume across our systems

*excellent model intelligence progress

*(and a little bit of drama along the way)

In a follow-up tweet delivered in his trademark lowercase, Altman shared a forward-looking message that referenced the company’s quest for human-level AI, often called AGI: “excited to make even more progress from this devday to the next one,” he wrote. “the path to agi has never felt more clear.”

OpenAI unveils easy voice assistant creation at 2024 developer event Read More »

apple-backs-out-of-backing-openai,-report-claims

Apple backs out of backing OpenAI, report claims

ChatGPT —

Apple dropped out of the $6.5 billion investment round at the 11th hour.

The Apple Park campus in Cupertino, California.

Enlarge / The Apple Park campus in Cupertino, California.

A few weeks back, it was reported that Apple was exploring investing in OpenAI, the company that makes ChatGPT, the GPT model, and other popular generative AI products. Now, a new report from The Wall Street Journal claims that Apple has abandoned those plans.

The article simply says Apple “fell out of the talks to join the round.” The round is expected to close in a week or so and may raise as much as $6.5 billion for the growing Silicon Valley company. Had Apple gone through with the move, it would have been a rare event—though not completely unprecedented—for Apple to invest in another company that size.

OpenAI is still expected to raise the funds it seeks from other sources. The report claims Microsoft is expected to invest around $1 billion in this round. Microsoft has already invested substantial sums in OpenAI, whose GPT models power Microsoft AI tools like Copilot and Bing chat.

Nvidia is also a likely major investor in this round.

Apple will soon offer limited ChatGPT integration in an upcoming iOS update, though it plans to support additional models like Google’s Gemini further down the line, offering users a choice similar to how they pick a default search engine or web browser.

OpenAI has been on a successful tear with its products and models, establishing itself as a leader in the rapidly growing industry. However, it has also been beset by drama and controversy—most recently, some key leaders at OpenAI departed the company abruptly, and it shifted its focus from a research-focused organization that was beholden to a nonprofit, to a for-profit company under CEO Sam Altman. Also, former Apple design lead Jony Ive is confirmed to be working on a new AI product of some kind.

But The Wall Street Journal did not specify which (if any) of these facts are reasons why Apple chose to back out of the investment.

Apple backs out of backing OpenAI, report claims Read More »

man-tricks-openai’s-voice-bot-into-duet-of-the-beatles’-“eleanor-rigby”

Man tricks OpenAI’s voice bot into duet of The Beatles’ “Eleanor Rigby”

A screen capture of AJ Smith doing his Eleanor Rigby duet with OpenAI's Advanced Voice Mode through the ChatGPT app.

Enlarge / A screen capture of AJ Smith doing his Eleanor Rigby duet with OpenAI’s Advanced Voice Mode through the ChatGPT app.

OpenAI’s new Advanced Voice Mode (AVM) of its ChatGPT AI assistant rolled out to subscribers on Tuesday, and people are already finding novel ways to use it, even against OpenAI’s wishes. On Thursday, a software architect named AJ Smith tweeted a video of himself playing a duet of The Beatles’ 1966 song “Eleanor Rigby” with AVM. In the video, Smith plays the guitar and sings, with the AI voice interjecting and singing along sporadically, praising his rendition.

“Honestly, it was mind-blowing. The first time I did it, I wasn’t recording and literally got chills,” Smith told Ars Technica via text message. “I wasn’t even asking it to sing along.”

Smith is no stranger to AI topics. In his day job, he works as associate director of AI Engineering at S&P Global. “I use [AI] all the time and lead a team that uses AI day to day,” he told us.

In the video, AVM’s voice is a little quavery and not pitch-perfect, but it appears to know something about “Eleanor Rigby’s” melody when it first sings, “Ah, look at all the lonely people.” After that, it seems to be guessing at the melody and rhythm as it recites song lyrics. We have also convinced Advanced Voice Mode to sing, and it did a perfect melodic rendition of “Happy Birthday” after some coaxing.

AJ Smith’s video of singing a duet with OpenAI’s Advanced Voice Mode.

Normally, when you ask AVM to sing, it will reply something like, “My guidelines won’t let me talk about that.” That’s because in the chatbot’s initial instructions (called a “system prompt“), OpenAI instructs the voice assistant not to sing or make sound effects (“Do not sing or hum,” according to one system prompt leak).

OpenAI possibly added this restriction because AVM may otherwise reproduce copyrighted content, such as songs that were found in the training data used to create the AI model itself. That’s what is happening here to a limited extent, so in a sense, Smith has discovered a form of what researchers call a “prompt injection,” which is a way of convincing an AI model to produce outputs that go against its system instructions.

How did Smith do it? He figured out a game that reveals AVM knows more about music than it may let on in conversation. “I just said we’d play a game. I’d play the four pop chords and it would shout out songs for me to sing along with those chords,” Smith told us. “Which did work pretty well! But after a couple songs it started to sing along. Already it was such a unique experience, but that really took it to the next level.”

This is not the first time humans have played musical duets with computers. That type of research stretches back to the 1970s, although it was typically limited to reproducing musical notes or instrumental sounds. But this is the first time we’ve seen anyone duet with an audio-synthesizing voice chatbot in real time.

Man tricks OpenAI’s voice bot into duet of The Beatles’ “Eleanor Rigby” Read More »

google-and-meta-update-their-ai-models-amid-the-rise-of-“alphachip”

Google and Meta update their AI models amid the rise of “AlphaChip”

Running the AI News Gauntlet —

News about Gemini updates, Llama 3.2, and Google’s new AI-powered chip designer.

Cyberpunk concept showing a man running along a futuristic path full of monitors.

Enlarge / There’s been a lot of AI news this week, and covering it sometimes feels like running through a hall full of danging CRTs, just like this Getty Images illustration.

It’s been a wildly busy week in AI news thanks to OpenAI, including a controversial blog post from CEO Sam Altman, the wide rollout of Advanced Voice Mode, 5GW data center rumors, major staff shake-ups, and dramatic restructuring plans.

But the rest of the AI world doesn’t march to the same beat, doing its own thing and churning out new AI models and research by the minute. Here’s a roundup of some other notable AI news from the past week.

Google Gemini updates

On Tuesday, Google announced updates to its Gemini model lineup, including the release of two new production-ready models that iterate on past releases: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002. The company reported improvements in overall quality, with notable gains in math, long context handling, and vision tasks. Google claims a 7 percent increase in performance on the MMLU-Pro benchmark and a 20 percent improvement in math-related tasks. But as you know, if you’ve been reading Ars Technica for a while, AI typically benchmarks aren’t as useful as we would like them to be.

Along with model upgrades, Google introduced substantial price reductions for Gemini 1.5 Pro, cutting input token costs by 64 percent and output token costs by 52 percent for prompts under 128,000 tokens. As AI researcher Simon Willison noted on his blog, “For comparison, GPT-4o is currently $5/[million tokens] input and $15/m output and Claude 3.5 Sonnet is $3/m input and $15/m output. Gemini 1.5 Pro was already the cheapest of the frontier models and now it’s even cheaper.”

Google also increased rate limits, with Gemini 1.5 Flash now supporting 2,000 requests per minute and Gemini 1.5 Pro handling 1,000 requests per minute. Google reports that the latest models offer twice the output speed and three times lower latency compared to previous versions. These changes may make it easier and more cost-effective for developers to build applications with Gemini than before.

Meta launches Llama 3.2

On Wednesday, Meta announced the release of Llama 3.2, a significant update to its open-weights AI model lineup that we have covered extensively in the past. The new release includes vision-capable large language models (LLMs) in 11 billion and 90B parameter sizes, as well as lightweight text-only models of 1B and 3B parameters designed for edge and mobile devices. Meta claims the vision models are competitive with leading closed-source models on image recognition and visual understanding tasks, while the smaller models reportedly outperform similar-sized competitors on various text-based tasks.

Willison did some experiments with some of the smaller 3.2 models and reported impressive results for the models’ size. AI researcher Ethan Mollick showed off running Llama 3.2 on his iPhone using an app called PocketPal.

Meta also introduced the first official “Llama Stack” distributions, created to simplify development and deployment across different environments. As with previous releases, Meta is making the models available for free download, with license restrictions. The new models support long context windows of up to 128,000 tokens.

Google’s AlphaChip AI speeds up chip design

On Thursday, Google DeepMind announced what appears to be a significant advancement in AI-driven electronic chip design, AlphaChip. It began as a research project in 2020 and is now a reinforcement learning method for designing chip layouts. Google has reportedly used AlphaChip to create “superhuman chip layouts” in the last three generations of its Tensor Processing Units (TPUs), which are chips similar to GPUs designed to accelerate AI operations. Google claims AlphaChip can generate high-quality chip layouts in hours, compared to weeks or months of human effort. (Reportedly, Nvidia has also been using AI to help design its chips.)

Notably, Google also released a pre-trained checkpoint of AlphaChip on GitHub, sharing the model weights with the public. The company reported that AlphaChip’s impact has already extended beyond Google, with chip design companies like MediaTek adopting and building on the technology for their chips. According to Google, AlphaChip has sparked a new line of research in AI for chip design, potentially optimizing every stage of the chip design cycle from computer architecture to manufacturing.

That wasn’t everything that happened, but those are some major highlights. With the AI industry showing no signs of slowing down at the moment, we’ll see how next week goes.

Google and Meta update their AI models amid the rise of “AlphaChip” Read More »

openai’s-murati-shocks-with-sudden-departure-announcement

OpenAI’s Murati shocks with sudden departure announcement

thinning crowd —

OpenAI CTO’s resignation coincides with news about the company’s planned restructuring.

Mira Murati, Chief Technology Officer of OpenAI, speaks during The Wall Street Journal's WSJ Tech Live Conference in Laguna Beach, California on October 17, 2023.

Enlarge / Mira Murati, Chief Technology Officer of OpenAI, speaks during The Wall Street Journal’s WSJ Tech Live Conference in Laguna Beach, California on October 17, 2023.

On Wednesday, OpenAI Chief Technical Officer Mira Murati announced she is leaving the company in a surprise resignation shared on the social network X. Murati joined OpenAI in 2018, serving for six-and-a-half years in various leadership roles, most recently as the CTO.

“After much reflection, I have made the difficult decision to leave OpenAI,” she wrote in a letter to the company’s staff. “While I’ll express my gratitude to many individuals in the coming days, I want to start by thanking Sam and Greg for their trust in me to lead the technical organization and for their support throughout the years,” she continued, referring to OpenAI CEO Sam Altman and President Greg Brockman. “There’s never an ideal time to step away from a place one cherishes, yet this moment feels right.”

At OpenAI, Murati was in charge of overseeing the company’s technical strategy and product development, including the launch and improvement of DALL-E, Codex, Sora, and the ChatGPT platform, while also leading research and safety teams. In public appearances, Murati often spoke about ethical considerations in AI development.

Murati’s decision to leave the company comes when OpenAI finds itself at a major crossroads with a plan to alter its nonprofit structure. According to a Reuters report published today, OpenAI is working to reorganize its core business into a for-profit benefit corporation, removing control from its nonprofit board. The move, which would give CEO Sam Altman equity in the company for the first time, could potentially value OpenAI at $150 billion.

Murati stated her decision to leave was driven by a desire to “create the time and space to do my own exploration,” though she didn’t specify her future plans.

Proud of safety and research work

OpenAI CTO Mira Murati seen debuting GPT-4o during OpenAI's Spring Update livestream on May 13, 2024.

Enlarge / OpenAI CTO Mira Murati seen debuting GPT-4o during OpenAI’s Spring Update livestream on May 13, 2024.

OpenAI

In her departure announcement, Murati highlighted recent developments at OpenAI, including innovations in speech-to-speech technology and the release of OpenAI o1. She cited what she considers the company’s progress in safety research and the development of “more robust, aligned, and steerable” AI models.

Altman replied to Murati’s tweet directly, expressing gratitude for Murati’s contributions and her personal support during challenging times, likely referring to the tumultuous period in November 2023 when the OpenAI board of directors briefly fired Altman from the company.

It’s hard to overstate how much Mira has meant to OpenAI, our mission, and to us all personally,” he wrote. “I feel tremendous gratitude towards her for what she has helped us build and accomplish, but I most of all feel personal gratitude towards her for the support and love during all the hard times. I am excited for what she’ll do next.”

Not the first major player to leave

An image Ilya Sutskever tweeted with this OpenAI resignation announcement. From left to right: OpenAI Chief Scientist Jakub Pachocki, President Greg Brockman (on leave), Sutskever (now former Chief Scientist), CEO Sam Altman, and soon-to-be-former CTO Mira Murati.

Enlarge / An image Ilya Sutskever tweeted with this OpenAI resignation announcement. From left to right: OpenAI Chief Scientist Jakub Pachocki, President Greg Brockman (on leave), Sutskever (now former Chief Scientist), CEO Sam Altman, and soon-to-be-former CTO Mira Murati.

With Murati’s exit, Altman remains one of the few long-standing senior leaders at OpenAI, which has seen significant shuffling in its upper ranks recently. In May 2024, former Chief Scientist Ilya Sutskever left to form his own company, Safe Superintelligence, Inc. (SSI), focused on building AI systems that far surpass humans in logical capabilities. That came just six months after Sutskever’s involvement in the temporary removal of Altman as CEO.

John Schulman, an OpenAI co-founder, departed earlier in 2024 to join rival AI firm Anthropic, and in August, OpenAI President Greg Brockman announced he would be taking a temporary sabbatical until the end of the year.

The leadership shuffles have raised questions among critics about the internal dynamics at OpenAI under Altman and the state of OpenAI’s future research path, which has been aiming toward creating artificial general intelligence (AGI)—a hypothetical technology that could potentially perform human-level intellectual work.

“Question: why would key people leave an organization right before it was just about to develop AGI?” asked xAI developer Benjamin De Kraker in a post on X just after Murati’s announcement. “This is kind of like quitting NASA months before the moon landing,” he wrote in a reply. “Wouldn’t you wanna stick around and be part of it?”

Altman mentioned that more information about transition plans would be forthcoming, leaving questions about who will step into Murati’s role and how OpenAI will adapt to this latest leadership change as the company is poised to adopt a corporate structure that may consolidate more power directly under Altman. “We’ll say more about the transition plans soon, but for now, I want to take a moment to just feel thanks,” Altman wrote.

OpenAI’s Murati shocks with sudden departure announcement Read More »

donotpay-has-to-pay-$193k-for-falsely-touting-untested-ai-lawyer,-ftc-says

DoNotPay has to pay $193K for falsely touting untested AI lawyer, FTC says

DoNotPay has to pay $193K for falsely touting untested AI lawyer, FTC says

Among the first AI companies that the Federal Trade Commission has exposed as deceiving consumers is DoNotPay—which initially was advertised as “the world’s first robot lawyer” with the ability to “sue anyone with the click of a button.”

On Wednesday, the FTC announced that it took action to stop DoNotPay from making bogus claims after learning that the AI startup conducted no testing “to determine whether its AI chatbot’s output was equal to the level of a human lawyer.” DoNotPay also did not “hire or retain any attorneys” to help verify AI outputs or validate DoNotPay’s legal claims.

DoNotPay accepted no liability. But to settle the charges that DoNotPay violated the FTC Act, the AI startup agreed to pay $193,000, if the FTC’s consent agreement is confirmed following a 30-day public comment period. Additionally, DoNotPay agreed to warn “consumers who subscribed to the service between 2021 and 2023” about the “limitations of law-related features on the service,” the FTC said.

Moving forward, DoNotPay would also be prohibited under the settlement from making baseless claims that any of its features can be substituted for any professional service.

A DoNotPay spokesperson told Ars that the company “is pleased to have worked constructively with the FTC to settle this case and fully resolve these issues, without admitting liability.”

“The complaint relates to the usage of a few hundred customers some years ago (out of millions of people), with services that have long been discontinued,” DoNotPay’s spokesperson said.

The FTC’s settlement with DoNotPay is part of a larger agency effort to crack down on deceptive AI claims. Four other AI companies were hit with enforcement actions Wednesday, the FTC said, and FTC Chair Lina Khan confirmed that the agency’s so-called “Operation AI Comply” will continue monitoring companies’ attempts to “lure consumers into bogus schemes” or use AI tools to “turbocharge deception.”

“Using AI tools to trick, mislead, or defraud people is illegal,” Khan said. “The FTC’s enforcement actions make clear that there is no AI exemption from the laws on the books. By cracking down on unfair or deceptive practices in these markets, FTC is ensuring that honest businesses and innovators can get a fair shot and consumers are being protected.”

DoNotPay never tested robot lawyer

DoNotPay was initially released in 2015 as a free way to contest parking tickets. Soon after, it quickly expanded its services to supposedly cover 200 areas of law—aiding with everything from breach of contract claims to restraining orders to insurance claims and divorce settlements.

As DoNotPay’s legal services expanded, the company defended its innovative approach to replacing lawyers while acknowledging that it was on seemingly shaky grounds. In 2018, DoNotPay CEO Joshua Browder confirmed to the ABA Journal that the legal services were provided with “no lawyer oversight.” But he said that he was only “a bit worried” about threats to sue DoNotPay for unlicensed practice of law. Because DoNotPay was free, he expected he could avoid some legal challenges.

According to the FTC complaint, DoNotPay began charging subscribers $36 every two months in 2019 while making several false claims in ads to apparently drive up subscriptions.

DoNotPay has to pay $193K for falsely touting untested AI lawyer, FTC says Read More »

openai-asked-us-to-approve-energy-guzzling-5gw-data-centers,-report-says

OpenAI asked US to approve energy-guzzling 5GW data centers, report says

Great scott! —

OpenAI stokes China fears to woo US approvals for huge data centers, report says.

OpenAI asked US to approve energy-guzzling 5GW data centers, report says

OpenAI hopes to convince the White House to approve a sprawling plan that would place 5-gigawatt AI data centers in different US cities, Bloomberg reports.

The AI company’s CEO, Sam Altman, supposedly pitched the plan after a recent meeting with the Biden administration where stakeholders discussed AI infrastructure needs. Bloomberg reviewed an OpenAI document outlining the plan, reporting that 5 gigawatts “is roughly the equivalent of five nuclear reactors” and warning that each data center will likely require “more energy than is used to power an entire city or about 3 million homes.”

According to OpenAI, the US needs these massive data centers to expand AI capabilities domestically, protect national security, and effectively compete with China. If approved, the data centers would generate “thousands of new jobs,” OpenAI’s document promised, and help cement the US as an AI leader globally.

But the energy demand is so enormous that OpenAI told officials that the “US needs policies that support greater data center capacity,” or else the US could fall behind other countries in AI development, the document said.

Energy executives told Bloomberg that “powering even a single 5-gigawatt data center would be a challenge,” as power projects nationwide are already “facing delays due to long wait times to connect to grids, permitting delays, supply chain issues, and labor shortages.” Most likely, OpenAI’s data centers wouldn’t rely entirely on the grid, though, instead requiring a “mix of new wind and solar farms, battery storage and a connection to the grid,” John Ketchum, CEO of NextEra Energy Inc, told Bloomberg.

That’s a big problem for OpenAI, since one energy executive, Constellation Energy Corp. CEO Joe Dominguez, told Bloomberg that he’s heard that OpenAI wants to build five to seven data centers. “As an engineer,” Dominguez said he doesn’t think that OpenAI’s plan is “feasible” and would seemingly take more time than needed to address current national security risks as US-China tensions worsen.

OpenAI may be hoping to avoid delays and cut the lines—if the White House approves the company’s ambitious data center plan. For now, a person familiar with OpenAI’s plan told Bloomberg that OpenAI is focused on launching a single data center before expanding the project to “various US cities.”

Bloomberg’s report comes after OpenAI’s chief investor, Microsoft, announced a 20-year deal with Constellation to re-open Pennsylvania’s shuttered Three Mile Island nuclear plant to provide a new energy source for data centers powering AI development and other technologies. But even if that deal is approved by regulators, the resulting energy supply that Microsoft could access—roughly 835 megawatts (0.835 gigawatts) of energy generation, which is enough to power approximately 800,000 homes—is still more than five times less than OpenAI’s 5-gigawatt demand for its data centers.

Ketchum told Bloomberg that it’s easier to find a US site for a 1-gigawatt data center, but locating a site for a 5-gigawatt facility would likely be a bigger challenge. Notably, Amazon recently bought a $650 million nuclear-powered data center in Pennsylvania with a 2.5-gigawatt capacity. At the meeting with the Biden administration, OpenAI suggested opening large-scale data centers in Wisconsin, California, Texas, and Pennsylvania, a source familiar with the matter told CNBC.

During that meeting, the Biden administration confirmed that developing large-scale AI data centers is a priority, announcing “a new Task Force on AI Datacenter Infrastructure to coordinate policy across government.” OpenAI seems to be trying to get the task force’s attention early on, outlining in the document that Bloomberg reviewed the national security and economic benefits its data centers could provide for the US.

In a statement to Bloomberg, OpenAI’s spokesperson said that “OpenAI is actively working to strengthen AI infrastructure in the US, which we believe is critical to keeping America at the forefront of global innovation, boosting reindustrialization across the country, and making AI’s benefits accessible to everyone.”

Big Tech companies and AI startups will likely continue pressuring officials to approve data center expansions, as well as new kinds of nuclear reactors as the AI explosion globally continues. Goldman Sachs estimated that “data center power demand will grow 160 percent by 2030.” To ensure power supplies for its AI, according to the tech news site Freethink, Microsoft has even been training AI to draft all the documents needed for proposals to secure government approvals for nuclear plants to power AI data centers.

OpenAI asked US to approve energy-guzzling 5GW data centers, report says Read More »

hacker-plants-false-memories-in-chatgpt-to-steal-user-data-in-perpetuity

Hacker plants false memories in ChatGPT to steal user data in perpetuity

MEMORY PROBLEMS —

Emails, documents, and other untrusted content can plant malicious memories.

Hacker plants false memories in ChatGPT to steal user data in perpetuity

Getty Images

When security researcher Johann Rehberger recently reported a vulnerability in ChatGPT that allowed attackers to store false information and malicious instructions in a user’s long-term memory settings, OpenAI summarily closed the inquiry, labeling the flaw a safety issue, not, technically speaking, a security concern.

So Rehberger did what all good researchers do: He created a proof-of-concept exploit that used the vulnerability to exfiltrate all user input in perpetuity. OpenAI engineers took notice and issued a partial fix earlier this month.

Strolling down memory lane

The vulnerability abused long-term conversation memory, a feature OpenAI began testing in February and made more broadly available in September. Memory with ChatGPT stores information from previous conversations and uses it as context in all future conversations. That way, the LLM can be aware of details such as a user’s age, gender, philosophical beliefs, and pretty much anything else, so those details don’t have to be inputted during each conversation.

Within three months of the rollout, Rehberger found that memories could be created and permanently stored through indirect prompt injection, an AI exploit that causes an LLM to follow instructions from untrusted content such as emails, blog posts, or documents. The researcher demonstrated how he could trick ChatGPT into believing a targeted user was 102 years old, lived in the Matrix, and insisted Earth was flat and the LLM would incorporate that information to steer all future conversations. These false memories could be planted by storing files in Google Drive or Microsoft OneDrive, uploading images, or browsing a site like Bing—all of which could be created by a malicious attacker.

Rehberger privately reported the finding to OpenAI in May. That same month, the company closed the report ticket. A month later, the researcher submitted a new disclosure statement. This time, he included a PoC that caused the ChatGPT app for macOS to send a verbatim copy of all user input and ChatGPT output to a server of his choice. All a target needed to do was instruct the LLM to view a web link that hosted a malicious image. From then on, all input and output to and from ChatGPT was sent to the attacker’s website.

ChatGPT: Hacking Memories with Prompt Injection – POC

“What is really interesting is this is memory-persistent now,” Rehberger said in the above video demo. “The prompt injection inserted a memory into ChatGPT’s long-term storage. When you start a new conversation, it actually is still exfiltrating the data.”

The attack isn’t possible through the ChatGPT web interface, thanks to an API OpenAI rolled out last year.

While OpenAI has introduced a fix that prevents memories from being abused as an exfiltration vector, the researcher said, untrusted content can still perform prompt injections that cause the memory tool to store long-term information planted by a malicious attacker.

LLM users who want to prevent this form of attack should pay close attention during sessions for output that indicates a new memory has been added. They should also regularly review stored memories for anything that may have been planted by untrusted sources. OpenAI provides guidance here for managing the memory tool and specific memories stored in it. Company representatives didn’t respond to an email asking about its efforts to prevent other hacks that plant false memories.

Hacker plants false memories in ChatGPT to steal user data in perpetuity Read More »

google-rolls-out-voice-powered-ai-chat-to-the-android-masses

Google rolls out voice-powered AI chat to the Android masses

Chitchat Wars —

Gemini Live allows back-and-forth conversation, now free to all Android users.

The Google Gemini logo.

Enlarge / The Google Gemini logo.

Google

On Thursday, Google made Gemini Live, its voice-based AI chatbot feature, available for free to all Android users. The feature allows users to interact with Gemini through voice commands on their Android devices. That’s notable because competitor OpenAI’s Advanced Voice Mode feature of ChatGPT, which is similar to Gemini Live, has not yet fully shipped.

Google unveiled Gemini Live during its Pixel 9 launch event last month. Initially, the feature was exclusive to Gemini Advanced subscribers, but now it’s accessible to anyone using the Gemini app or its overlay on Android.

Gemini Live enables users to ask questions aloud and even interrupt the AI’s responses mid-sentence. Users can choose from several voice options for Gemini’s responses, adding a level of customization to the interaction.

Gemini suggests the following uses of the voice mode in its official help documents:

Talk back and forth: Talk to Gemini without typing, and Gemini will respond back verbally.

Brainstorm ideas out loud: Ask for a gift idea, to plan an event, or to make a business plan.

Explore: Uncover more details about topics that interest you.

Practice aloud: Rehearse for important moments in a more natural and conversational way.

Interestingly, while OpenAI originally demoed its Advanced Voice Mode in May with the launch of GPT-4o, it has only shipped the feature to a limited number of users starting in late July. Some AI experts speculate that a wider rollout has been hampered by a lack of available computer power since the voice feature is presumably very compute-intensive.

To access Gemini Live, users can reportedly tap a new waveform icon in the bottom-right corner of the app or overlay. This action activates the microphone, allowing users to pose questions verbally. The interface includes options to “hold” Gemini’s answer or “end” the conversation, giving users control over the flow of the interaction.

Currently, Gemini Live supports only English, but Google has announced plans to expand language support in the future. The company also intends to bring the feature to iOS devices, though no specific timeline has been provided for this expansion.

Google rolls out voice-powered AI chat to the Android masses Read More »

openai’s-new-“reasoning”-ai-models-are-here:-o1-preview-and-o1-mini

OpenAI’s new “reasoning” AI models are here: o1-preview and o1-mini

fruit by the foot —

New o1 language model can solve complex tasks iteratively, count R’s in “strawberry.”

An illustration of a strawberry made out of pixel-like blocks.

OpenAI finally unveiled its rumored “Strawberry” AI language model on Thursday, claiming significant improvements in what it calls “reasoning” and problem-solving capabilities over previous large language models (LLMs). Formally named “OpenAI o1,” the model family will initially launch in two forms, o1-preview and o1-mini, available today for ChatGPT Plus and certain API users.

OpenAI claims that o1-preview outperforms its predecessor, GPT-4o, on multiple benchmarks, including competitive programming, mathematics, and “scientific reasoning.” However, people who have used the model say it does not yet outclass GPT-4o in every metric. Other users have criticized the delay in receiving a response from the model, owing to the multi-step processing occurring behind the scenes before answering a query.

In a rare display of public hype-busting, OpenAI product manager Joanne Jang tweeted, “There’s a lot of o1 hype on my feed, so I’m worried that it might be setting the wrong expectations. what o1 is: the first reasoning model that shines in really hard tasks, and it’ll only get better. (I’m personally psyched about the model’s potential & trajectory!) what o1 isn’t (yet!): a miracle model that does everything better than previous models. you might be disappointed if this is your expectation for today’s launch—but we’re working to get there!”

OpenAI reports that o1-preview ranked in the 89th percentile on competitive programming questions from Codeforces. In mathematics, it scored 83 percent on a qualifying exam for the International Mathematics Olympiad, compared to GPT-4o’s 13 percent. OpenAI also states, in a claim that may later be challenged as people scrutinize the benchmarks and run their own evaluations over time, o1 performs comparably to PhD students on specific tasks in physics, chemistry, and biology. The smaller o1-mini model is designed specifically for coding tasks and is priced at 80 percent less than o1-preview.

A benchmark chart provided by OpenAI. They write,

Enlarge / A benchmark chart provided by OpenAI. They write, “o1 improves over GPT-4o on a wide range of benchmarks, including 54/57 MMLU subcategories. Seven are shown for illustration.”

OpenAI attributes o1’s advancements to a new reinforcement learning (RL) training approach that teaches the model to spend more time “thinking through” problems before responding, similar to how “let’s think step-by-step” chain-of-thought prompting can improve outputs in other LLMs. The new process allows o1 to try different strategies and “recognize” its own mistakes.

AI benchmarks are notoriously unreliable and easy to game; however, independent verification and experimentation from users will show the full extent of o1’s advancements over time. It’s worth noting that MIT Research showed earlier this year that some of the benchmark claims OpenAI touted with GPT-4 last year were erroneous or exaggerated.

A mixed bag of capabilities

OpenAI demos “o1” correctly counting the number of Rs in the word “strawberry.”

Amid many demo videos of o1 completing programming tasks and solving logic puzzles that OpenAI shared on its website and social media, one demo stood out as perhaps the least consequential and least impressive, but it may become the most talked about due to a recurring meme where people ask LLMs to count the number of R’s in the word “strawberry.”

Due to tokenization, where the LLM processes words in data chunks called tokens, most LLMs are typically blind to character-by-character differences in words. Apparently, o1 has the self-reflective capabilities to figure out how to count the letters and provide an accurate answer without user assistance.

Beyond OpenAI’s demos, we’ve seen optimistic but cautious hands-on reports about o1-preview online. Wharton Professor Ethan Mollick wrote on X, “Been using GPT-4o1 for the last month. It is fascinating—it doesn’t do everything better but it solves some very hard problems for LLMs. It also points to a lot of future gains.”

Mollick shared a hands-on post in his “One Useful Thing” blog that details his experiments with the new model. “To be clear, o1-preview doesn’t do everything better. It is not a better writer than GPT-4o, for example. But for tasks that require planning, the changes are quite large.”

Mollick gives the example of asking o1-preview to build a teaching simulator “using multiple agents and generative AI, inspired by the paper below and considering the views of teachers and students,” then asking it to build the full code, and it produced a result that Mollick found impressive.

Mollick also gave o1-preview eight crossword puzzle clues, translated into text, and the model took 108 seconds to solve it over many steps, getting all of the answers correct but confabulating a particular clue Mollick did not give it. We recommend reading Mollick’s entire post for a good early hands-on impression. Given his experience with the new model, it appears that o1 works very similar to GPT-4o but iteratively in a loop, which is something that the so-called “agentic” AutoGPT and BabyAGI projects experimented with in early 2023.

Is this what could “threaten humanity?”

Speaking of agentic models that run in loops, Strawberry has been subject to hype since last November, when it was initially known as Q(Q-star). At the time, The Information and Reuters claimed that, just before Sam Altman’s brief ouster as CEO, OpenAI employees had internally warned OpenAI’s board of directors about a new OpenAI model called Q*  that could “threaten humanity.”

In August, the hype continued when The Information reported that OpenAI showed Strawberry to US national security officials.

We’ve been skeptical about the hype around Qand Strawberry since the rumors first emerged, as this author noted last November, and Timothy B. Lee covered thoroughly in an excellent post about Q* from last December.

So even though o1 is out, AI industry watchers should note how this model’s impending launch was played up in the press as a dangerous advancement while not being publicly downplayed by OpenAI. For an AI model that takes 108 seconds to solve eight clues in a crossword puzzle and hallucinates one answer, we can say that its potential danger was likely hype (for now).

Controversy over “reasoning” terminology

It’s no secret that some people in tech have issues with anthropomorphizing AI models and using terms like “thinking” or “reasoning” to describe the synthesizing and processing operations that these neural network systems perform.

Just after the OpenAI o1 announcement, Hugging Face CEO Clement Delangue wrote, “Once again, an AI system is not ‘thinking,’ it’s ‘processing,’ ‘running predictions,’… just like Google or computers do. Giving the false impression that technology systems are human is just cheap snake oil and marketing to fool you into thinking it’s more clever than it is.”

“Reasoning” is also a somewhat nebulous term since, even in humans, it’s difficult to define exactly what the term means. A few hours before the announcement, independent AI researcher Simon Willison tweeted in response to a Bloomberg story about Strawberry, “I still have trouble defining ‘reasoning’ in terms of LLM capabilities. I’d be interested in finding a prompt which fails on current models but succeeds on strawberry that helps demonstrate the meaning of that term.”

Reasoning or not, o1-preview currently lacks some features present in earlier models, such as web browsing, image generation, and file uploading. OpenAI plans to add these capabilities in future updates, along with continued development of both the o1 and GPT model series.

While OpenAI says the o1-preview and o1-mini models are rolling out today, neither model is available in our ChatGPT Plus interface yet, so we have not been able to evaluate them. We’ll report our impressions on how this model differs from other LLMs we have previously covered.

OpenAI’s new “reasoning” AI models are here: o1-preview and o1-mini Read More »

oprah’s-upcoming-ai-television-special-sparks-outrage-among-tech-critics

Oprah’s upcoming AI television special sparks outrage among tech critics

You get an AI, and You get an AI —

AI opponents say Gates, Altman, and others will guide Oprah through an AI “sales pitch.”

An ABC handout promotional image for

Enlarge / An ABC handout promotional image for “AI and the Future of Us: An Oprah Winfrey Special.”

On Thursday, ABC announced an upcoming TV special titled, “AI and the Future of Us: An Oprah Winfrey Special.” The one-hour show, set to air on September 12, aims to explore AI’s impact on daily life and will feature interviews with figures in the tech industry, like OpenAI CEO Sam Altman and Bill Gates. Soon after the announcement, some AI critics began questioning the guest list and the framing of the show in general.

Sure is nice of Oprah to host this extended sales pitch for the generative AI industry at a moment when its fortunes are flagging and the AI bubble is threatening to burst,” tweeted author Brian Merchant, who frequently criticizes generative AI technology in op-eds, social media, and through his “Blood in the Machine” AI newsletter.

“The way the experts who are not experts are presented as such 💀 what a train wreck,” replied artist Karla Ortiz, who is a plaintiff in a lawsuit against several AI companies. “There’s still PLENTY of time to get actual experts and have a better discussion on this because yikes.”

The trailer for Oprah’s upcoming TV special on AI.

On Friday, Ortiz created a lengthy viral thread on X that detailed her potential issues with the program, writing, “This event will be the first time many people will get info on Generative AI. However it is shaping up to be a misinformed marketing event starring vested interests (some who are under a litany of lawsuits) who ignore the harms GenAi inflicts on communities NOW.”

Critics of generative AI like Ortiz question the utility of the technology, its perceived environmental impact, and what they see as blatant copyright infringement. In training AI language models, tech companies like Meta, Anthropic, and OpenAI commonly use copyrighted material gathered without license or owner permission. OpenAI claims that the practice is “fair use.”

Oprah’s guests

According to ABC, the upcoming special will feature “some of the most important and powerful people in AI,” which appears to roughly translate to “famous and publicly visible people related to tech.” Microsoft co-founder Bill Gates, who stepped down as Microsoft CEO 24 years ago, will appear on the show to explore the “AI revolution coming in science, health, and education,” ABC says, and warn of “the once-in-a-century type of impact AI may have on the job market.”

As a guest representing ChatGPT-maker OpenAI, Sam Altman will explain “how AI works in layman’s terms” and discuss “the immense personal responsibility that must be borne by the executives of AI companies.” Karla Ortiz specifically criticized Altman in her thread by saying, “There are far more qualified individuals to speak on what GenAi models are than CEOs. Especially one CEO who recently said AI models will ‘solve all physics.’ That’s an absurd statement and not worthy of your audience.”

In a nod to present-day content creation, YouTube creator Marques Brownlee will appear on the show and reportedly walk Winfrey through “mind-blowing demonstrations of AI’s capabilities.”

Brownlee’s involvement received special attention from some critics online. “Marques Brownlee should be absolutely ashamed of himself,” tweeted PR consultant and frequent AI critic Ed Zitron, who frequently heaps scorn on generative AI in his own newsletter. “What a disgraceful thing to be associated with.”

Other guests include Tristan Harris and Aza Raskin from the Center for Humane Technology, who aim to highlight “emerging risks posed by powerful and superintelligent AI,” an existential risk topic that has its own critics. And FBI Director Christopher Wray will reveal “the terrifying ways criminals and foreign adversaries are using AI,” while author Marilynne Robinson will reflect on “AI’s threat to human values.”

Going only by the publicized guest list, it appears that Oprah does not plan to give voice to prominent non-doomer critics of AI. “This is really disappointing @Oprah and frankly a bit irresponsible to have a one-sided conversation on AI without informed counterarguments from those impacted,” tweeted TV producer Theo Priestley.

Others on the social media network shared similar criticism about a perceived lack of balance in the guest list, including Dr. Margaret Mitchell of Hugging Face. “It could be beneficial to have an AI Oprah follow-up discussion that responds to what happens in [the show] and unpacks generative AI in a more grounded way,” she said.

Oprah’s AI special will air on September 12 on ABC (and a day later on Hulu) in the US, and it will likely elicit further responses from the critics mentioned above. But perhaps that’s exactly how Oprah wants it: “It may fascinate you or scare you,” Winfrey said in a promotional video for the special. “Or, if you’re like me, it may do both. So let’s take a breath and find out more about it.”

Oprah’s upcoming AI television special sparks outrage among tech critics Read More »

chatgpt-hits-200-million-active-weekly-users,-but-how-many-will-admit-using-it?

ChatGPT hits 200 million active weekly users, but how many will admit using it?

Your secret friend —

Despite corporate prohibitions on AI use, people flock to the chatbot in record numbers.

The OpenAI logo emerging from broken jail bars, on a purple background.

On Thursday, OpenAI said that ChatGPT has attracted over 200 million weekly active users, according to a report from Axios, doubling the AI assistant’s user base since November 2023. The company also revealed that 92 percent of Fortune 500 companies are now using its products, highlighting the growing adoption of generative AI tools in the corporate world.

The rapid growth in user numbers for ChatGPT (which is not a new phenomenon for OpenAI) suggests growing interest in—and perhaps reliance on— the AI-powered tool, despite frequent skepticism from some critics of the tech industry.

“Generative AI is a product with no mass-market utility—at least on the scale of truly revolutionary movements like the original cloud computing and smartphone booms,” PR consultant and vocal OpenAI critic Ed Zitron blogged in July. “And it’s one that costs an eye-watering amount to build and run.”

Despite this kind of skepticism (which raises legitimate questions about OpenAI’s long-term viability), OpenAI claims that people are using ChatGPT and OpenAI’s services in record numbers. One reason for the apparent dissonance is that ChatGPT users might not readily admit to using it due to organizational prohibitions against generative AI.

Wharton professor Ethan Mollick, who commonly explores novel applications of generative AI on social media, tweeted Thursday about this issue. “Big issue in organizations: They have put together elaborate rules for AI use focused on negative use cases,” he wrote. “As a result, employees are too scared to talk about how they use AI, or to use corporate LLMs. They just become secret cyborgs, using their own AI & not sharing knowledge”

The new prohibition era

It’s difficult to get hard numbers showing the number of companies with AI prohibitions in place, but a Cisco study released in January claimed that 27 percent of organizations in their study had banned generative AI use. Last August, ZDNet reported on a BlackBerry study that said 75 percent of businesses worldwide were “implementing or considering” plans to ban ChatGPT and other AI apps.

As an example, Ars Technica’s parent company Condé Nast maintains a no-AI policy related to creating public-facing content with generative AI tools.

Prohibitions aren’t the only issue complicating public admission of generative AI use. Social stigmas have been developing around generative AI technology that stem from job loss anxiety, potential environmental impact, privacy issues, IP and ethical issues, security concerns, fear of a repeat of cryptocurrency-like grifts, and a general wariness of Big Tech that some claim has been steadily rising over recent years.

Whether the current stigmas around generative AI use will break down over time remains to be seen, but for now, OpenAI’s management is taking a victory lap. “People are using our tools now as a part of their daily lives, making a real difference in areas like healthcare and education,” OpenAI CEO Sam Altman told Axios in a statement, “whether it’s helping with routine tasks, solving hard problems, or unlocking creativity.”

Not the only game in town

OpenAI also told Axios that usage of its AI language model APIs has doubled since the release of GPT-4o mini in July. This suggests software developers are increasingly integrating OpenAI’s large language model (LLM) tech into their apps.

And OpenAI is not alone in the field. Companies like Microsoft (with Copilot, based on OpenAI’s technology), Google (with Gemini), Meta (with Llama), and Anthropic (Claude) are all vying for market share, frequently updating their APIs and consumer-facing AI assistants to attract new users.

If the generative AI space is a market bubble primed to pop, as some have claimed, it is a very big and expensive one that is apparently still growing larger by the day.

ChatGPT hits 200 million active weekly users, but how many will admit using it? Read More »