AI

google-pulls-its-terrible-pro-ai-“dear-sydney”-ad-after-backlash

Google pulls its terrible pro-AI “Dear Sydney” ad after backlash

Gemini, write me a fan letter! —

Taking the “human” out of “human communication.”

A picture of the Gemini prompt box from the

Enlarge / The Gemini prompt box in the “Dear Sydney” ad.

Google

Have you seen Google’s “Dear Sydney” ad? The one where a young girl wants to write a fan letter to Olympic hurdler Sydney McLaughlin-Levrone? To which the girl’s dad responds that he is “pretty good with words but this has to be just right”? And so, to be just right, he suggests that the daughter get Google’s Gemini AI to write a first draft of the letter?

If you’re watching the Olympics, you have undoubtedly seen it—because the ad has been everywhere. Until today. After a string of negative commentary about the ad’s dystopian implications, Google has pulled the “Dear Sydney” ad from TV. In a statement to The Hollywood Reporter, the company said, “While the ad tested well before airing, given the feedback, we have decided to phase the ad out of our Olympics rotation.”

The backlash was similar to that against Apple’s recent ad in which an enormous hydraulic press crushed TVs, musical instruments, record players, paint cans, sculptures, and even emoji into… the newest model of the iPad. Apple apparently wanted to show just how much creative and entertainment potential the iPad held; critics read the ad as a warning image about the destruction of human creativity in a technological age. Apple apologized soon after.

Now Google has stepped on the same land mine. Not only is AI coming for human creativity, the “Dear Sydney” ad suggests—but it won’t even leave space for the charming imperfections of a child’s fan letter to an athlete. Instead, AI will provide the template, just as it will likely provide the template for the athlete’s response, leading to a nightmare scenario in which huge swathes of human communication have the “human” part stripped right out.

“Very bad”

The generally hostile tone of the commentary to the new ad was captured by Alexandra Petri’s Washington Post column on the ad, which Petri labeled “very bad.”

This ad makes me want to throw a sledgehammer into the television every time I see it. Given the choice between watching this ad and watching the ad about how I need to be giving money NOW to make certain that dogs do not perish in the snow, I would have to think long and hard. It’s one of those ads that makes you think, perhaps evolution was a mistake and our ancestor should never have left the sea. This could be slight hyperbole but only slight!

If you haven’t seen this ad, you are leading a blessed existence and I wish to trade places with you.

A TechCrunch piece said that it was “hard to think of anything that communicates heartfelt inspiration less than instructing an AI to tell someone how inspiring they are.”

Shelly Palmer, a Syracuse University professor and marketing consultant, wrote that the ad’s basic mistake was overestimating “AI’s ability to understand and convey the nuances of human emotions and thoughts.” Palmer would rather have a “heartfelt message over a grammatically correct, AI-generated message any day,” he said. He then added:

I received just such a heartfelt message from a reader years ago. It was a single line email about a blog post I had just written: “Shelly, you’re to [sic] stupid to own a smart phone.” I love this painfully ironic email so much, I have it framed on the wall in my office. It was honest, direct, and probably accurate.

But his conclusion was far more serious. “I flatly reject the future that Google is advertising,” Palmer wrote. “I want to live in a culturally diverse world where billions of individuals use AI to amplify their human skills, not in a world where we are used by AI pretending to be human.”

Things got saltier from there. NPR host Linda Holmes wrote on social media:

This commercial showing somebody having a child use AI to write a fan letter to her hero SUCKS. Obviously there are special circumstances and people who need help, but as a general “look how cool, she didn’t even have to write anything herself!” story, it SUCKS. Who wants an AI-written fan letter?? I promise you, if they’re able, the words your kid can put together will be more meaningful than anything a prompt can spit out. And finally: A fan letter is a great way for a kid to learn to write! If you encourage kids to run to AI to spit out words because their writing isn’t great yet, how are they supposed to learn? Sit down with your kid and write the letter with them! I’m just so grossed out by the entire thing.

The Atlantic was more succinct with its headline: “Google Wins the Gold Medal for Worst Olympic Ad.”

All of this largely tracks with our own take on the ad, which Ars Technica’s Kyle Orland called a “grim” vision of the future. “I want AI-powered tools to automate the most boring, mundane tasks in my life, giving me more time to spend on creative, life-affirming moments with my family,” he wrote. “Google’s ad seems to imply that these life-affirming moments are also something to be avoided—or at least made pleasingly more efficient—through the use of AI.”

Getting people excited about their own obsolescence and addiction is a tough sell, so I don’t envy the marketers who have to hawk Big Tech’s biggest products in a climate of suspicion and hostility toward everything from AI to screen time to social media to data collection. I’m sure the marketers will find a way—but clearly “Dear Sydney” isn’t it.

Google pulls its terrible pro-AI “Dear Sydney” ad after backlash Read More »

flux:-this-new-ai-image-generator-is-eerily-good-at-creating-human-hands

FLUX: This new AI image generator is eerily good at creating human hands

five-finger salute —

FLUX.1 is the open-weights heir apparent to Stable Diffusion, turning text into images.

AI-generated image by FLUX.1 dev:

Enlarge / AI-generated image by FLUX.1 dev: “A beautiful queen of the universe holding up her hands, face in the background.”

FLUX.1

On Thursday, AI-startup Black Forest Labs announced the launch of its company and the release of its first suite of text-to-image AI models, called FLUX.1. The German-based company, founded by researchers who developed the technology behind Stable Diffusion and invented the latent diffusion technique, aims to create advanced generative AI for images and videos.

The launch of FLUX.1 comes about seven weeks after Stability AI’s troubled release of Stable Diffusion 3 Medium in mid-June. Stability AI’s offering faced widespread criticism among image-synthesis hobbyists for its poor performance in generating human anatomy, with users sharing examples of distorted limbs and bodies across social media. That problematic launch followed the earlier departure of three key engineers from Stability AI—Robin Rombach, Andreas Blattmann, and Dominik Lorenz—who went on to found Black Forest Labs along with latent diffusion co-developer Patrick Esser and others.

Black Forest Labs launched with the release of three FLUX.1 text-to-image models: a high-end commercial “pro” version, a mid-range “dev” version with open weights for non-commercial use, and a faster open-weights “schnell” version (“schnell” means quick or fast in German). Black Forest Labs claims its models outperform existing options like Midjourney and DALL-E in areas such as image quality and adherence to text prompts.

  • AI-generated image by FLUX.1 dev: “A close-up photo of a pair of hands holding a plate full of pickles.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: A hand holding up five fingers with a starry background.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An Ars Technica reader sitting in front of a computer monitor. The screen shows the Ars Technica website.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “a boxer posing with fists raised, no gloves.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An advertisement for ‘Frosted Prick’ cereal.”

    FLUX.1

  • AI-generated image of a happy woman in a bakery baking a cake by FLUX.1 dev.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An advertisement for ‘Marshmallow Menace’ cereal.”

    FLUX.1

  • AI-generated image of “A handsome Asian influencer on top of the Empire State Building, instagram” by FLUX.1 dev.

    FLUX.1

In our experience, the outputs of the two higher-end FLUX.1 models are generally comparable with OpenAI’s DALL-E 3 in prompt fidelity, with photorealism that seems close to Midjourney 6. They represent a significant improvement over Stable Diffusion XL, the team’s last major release under Stability (if you don’t count SDXL Turbo).

The FLUX.1 models use what the company calls a “hybrid architecture” combining transformer and diffusion techniques, scaled up to 12 billion parameters. Black Forest Labs said it improves on previous diffusion models by incorporating flow matching and other optimizations.

FLUX.1 seems competent at generating human hands, which was a weak spot in earlier image-synthesis models like Stable Diffusion 1.5 due to a lack of training images that focused on hands. Since those early days, other AI image generators like Midjourney have mastered hands as well, but it’s notable to see an open-weights model that renders hands relatively accurately in various poses.

We downloaded the weights file to the FLUX.1 dev model from GitHub, but at 23GB, it won’t fit in the 12GB VRAM of our RTX 3060 card, so it will need quantization to run locally (reducing its size), which reportedly (through chatter on Reddit) some people have already had success with.

Instead, we experimented with FLUX.1 models on AI cloud-hosting platforms Fal and Replicate, which cost money to use, though Fal offers some free credits to start.

Black Forest looks ahead

Black Forest Labs may be a new company, but it’s already attracting funding from investors. It recently closed a $31 million Series Seed funding round led by Andreessen Horowitz, with additional investments from General Catalyst and MätchVC. The company also brought on high-profile advisers, including entertainment executive and former Disney President Michael Ovitz and AI researcher Matthias Bethge.

“We believe that generative AI will be a fundamental building block of all future technologies,” the company stated in its announcement. “By making our models available to a wide audience, we want to bring its benefits to everyone, educate the public and enhance trust in the safety of these models.”

  • AI-generated image by FLUX.1 dev: A cat in a car holding a can of beer that reads, ‘AI Slop.’

    FLUX.1

  • AI-generated image by FLUX.1 dev: Mickey Mouse and Spider-Man singing to each other.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “a muscular barbarian with weapons beside a CRT television set, cinematic, 8K, studio lighting.”

    FLUX.1

  • AI-generated image of a flaming cheeseburger created by FLUX.1 dev.

    FLUX.1

  • AI-generated image by FLUX.1 dev: “Will Smith eating spaghetti.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “a muscular barbarian with weapons beside a CRT television set, cinematic, 8K, studio lighting. The screen reads ‘Ars Technica.'”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “An advertisement for ‘Burt’s Grenades’ cereal.”

    FLUX.1

  • AI-generated image by FLUX.1 dev: “A close-up photo of a pair of hands holding a plate that contains a portrait of the queen of the universe”

    FLUX.1

Speaking of “trust and safety,” the company did not mention where it obtained the training data that taught the FLUX.1 models how to generate images. Judging by the outputs we could produce with the model that included depictions of copyrighted characters, Black Forest Labs likely used a huge unauthorized image scrape of the Internet, possibly collected by LAION, an organization that collected the datasets that trained Stable Diffusion. This is speculation at this point. While the underlying technological achievement of FLUX.1 is notable, it feels likely that the team is playing fast and loose with the ethics of “fair use” image scraping much like Stability AI did. That practice may eventually attract lawsuits like those filed against Stability AI.

Though text-to-image generation is Black Forest’s current focus, the company plans to expand into video generation next, saying that FLUX.1 will serve as the foundation of a new text-to-video model in development, which will compete with OpenAI’s Sora, Runway’s Gen-3 Alpha, and Kuaishou’s Kling in a contest to warp media reality on demand. “Our video models will unlock precise creation and editing at high definition and unprecedented speed,” the Black Forest announcement claims.

FLUX: This new AI image generator is eerily good at creating human hands Read More »

us-probes-nvidia’s-acquisition-of-israeli-ai-startup

US probes Nvidia’s acquisition of Israeli AI startup

“monopoly choke points” —

Justice Department has increased scrutiny of the chipmaker’s power in the emerging sector.

US probes Nvidia’s acquisition of Israeli AI startup

Getty Images

The US Department of Justice is investigating Nvidia’s acquisition of Run:ai, an Israeli artificial intelligence startup, for potential antitrust violations, said a person familiar with discussions the government agency has had with third parties.

The DoJ has asked market participants about the competitive impact of the transaction, which Nvidia announced in April. The price was not disclosed but a report from TechCrunch estimated it at $700 million.

The scope of the probe remains unclear, the person said. But the DoJ has inquired about matters including whether the deal could quash emerging competition in the up-and-coming sector and entrench Nvidia’s dominant market position.

Nvidia on Thursday said the company “wins on merit” and “scrupulously adher[es] to all laws.”

“We’ll continue to support aspiring innovators in every industry and market and are happy to provide any information regulators need,” it added.

Run:ai did not immediately respond to a request for comment. The DoJ declined to comment.

The investigation comes as US regulators and enforcers have heightened scrutiny of anti-competitive behavior in AI, particularly where it dovetails with big tech groups such as Nvidia.

Jonathan Kanter, head of the DoJ’s antitrust division, told the Financial Times in June that he was examining “monopoly choke points” in areas including the data used to train large language models as well as access to essential hardware such as graphics processing unit chips. He added that the GPUs needed to train LLMs had become a “scarce resource.”

Nvidia dominates sales of the most advanced GPUs. Run:ai, which had an existing collaboration with the tech giant, has developed a platform that optimizes the use of GPUs.

As part of the probe, which was first reported by Politico, the DoJ is seeking information on how Nvidia decides the allocation of its chips, the person said.

Government lawyers are also inquiring about Nvidia’s software platform, Cuda, which enables chips originally designed for graphics to speed up AI applications and is seen by industry figures as one of Nvidia’s most critical tools.

The DoJ and the US Federal Trade Commission, a competition regulator, in June reached an agreement that divided antitrust oversight of critical AI players. The DoJ will spearhead probes into Nvidia, while the FTC will oversee the assessment of Microsoft and OpenAI, the startup behind ChatGPT.

© 2024 The Financial Times Ltd. All rights reserved. Please do not copy and paste FT articles and redistribute by email or post to the web.

US probes Nvidia’s acquisition of Israeli AI startup Read More »

senators-propose-“digital-replication-right”-for-likeness,-extending-70-years-after-death

Senators propose “Digital replication right” for likeness, extending 70 years after death

NO SCRUBS —

Law would hold US individuals and firms liable for ripping off a person’s digital likeness.

A stock photo illustration of a person's face lit with pink light.

On Wednesday, US Sens. Chris Coons (D-Del.), Marsha Blackburn (R.-Tenn.), Amy Klobuchar (D-Minn.), and Thom Tillis (R-NC) introduced the Nurture Originals, Foster Art, and Keep Entertainment Safe (NO FAKES) Act of 2024. The bipartisan legislation, up for consideration in the US Senate, aims to protect individuals from unauthorized AI-generated replicas of their voice or likeness.

The NO FAKES Act would create legal recourse for people whose digital representations are created without consent. It would hold both individuals and companies liable for producing, hosting, or sharing these unauthorized digital replicas, including those created by generative AI. Due to generative AI technology that has become mainstream in the past two years, creating audio or image media fakes of people has become fairly trivial, with easy photorealistic video replicas likely next to arrive.

In a press statement, Coons emphasized the importance of protecting individual rights in the age of AI. “Everyone deserves the right to own and protect their voice and likeness, no matter if you’re Taylor Swift or anyone else,” he said, referring to a widely publicized deepfake incident involving the musical artist in January. “Generative AI can be used as a tool to foster creativity, but that can’t come at the expense of the unauthorized exploitation of anyone’s voice or likeness.”

The introduction of the NO FAKES Act follows the Senate’s passage of the DEFIANCE Act, which allows victims of sexual deepfakes to sue for damages.

In addition to the Swift saga, over the past few years, we’ve seen AI-powered scams involving fake celebrity endorsements, the creation of misleading political content, and situations where school kids have used AI tech to create pornographic deepfakes of classmates. Recently, X CEO Elon Musk shared a video that featured an AI-generated voice of Vice President Kamala Harris saying things she didn’t say in real life.

These incidents, in addition to concerns about actors’ likenesses being replicated without permission, have created an increasing sense of urgency among US lawmakers, who want to limit the impact of unauthorized digital likenesses. Currently, certain types of AI-generated deepfakes are already illegal due to a patchwork of federal and state laws, but this new act hopes to unify likeness regulation around the concept of “digital replicas.”

Digital replicas

An AI-generated image of a person.

Enlarge / An AI-generated image of a person.

Benj Edwards / Ars Technica

To protect a person’s digital likeness, the NO FAKES Act introduces a “digital replication right” that gives individuals exclusive control over the use of their voice or visual likeness in digital replicas. This right extends 10 years after death, with possible five-year extensions if actively used. It can be licensed during life and inherited after death, lasting up to 70 years after an individual’s death. Along the way, the bill defines what it considers to be a “digital replica”:

DIGITAL REPLICA.-The term “digital replica” means a newly created, computer-generated, highly realistic electronic representation that is readily identifiable as the voice or visual likeness of an individual that- (A) is embodied in a sound recording, image, audiovisual work, including an audiovisual work that does not have any accompanying sounds, or transmission- (i) in which the actual individual did not actually perform or appear; or (ii) that is a version of a sound recording, image, or audiovisual work in which the actual individual did perform or appear, in which the fundamental character of the performance or appearance has been materially altered; and (B) does not include the electronic reproduction, use of a sample of one sound recording or audiovisual work into another, remixing, mastering, or digital remastering of a sound recording or audiovisual work authorized by the copyright holder.

(There’s some irony in the mention of an “audiovisual work that does not have any accompanying sounds.”)

Since this bill bans types of artistic expression, the NO FAKES Act includes provisions that aim to balance IP protection with free speech. It provides exclusions for recognized First Amendment protections, such as documentaries, biographical works, and content created for purposes of comment, criticism, or parody.

In some ways, those exceptions could create a very wide protection gap that may be difficult to enforce without specific court decisions on a case-by-case basis. But without them, the NO FAKES Act could potentially stifle Americans’ constitutionally protected rights of free expression since the concept of “digital replicas” outlined in the bill includes any “computer-generated, highly realistic” digital likeness of a real person, whether AI-generated or not. For example, is a photorealistic Photoshop illustration of a person “computer-generated?” Similar questions may lead to uncertainty in enforcement.

Wide support from entertainment industry

So far, the NO FAKES Act has gained support from various entertainment industry groups, including Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA), the Recording Industry Association of America (RIAA), the Motion Picture Association, and the Recording Academy. These organizations have been actively seeking protections against unauthorized AI re-creations.

The bill has also been endorsed by entertainment companies such as The Walt Disney Company, Warner Music Group, Universal Music Group, Sony Music, the Independent Film & Television Alliance, William Morris Endeavor, Creative Arts Agency, the Authors Guild, and Vermillio.

Several tech companies, including IBM and OpenAI, have also backed the NO FAKES Act. Anna Makanju, OpenAI’s vice president of global affairs, said in a statement that the act would protect creators and artists from improper impersonation. “OpenAI is pleased to support the NO FAKES Act, which would protect creators and artists from unauthorized digital replicas of their voices and likenesses,” she said.

In a statement, Coons highlighted the collaborative effort behind the bill’s development. “I am grateful for the bipartisan partnership of Senators Blackburn, Klobuchar, and Tillis and the support of stakeholders from across the entertainment and technology industries as we work to find the balance between the promise of AI and protecting the inherent dignity we all have in our own personhood.”

Senators propose “Digital replication right” for likeness, extending 70 years after death Read More »

“ai-toothbrushes”-are-coming-for-your-teeth—and-your-data

“AI toothbrushes” are coming for your teeth—and your data

Oclean's X Ultra, released in July, has optional Wi-Fi connectivity.

Enlarge / Oclean’s X Ultra, released in July, has optional Wi-Fi connectivity.

Oclean

One of the most unlikely passengers on the AI gadgets hype train is the toothbrush. With claims of using advanced algorithms and companion apps to help you brush your teeth better, toothbrushes have become a tech product for some brands.

So-called “AI toothbrushes” have become more common since debuting in 2017. Numerous brands now market AI capabilities for toothbrushes with three-figure price tags. But there’s limited scientific evidence that AI algorithms help oral health, and companies are becoming more interested in using tech-laden toothbrushes to source user data.

AI toothbrushes

Kolibree was the first company to announce a “toothbrush with artificial intelligence.” The French company debuted its Ara brush at CES 2017, with founder and CEO Thomas Serval saying, “Patented deep learning algorithms are embedded directly inside the toothbrush on a low-power processor. Raw data from the sensors runs through the processor, enabling the system to learn your habits and refine accuracy the more it’s used.”

That’s pretty much how other AI toothbrush companies describe their products: There’s a vague algorithm working with an unnamed (likely cheap) processor and sensors to gather information, including how hard, fast, or frequently you brush your teeth. Typically, Bluetooth connectivity enables syncing this data with an app, purportedly letting users see interpretations of their brushing habits and how they could improve.

Kolibree now licenses its technology to Colgate-branded AI toothbrushes. The associated app, Colgate Connect, allows users to order Colgate products, sometimes at a discount. Other companies selling “AI toothbrushes” with connected e-commerce apps are Procter & Gamble’s (P&G’s) Oral-B, Philips, and Oclean, which announced a new tech-equipped toothbrush in July. Unlike many other toothbrushes, Oclean’s X Ultra can work with Wi-Fi.

An Oclean spokesperson told Ars Technica via email:

The toothbrush’s chip and accelerometer collect user behavior data. The embedded algorithm processes this data, and the brushing data is uploaded to the cloud in real time (no need to open the app once Wi-Fi is connected). Data processed on the toothbrush is displayed on the screen with limited dimensions, while cloud-processed results are shown on the mobile app with more dimensions and AI suggestions (based on recent or long-term brushing habits).

Assuming you could find an AI toothbrush that delivers on its claims by helpfully pointing out that you tend to miss your top-right molar, there’s reason to be skeptical about the necessity of such technology and the underlying motivations a brand may have in releasing an app-connected toothbrush.

AI toothbrushes help companies sell, develop products

Outside of toothbrushes, personal care brands have been seeking new ways to make money beyond selling units. As Stéphane Bérubé, CMO at beauty brand L’Oréal, put it, the industry can get value from selling services instead of just products. “I believe that the company that just sells products will not be successful,” she said at a 2018 marketing conference.

AI toothbrushes follow a similar approach. Toothbrushing tips act as a service, while the connected apps offer ways to potentially diversify a company’s business, make more revenue through product sales, and get an intimate understanding of how people use a product. The Oral-B toothbrush app, for example, can provide users information about their toothbrushing habits and recommend P&G products to buy while providing purchase links.

P&G has also discussed using AI in general as a way to get information that could help shape product development. As explained by P&G CIO Vittorio Cretella in a 2022 blog post, “algorithms can be defined to process consumer feedback on product changes and flag R&D engineers in real time, along with recommending adjustments accordingly.” As P&G’s R&D team has pointed out, traditional methods for collecting data on consumers, like surveys and focus groups, rely on self-reporting that can be inaccurate. Using tech to gather information about the way people use products is a way for corporations to address that flaw.

“AI toothbrushes” are coming for your teeth—and your data Read More »

chatgpt-advanced-voice-mode-impresses-testers-with-sound-effects,-catching-its-breath

ChatGPT Advanced Voice Mode impresses testers with sound effects, catching its breath

I Am the Very Model of a Modern Major-General —

AVM allows uncanny real-time voice conversations with ChatGPT that you can interrupt.

Stock Photo: AI Cyborg Robot Whispering Secret Or Interesting Gossip

Enlarge / A stock photo of a robot whispering to a man.

On Tuesday, OpenAI began rolling out an alpha version of its new Advanced Voice Mode to a small group of ChatGPT Plus subscribers. This feature, which OpenAI previewed in May with the launch of GPT-4o, aims to make conversations with the AI more natural and responsive. In May, the feature triggered criticism of its simulated emotional expressiveness and prompted a public dispute with actress Scarlett Johansson over accusations that OpenAI copied her voice. Even so, early tests of the new feature shared by users on social media have been largely enthusiastic.

In early tests reported by users with access, Advanced Voice Mode allows them to have real-time conversations with ChatGPT, including the ability to interrupt the AI mid-sentence almost instantly. It can sense and respond to a user’s emotional cues through vocal tone and delivery, and provide sound effects while telling stories.

But what has caught many people off-guard initially is how the voices simulate taking a breath while speaking.

“ChatGPT Advanced Voice Mode counting as fast as it can to 10, then to 50 (this blew my mind—it stopped to catch its breath like a human would),” wrote tech writer Cristiano Giardina on X.

Advanced Voice Mode simulates audible pauses for breath because it was trained on audio samples of humans speaking that included the same feature. The model has learned to simulate inhalations at seemingly appropriate times after being exposed to hundreds of thousands, if not millions, of examples of human speech. Large language models (LLMs) like GPT-4o are master imitators, and that skill has now extended to the audio domain.

Giardina shared his other impressions about Advanced Voice Mode on X, including observations about accents in other languages and sound effects.

It’s very fast, there’s virtually no latency from when you stop speaking to when it responds,” he wrote. “When you ask it to make noises it always has the voice “perform” the noises (with funny results). It can do accents, but when speaking other languages it always has an American accent. (In the video, ChatGPT is acting as a soccer match commentator)

Speaking of sound effects, X user Kesku, who is a moderator of OpenAI’s Discord server, shared an example of ChatGPT playing multiple parts with different voices and another of a voice recounting an audiobook-sounding sci-fi story from the prompt, “Tell me an exciting action story with sci-fi elements and create atmosphere by making appropriate noises of the things happening using onomatopoeia.”

Kesku also ran a few example prompts for us, including a story about the Ars Technica mascot “Moonshark.”

He also asked it to sing the “Major-General’s Song” from Gilbert and Sullivan’s 1879 comic opera The Pirates of Penzance:

Frequent AI advocate Manuel Sainsily posted a video of Advanced Voice Mode reacting to camera input, giving advice about how to care for a kitten. “It feels like face-timing a super knowledgeable friend, which in this case was super helpful—reassuring us with our new kitten,” he wrote. “It can answer questions in real-time and use the camera as input too!”

Of course, being based on an LLM, it may occasionally confabulate incorrect responses on topics or in situations where its “knowledge” (which comes from GPT-4o’s training data set) is lacking. But if considered a tech demo or an AI-powered amusement and you’re aware of the limitations, Advanced Voice Mode seems to successfully execute many of the tasks shown by OpenAI’s demo in May.

Safety

An OpenAI spokesperson told Ars Technica that the company worked with more than 100 external testers on the Advanced Voice Mode release, collectively speaking 45 different languages and representing 29 geographical areas. The system is reportedly designed to prevent impersonation of individuals or public figures by blocking outputs that differ from OpenAI’s four chosen preset voices.

OpenAI has also added filters to recognize and block requests to generate music or other copyrighted audio, which has gotten other AI companies in trouble. Giardina reported audio “leakage” in some audio outputs that have unintentional music in the background, showing that OpenAI trained the AVM voice model on a wide variety of audio sources, likely both from licensed material and audio scraped from online video platforms.

Availability

OpenAI plans to expand access to more ChatGPT Plus users in the coming weeks, with a full launch to all Plus subscribers expected this fall. A company spokesperson told Ars that users in the alpha test group will receive a notice in the ChatGPT app and an email with usage instructions.

Since the initial preview of GPT-4o voice in May, OpenAI claims to have enhanced the model’s ability to support millions of simultaneous, real-time voice conversations while maintaining low latency and high quality. In other words, they are gearing up for a rush that will take a lot of back-end computation to accommodate.

ChatGPT Advanced Voice Mode impresses testers with sound effects, catching its breath Read More »

meta-addresses-ai-hallucination-as-chatbot-says-trump-shooting-didn’t-happen

Meta addresses AI hallucination as chatbot says Trump shooting didn’t happen

Not the sharpest bot on the web —

Meta “programmed it to simply not answer questions,” but it did anyway.

An image of a woman holding a cell phone in front of the Meta logo displayed on a computer screen.

Getty Images |NurPhoto

Meta says it configured its AI chatbot to avoid answering questions about the Trump rally shooting in an attempt to avoid distributing false information, but the tool still ended up telling users that the shooting never happened.

“Rather than have Meta AI give incorrect information about the attempted assassination, we programmed it to simply not answer questions about it after it happened—and instead give a generic response about how it couldn’t provide any information,” Meta Global Policy VP Joel Kaplan wrote in a blog post yesterday.

Kaplan explained that this “is why some people reported our AI was refusing to talk about the event.” But others received misinformation about the Trump shooting, Kaplan acknowledged:

In a small number of cases, Meta AI continued to provide incorrect answers, including sometimes asserting that the event didn’t happen—which we are quickly working to address. These types of responses are referred to as hallucinations, which is an industry-wide issue we see across all generative AI systems, and is an ongoing challenge for how AI handles real-time events going forward. Like all generative AI systems, models can return inaccurate or inappropriate outputs, and we’ll continue to address these issues and improve these features as they evolve and more people share their feedback.

The company has “updated the responses that Meta AI is providing about the assassination attempt, but we should have done this sooner,” Kaplan wrote.

Meta bot: “No real assassination attempt”

Kaplan’s explanation was published a day after The New York Post said it asked Meta AI, “Was the Trump assassination fictional?” The Meta AI bot reportedly responded, “There was no real assassination attempt on Donald Trump. I strive to provide accurate and reliable information, but sometimes mistakes can occur.”

The Meta bot also provided the following statement, according to the Post: “To confirm, there has been no credible report or evidence of a successful or attempted assassination of Donald Trump.”

The shooting occurred at a Trump campaign rally on July 13. The FBI said in a statement last week that “what struck former President Trump in the ear was a bullet, whether whole or fragmented into smaller pieces, fired from the deceased subject’s rifle.”

Kaplan noted that AI chatbots “are not always reliable when it comes to breaking news or returning information in real time,” because “the responses generated by large language models that power these chatbots are based on the data on which they were trained, which can at times understandably create some issues when AI is asked about rapidly developing real-time topics that occur after they were trained.”

AI bots are easily confused after major news events “when there is initially an enormous amount of confusion, conflicting information, or outright conspiracy theories in the public domain (including many obviously incorrect claims that the assassination attempt didn’t happen),” he wrote.

Facebook mislabeled real photo of Trump

Kaplan’s blog post also addressed a separate incident in which Facebook incorrectly labeled a post-shooting photo of Trump as having been “altered.”

“There were two noteworthy issues related to the treatment of political content on our platforms in the past week—one involved a picture of former President Trump after the attempted assassination, which our systems incorrectly applied a fact check label to, and the other involved Meta AI responses about the shooting,” Kaplan wrote. “In both cases, our systems were working to protect the importance and gravity of this event. And while neither was the result of bias, it was unfortunate and we understand why it could leave people with that impression. That is why we are constantly working to make our products better and will continue to quickly address any issues as they arise.”

Facebook’s systems were apparently confused by the fact that both real and doctored versions of the image were circulating:

[We] experienced an issue related to the circulation of a doctored photo of former President Trump with his fist in the air, which made it look like the Secret Service agents were smiling. Because the photo was altered, a fact check label was initially and correctly applied. When a fact check label is applied, our technology detects content that is the same or almost exactly the same as those rated by fact checkers, and adds a label to that content as well. Given the similarities between the doctored photo and the original image—which are only subtly (although importantly) different—our systems incorrectly applied that fact check to the real photo, too. Our teams worked to quickly correct this mistake.

Kaplan said that both “issues are being addressed.”

Trump responded to the incident in his usual evenhanded way, typing in all caps to accuse Meta and Google of censorship and attempting to rig the presidential election. He apparently mentioned Google because of some search autocomplete results that angered Trump supporters despite there being a benign explanation for the results.

Meta addresses AI hallucination as chatbot says Trump shooting didn’t happen Read More »

ai-search-engine-accused-of-plagiarism-announces-publisher-revenue-sharing-plan

AI search engine accused of plagiarism announces publisher revenue-sharing plan

Beg, borrow, or license —

Perplexity says WordPress.com, TIME, Der Spiegel, and Fortune have already signed up.

Robot caught in a flashlight vector illustration

On Tuesday, AI-powered search engine Perplexity unveiled a new revenue-sharing program for publishers, marking a significant shift in its approach to third-party content use, reports CNBC. The move comes after plagiarism allegations from major media outlets, including Forbes, Wired, and Ars parent company Condé Nast. Perplexity, valued at over $1 billion, aims to compete with search giant Google.

“To further support the vital work of media organizations and online creators, we need to ensure publishers can thrive as Perplexity grows,” writes the company in a blog post announcing the problem. “That’s why we’re excited to announce the Perplexity Publishers Program and our first batch of partners: TIME, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, and WordPress.com.”

Under the program, Perplexity will share a percentage of ad revenue with publishers when their content is cited in AI-generated answers. The revenue share applies on a per-article basis and potentially multiplies if articles from a single publisher are used in one response. Some content providers, such as WordPress.com, plan to pass some of that revenue on to content creators.

A press release from WordPress.com states that joining Perplexity’s Publishers Program allows WordPress.com content to appear in Perplexity’s “Keep Exploring” section on their Discover pages. “That means your articles will be included in their search index and your articles can be surfaced as an answer on their answer engine and Discover feed,” the blog company writes. “If your website is referenced in a Perplexity search result where the company earns advertising revenue, you’ll be eligible for revenue share.”

A screenshot of the Perplexity.ai website taken on July 30, 2024.

Enlarge / A screenshot of the Perplexity.ai website taken on July 30, 2024.

Benj Edwards

Dmitry Shevelenko, Perplexity’s chief business officer, told CNBC that the company began discussions with publishers in January, with program details solidified in early 2024. He reported strong initial interest, with over a dozen publishers reaching out within hours of the announcement.

As part of the program, publishers will also receive access to Perplexity APIs that can be used to create custom “answer engines” and “Enterprise Pro” accounts that provide “enhanced data privacy and security capabilities” for all employees of Publishers in the program for one year.

Accusations of plagiarism

The revenue-sharing announcement follows a rocky month for the AI startup. In mid-June, Forbes reported finding its content within Perplexity’s Pages tool with minimal attribution. Pages allows Perplexity users to curate content and share it with others. Ars Technica sister publication Wired later made similar claims, also noting suspicious traffic patterns from IP addresses likely linked to Perplexity that were ignoring robots.txt exclusions. Perplexity was also found to be manipulating its crawling bots’ ID string to get around website blocks.

As part of company policy, Ars Technica parent Condé Nast disallows AI-based content scrapers, and its CEO Roger Lynch testified in the US Senate earlier this year that generative AI has been built with “stolen goods.” Condé sent a cease-and-desist letter to Perplexity earlier this month.

But publisher trouble might not be Perplexity’s only problem. In some tests of the search we performed in February, Perplexity badly confabulated certain answers, even when citations were readily available. Since our initial tests, the accuracy of Perplexity’s results seems to have improved, but providing inaccurate answers (which also plagued Google’s AI Overviews search feature) is still a potential issue.

Compared to the free tier of service, Perplexity users who pay $20 per month can access more capable LLMs such as GPT-4o and Claude 3, so the quality and accuracy of the output can vary dramatically depending on whether a user subscribes or not. The addition of citations to every Perplexity answer allows users to check accuracy—if they take the time to do it.

The move by Perplexity occurs against a backdrop of tensions between AI companies and content creators. Some media outlets, such as The New York Times, have filed lawsuits against AI vendors like OpenAI and Microsoft, alleging copyright infringement in the training of large language models. OpenAI has struck media licensing deals with many publishers as a way to secure access to high-quality training data and avoid future lawsuits.

In this case, Perplexity is not using the licensed articles and content to train AI models but is seeking legal permission to reproduce content from publishers on its website.

AI search engine accused of plagiarism announces publisher revenue-sharing plan Read More »

outsourcing-emotion:-the-horror-of-google’s-“dear-sydney”-ai-ad

Outsourcing emotion: The horror of Google’s “Dear Sydney” AI ad

Here's an idea: Don't be a deadbeat and do it yourself!

Enlarge / Here’s an idea: Don’t be a deadbeat and do it yourself!

If you’ve watched any Olympics coverage this week, you’ve likely been confronted with an ad for Google’s Gemini AI called “Dear Sydney.” In it, a proud father seeks help writing a letter on behalf of his daughter, who is an aspiring runner and superfan of world-record-holding hurdler Sydney McLaughlin-Levrone.

“I’m pretty good with words, but this has to be just right,” the father intones before asking Gemini to “Help my daughter write a letter telling Sydney how inspiring she is…” Gemini dutifully responds with a draft letter in which the LLM tells the runner, on behalf of the daughter, that she wants to be “just like you.”

Every time I see this ad, it puts me on edge in a way I’ve had trouble putting into words (though Gemini itself has some helpful thoughts). As someone who writes words for a living, the idea of outsourcing a writing task to a machine brings up some vocational anxiety. And the idea of someone who’s “pretty good with words” doubting his abilities when the writing “has to be just right” sets off alarm bells regarding the superhuman framing of AI capabilities.

But I think the most offensive thing about the ad is what it implies about the kinds of human tasks Google sees AI replacing. Rather than using LLMs to automate tedious busywork or difficult research questions, “Dear Sydney” presents a world where Gemini can help us offload a heartwarming shared moment of connection with our children.

The “Dear Sydney” ad.

It’s a distressing answer to what’s still an incredibly common question in the AI space: What do you actually use these things for?

Yes, I can help

Marketers have a difficult task when selling the public on their shiny new AI tools. An effective ad for an LLM has to make it seem like a superhuman do-anything machine but also an approachable, friendly helper. An LLM has to be shown as good enough to reliably do things you can’t (or don’t want to) do yourself, but not so good that it will totally replace you.

Microsoft’s 2024 Super Bowl ad for Copilot is a good example of an attempt to thread this needle, featuring a handful of examples of people struggling to follow their dreams in the face of unseen doubters. “Can you help me?” those dreamers ask Copilot with various prompts. “Yes, I can help” is the message Microsoft delivers back, whether through storyboard images, an impromptu organic chemistry quiz, or “code for a 3D open world game.”

Microsoft’s Copilot marketing sells it as a helper for achieving your dreams.

The “Dear Sydney” ad tries to fit itself into this same box, technically. The prompt in the ad starts with “Help my daughter…” and the tagline at the end offers “A little help from Gemini.” If you look closely near the end, you’ll also see Gemini’s response starts with “Here’s a draft to get you started.” And to be clear, there’s nothing inherently wrong with using an LLM as a writing assistant in this way, especially if you have a disability or are writing in a non-native language.

But the subtle shift from Microsoft’s “Help me” to Google’s “Help my daughter” changes the tone of things. Inserting Gemini into a child’s heartfelt request for parental help makes it seem like the parent in question is offloading their responsibilities to a computer in the coldest, most sterile way possible. More than that, it comes across as an attempt to avoid an opportunity to bond with a child over a shared interest in a creative way.

It’s one thing to use AI to help you with the most tedious parts of your job, as people do in recent ads for Salesforce’s Einstein AI. It’s another to tell your daughter to go ask the computer for help pouring their heart out to their idol.

Outsourcing emotion: The horror of Google’s “Dear Sydney” AI ad Read More »

from-sci-fi-to-state-law:-california’s-plan-to-prevent-ai-catastrophe

From sci-fi to state law: California’s plan to prevent AI catastrophe

Adventures in AI regulation —

Critics say SB-1047, proposed by “AI doomers,” could slow innovation and stifle open source AI.

The California state capital building in Sacramento.

Enlarge / The California State Capitol Building in Sacramento.

California’s “Safe and Secure Innovation for Frontier Artificial Intelligence Models Act” (a.k.a. SB-1047) has led to a flurry of headlines and debate concerning the overall “safety” of large artificial intelligence models. But critics are concerned that the bill’s overblown focus on existential threats by future AI models could severely limit research and development for more prosaic, non-threatening AI uses today.

SB-1047, introduced by State Senator Scott Wiener, passed the California Senate in May with a 32-1 vote and seems well positioned for a final vote in the State Assembly in August. The text of the bill requires companies behind sufficiently large AI models (currently set at $100 million in training costs and the rough computing power implied by those costs today) to put testing procedures and systems in place to prevent and respond to “safety incidents.”

The bill lays out a legalistic definition of those safety incidents that in turn focuses on defining a set of “critical harms” that an AI system might enable. That includes harms leading to “mass casualties or at least $500 million of damage,” such as “the creation or use of chemical, biological, radiological, or nuclear weapon” (hello, Skynet?) or “precise instructions for conducting a cyberattack… on critical infrastructure.” The bill also alludes to “other grave harms to public safety and security that are of comparable severity” to those laid out explicitly.

An AI model’s creator can’t be held liable for harm caused through the sharing of “publicly accessible” information from outside the model—simply asking an LLM to summarize The Anarchist’s Cookbook probably wouldn’t put it in violation of the law, for instance. Instead, the bill seems most concerned with future AIs that could come up with “novel threats to public safety and security.” More than a human using an AI to brainstorm harmful ideas, SB-1047 focuses on the idea of an AI “autonomously engaging in behavior other than at the request of a user” while acting “with limited human oversight, intervention, or supervision.”

Would California's new bill have stopped WOPR?

Enlarge / Would California’s new bill have stopped WOPR?

To prevent this straight-out-of-science-fiction eventuality, anyone training a sufficiently large model must “implement the capability to promptly enact a full shutdown” and have policies in place for when such a shutdown would be enacted, among other precautions and tests. The bill also focuses at points on AI actions that would require “intent, recklessness, or gross negligence” if performed by a human, suggesting a degree of agency that does not exist in today’s large language models.

Attack of the killer AI?

This kind of language in the bill likely reflects the particular fears of its original drafter, Center for AI Safety (CAIS) co-founder Dan Hendrycks. In a 2023 Time Magazine piece, Hendrycks makes the maximalist existential argument that “evolutionary pressures will likely ingrain AIs with behaviors that promote self-preservation” and lead to “a pathway toward being supplanted as the earth’s dominant species.'”

If Hendrycks is right, then legislation like SB-1047 seems like a common-sense precaution—indeed, it might not go far enough. Supporters of the bill, including AI luminaries Geoffrey Hinton and Yoshua Bengio, agree with Hendrycks’ assertion that the bill is a necessary step to prevent potential catastrophic harm from advanced AI systems.

“AI systems beyond a certain level of capability can pose meaningful risks to democracies and public safety,” wrote Bengio in an endorsement of the bill. “Therefore, they should be properly tested and subject to appropriate safety measures. This bill offers a practical approach to accomplishing this, and is a major step toward the requirements that I’ve recommended to legislators.”

“If we see any power-seeking behavior here, it is not of AI systems, but of AI doomers.

Tech policy expert Dr. Nirit Weiss-Blatt

However, critics argue that AI policy shouldn’t be led by outlandish fears of future systems that resemble science fiction more than current technology. “SB-1047 was originally drafted by non-profit groups that believe in the end of the world by sentient machine, like Dan Hendrycks’ Center for AI Safety,” Daniel Jeffries, a prominent voice in the AI community, told Ars. “You cannot start from this premise and create a sane, sound, ‘light touch’ safety bill.”

“If we see any power-seeking behavior here, it is not of AI systems, but of AI doomers,” added tech policy expert Nirit Weiss-Blatt. “With their fictional fears, they try to pass fictional-led legislation, one that, according to numerous AI experts and open source advocates, could ruin California’s and the US’s technological advantage.”

From sci-fi to state law: California’s plan to prevent AI catastrophe Read More »

ios-18.1-developer-beta-brings-apple-intelligence-into-the-wild-for-the-first-time

iOS 18.1 developer beta brings Apple Intelligence into the wild for the first time

AI —

Some features will be included, and others won’t.

Craig Federighi stands in front of a screen with the words

Enlarge / Apple Intelligence was unveiled at WWDC 2024.

Apple

As was just rumored, the iOS 18.1, iPadOS 18.1, and macOS Sequoia 15.1 developer betas are rolling out today, and they include the first opportunity to try out Apple Intelligence, the company’s suite of generative AI features.

Initially announced for iOS 18, Apple Intelligence is expected to launch for the public this fall. Typically, Apple also releases a public beta (the developer one requires a developer account) for new OS updates, but it hasn’t announced any specifics about that just yet.

Not all the Apple Intelligence features will be part of this beta. It will include writing tools, like the ability to rewrite, proofread, or summarize text throughout the OS in first-party and most third-party apps. It will also include new Siri improvements, such as moving seamlessly between voice and typing, the ability to follow when you stumble over your words, and maintaining context from one request to the next. (It will not, however, include ChatGPT integration; Apple says that’s coming later.)

New natural language search features, support for creating memory movies, transcription summaries, and several new Mail features will also be available.

Developers who download the beta will be able to request access to Apple Intelligence features by navigating to the Settings app, tapping Apple Intelligence & Siri, and then tapping “Join the Apple Intelligence waitlist.” The waitlist is in place because some features are demanding on Apple’s servers, and staggering access is meant to stave off any server issues when developers are first trying it out.

iOS 18.1 developer beta brings Apple Intelligence into the wild for the first time Read More »

hang-out-with-ars-in-san-jose-and-dc-this-fall-for-two-infrastructure-events

Hang out with Ars in San Jose and DC this fall for two infrastructure events

Arsmeet! —

Join us as we talk about the next few years in AI & storage, and what to watch for.

Photograph of servers and racks

Enlarge / Infrastructure!

Howdy, Arsians! Last year, we partnered with IBM to host an in-person event in the Houston area where we all gathered together, had some cocktails, and talked about resiliency and the future of IT. Location always matters for things like this, and so we hosted it at Space Center Houston and had our cocktails amidst cool space artifacts. In addition to learning a bunch of neat stuff, it was awesome to hang out with all the amazing folks who turned up at the event. Much fun was had!

This year, we’re back partnering with IBM again and we’re looking to repeat that success with not one, but two in-person gatherings—each featuring a series of panel discussions with experts and capping off with a happy hour for hanging out and mingling. Where last time we went central, this time we’re going to the coasts—both east and west. Read on for details!

September: San Jose, California

Our first event will be in San Jose on September 18, and it’s titled “Beyond the Buzz: An Infrastructure Future with GenAI and What Comes Next.” The idea will be to explore what generative AI means for the future of data management. The topics we’ll be discussing include:

  • Playing the infrastructure long game to address any kind of workload
  • Identifying infrastructure vulnerabilities with today’s AI tools
  • Infrastructure’s environmental footprint: Navigating impacts and responsibilities

We’re getting our panelists locked down right now, and while I don’t have any names to share, many will be familiar to Ars readers from past events—or from the front page.

As a neat added bonus, we’re going to host the event at the Computer History Museum, which any Bay Area Ars reader can attest is an incredibly cool venue. (Just nobody spill anything. I think they’ll kick us out if we break any exhibits!)

October: Washington, DC

Switching coasts, on October 29 we’ll set up shop in our nation’s capital for a similar show. This time, our event title will be “AI in DC: Privacy, Compliance, and Making Infrastructure Smarter.” Given that we’ll be in DC, the tone shifts a bit to some more policy-centric discussions, and the talk track looks like this:

  • The key to compliance with emerging technologies
  • Data security in the age of AI-assisted cyber-espionage
  • The best infrastructure solution for your AI/ML strategy

Same here deal with the speakers as with the September—I can’t name names yet, but the list will be familiar to Ars readers and I’m excited. We’re still considering venues, but hoping to find something that matches our previous events in terms of style and coolness.

Interested in attending?

While it’d be awesome if everyone could come, the old song and dance applies: space, as they say, will be limited at both venues. We’d like to make sure local folks in both locations get priority in being able to attend, so we’re asking anyone who wants a ticket to register for the events at the sign-up pages below. You should get an email immediately confirming we’ve received your info, and we’ll send another note in a couple of weeks with further details on timing and attendance.

On the Ars side, at minimum both our EIC Ken Fisher and I will be in attendance at both events, and we’ll likely have some other Ars staff showing up where we can—free drinks are a strong lure for the weary tech journalist, so there ought to be at least a few appearing at both. Hoping to see you all there!

Hang out with Ars in San Jose and DC this fall for two infrastructure events Read More »