Deepfakes

ai-generated-al-michaels-to-provide-daily-recaps-during-2024-summer-olympics

AI-generated Al Michaels to provide daily recaps during 2024 Summer Olympics

forever young —

AI voice clone will narrate daily Olympics video recaps; critics call it a “code-generated ghoul.”

Al Michaels looks on prior to the game between the Minnesota Vikings and Philadelphia Eagles at Lincoln Financial Field on September 14, 2023 in Philadelphia, Pennsylvania.

Enlarge / Al Michaels looks on prior to the game between the Minnesota Vikings and Philadelphia Eagles at Lincoln Financial Field on September 14, 2023, in Philadelphia, Pennsylvania.

On Wednesday, NBC announced plans to use an AI-generated clone of famous sports commentator Al Michaels‘ voice to narrate daily streaming video recaps of the 2024 Summer Olympics in Paris, which start on July 26. The AI-powered narration will feature in “Your Daily Olympic Recap on Peacock,” NBC’s streaming service. But this new, high-profile use of voice cloning worries critics, who say the technology may muscle out upcoming sports commentators by keeping old personas around forever.

NBC says it has created a “high-quality AI re-creation” of Michaels’ voice, trained on Michaels’ past NBC appearances to capture his distinctive delivery style.

The veteran broadcaster, revered in the sports commentator world for his iconic “Do you believe in miracles? Yes!” call during the 1980 Winter Olympics, has been covering sports on TV since 1971, including a high-profile run of play-by-play coverage of NFL football games for both ABC and NBC since the 1980s. NBC dropped him from NFL coverage in 2023, however, possibly due to his age.

Michaels, who is 79 years old, shared his initial skepticism about the project in an interview with Vanity Fair, as NBC News notes. After hearing the AI version of his voice, which can greet viewers by name, he described the experience as “astonishing” and “a little bit frightening.” He said the AI recreation was “almost 2% off perfect” in mimicking his style.

The Vanity Fair article provides some insight into how NBC’s new AI system works. It first uses a large language model (similar technology to what powers ChatGPT) to analyze subtitles and metadata from NBC’s Olympics video coverage, summarizing events and writing custom output to imitate Michaels’ style. This text is then fed into an unspecified voice AI model trained on Michaels’ previous NBC appearances, reportedly replicating his unique pronunciations and intonations.

NBC estimates that the system could generate nearly 7 million personalized variants of the recaps across the US during the games, pulled from the network’s 5,000 hours of live coverage. Using the system, each Peacock user will receive about 10 minutes of personalized highlights.

A diminished role for humans in the future?

Al Michaels reports on the Sweden vs. USA men's ice hockey game at the 1980 Olympic Winter Games on February 12, 1980.

Enlarge / Al Michaels reports on the Sweden vs. USA men’s ice hockey game at the 1980 Olympic Winter Games on February 12, 1980.

It’s no secret that while AI is wildly hyped right now, it’s also controversial among some. Upon hearing the NBC announcement, critics of AI technology reacted strongly. “@NBCSports, this is gross,” tweeted actress and filmmaker Justine Bateman, who frequently uses X to criticize technologies that might replace human writers or performers in the future.

A thread of similar responses from X users reacting to the sample video provided above included criticisms such as, “Sounds pretty off when it’s just the same tone for every single word.” Another user wrote, “It just sounds so unnatural. No one talks like that.”

The technology will not replace NBC’s regular human sports commentators during this year’s Olympics coverage, and like other forms of AI, it leans heavily on existing human work by analyzing and regurgitating human-created content in the form of captions pulled from NBC footage.

Looking down the line, due to AI media cloning technologies like voice, video, and image synthesis, today’s celebrities may be able to attain a form of media immortality that allows new iterations of their likenesses to persist through the generations, potentially earning licensing fees for whoever holds the rights.

We’ve already seen it with James Earl Jones playing Darth Vader’s voice, and the trend will likely continue with other celebrity voices, provided the money is right. Eventually, it may extend to famous musicians through music synthesis and famous actors in video-synthesis applications as well.

The possibility of being muscled out by AI replicas factored heavily into a Hollywood actors’ strike last year, with SAG-AFTRA union President Fran Drescher saying, “If we don’t stand tall right now, we are all going to be in trouble. We are all going to be in jeopardy of being replaced by machines.”

For companies that like to monetize media properties for as long as possible, AI may provide a way to maintain a media legacy through automation. But future human performers may have to compete against all of the greatest performers of the past, rendered through AI, to break out and forge a new career—provided there will be room for human performers at all.

Al Michaels became Al Michaels because he was brought in to replace people who died, or retired, or moved on,” tweeted a writer named Geonn Cannon on X. “If he can’t do the job anymore, it’s time to let the next Al Michaels have a shot at it instead of just planting a code-generated ghoul in an empty chair.

AI-generated Al Michaels to provide daily recaps during 2024 Summer Olympics Read More »

political-deepfakes-are-the-most-popular-way-to-misuse-ai

Political deepfakes are the most popular way to misuse AI

This is not going well —

Study from Google’s DeepMind lays out nefarious ways AI is being used.

Political deepfakes are the most popular way to misuse AI

Artificial intelligence-generated “deepfakes” that impersonate politicians and celebrities are far more prevalent than efforts to use AI to assist cyber attacks, according to the first research by Google’s DeepMind division into the most common malicious uses of the cutting-edge technology.

The study said the creation of realistic but fake images, video, and audio of people was almost twice as common as the next highest misuse of generative AI tools: the falsifying of information using text-based tools, such as chatbots, to generate misinformation to post online.

The most common goal of actors misusing generative AI was to shape or influence public opinion, the analysis, conducted with the search group’s research and development unit Jigsaw, found. That accounted for 27 percent of uses, feeding into fears over how deepfakes might influence elections globally this year.

Deepfakes of UK Prime Minister Rishi Sunak, as well as other global leaders, have appeared on TikTok, X, and Instagram in recent months. UK voters go to the polls next week in a general election.

Concern is widespread that, despite social media platforms’ efforts to label or remove such content, audiences may not recognize these as fake, and dissemination of the content could sway voters.

Ardi Janjeva, research associate at The Alan Turing Institute, called “especially pertinent” the paper’s finding that the contamination of publicly accessible information with AI-generated content could “distort our collective understanding of sociopolitical reality.”

Janjeva added: “Even if we are uncertain about the impact that deepfakes have on voting behavior, this distortion may be harder to spot in the immediate term and poses long-term risks to our democracies.”

The study is the first of its kind by DeepMind, Google’s AI unit led by Sir Demis Hassabis, and is an attempt to quantify the risks from the use of generative AI tools, which the world’s biggest technology companies have rushed out to the public in search of huge profits.

As generative products such as OpenAI’s ChatGPT and Google’s Gemini become more widely used, AI companies are beginning to monitor the flood of misinformation and other potentially harmful or unethical content created by their tools.

In May, OpenAI released research revealing operations linked to Russia, China, Iran, and Israel had been using its tools to create and spread disinformation.

“There had been a lot of understandable concern around quite sophisticated cyber attacks facilitated by these tools,” said Nahema Marchal, lead author of the study and researcher at Google DeepMind. “Whereas what we saw were fairly common misuses of GenAI [such as deepfakes that] might go under the radar a little bit more.”

Google DeepMind and Jigsaw’s researchers analyzed around 200 observed incidents of misuse between January 2023 and March 2024, taken from social media platforms X and Reddit, as well as online blogs and media reports of misuse.

Ars Technica

The second most common motivation behind misuse was to make money, whether offering services to create deepfakes, including generating naked depictions of real people, or using generative AI to create swaths of content, such as fake news articles.

The research found that most incidents use easily accessible tools, “requiring minimal technical expertise,” meaning more bad actors can misuse generative AI.

Google DeepMind’s research will influence how it improves its evaluations to test models for safety, and it hopes it will also affect how its competitors and other stakeholders view how “harms are manifesting.”

© 2024 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Political deepfakes are the most popular way to misuse AI Read More »

runway’s-latest-ai-video-generator-brings-giant-cotton-candy-monsters-to-life

Runway’s latest AI video generator brings giant cotton candy monsters to life

Screen capture of a Runway Gen-3 Alpha video generated with the prompt

Enlarge / Screen capture of a Runway Gen-3 Alpha video generated with the prompt “A giant humanoid, made of fluffy blue cotton candy, stomping on the ground, and roaring to the sky, clear blue sky behind them.”

On Sunday, Runway announced a new AI video synthesis model called Gen-3 Alpha that’s still under development, but it appears to create video of similar quality to OpenAI’s Sora, which debuted earlier this year (and has also not yet been released). It can generate novel, high-definition video from text prompts that range from realistic humans to surrealistic monsters stomping the countryside.

Unlike Runway’s previous best model from June 2023, which could only create two-second-long clips, Gen-3 Alpha can reportedly create 10-second-long video segments of people, places, and things that have a consistency and coherency that easily surpasses Gen-2. If 10 seconds sounds short compared to Sora’s full minute of video, consider that the company is working with a shoestring budget of compute compared to more lavishly funded OpenAI—and actually has a history of shipping video generation capability to commercial users.

Gen-3 Alpha does not generate audio to accompany the video clips, and it’s highly likely that temporally coherent generations (those that keep a character consistent over time) are dependent on similar high-quality training material. But Runway’s improvement in visual fidelity over the past year is difficult to ignore.

AI video heats up

It’s been a busy couple of weeks for AI video synthesis in the AI research community, including the launch of the Chinese model Kling, created by Beijing-based Kuaishou Technology (sometimes called “Kwai”). Kling can generate two minutes of 1080p HD video at 30 frames per second with a level of detail and coherency that reportedly matches Sora.

Gen-3 Alpha prompt: “Subtle reflections of a woman on the window of a train moving at hyper-speed in a Japanese city.”

Not long after Kling debuted, people on social media began creating surreal AI videos using Luma AI’s Luma Dream Machine. These videos were novel and weird but generally lacked coherency; we tested out Dream Machine and were not impressed by anything we saw.

Meanwhile, one of the original text-to-video pioneers, New York City-based Runway—founded in 2018—recently found itself the butt of memes that showed its Gen-2 tech falling out of favor compared to newer video synthesis models. That may have spurred the announcement of Gen-3 Alpha.

Gen-3 Alpha prompt: “An astronaut running through an alley in Rio de Janeiro.”

Generating realistic humans has always been tricky for video synthesis models, so Runway specifically shows off Gen-3 Alpha’s ability to create what its developers call “expressive” human characters with a range of actions, gestures, and emotions. However, the company’s provided examples weren’t particularly expressive—mostly people just slowly staring and blinking—but they do look realistic.

Provided human examples include generated videos of a woman on a train, an astronaut running through a street, a man with his face lit by the glow of a TV set, a woman driving a car, and a woman running, among others.

Gen-3 Alpha prompt: “A close-up shot of a young woman driving a car, looking thoughtful, blurred green forest visible through the rainy car window.”

The generated demo videos also include more surreal video synthesis examples, including a giant creature walking in a rundown city, a man made of rocks walking in a forest, and the giant cotton candy monster seen below, which is probably the best video on the entire page.

Gen-3 Alpha prompt: “A giant humanoid, made of fluffy blue cotton candy, stomping on the ground, and roaring to the sky, clear blue sky behind them.”

Gen-3 will power various Runway AI editing tools (one of the company’s most notable claims to fame), including Multi Motion Brush, Advanced Camera Controls, and Director Mode. It can create videos from text or image prompts.

Runway says that Gen-3 Alpha is the first in a series of models trained on a new infrastructure designed for large-scale multimodal training, taking a step toward the development of what it calls “General World Models,” which are hypothetical AI systems that build internal representations of environments and use them to simulate future events within those environments.

Runway’s latest AI video generator brings giant cotton candy monsters to life Read More »

deepfakes-in-the-courtroom:-us-judicial-panel-debates-new-ai-evidence-rules

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules

adventures in 21st-century justice —

Panel of eight judges confronts deep-faking AI tech that may undermine legal trials.

An illustration of a man with a very long nose holding up the scales of justice.

On Friday, a federal judicial panel convened in Washington, DC, to discuss the challenges of policing AI-generated evidence in court trials, according to a Reuters report. The US Judicial Conference’s Advisory Committee on Evidence Rules, an eight-member panel responsible for drafting evidence-related amendments to the Federal Rules of Evidence, heard from computer scientists and academics about the potential risks of AI being used to manipulate images and videos or create deepfakes that could disrupt a trial.

The meeting took place amid broader efforts by federal and state courts nationwide to address the rise of generative AI models (such as those that power OpenAI’s ChatGPT or Stability AI’s Stable Diffusion), which can be trained on large datasets with the aim of producing realistic text, images, audio, or videos.

In the published 358-page agenda for the meeting, the committee offers up this definition of a deepfake and the problems AI-generated media may pose in legal trials:

A deepfake is an inauthentic audiovisual presentation prepared by software programs using artificial intelligence. Of course, photos and videos have always been subject to forgery, but developments in AI make deepfakes much more difficult to detect. Software for creating deepfakes is already freely available online and fairly easy for anyone to use. As the software’s usability and the videos’ apparent genuineness keep improving over time, it will become harder for computer systems, much less lay jurors, to tell real from fake.

During Friday’s three-hour hearing, the panel wrestled with the question of whether existing rules, which predate the rise of generative AI, are sufficient to ensure the reliability and authenticity of evidence presented in court.

Some judges on the panel, such as US Circuit Judge Richard Sullivan and US District Judge Valerie Caproni, reportedly expressed skepticism about the urgency of the issue, noting that there have been few instances so far of judges being asked to exclude AI-generated evidence.

“I’m not sure that this is the crisis that it’s been painted as, and I’m not sure that judges don’t have the tools already to deal with this,” said Judge Sullivan, as quoted by Reuters.

Last year, Chief US Supreme Court Justice John Roberts acknowledged the potential benefits of AI for litigants and judges, while emphasizing the need for the judiciary to consider its proper uses in litigation. US District Judge Patrick Schiltz, the evidence committee’s chair, said that determining how the judiciary can best react to AI is one of Roberts’ priorities.

In Friday’s meeting, the committee considered several deepfake-related rule changes. In the agenda for the meeting, US District Judge Paul Grimm and attorney Maura Grossman proposed modifying Federal Rule 901(b)(9) (see page 5), which involves authenticating or identifying evidence. They also recommended the addition of a new rule, 901(c), which might read:

901(c): Potentially Fabricated or Altered Electronic Evidence. If a party challenging the authenticity of computer-generated or other electronic evidence demonstrates to the court that it is more likely than not either fabricated, or altered in whole or in part, the evidence is admissible only if the proponent demonstrates that its probative value outweighs its prejudicial effect on the party challenging the evidence.

The panel agreed during the meeting that this proposal to address concerns about litigants challenging evidence as deepfakes did not work as written and that it will be reworked before being reconsidered later.

Another proposal by Andrea Roth, a law professor at the University of California, Berkeley, suggested subjecting machine-generated evidence to the same reliability requirements as expert witnesses. However, Judge Schiltz cautioned that such a rule could hamper prosecutions by allowing defense lawyers to challenge any digital evidence without establishing a reason to question it.

For now, no definitive rule changes have been made, and the process continues. But we’re witnessing the first steps of how the US justice system will adapt to an entirely new class of media-generating technology.

Putting aside risks from AI-generated evidence, generative AI has led to embarrassing moments for lawyers in court over the past two years. In May 2023, US lawyer Steven Schwartz of the firm Levidow, Levidow, & Oberman apologized to a judge for using ChatGPT to help write court filings that inaccurately cited six nonexistent cases, leading to serious questions about the reliability of AI in legal research. Also, in November, a lawyer for Michael Cohen cited three fake cases that were potentially influenced by a confabulating AI assistant.

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules Read More »

microsoft’s-vasa-1-can-deepfake-a-person-with-one-photo-and-one-audio-track

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

pics and it didn’t happen —

YouTube videos of 6K celebrities helped train AI model to animate photos in real time.

A sample image from Microsoft for

Enlarge / A sample image from Microsoft for “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.”

On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don’t require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

“It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors,” reads the abstract of the accompanying research paper titled, “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.” It’s the work of Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, and Baining Guo.

The VASA framework (short for “Visual Affective Skills Animator”) uses machine learning to analyze a static image along with a speech audio clip. It is then able to generate a realistic video with precise facial expressions, head movements, and lip-syncing to the audio. It does not clone or simulate voices (like other Microsoft research) but relies on an existing audio input that could be specially recorded or spoken for a particular purpose.

Microsoft claims the model significantly outperforms previous speech animation methods in terms of realism, expressiveness, and efficiency. To our eyes, it does seem like an improvement over single-image animating models that have come before.

AI research efforts to animate a single photo of a person or character extend back at least a few years, but more recently, researchers have been working on automatically synchronizing a generated video to an audio track. In February, an AI model called EMO: Emote Portrait Alive from Alibaba’s Institute for Intelligent Computing research group made waves with a similar approach to VASA-1 that can automatically sync an animated photo to a provided audio track (they call it “Audio2Video”).

Trained on YouTube clips

Microsoft Researchers trained VASA-1 on the VoxCeleb2 dataset created in 2018 by three researchers from the University of Oxford. That dataset contains “over 1 million utterances for 6,112 celebrities,” according to the VoxCeleb2 website, extracted from videos uploaded to YouTube. VASA-1 can reportedly generate videos of 512×512 pixel resolution at up to 40 frames per second with minimal latency, which means it could potentially be used for realtime applications like video conferencing.

To show off the model, Microsoft created a VASA-1 research page featuring many sample videos of the tool in action, including people singing and speaking in sync with pre-recorded audio tracks. They show how the model can be controlled to express different moods or change its eye gaze. The examples also include some more fanciful generations, such as Mona Lisa rapping to an audio track of Anne Hathaway performing a “Paparazzi” song on Conan O’Brien.

The researchers say that, for privacy reasons, each example photo on their page was AI-generated by StyleGAN2 or DALL-E 3 (aside from the Mona Lisa). But it’s obvious that the technique could equally apply to photos of real people as well, although it’s likely that it will work better if a person appears similar to a celebrity present in the training dataset. Still, the researchers say that deepfaking real humans is not their intention.

“We are exploring visual affective skill generation for virtual, interactive charactors [sic], NOT impersonating any person in the real world. This is only a research demonstration and there’s no product or API release plan,” reads the site.

While the Microsoft researchers tout potential positive applications like enhancing educational equity, improving accessibility, and providing therapeutic companionship, the technology could also easily be misused. For example, it could allow people to fake video chats, make real people appear to say things they never actually said (especially when paired with a cloned voice track), or allow harassment from a single social media photo.

Right now, the generated video still looks imperfect in some ways, but it could be fairly convincing for some people if they did not know to expect an AI-generated animation. The researchers say they are aware of this, which is why they are not openly releasing the code that powers the model.

“We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection,” write the researchers. “Currently, the videos generated by this method still contain identifiable artifacts, and the numerical analysis shows that there’s still a gap to achieve the authenticity of real videos.”

VASA-1 is only a research demonstration, but Microsoft is far from the only group developing similar technology. If the recent history of generative AI is any guide, it’s potentially only a matter of time before similar technology becomes open source and freely available—and they will very likely continue to improve in realism over time.

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track Read More »

florida-middle-schoolers-charged-with-making-deepfake-nudes-of-classmates

Florida middle-schoolers charged with making deepfake nudes of classmates

no consent —

AI tool was used to create nudes of 12- to 13-year-old classmates.

Florida middle-schoolers charged with making deepfake nudes of classmates

Jacqui VanLiew; Getty Images

Two teenage boys from Miami, Florida, were arrested in December for allegedly creating and sharing AI-generated nude images of male and female classmates without consent, according to police reports obtained by WIRED via public record request.

The arrest reports say the boys, aged 13 and 14, created the images of the students who were “between the ages of 12 and 13.”

The Florida case appears to be the first arrests and criminal charges as a result of alleged sharing of AI-generated nude images to come to light. The boys were charged with third-degree felonies—the same level of crimes as grand theft auto or false imprisonment—under a state law passed in 2022 which makes it a felony to share “any altered sexual depiction” of a person without their consent.

The parent of one of the boys arrested did not respond to a request for comment in time for publication. The parent of the other boy said that he had “no comment.” The detective assigned to the case, and the state attorney handling the case, did not respond for comment in time for publication.

As AI image-making tools have become more widely available, there have been several high-profile incidents in which minors allegedly created AI-generated nude images of classmates and shared them without consent. No arrests have been disclosed in the publicly reported cases—at Issaquah High School in Washington, Westfield High School in New Jersey, and Beverly Vista Middle School in California—even though police reports were filed. At Issaquah High School, police opted not to press charges.

The first media reports of the Florida case appeared in December, saying that the two boys were suspended from Pinecrest Cove Academy in Miami for 10 days after school administrators learned of allegations that they created and shared fake nude images without consent. After parents of the victims learned about the incident, several began publicly urging the school to expel the boys.

Nadia Khan-Roberts, the mother of one of the victims, told NBC Miami in December that for all of the families whose children were victimized the incident was traumatizing. “Our daughters do not feel comfortable walking the same hallways with these boys,” she said. “It makes me feel violated, I feel taken advantage [of] and I feel used,” one victim, who asked to remain anonymous, told the TV station.

WIRED obtained arrest records this week that say the incident was reported to police on December 6, 2023, and that the two boys were arrested on December 22. The records accuse the pair of using “an artificial intelligence application” to make the fake explicit images. The name of the app was not specified and the reports claim the boys shared the pictures between each other.

“The incident was reported to a school administrator,” the reports say, without specifying who reported it, or how that person found out about the images. After the school administrator “obtained copies of the altered images” the administrator interviewed the victims depicted in them, the reports say, who said that they did not consent to the images being created.

After their arrest, the two boys accused of making the images were transported to the Juvenile Service Department “without incident,” the reports say.

A handful of states have laws on the books that target fake, nonconsensual nude images. There’s no federal law targeting the practice, but a group of US senators recently introduced a bill to combat the problem after fake nude images of Taylor Swift were created and distributed widely on X.

The boys were charged under a Florida law passed in 2022 that state legislators designed to curb harassment involving deepfake images made using AI-powered tools.

Stephanie Cagnet Myron, a Florida lawyer who represents victims of nonconsensually shared nude images, tells WIRED that anyone who creates fake nude images of a minor would be in possession of child sexual abuse material, or CSAM. However, she claims it’s likely that the two boys accused of making and sharing the material were not charged with CSAM possession due to their age.

“There’s specifically several crimes that you can charge in a case, and you really have to evaluate what’s the strongest chance of winning, what has the highest likelihood of success, and if you include too many charges, is it just going to confuse the jury?” Cagnet Myron added.

Mary Anne Franks, a professor at the George Washington University School of Law and a lawyer who has studied the problem of nonconsensual explicit imagery, says it’s “odd” that Florida’s revenge porn law, which predates the 2022 statute under which the boys were charged, only makes the offense a misdemeanor, while this situation represented a felony.

“It is really strange to me that you impose heftier penalties for fake nude photos than for real ones,” she says.

Franks adds that although she believes distributing nonconsensual fake explicit images should be a criminal offense, thus creating a deterrent effect, she doesn’t believe offenders should be incarcerated, especially not juveniles.

“The first thing I think about is how young the victims are and worried about the kind of impact on them,” Franks says. “But then [I] also question whether or not throwing the book at kids is actually going to be effective here.”

This story originally appeared on wired.com.

Florida middle-schoolers charged with making deepfake nudes of classmates Read More »

stability-announces-stable-diffusion-3,-a-next-gen-ai-image-generator

Stability announces Stable Diffusion 3, a next-gen AI image generator

Pics and it didn’t happen —

SD3 may bring DALL-E-like prompt fidelity to an open-weights image-synthesis model.

Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

Enlarge / Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

On Thursday, Stability AI announced Stable Diffusion 3, an open-weights next-generation image-synthesis model. It follows its predecessors by reportedly generating detailed, multi-subject images with improved quality and accuracy in text generation. The brief announcement was not accompanied by a public demo, but Stability is opening up a waitlist today for those who would like to try it.

Stability says that its Stable Diffusion 3 family of models (which takes text descriptions called “prompts” and turns them into matching images) range in size from 800 million to 8 billion parameters. The size range accommodates allowing different versions of the model to run locally on a variety of devices—from smartphones to servers. Parameter size roughly corresponds to model capability in terms of how much detail it can generate. Larger models also require more VRAM on GPU accelerators to run.

Since 2022, we’ve seen Stability launch a progression of AI image-generation models: Stable Diffusion 1.4, 1.5, 2.0, 2.1, XL, XL Turbo, and now 3. Stability has made a name for itself as providing a more open alternative to proprietary image-synthesis models like OpenAI’s DALL-E 3, though not without controversy due to the use of copyrighted training data, bias, and the potential for abuse. (This has led to lawsuits that are unresolved.) Stable Diffusion models have been open-weights and source-available, which means the models can be run locally and fine-tuned to change their outputs.

  • Stable Diffusion 3 generation with the prompt: Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says “Stable Diffusion 3” made out of colorful energy.

  • An AI-generated image of a grandma wearing a “Go big or go home sweatshirt” generated by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: Three transparent glass bottles on a wooden table. The one on the left has red liquid and the number 1. The one in the middle has blue liquid and the number 2. The one on the right has green liquid and the number 3.

  • An AI-generated image created by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: A horse balancing on top of a colorful ball in a field with green grass and a mountain in the background.

  • Stable Diffusion 3 generation with the prompt: Moody still life of assorted pumpkins.

  • Stable Diffusion 3 generation with the prompt: a painting of an astronaut riding a pig wearing a tutu holding a pink umbrella, on the ground next to the pig is a robin bird wearing a top hat, in the corner are the words “stable diffusion.”

  • Stable Diffusion 3 generation with the prompt: Resting on the kitchen table is an embroidered cloth with the text ‘good night’ and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic.

  • Stable Diffusion 3 generation with the prompt: Photo of an 90’s desktop computer on a work desk, on the computer screen it says “welcome”. On the wall in the background we see beautiful graffiti with the text “SD3” very large on the wall.

As far as tech improvements are concerned, Stability CEO Emad Mostaque wrote on X, “This uses a new type of diffusion transformer (similar to Sora) combined with flow matching and other improvements. This takes advantage of transformer improvements & can not only scale further but accept multimodal inputs.”

Like Mostaque said, the Stable Diffusion 3 family uses diffusion transformer architecture, which is a new way of creating images with AI that swaps out the usual image-building blocks (such as U-Net architecture) for a system that works on small pieces of the picture. The method was inspired by transformers, which are good at handling patterns and sequences. This approach not only scales up efficiently but also reportedly produces higher-quality images.

Stable Diffusion 3 also utilizes “flow matching,” which is a technique for creating AI models that can generate images by learning how to transition from random noise to a structured image smoothly. It does this without needing to simulate every step of the process, instead focusing on the overall direction or flow that the image creation should follow.

A comparison of outputs between OpenAI's DALL-E 3 and Stable Diffusion 3 with the prompt,

Enlarge / A comparison of outputs between OpenAI’s DALL-E 3 and Stable Diffusion 3 with the prompt, “Night photo of a sports car with the text “SD3″ on the side, the car is on a race track at high speed, a huge road sign with the text ‘faster.'”

We do not have access to Stable Diffusion 3 (SD3), but from samples we found posted on Stability’s website and associated social media accounts, the generations appear roughly comparable to other state-of-the-art image-synthesis models at the moment, including the aforementioned DALL-E 3, Adobe Firefly, Imagine with Meta AI, Midjourney, and Google Imagen.

SD3 appears to handle text generation very well in the examples provided by others, which are potentially cherry-picked. Text generation was a particular weakness of earlier image-synthesis models, so an improvement to that capability in a free model is a big deal. Also, prompt fidelity (how closely it follows descriptions in prompts) seems to be similar to DALL-E 3, but we haven’t tested that ourselves yet.

While Stable Diffusion 3 isn’t widely available, Stability says that once testing is complete, its weights will be free to download and run locally. “This preview phase, as with previous models,” Stability writes, “is crucial for gathering insights to improve its performance and safety ahead of an open release.”

Stability has been experimenting with a variety of image-synthesis architectures recently. Aside from SDXL and SDXL Turbo, just last week, the company announced Stable Cascade, which uses a three-stage process for text-to-image synthesis.

Listing image by Emad Mostaque (Stability AI)

Stability announces Stable Diffusion 3, a next-gen AI image generator Read More »

will-smith-parodies-viral-ai-generated-video-by-actually-eating-spaghetti

Will Smith parodies viral AI-generated video by actually eating spaghetti

Mangia, mangia —

Actor pokes fun at 2023 AI video by eating spaghetti messily and claiming it’s AI-generated.

The real Will Smith eating spaghetti, parodying an AI-generated video from 2023.

Enlarge / The real Will Smith eating spaghetti, parodying an AI-generated video from 2023.

On Monday, Will Smith posted a video on his official Instagram feed that parodied an AI-generated video of the actor eating spaghetti that went viral last year. With the recent announcement of OpenAI’s Sora video synthesis model, many people have noted the dramatic jump in AI-video quality over the past year compared to the infamous spaghetti video. Smith’s new video plays on that comparison by showing the actual actor eating spaghetti in a comical fashion and claiming that it is AI-generated.

Captioned “This is getting out of hand!”, the Instagram video uses a split screen layout to show the original AI-generated spaghetti video created by a Reddit user named “chaindrop” in March 2023 on the top, labeled with the subtitle “AI Video 1 year ago.” Below that, in a box titled “AI Video Now,” the real Smith shows 11 video segments of himself actually eating spaghetti by slurping it up while shaking his head, pouring it into his mouth with his fingers, and even nibbling on a friend’s hair. 2006’s Snap Yo Fingers by Lil Jon plays in the background.

In the Instagram comments section, some people expressed confusion about the new (non-AI) video, saying, “I’m still in doubt if second video was also made by AI or not.” In a reply, someone else wrote, “Boomers are gonna loose [sic] this one. Second one is clearly him making a joke but I wouldn’t doubt it in a couple months time it will get like that.”

We have not yet seen a model with the capability of Sora attempt to create a new Will-Smith-eating-spaghetti AI video, but the result would likely be far better than what we saw last year, even if it contained obvious glitches. Given how things are progressing, we wouldn’t be surprised if by 2025, video synthesis AI models can replicate the parody video created by Smith himself.

It’s worth noting for history’s sake that despite the comparison, the video of Will Smith eating spaghetti did not represent the state of the art in text-to-video synthesis at the time of its creation in March 2023 (that title would likely apply to Runway’s Gen-2, which was then in closed testing). However, the spaghetti video was reasonably advanced for open weights models at the time, having used the ModelScope AI model. More capable video synthesis models had already been released at that time, but due to the humorous cultural reference, it’s arguably more fun to compare today’s AI video synthesis to Will Smith grotesquely eating spaghetti than to teddy bears washing dishes.

Will Smith parodies viral AI-generated video by actually eating spaghetti Read More »

openai-collapses-media-reality-with-sora,-a-photorealistic-ai-video-generator

OpenAI collapses media reality with Sora, a photorealistic AI video generator

Pics and it didn’t happen —

Hello, cultural singularity—soon, every video you see online could be completely fake.

Snapshots from three videos generated using OpenAI's Sora.

Enlarge / Snapshots from three videos generated using OpenAI’s Sora.

On Thursday, OpenAI announced Sora, a text-to-video AI model that can generate 60-second-long photorealistic HD video from written descriptions. While it’s only a research preview that we have not tested, it reportedly creates synthetic video (but not audio yet) at a fidelity and consistency greater than any text-to-video model available at the moment. It’s also freaking people out.

“It was nice knowing you all. Please tell your grandchildren about my videos and the lengths we went to to actually record them,” wrote Wall Street Journal tech reporter Joanna Stern on X.

“This could be the ‘holy shit’ moment of AI,” wrote Tom Warren of The Verge.

“Every single one of these videos is AI-generated, and if this doesn’t concern you at least a little bit, nothing will,” tweeted YouTube tech journalist Marques Brownlee.

For future reference—since this type of panic will some day appear ridiculous—there’s a generation of people who grew up believing that photorealistic video must be created by cameras. When video was faked (say, for Hollywood films), it took a lot of time, money, and effort to do so, and the results weren’t perfect. That gave people a baseline level of comfort that what they were seeing remotely was likely to be true, or at least representative of some kind of underlying truth. Even when the kid jumped over the lava, there was at least a kid and a room.

The prompt that generated the video above: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.

Technology like Sora pulls the rug out from under that kind of media frame of reference. Very soon, every photorealistic video you see online could be 100 percent false in every way. Moreover, every historical video you see could also be false. How we confront that as a society and work around it while maintaining trust in remote communications is far beyond the scope of this article, but I tried my hand at offering some solutions back in 2020, when all of the tech we’re seeing now seemed like a distant fantasy to most people.

In that piece, I called the moment that truth and fiction in media become indistinguishable the “cultural singularity.” It appears that OpenAI is on track to bring that prediction to pass a bit sooner than we expected.

Prompt: Reflections in the window of a train traveling through the Tokyo suburbs.

OpenAI has found that, like other AI models that use the transformer architecture, Sora scales with available compute. Given far more powerful computers behind the scenes, AI video fidelity could improve considerably over time. In other words, this is the “worst” AI-generated video is ever going to look. There’s no synchronized sound yet, but that might be solved in future models.

How (we think) they pulled it off

AI video synthesis has progressed by leaps and bounds over the past two years. We first covered text-to-video models in September 2022 with Meta’s Make-A-Video. A month later, Google showed off Imagen Video. And just 11 months ago, an AI-generated version of Will Smith eating spaghetti went viral. In May of last year, what was previously considered to be the front-runner in the text-to-video space, Runway Gen-2, helped craft a fake beer commercial full of twisted monstrosities, generated in two-second increments. In earlier video-generation models, people pop in and out of reality with ease, limbs flow together like pasta, and physics doesn’t seem to matter.

Sora (which means “sky” in Japanese) appears to be something altogether different. It’s high-resolution (1920×1080), can generate video with temporal consistency (maintaining the same subject over time) that lasts up to 60 seconds, and appears to follow text prompts with a great deal of fidelity. So, how did OpenAI pull it off?

OpenAI doesn’t usually share insider technical details with the press, so we’re left to speculate based on theories from experts and information given to the public.

OpenAI says that Sora is a diffusion model, much like DALL-E 3 and Stable Diffusion. It generates a video by starting off with noise and “gradually transforms it by removing the noise over many steps,” the company explains. It “recognizes” objects and concepts listed in the written prompt and pulls them out of the noise, so to speak, until a coherent series of video frames emerge.

Sora is capable of generating videos all at once from a text prompt, extending existing videos, or generating videos from still images. It achieves temporal consistency by giving the model “foresight” of many frames at once, as OpenAI calls it, solving the problem of ensuring a generated subject remains the same even if it falls out of view temporarily.

OpenAI represents video as collections of smaller groups of data called “patches,” which the company says are similar to tokens (fragments of a word) in GPT-4. “By unifying how we represent data, we can train diffusion transformers on a wider range of visual data than was possible before, spanning different durations, resolutions, and aspect ratios,” the company writes.

An important tool in OpenAI’s bag of tricks is that its use of AI models is compounding. Earlier models are helping to create more complex ones. Sora follows prompts well because, like DALL-E 3, it utilizes synthetic captions that describe scenes in the training data generated by another AI model like GPT-4V. And the company is not stopping here. “Sora serves as a foundation for models that can understand and simulate the real world,” OpenAI writes, “a capability we believe will be an important milestone for achieving AGI.”

One question on many people’s minds is what data OpenAI used to train Sora. OpenAI has not revealed its dataset, but based on what people are seeing in the results, it’s possible OpenAI is using synthetic video data generated in a video game engine in addition to sources of real video (say, scraped from YouTube or licensed from stock video libraries). Nvidia’s Dr. Jim Fan, who is a specialist in training AI with synthetic data, wrote on X, “I won’t be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be!” Until confirmed by OpenAI, however, that’s just speculation.

OpenAI collapses media reality with Sora, a photorealistic AI video generator Read More »

facebook-rules-allowing-fake-biden-“pedophile”-video-deemed-“incoherent”

Facebook rules allowing fake Biden “pedophile” video deemed “incoherent”

Not to be misled —

Meta may revise AI policies that experts say overlook “more misleading” content.

Facebook rules allowing fake Biden “pedophile” video deemed “incoherent”

A fake video manipulated to falsely depict President Joe Biden inappropriately touching his granddaughter has revealed flaws in Facebook’s “deepfake” policies, Meta’s Oversight Board concluded Monday.

Last year when the Biden video went viral, Facebook repeatedly ruled that it did not violate policies on hate speech, manipulated media, or bullying and harassment. Since the Biden video is not AI-generated content and does not manipulate the president’s speech—making him appear to say things he’s never said—the video was deemed OK to remain on the platform. Meta also noted that the video was “unlikely to mislead” the “average viewer.”

“The video does not depict President Biden saying something he did not say, and the video is not the product of artificial intelligence or machine learning in a way that merges, combines, replaces, or superimposes content onto the video (the video was merely edited to remove certain portions),” Meta’s blog said.

The Oversight Board—an independent panel of experts—reviewed the case and ultimately upheld Meta’s decision despite being “skeptical” that current policies work to reduce harms.

“The board sees little sense in the choice to limit the Manipulated Media policy to cover only people saying things they did not say, while excluding content showing people doing things they did not do,” the board said, noting that Meta claimed this distinction was made because “videos involving speech were considered the most misleading and easiest to reliably detect.”

The board called upon Meta to revise its “incoherent” policies that it said appear to be more concerned with regulating how content is created, rather than with preventing harms. For example, the Biden video’s caption described the president as a “sick pedophile” and called out anyone who would vote for him as “mentally unwell,” which could affect “electoral processes” that Meta could choose to protect, the board suggested.

“Meta should reconsider this policy quickly, given the number of elections in 2024,” the Oversight Board said.

One problem, the Oversight Board suggested, is that in its rush to combat AI technologies that make generating deepfakes a fast, cheap, and easy business, Meta policies currently overlook less technical ways of manipulating content.

Instead of using AI, the Biden video relied on basic video-editing technology to edit out the president placing an “I Voted” sticker on his adult granddaughter’s chest. The crude edit looped a 7-second clip altered to make the president appear to be, as Meta described in its blog, “inappropriately touching a young woman’s chest and kissing her on the cheek.”

Meta making this distinction is confusing, the board said, partly because videos altered using non-AI technologies are not considered less misleading or less prevalent on Facebook.

The board recommended that Meta update policies to cover not just AI-generated videos, but other forms of manipulated media, including all forms of manipulated video and audio. Audio fakes currently not covered in the policy, the board warned, offer fewer cues to alert listeners to the inauthenticity of recordings and may even be considered “more misleading than video content.”

Notably, earlier this year, a fake Biden robocall attempted to mislead Democratic voters in New Hampshire by encouraging them not to vote. The Federal Communications Commission promptly responded by declaring AI-generated robocalls illegal, but the Federal Election Commission was not able to act as swiftly to regulate AI-generated misleading campaign ads easily spread on social media, AP reported. In a statement, Oversight Board Co-Chair Michael McConnell said that manipulated audio is “one of the most potent forms of electoral disinformation.”

To better combat known harms, the board suggested that Meta revise its Manipulated Media policy to “clearly specify the harms it is seeking to prevent.”

Rather than pushing Meta to remove more content, however, the board urged Meta to use “less restrictive” methods of coping with fake content, such as relying on fact-checkers applying labels noting that content is “significantly altered.” In public comments, some Facebook users agreed that labels would be most effective. Others urged Meta to “start cracking down” and remove all fake videos, with one suggesting that removing the Biden video should have been a “deeply easy call.” Another commenter suggested that the Biden video should be considered acceptable speech, as harmless as a funny meme.

While the board wants Meta to also expand its policies to cover all forms of manipulated audio and video, it cautioned that including manipulated photos in the policy could “significantly expand” the policy’s scope and make it harder to enforce.

“If Meta sought to label videos, audio, and photographs but only captured a small portion, this could create a false impression that non-labeled content is inherently trustworthy,” the board warned.

Meta should therefore stop short of adding manipulated images to the policy, the board said. Instead, Meta should conduct research into the effects of manipulated photos and then consider updates when the company is prepared to enforce a ban on manipulated photos at scale, the board recommended. In the meantime, Meta should move quickly to update policies ahead of a busy election year where experts and politicians globally are bracing for waves of misinformation online.

“The volume of misleading content is rising, and the quality of tools to create it is rapidly increasing,” McConnell said. “Platforms must keep pace with these changes, especially in light of global elections during which certain actors seek to mislead the public.”

Meta’s spokesperson told Ars that Meta is “reviewing the Oversight Board’s guidance and will respond publicly to their recommendations within 60 days.”

Facebook rules allowing fake Biden “pedophile” video deemed “incoherent” Read More »

as-2024-election-looms,-openai-says-it-is-taking-steps-to-prevent-ai-abuse

As 2024 election looms, OpenAI says it is taking steps to prevent AI abuse

Don’t Rock the vote —

ChatGPT maker plans transparency for gen AI content and improved access to voting info.

A pixelated photo of Donald Trump.

On Monday, ChatGPT maker OpenAI detailed its plans to prevent the misuse of its AI technologies during the upcoming elections in 2024, promising transparency in AI-generated content and enhancing access to reliable voting information. The AI developer says it is working on an approach that involves policy enforcement, collaboration with partners, and the development of new tools aimed at classifying AI-generated media.

“As we prepare for elections in 2024 across the world’s largest democracies, our approach is to continue our platform safety work by elevating accurate voting information, enforcing measured policies, and improving transparency,” writes OpenAI in its blog post. “Protecting the integrity of elections requires collaboration from every corner of the democratic process, and we want to make sure our technology is not used in a way that could undermine this process.”

Initiatives proposed by OpenAI include preventing abuse by means such as deepfakes or bots imitating candidates, refining usage policies, and launching a reporting system for the public to flag potential abuses. For example, OpenAI’s image generation tool, DALL-E 3, includes built-in filters that reject requests to create images of real people, including politicians. “For years, we’ve been iterating on tools to improve factual accuracy, reduce bias, and decline certain requests,” the company stated.

OpenAI says it regularly updates its Usage Policies for ChatGPT and its API products to prevent misuse, especially in the context of elections. The organization has implemented restrictions on using its technologies for political campaigning and lobbying until it better understands the potential for personalized persuasion. Also, OpenAI prohibits creating chatbots that impersonate real individuals or institutions and disallows the development of applications that could deter people from “participation in democratic processes.” Users can report GPTs that may violate the rules.

OpenAI claims to be proactively engaged in detailed strategies to safeguard its technologies against misuse. According to their statements, this includes red-teaming new systems to anticipate challenges, engaging with users and partners for feedback, and implementing robust safety mitigations. OpenAI asserts that these efforts are integral to its mission of continually refining AI tools for improved accuracy, reduced biases, and responsible handling of sensitive requests

Regarding transparency, OpenAI says it is advancing its efforts in classifying image provenance. The company plans to embed digital credentials, using cryptographic techniques, into images produced by DALL-E 3 as part of its adoption of standards by the Coalition for Content Provenance and Authenticity. Additionally, OpenAI says it is testing a tool designed to identify DALL-E-generated images.

In an effort to connect users with authoritative information, particularly concerning voting procedures, OpenAI says it has partnered with the National Association of Secretaries of State (NASS) in the United States. ChatGPT will direct users to CanIVote.org for verified US voting information.

“We want to make sure that our AI systems are built, deployed, and used safely,” writes OpenAI. “Like any new technology, these tools come with benefits and challenges. They are also unprecedented, and we will keep evolving our approach as we learn more about how our tools are used.”

As 2024 election looms, OpenAI says it is taking steps to prevent AI abuse Read More »

report:-deepfake-porn-consistently-found-atop-google,-bing-search-results

Report: Deepfake porn consistently found atop Google, Bing search results

Shocking results —

Google vows to create more safeguards to protect victims of deepfake porn.

Report: Deepfake porn consistently found atop Google, Bing search results

Popular search engines like Google and Bing are making it easy to surface nonconsensual deepfake pornography by placing it at the top of search results, NBC News reported Thursday.

These controversial deepfakes superimpose faces of real women, often celebrities, onto the bodies of adult entertainers to make them appear to be engaging in real sex. Thanks in part to advances in generative AI, there is now a burgeoning black market for deepfake porn that could be discovered through a Google search, NBC News previously reported.

NBC News uncovered the problem by turning off safe search, then combining the names of 36 female celebrities with obvious search terms like “deepfakes,” “deepfake porn,” and “fake nudes.” Bing generated links to deepfake videos in top results 35 times, while Google did so 34 times. Bing also surfaced “fake nude photos of former teen Disney Channel female actors” using images where actors appear to be underaged.

A Google spokesperson told NBC that the tech giant understands “how distressing this content can be for people affected by it” and is “actively working to bring more protections to Search.”

According to Google’s spokesperson, this controversial content sometimes appears because “Google indexes content that exists on the web,” just “like any search engine.” But while searches using terms like “deepfake” may generate results consistently, Google “actively” designs “ranking systems to avoid shocking people with unexpected harmful or explicit content that they aren’t looking for,” the spokesperson said.

Currently, the only way to remove nonconsensual deepfake porn from Google search results is for the victim to submit a form personally or through an “authorized representative.” That form requires victims to meet three requirements: showing that they’re “identifiably depicted” in the deepfake; the “imagery in question is fake and falsely depicts” them as “nude or in a sexually explicit situation”; and the imagery was distributed without their consent.

While this gives victims some course of action to remove content, experts are concerned that search engines need to do more to effectively reduce the prevalence of deepfake pornography available online—which right now is rising at a rapid rate.

This emerging issue increasingly affects average people and even children, not just celebrities. Last June, child safety experts discovered thousands of realistic but fake AI child sex images being traded online, around the same time that the FBI warned that the use of AI-generated deepfakes in sextortion schemes was increasing.

And nonconsensual deepfake porn isn’t just being traded in black markets online. In November, New Jersey police launched a probe after high school teens used AI image generators to create and share fake nude photos of female classmates.

With tech companies seemingly slow to stop the rise in deepfakes, some states have passed laws criminalizing deepfake porn distribution. Last July, Virginia amended its existing law criminalizing revenge porn to include any “falsely created videographic or still image.” In October, New York passed a law specifically focused on banning deepfake porn, imposing a $1,000 fine and up to a year of jail time on violators. Congress has also introduced legislation that creates criminal penalties for spreading deepfake porn.

Although Google told NBC News that its search features “don’t allow manipulated media or sexually explicit content,” the outlet’s investigation seemingly found otherwise. NBC News also noted that Google’s Play app store hosts an app that was previously marketed for creating deepfake porn, despite prohibiting “apps determined to promote or perpetuate demonstrably misleading or deceptive imagery, videos and/or text.” This suggests that Google’s remediation efforts blocking deceptive imagery may be inconsistent.

Google told Ars that it will soon be strengthening its policies against apps featuring AI-generated restricted content in the Play Store. A generative AI policy taking effect on January 31 will require all apps to comply with developer policies that ban AI-generated restricted content, including deceptive content and content that facilitates the exploitation or abuse of children.

Experts told NBC News that “Google’s lack of proactive patrolling for abuse has made it and other search engines useful platforms for people looking to engage in deepfake harassment campaigns.”

Google is currently “in the process of building more expansive safeguards, with a particular focus on removing the need for known victims to request content removals one by one,” Google’s spokesperson told NBC News.

Microsoft’s spokesperson told Ars that Microsoft updated its process for reporting concerns with Bing searches to include non-consensual intimate imagery (NCII) used in “deepfakes” last August because it had become a “significant concern.” Like Google, Microsoft allows victims to report NCII deepfakes by submitting a web form to request removal from search results, understanding that any sharing of NCII is “a gross violation of personal privacy and dignity with devastating effects for victims.”

In the past, Microsoft President Brad Smith has said that among all dangers that AI poses, deepfakes worry him most, but deepfakes fueling “foreign cyber influence operations” seemingly concern him more than deepfake porn.

This story was updated on January 11 to include information on Google’s AI-generated content policy and on January 12 to include information from Microsoft.

Report: Deepfake porn consistently found atop Google, Bing search results Read More »