AI image generators


OpenAI’s flawed plan to flag deepfakes ahead of 2024 elections

OpenAI’s flawed plan to flag deepfakes ahead of 2024 elections

As the US moves toward criminalizing deepfakes—deceptive AI-generated audio, images, and videos that are increasingly hard to discern from authentic content online—tech companies have rushed to roll out tools to help everyone better detect AI content.

But efforts so far have been imperfect, and experts fear that social media platforms may not be ready to handle the ensuing AI chaos during major global elections in 2024—despite tech giants committing to making tools specifically to combat AI-fueled election disinformation. The best AI detection remains observant humans, who, by paying close attention to deepfakes, can pick up on flaws like AI-generated people with extra fingers or AI voices that speak without pausing for a breath.

Among the splashiest tools announced this week, OpenAI shared details today about a new AI image detection classifier that it claims can detect about 98 percent of AI outputs from its own sophisticated image generator, DALL-E 3. It also “currently flags approximately 5 to 10 percent of images generated by other AI models,” OpenAI’s blog said.

According to OpenAI, the classifier provides a binary “true/false” response “indicating the likelihood of the image being AI-generated by DALL·E 3.” A screenshot of the tool shows how it can also be used to display a straightforward content summary confirming that “this content was generated with an AI tool” and includes fields ideally flagging the “app or device” and AI tool used.

To develop the tool, OpenAI spent months adding tamper-resistant metadata to “all images created and edited by DALL·E 3” that “can be used to prove the content comes” from “a particular source.” The detector reads this metadata to accurately flag DALL-E 3 images as fake.

That metadata follows “a widely used standard for digital content certification” set by the Coalition for Content Provenance and Authenticity (C2PA), often likened to a nutrition label. And reinforcing that standard has become “an important aspect” of OpenAI’s approach to AI detection beyond DALL-E 3, OpenAI said. When OpenAI broadly launches its video generator, Sora, C2PA metadata will be integrated into that tool as well, OpenAI said.

Of course, this solution is not comprehensive because that metadata could always be removed, and “people can still create deceptive content without this information (or can remove it),” OpenAI said, “but they cannot easily fake or alter this information, making it an important resource to build trust.”

Because OpenAI is all in on C2PA, the AI leader announced today that it would join the C2PA steering committee to help drive broader adoption of the standard. OpenAI will also launch a $2 million fund with Microsoft to support broader “AI education and understanding,” seemingly partly in the hopes that the more people understand about the importance of AI detection, the less likely they will be to remove this metadata.

“As adoption of the standard increases, this information can accompany content through its lifecycle of sharing, modification, and reuse,” OpenAI said. “Over time, we believe this kind of metadata will be something people come to expect, filling a crucial gap in digital content authenticity practices.”

OpenAI joining the committee “marks a significant milestone for the C2PA and will help advance the coalition’s mission to increase transparency around digital media as AI-generated content becomes more prevalent,” C2PA said in a blog.

OpenAI’s flawed plan to flag deepfakes ahead of 2024 elections Read More »


Stability announces Stable Diffusion 3, a next-gen AI image generator

Pics and it didn’t happen —

SD3 may bring DALL-E-like prompt fidelity to an open-weights image-synthesis model.

Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

Enlarge / Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

On Thursday, Stability AI announced Stable Diffusion 3, an open-weights next-generation image-synthesis model. It follows its predecessors by reportedly generating detailed, multi-subject images with improved quality and accuracy in text generation. The brief announcement was not accompanied by a public demo, but Stability is opening up a waitlist today for those who would like to try it.

Stability says that its Stable Diffusion 3 family of models (which takes text descriptions called “prompts” and turns them into matching images) range in size from 800 million to 8 billion parameters. The size range accommodates allowing different versions of the model to run locally on a variety of devices—from smartphones to servers. Parameter size roughly corresponds to model capability in terms of how much detail it can generate. Larger models also require more VRAM on GPU accelerators to run.

Since 2022, we’ve seen Stability launch a progression of AI image-generation models: Stable Diffusion 1.4, 1.5, 2.0, 2.1, XL, XL Turbo, and now 3. Stability has made a name for itself as providing a more open alternative to proprietary image-synthesis models like OpenAI’s DALL-E 3, though not without controversy due to the use of copyrighted training data, bias, and the potential for abuse. (This has led to lawsuits that are unresolved.) Stable Diffusion models have been open-weights and source-available, which means the models can be run locally and fine-tuned to change their outputs.

  • Stable Diffusion 3 generation with the prompt: Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says “Stable Diffusion 3” made out of colorful energy.

  • An AI-generated image of a grandma wearing a “Go big or go home sweatshirt” generated by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: Three transparent glass bottles on a wooden table. The one on the left has red liquid and the number 1. The one in the middle has blue liquid and the number 2. The one on the right has green liquid and the number 3.

  • An AI-generated image created by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: A horse balancing on top of a colorful ball in a field with green grass and a mountain in the background.

  • Stable Diffusion 3 generation with the prompt: Moody still life of assorted pumpkins.

  • Stable Diffusion 3 generation with the prompt: a painting of an astronaut riding a pig wearing a tutu holding a pink umbrella, on the ground next to the pig is a robin bird wearing a top hat, in the corner are the words “stable diffusion.”

  • Stable Diffusion 3 generation with the prompt: Resting on the kitchen table is an embroidered cloth with the text ‘good night’ and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic.

  • Stable Diffusion 3 generation with the prompt: Photo of an 90’s desktop computer on a work desk, on the computer screen it says “welcome”. On the wall in the background we see beautiful graffiti with the text “SD3” very large on the wall.

As far as tech improvements are concerned, Stability CEO Emad Mostaque wrote on X, “This uses a new type of diffusion transformer (similar to Sora) combined with flow matching and other improvements. This takes advantage of transformer improvements & can not only scale further but accept multimodal inputs.”

Like Mostaque said, the Stable Diffusion 3 family uses diffusion transformer architecture, which is a new way of creating images with AI that swaps out the usual image-building blocks (such as U-Net architecture) for a system that works on small pieces of the picture. The method was inspired by transformers, which are good at handling patterns and sequences. This approach not only scales up efficiently but also reportedly produces higher-quality images.

Stable Diffusion 3 also utilizes “flow matching,” which is a technique for creating AI models that can generate images by learning how to transition from random noise to a structured image smoothly. It does this without needing to simulate every step of the process, instead focusing on the overall direction or flow that the image creation should follow.

A comparison of outputs between OpenAI's DALL-E 3 and Stable Diffusion 3 with the prompt,

Enlarge / A comparison of outputs between OpenAI’s DALL-E 3 and Stable Diffusion 3 with the prompt, “Night photo of a sports car with the text “SD3″ on the side, the car is on a race track at high speed, a huge road sign with the text ‘faster.'”

We do not have access to Stable Diffusion 3 (SD3), but from samples we found posted on Stability’s website and associated social media accounts, the generations appear roughly comparable to other state-of-the-art image-synthesis models at the moment, including the aforementioned DALL-E 3, Adobe Firefly, Imagine with Meta AI, Midjourney, and Google Imagen.

SD3 appears to handle text generation very well in the examples provided by others, which are potentially cherry-picked. Text generation was a particular weakness of earlier image-synthesis models, so an improvement to that capability in a free model is a big deal. Also, prompt fidelity (how closely it follows descriptions in prompts) seems to be similar to DALL-E 3, but we haven’t tested that ourselves yet.

While Stable Diffusion 3 isn’t widely available, Stability says that once testing is complete, its weights will be free to download and run locally. “This preview phase, as with previous models,” Stability writes, “is crucial for gathering insights to improve its performance and safety ahead of an open release.”

Stability has been experimenting with a variety of image-synthesis architectures recently. Aside from SDXL and SDXL Turbo, just last week, the company announced Stable Cascade, which uses a three-stage process for text-to-image synthesis.

Listing image by Emad Mostaque (Stability AI)

Stability announces Stable Diffusion 3, a next-gen AI image generator Read More »


Cops bogged down by flood of fake AI child sex images, report says

“Particularly heinous” —

Investigations tied to harmful AI sex images will grow “exponentially,” experts say.

Cops bogged down by flood of fake AI child sex images, report says

Law enforcement is continuing to warn that a “flood” of AI-generated fake child sex images is making it harder to investigate real crimes against abused children, The New York Times reported.

Last year, after researchers uncovered thousands of realistic but fake AI child sex images online, quickly every attorney general across the US called on Congress to set up a committee to squash the problem. But so far, Congress has moved slowly, while only a few states have specifically banned AI-generated non-consensual intimate imagery. Meanwhile, law enforcement continues to struggle with figuring out how to confront bad actors found to be creating and sharing images that, for now, largely exist in a legal gray zone.

“Creating sexually explicit images of children through the use of artificial intelligence is a particularly heinous form of online exploitation,” Steve Grocki, the chief of the Justice Department’s child exploitation and obscenity section, told The Times. Experts told The Washington Post in 2023 that risks of realistic but fake images spreading included normalizing child sexual exploitation, luring more children into harm’s way, and making it harder for law enforcement to find actual children being harmed.

In one example, the FBI announced earlier this year that an American Airlines flight attendant, Estes Carter Thompson III, was arrested “for allegedly surreptitiously recording or attempting to record a minor female passenger using a lavatory aboard an aircraft.” A search of Thompson’s iCloud revealed “four additional instances” where Thompson allegedly recorded other minors in the lavatory, as well as “over 50 images of a 9-year-old unaccompanied minor” sleeping in her seat. While police attempted to identify these victims, they also “further alleged that hundreds of images of AI-generated child pornography” were found on Thompson’s phone.

The troubling case seems to illustrate how AI-generated child sex images can be linked to real criminal activity while also showing how police investigations could be bogged down by attempts to distinguish photos of real victims from AI images that could depict real or fake children.

Robin Richards, the commander of the Los Angeles Police Department’s Internet Crimes Against Children task force, confirmed to the NYT that due to AI, “investigations are way more challenging.”

And because image generators and AI models that can be trained on photos of children are widely available, “using AI to alter photos” of children online “is becoming more common,” Michael Bourke—a former chief psychologist for the US Marshals Service who spent decades supporting investigations into sex offenses involving children—told the NYT. Richards said that cops don’t know what to do when they find these AI-generated materials.

Currently, there aren’t many cases involving AI-generated child sex abuse materials (CSAM), The NYT reported, but experts expect that number will “grow exponentially,” raising “novel and complex questions of whether existing federal and state laws are adequate to prosecute these crimes.”

Platforms struggle to monitor harmful AI images

At a Senate Judiciary Committee hearing today grilling Big Tech CEOs over child sexual exploitation (CSE) on their platforms, Linda Yaccarino—CEO of X (formerly Twitter)—warned in her opening statement that artificial intelligence is also making it harder for platforms to monitor CSE. Yaccarino suggested that industry collaboration is imperative to get ahead of the growing problem, as is providing more resources to law enforcement.

However, US law enforcement officials have indicated that platforms are also making it harder to police CSAM and CSE online. Platforms relying on AI to detect CSAM are generating “unviable reports” gumming up investigations managed by already underfunded law enforcement teams, The Guardian reported. And the NYT reported that other investigations are being thwarted by adding end-to-end encryption options to messaging services, which “drastically limit the number of crimes the authorities are able to track.”

The NYT report noted that in 2002, the Supreme Court struck down a law that had been on the books since 1996 preventing “virtual” or “computer-generated child pornography.” South Carolina’s attorney general, Alan Wilson, has said that AI technology available today may test that ruling, especially if minors continue to be harmed by fake AI child sex images spreading online. In the meantime, federal laws such as obscenity statutes may be used to prosecute cases, the NYT reported.

Congress has recently re-introduced some legislation to directly address AI-generated non-consensual intimate images after a wide range of images depicting fake AI porn of pop star Taylor Swift went viral this month. That includes the Disrupt Explicit Forged Images and Non-Consensual Edits Act, which creates a federal civil remedy for any victims of any age who are identifiable in AI images depicting them as nude or engaged in sexually explicit conduct or sexual scenarios.

There’s also the “Preventing Deepfakes of Intimate Images Act,” which seeks to “prohibit the non-consensual disclosure of digitally altered intimate images.” That was re-introduced this year after teen boys generated AI fake nude images of female classmates and spread them around a New Jersey high school last fall. Francesca Mani, one of the teen victims in New Jersey, was there to help announce the proposed law, which includes penalties of up to two years imprisonment for sharing harmful images.

“What happened to me and my classmates was not cool, and there’s no way I’m just going to shrug and let it slide,” Mani said. “I’m here, standing up and shouting for change, fighting for laws, so no one else has to feel as lost and powerless as I did on October 20th.”

Cops bogged down by flood of fake AI child sex images, report says Read More »