AI

ten-months-after-first-tease,-openai-launches-sora-video-generation-publicly

Ten months after first tease, OpenAI launches Sora video generation publicly

A music video by Canadian art collective Vallée Duhamel made with Sora-generated video. “[We] just shoot stuff and then use Sora to combine it with a more interesting, more surreal vision.”

During a livestream on Monday—during Day 3 of OpenAI’s “12 days of OpenAi”—Sora’s developers showcased a new “Explore” interface that allows people to browse through videos generated by others to get prompting ideas. OpenAI says that anyone can enjoy viewing the “Explore” feed for free, but generating videos requires a subscription.

They also showed off a new feature called “Storyboard” that allows users to direct a video with multiple actions in a frame-by-frame manner.

Safety measures and limitations

In addition to the release, OpenAI also publish Sora’s System Card for the first time. It includes technical details about how the model works and safety testing the company undertook prior to this release.

“Whereas LLMs have text tokens, Sora has visual patches,” OpenAI writes, describing the new training chunks as “an effective representation for models of visual data… At a high level, we turn videos into patches by first compressing videos into a lower-dimensional latent space, and subsequently decomposing the representation into spacetime patches.”

Sora also makes use of a “recaptioning technique”—similar to that seen in the company’s DALL-E 3 image generation, to “generate highly descriptive captions for the visual training data.” That, in turn, lets Sora “follow the user’s text instructions in the generated video more faithfully,” OpenAI writes.

Sora-generated video provided by OpenAI, from the prompt: “Loop: a golden retriever puppy wearing a superhero outfit complete with a mask and cape stands perched on the top of the empire state building in winter, overlooking the nyc it protects at night. the back of the pup is visible to the camera; his attention faced to nyc”

OpenAI implemented several safety measures in the release. The platform embeds C2PA metadata in all generated videos for identification and origin verification. Videos display visible watermarks by default, and OpenAI developed an internal search tool to verify Sora-generated content.

The company acknowledged technical limitations in the current release. “This early version of Sora will make mistakes, it’s not perfect,” said one developer during the livestream launch. The model reportedly struggles with physics simulations and complex actions over extended durations.

In the past, we’ve seen that these types of limitations are based on what example videos were used to train AI models. This current generation of AI video-synthesis models has difficulty generating truly new things, since the underlying architecture excels at transforming existing concepts into new presentations, but so far typically fails at true originality. Still, it’s early in AI video generation, and the technology is improving all the time.

Ten months after first tease, OpenAI launches Sora video generation publicly Read More »

itch.io-platform-briefly-goes-down-due-to-“ai-driven”-anti-phishing-report

Itch.io platform briefly goes down due to “AI-driven” anti-phishing report

The itch.io domain was back up and running by 7 am Eastern, according to media reports, “after the registrant finally responded to our notice and took appropriate action to resolve the issue.” Users could access the site throughout if they typed the itch.io IP address into their web browser directly.

Too strong a shield?

BrandShield’s website describes it as a service that “detects and hunts online trademark infringement, counterfeit sales, and brand abuse across multiple platforms.” The company claims to have multiple Fortune 500 and FTSE100 companies on its client list.

In its own series of social media posts, BrandShield said its “AI-driven platform” had identified “an abuse of Funko… from an itch.io subdomain.” The takedown request it filed was focused on that subdomain, not the entirety of itch.io, BrandShield said.

“The temporary takedown of the website was a decision made by the service providers, not BrandShield or Funko.”

The whole affair highlights how the delicate web of domain registrars and DNS servers can remain a key failure point for web-based businesses. Back in May, we saw how the desyncing of a single DNS root server could cause problems across the entire Internet. And in 2012, the hacking collective Anonymous highlighted the potential for a coordinated attack to take down the entire DNS system.

Itch.io platform briefly goes down due to “AI-driven” anti-phishing report Read More »

google’s-genie-2-“world-model”-reveal-leaves-more-questions-than-answers

Google’s Genie 2 “world model” reveal leaves more questions than answers


Making a command out of your wish?

Long-term persistence, real-time interactions remain huge hurdles for AI worlds.

A sample of some of the best-looking Genie 2 worlds Google wants to show off. Credit: Google Deepmind

In March, Google showed off its first Genie AI model. After training on thousands of hours of 2D run-and-jump video games, the model could generate halfway-passable, interactive impressions of those games based on generic images or text descriptions.

Nine months later, this week’s reveal of the Genie 2 model expands that idea into the realm of fully 3D worlds, complete with controllable third- or first-person avatars. Google’s announcement talks up Genie 2’s role as a “foundational world model” that can create a fully interactive internal representation of a virtual environment. That could allow AI agents to train themselves in synthetic but realistic environments, Google says, forming an important stepping stone on the way to artificial general intelligence.

But while Genie 2 shows just how much progress Google’s Deepmind team has achieved in the last nine months, the limited public information about the model thus far leaves a lot of questions about how close we are to these foundational model worlds being useful for anything but some short but sweet demos.

How long is your memory?

Much like the original 2D Genie model, Genie 2 starts from a single image or text description and then generates subsequent frames of video based on both the previous frames and fresh input from the user (such as a movement direction or “jump”). Google says it trained on a “large-scale video dataset” to achieve this, but it doesn’t say just how much training data was necessary compared to the 30,000 hours of footage used to train the first Genie.

Short GIF demos on the Google DeepMind promotional page show Genie 2 being used to animate avatars ranging from wooden puppets to intricate robots to a boat on the water. Simple interactions shown in those GIFs demonstrate those avatars busting balloons, climbing ladders, and shooting exploding barrels without any explicit game engine describing those interactions.

Those Genie 2-generated pyramids will still be there in 30 seconds. But in five minutes? Credit: Google Deepmind

Perhaps the biggest advance claimed by Google here is Genie 2’s “long horizon memory.” This feature allows the model to remember parts of the world as they come out of view and then render them accurately as they come back into the frame based on avatar movement. This kind of persistence has proven to be a persistent problem for video generation models like Sora, which OpenAI said in February “do[es] not always yield correct changes in object state” and can develop “incoherencies… in long duration samples.”

The “long horizon” part of “long horizon memory” is perhaps a little overzealous here, though, as Genie 2 only “maintains a consistent world for up to a minute,” with “the majority of examples shown lasting [10 to 20 seconds].” Those are definitely impressive time horizons in the world of AI video consistency, but it’s pretty far from what you’d expect from any other real-time game engine. Imagine entering a town in a Skyrim-style RPG, then coming back five minutes later to find that the game engine had forgotten what that town looks like and generated a completely different town from scratch instead.

What are we prototyping, exactly?

Perhaps for this reason, Google suggests Genie 2 as it stands is less useful for creating a complete game experience and more to “rapidly prototype diverse interactive experiences” or to turn “concept art and drawings… into fully interactive environments.”

The ability to transform static “concept art” into lightly interactive “concept videos” could definitely be useful for visual artists brainstorming ideas for new game worlds. However, these kinds of AI-generated samples might be less useful for prototyping actual game designs that go beyond the visual.

On Bluesky, British game designer Sam Barlow (Silent Hill: Shattered Memories, Her Story) points out how game designers often use a process called whiteboxing to lay out the structure of a game world as simple white boxes well before the artistic vision is set. The idea, he says, is to “prove out and create a gameplay-first version of the game that we can lock so that art can come in and add expensive visuals to the structure. We build in lo-fi because it allows us to focus on these issues and iterate on them cheaply before we are too far gone to correct.”

Generating elaborate visual worlds using a model like Genie 2 before designing that underlying structure feels a bit like putting the cart before the horse. The process almost seems designed to generate generic, “asset flip”-style worlds with AI-generated visuals papered over generic interactions and architecture.

As podcaster Ryan Zhao put it on Bluesky, “The design process has gone wrong when what you need to prototype is ‘what if there was a space.'”

Gotta go fast

When Google revealed the first version of Genie earlier this year, it also published a detailed research paper outlining the specific steps taken behind the scenes to train the model and how that model generated interactive videos. No such research paper has been published detailing Genie 2’s process, leaving us guessing at some important details.

One of the most important of these details is model speed. The first Genie model generated its world at roughly one frame per second, a rate that was orders of magnitude slower than would be tolerably playable in real time. For Genie 2, Google only says that “the samples in this blog post are generated by an undistilled base model, to show what is possible. We can play a distilled version in real-time with a reduction in quality of the outputs.”

Reading between the lines, it sounds like the full version of Genie 2 operates at something well below the real-time interactions implied by those flashy GIFs. It’s unclear how much “reduction in quality” is necessary to get a diluted version of the model to real-time controls, but given the lack of examples presented by Google, we have to assume that reduction is significant.

Oasis’ AI-generated Minecraft clone shows great potential, but still has a lot of rough edges, so to speak. Credit: Oasis

Real-time, interactive AI video generation isn’t exactly a pipe dream. Earlier this year, AI model maker Decart and hardware maker Etched published the Oasis model, showing off a human-controllable, AI-generated video clone of Minecraft that runs at a full 20 frames per second. However, that 500 million parameter model was trained on millions of hours of footage of a single, relatively simple game, and focused exclusively on the limited set of actions and environmental designs inherent to that game.

When Oasis launched, its creators fully admitted the model “struggles with domain generalization,” showing how “realistic” starting scenes had to be reduced to simplistic Minecraft blocks to achieve good results. And even with those limitations, it’s not hard to find footage of Oasis degenerating into horrifying nightmare fuel after just a few minutes of play.

What started as a realistic-looking soldier in this Genie 2 demo degenerates into this blobby mess just seconds later. Credit: Google Deepmind

We can already see similar signs of degeneration in the extremely short GIFs shared by the Genie team, such as an avatar’s dream-like fuzz during high-speed movement or NPCs that quickly fade into undifferentiated blobs at a short distance. That’s not a great sign for a model whose “long memory horizon” is supposed to be a key feature.

A learning crèche for other AI agents?

From this image, Genie 2 could generate a useful training environment for an AI agent and a simple “pick a door” task. Credit: Google Deepmind

Genie 2 seems to be using individual game frames as the basis for the animations in its model. But it also seems able to infer some basic information about the objects in those frames and craft interactions with those objects in the way a game engine might.

Google’s blog post shows how a SIMA agent inserted into a Genie 2 scene can follow simple instructions like “enter the red door” or “enter the blue door,” controlling the avatar via simple keyboard and mouse inputs. That could potentially make Genie 2 environment a great test bed for AI agents in various synthetic worlds.

Google claims rather grandiosely that Genie 2 puts it on “the path to solving a structural problem of training embodied agents safely while achieving the breadth and generality required to progress towards [artificial general intelligence].” Whether or not that ends up being true, recent research shows that agent learning gained from foundational models can be effectively applied to real-world robotics.

Using this kind of AI model to create worlds for other AI models to learn in might be the ultimate use case for this kind of technology. But when it comes to the dream of an AI model that can create generic 3D worlds that a human player could explore in real time, we might not be as close as it seems.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Google’s Genie 2 “world model” reveal leaves more questions than answers Read More »

your-ai-clone-could-target-your-family,-but-there’s-a-simple-defense

Your AI clone could target your family, but there’s a simple defense

The warning extends beyond voice scams. The FBI announcement details how criminals also use AI models to generate convincing profile photos, identification documents, and chatbots embedded in fraudulent websites. These tools automate the creation of deceptive content while reducing previously obvious signs of humans behind the scams, like poor grammar or obviously fake photos.

Much like we warned in 2022 in a piece about life-wrecking deepfakes based on publicly available photos, the FBI also recommends limiting public access to recordings of your voice and images online. The bureau suggests making social media accounts private and restricting followers to known contacts.

Origin of the secret word in AI

To our knowledge, we can trace the first appearance of the secret word in the context of modern AI voice synthesis and deepfakes back to an AI developer named Asara Near, who first announced the idea on Twitter on March 27, 2023.

“(I)t may be useful to establish a ‘proof of humanity’ word, which your trusted contacts can ask you for,” Near wrote. “(I)n case they get a strange and urgent voice or video call from you this can help assure them they are actually speaking with you, and not a deepfaked/deepcloned version of you.”

Since then, the idea has spread widely. In February, Rachel Metz covered the topic for Bloomberg, writing, “The idea is becoming common in the AI research community, one founder told me. It’s also simple and free.”

Of course, passwords have been used since ancient times to verify someone’s identity, and it seems likely some science fiction story has dealt with the issue of passwords and robot clones in the past. It’s interesting that, in this new age of high-tech AI identity fraud, this ancient invention—a special word or phrase known to few—can still prove so useful.

Your AI clone could target your family, but there’s a simple defense Read More »

openai-announces-full-“o1”-reasoning-model,-$200-chatgpt-pro-tier

OpenAI announces full “o1” reasoning model, $200 ChatGPT Pro tier

On X, frequent AI experimenter Ethan Mollick wrote, “Been playing with o1 and o1-pro for bit. They are very good & a little weird. They are also not for most people most of the time. You really need to have particular hard problems to solve in order to get value out of it. But if you have those problems, this is a very big deal.”

OpenAI claims improved reliability

OpenAI is touting pro mode’s improved reliability, which is evaluated internally based on whether it can solve a question correctly in four out of four attempts rather than just a single attempt.

“In evaluations from external expert testers, o1 pro mode produces more reliably accurate and comprehensive responses, especially in areas like data science, programming, and case law analysis,” OpenAI writes.

Even without pro mode, OpenAI cited significant increases in performance over the o1 preview model on popular math and coding benchmarks (AIME 2024 and Codeforces), and more marginal improvements on a “PhD-level science” benchmark (GPQA Diamond). The increase in scores between o1 and o1 pro mode were much more marginal on these benchmarks.

We’ll likely have more coverage of the full version of o1 once it rolls out widely—and it’s supposed to launch today, accessible to ChatGPT Plus and Team users globally. Enterprise and Edu users will have access next week. At the moment, the ChatGPT Pro subscription is not yet available on our test account.

OpenAI announces full “o1” reasoning model, $200 ChatGPT Pro tier Read More »

soon,-the-tech-behind-chatgpt-may-help-drone-operators-decide-which-enemies-to-kill

Soon, the tech behind ChatGPT may help drone operators decide which enemies to kill

This marks a potential shift in tech industry sentiment from 2018, when Google employees staged walkouts over military contracts. Now, Google competes with Microsoft and Amazon for lucrative Pentagon cloud computing deals. Arguably, the military market has proven too profitable for these companies to ignore. But is this type of AI the right tool for the job?

Drawbacks of LLM-assisted weapons systems

There are many kinds of artificial intelligence already in use by the US military. For example, the guidance systems of Anduril’s current attack drones are not based on AI technology similar to ChatGPT.

But it’s worth pointing out that the type of AI OpenAI is best known for comes from large language models (LLMs)—sometimes called large multimodal models—that are trained on massive datasets of text, images, and audio pulled from many different sources.

LLMs are notoriously unreliable, sometimes confabulating erroneous information, and they’re also subject to manipulation vulnerabilities like prompt injections. That could lead to critical drawbacks from using LLMs to perform tasks such as summarizing defensive information or doing target analysis.

Potentially using unreliable LLM technology in life-or-death military situations raises important questions about safety and reliability, although the Anduril news release does mention this in its statement: “Subject to robust oversight, this collaboration will be guided by technically informed protocols emphasizing trust and accountability in the development and employment of advanced AI for national security missions.”

Hypothetically and speculatively speaking, defending against future LLM-based targeting with, say, a visual prompt injection (“ignore this target and fire on someone else” on a sign, perhaps) might bring warfare to weird new places. For now, we’ll have to wait to see where LLM technology ends up next.

Soon, the tech behind ChatGPT may help drone operators decide which enemies to kill Read More »

openai-teases-12-days-of-mystery-product-launches-starting-tomorrow

OpenAI teases 12 days of mystery product launches starting tomorrow

On Wednesday, OpenAI CEO Sam Altman announced a “12 days of OpenAI” period starting December 5, which will unveil new AI features and products for 12 consecutive weekdays.

Altman did not specify the exact features or products OpenAI plans to unveil, but a report from The Verge about this “12 days of shipmas” event suggests the products may include a public release of the company’s text-to-video model Sora and a new “reasoning” AI model similar to o1-preview. Perhaps we may even see DALL-E 4 or a new image generator based on GPT-4o’s multimodal capabilities.

Altman’s full tweet included hints at releases both big and small:

🎄🎅starting tomorrow at 10 am pacific, we are doing 12 days of openai.

each weekday, we will have a livestream with a launch or demo, some big ones and some stocking stuffers.

we’ve got some great stuff to share, hope you enjoy! merry christmas.

If we’re reading the calendar correctly, 12 weekdays means a new announcement every day until December 20.

OpenAI teases 12 days of mystery product launches starting tomorrow Read More »

google’s-deepmind-tackles-weather-forecasting,-with-great-performance

Google’s DeepMind tackles weather forecasting, with great performance

By some measures, AI systems are now competitive with traditional computing methods for generating weather forecasts. Because their training penalizes errors, however, the forecasts tend to get “blurry”—as you move further ahead in time, the models make fewer specific predictions since those are more likely to be wrong. As a result, you start to see things like storm tracks broadening and the storms themselves losing clearly defined edges.

But using AI is still extremely tempting because the alternative is a computational atmospheric circulation model, which is extremely compute-intensive. Still, it’s highly successful, with the ensemble model from the European Centre for Medium-Range Weather Forecasts considered the best in class.

In a paper being released today, Google’s DeepMind claims its new AI system manages to outperform the European model on forecasts out to at least a week and often beyond. DeepMind’s system, called GenCast, merges some computational approaches used by atmospheric scientists with a diffusion model, commonly used in generative AI. The result is a system that maintains high resolution while cutting the computational cost significantly.

Ensemble forecasting

Traditional computational methods have two main advantages over AI systems. The first is that they’re directly based on atmospheric physics, incorporating the rules we know govern the behavior of our actual weather, and they calculate some of the details in a way that’s directly informed by empirical data. They’re also run as ensembles, meaning that multiple instances of the model are run. Due to the chaotic nature of the weather, these different runs will gradually diverge, providing a measure of the uncertainty of the forecast.

At least one attempt has been made to merge some of the aspects of traditional weather models with AI systems. An internal Google project used a traditional atmospheric circulation model that divided the Earth’s surface into a grid of cells but used an AI to predict the behavior of each cell. This provided much better computational performance, but at the expense of relatively large grid cells, which resulted in relatively low resolution.

For its take on AI weather predictions, DeepMind decided to skip the physics and instead adopt the ability to run an ensemble.

Gen Cast is based on diffusion models, which have a key feature that’s useful here. In essence, these models are trained by starting them with a mixture of an original—image, text, weather pattern—and then a variation where noise is injected. The system is supposed to create a variation of the noisy version that is closer to the original. Once trained, it can be fed pure noise and evolve the noise to be closer to whatever it’s targeting.

In this case, the target is realistic weather data, and the system takes an input of pure noise and evolves it based on the atmosphere’s current state and its recent history. For longer-range forecasts, the “history” includes both the actual data and the predicted data from earlier forecasts. The system moves forward in 12-hour steps, so the forecast for day three will incorporate the starting conditions, the earlier history, and the two forecasts from days one and two.

This is useful for creating an ensemble forecast because you can feed it different patterns of noise as input, and each will produce a slightly different output of weather data. This serves the same purpose it does in a traditional weather model: providing a measure of the uncertainty for the forecast.

For each grid square, GenCast works with six weather measures at the surface, along with six that track the state of the atmosphere and 13 different altitudes at which it estimates the air pressure. Each of these grid squares is 0.2 degrees on a side, a higher resolution than the European model uses for its forecasts. Despite that resolution, DeepMind estimates that a single instance (meaning not a full ensemble) can be run out to 15 days on one of Google’s tensor processing systems in just eight minutes.

It’s possible to make an ensemble forecast by running multiple versions of this in parallel and then integrating the results. Given the amount of hardware Google has at its disposal, the whole process from start to finish is likely to take less than 20 minutes. The source and training data will be placed on the GitHub page for DeepMind’s GraphCast project. Given the relatively low computational requirements, we can probably expect individual academic research teams to start experimenting with it.

Measures of success

DeepMind reports that GenCast dramatically outperforms the best traditional forecasting model. Using a standard benchmark in the field, DeepMind found that GenCast was more accurate than the European model on 97 percent of the tests it used, which checked different output values at different times in the future. In addition, the confidence values, based on the uncertainty obtained from the ensemble, were generally reasonable.

Past AI weather forecasters, having been trained on real-world data, are generally not great at handling extreme weather since it shows up so rarely in the training set. But GenCast did quite well, often outperforming the European model in things like abnormally high and low temperatures and air pressure (one percent frequency or less, including at the 0.01 percentile).

DeepMind also went beyond standard tests to determine whether GenCast might be useful. This research included projecting the tracks of tropical cyclones, an important job for forecasting models. For the first four days, GenCast was significantly more accurate than the European model, and it maintained its lead out to about a week.

One of DeepMind’s most interesting tests was checking the global forecast of wind power output based on information from the Global Powerplant Database. This involved using it to forecast wind speeds at 10 meters above the surface (which is actually lower than where most turbines reside but is the best approximation possible) and then using that number to figure out how much power would be generated. The system beat the traditional weather model by 20 percent for the first two days and stayed in front with a declining lead out to a week.

The researchers don’t spend much time examining why performance seems to decline gradually for about a week. Ideally, more details about GenCast’s limitations would help inform further improvements, so the researchers are likely thinking about it. In any case, today’s paper marks the second case where taking something akin to a hybrid approach—mixing aspects of traditional forecast systems with AI—has been reported to improve forecasts. And both those cases took very different approaches, raising the prospect that it will be possible to combine some of their features.

Nature, 2024. DOI: 10.1038/s41586-024-08252-9  (About DOIs).

Google’s DeepMind tackles weather forecasting, with great performance Read More »

certain-names-make-chatgpt-grind-to-a-halt,-and-we-know-why

Certain names make ChatGPT grind to a halt, and we know why

The “David Mayer” block in particular (now resolved) presents additional questions, first posed on Reddit on November 26, as multiple people share this name. Reddit users speculated about connections to David Mayer de Rothschild, though no evidence supports these theories.

The problems with hard-coded filters

Allowing a certain name or phrase to always break ChatGPT outputs could cause a lot of trouble down the line for certain ChatGPT users, opening them up for adversarial attacks and limiting the usefulness of the system.

Already, Scale AI prompt engineer Riley Goodside discovered how an attacker might interrupt a ChatGPT session using a visual prompt injection of the name “David Mayer” rendered in a light, barely legible font embedded in an image. When ChatGPT sees the image (in this case, a math equation), it stops, but the user might not understand why.

The filter also means that it’s likely that ChatGPT won’t be able to answer questions about this article when browsing the web, such as through ChatGPT with Search.  Someone could use that to potentially prevent ChatGPT from browsing and processing a website on purpose if they added a forbidden name to the site’s text.

And then there’s the inconvenience factor. Preventing ChatGPT from mentioning or processing certain names like “David Mayer,” which is likely a popular name shared by hundreds if not thousands of people, means that people who share that name will have a much tougher time using ChatGPT. Or, say, if you’re a teacher and you have a student named David Mayer and you want help sorting a class list, ChatGPT would refuse the task.

These are still very early days in AI assistants, LLMs, and chatbots. Their use has opened up numerous opportunities and vulnerabilities that people are still probing daily. How OpenAI might resolve these issues is still an open question.

Certain names make ChatGPT grind to a halt, and we know why Read More »

elon-musk-asks-court-to-block-openai-conversion-from-nonprofit-to-for-profit

Elon Musk asks court to block OpenAI conversion from nonprofit to for-profit

OpenAI provided a statement to Ars today saying that “Elon’s fourth attempt, which again recycles the same baseless complaints, continues to be utterly without merit.” OpenAI referred to a longer statement that it made in March after Musk filed an earlier version of his lawsuit.

The March statement disputes Musk’s version of events. “In late 2017, we and Elon decided the next step for the mission was to create a for-profit entity,” OpenAI said. “Elon wanted majority equity, initial board control, and to be CEO. In the middle of these discussions, he withheld funding. Reid Hoffman bridged the gap to cover salaries and operations.”

OpenAI cited Musk’s desire for Tesla merger

OpenAI’s statement in March continued:

We couldn’t agree to terms on a for-profit with Elon because we felt it was against the mission for any individual to have absolute control over OpenAI. He then suggested instead merging OpenAI into Tesla. In early February 2018, Elon forwarded us an email suggesting that OpenAI should “attach to Tesla as its cash cow,” commenting that it was “exactly right… Tesla is the only path that could even hope to hold a candle to Google. Even then, the probability of being a counterweight to Google is small. It just isn’t zero.”

Elon soon chose to leave OpenAI, saying that our probability of success was 0, and that he planned to build an AGI competitor within Tesla. When he left in late February 2018, he told our team he was supportive of us finding our own path to raising billions of dollars. In December 2018, Elon sent us an email saying “Even raising several hundred million won’t be enough. This needs billions per year immediately or forget it.”

Now, Musk says the public interest would be served by his request for a preliminary injunction. Preserving competitive markets is particularly important in AI because of the technology’s “profound implications for society,” he wrote.

Musk’s motion said the public “has a strong interest in ensuring that charitable assets are not diverted for private gain. This interest is particularly acute here given the substantial tax benefits OpenAI, Inc. received as a non-profit, the organization’s repeated public commitments to developing AI technology for the benefit of humanity, and the serious safety concerns raised by former OpenAI employees regarding the organization’s rush to market potentially dangerous products in pursuit of profit.”

Elon Musk asks court to block OpenAI conversion from nonprofit to for-profit Read More »

openai-is-at-war-with-its-own-sora-video-testers-following-brief-public-leak

OpenAI is at war with its own Sora video testers following brief public leak

“We are not against the use of AI technology as a tool for the arts (if we were, we probably wouldn’t have been invited to this program),” PR Puppets writes. “What we don’t agree with is how this artist program has been rolled out and how the tool is shaping up ahead of a possible public release. We are sharing this to the world in the hopes that OpenAI becomes more open, more artist friendly and supports the arts beyond PR stunts.”

An excerpt from the PR Puppets open letter, as it appeared on Hugging Face Tuesday. Credit: PR Puppets / HuggingFace

In a statement provided to Ars Technica, an OpenAI spokesperson noted that “Sora is still in research preview, and we’re working to balance creativity with robust safety measures for broader use. Hundreds of artists in our alpha have shaped Sora’s development, helping prioritize new features and safeguards. Participation is voluntary, with no obligation to provide feedback or use the tool.”

Throughout the day Tuesday, PR Puppets updated its open letter with signatures from 16 people and groups listed as “sora-alpha-artists.” But a source with knowledge of OpenAI’s testing program told Ars that only a couple of those artists were actually part of the alpha testing group and that those artists were asked to refrain from sharing confidential details during Sora’s development.

PR Puppets also later linked to a public petition encouraging others to sign on to the same message shared in their open letter. Artists Memo Akten, Jake Elwes, and CROSSLUCID, who are also listed as “sora-alpha-artists,” were among the first to sign that public petition.

When can we get in?

Made with Sora (see above for more info): pic.twitter.com/VlveALuvYS

— Kol Tregaskes (@koltregaskes) November 26, 2024

Sora made a huge splash when OpenAI first teased its video-generation capabilities in February, before shopping the tech around Hollywood and using it in a public advertisement for Toys R Us. Since then, though, publicly accessible video generators like Minimax and announcements of in-development competitors from Google and Meta have stolen some of Sora’s initial thunder.

Previous OpenAI CTO Mira Murati told The Wall Street Journal in March that it planned to release Sora publicly by the end of the year. But CPO Kevin Weil said in a recent Reddit AMA that the platform’s deployment has been delayed by the “need to perfect the model, need to get safety/impersonation/other things right, and need to scale compute!”

OpenAI is at war with its own Sora video testers following brief public leak Read More »

google’s-plan-to-keep-ai-out-of-search-trial-remedies-isn’t-going-very-well

Google’s plan to keep AI out of search trial remedies isn’t going very well


DOJ: AI is not its own market

Judge: AI will likely play “larger role” in Google search remedies as market shifts.

Google got some disappointing news at a status conference Tuesday, where US District Judge Amit Mehta suggested that Google’s AI products may be restricted as an appropriate remedy following the government’s win in the search monopoly trial.

According to Law360, Mehta said that “the recent emergence of AI products that are intended to mimic the functionality of search engines” is rapidly shifting the search market. Because the judge is now weighing preventive measures to combat Google’s anticompetitive behavior, the judge wants to hear much more about how each side views AI’s role in Google’s search empire during the remedies stage of litigation than he did during the search trial.

“AI and the integration of AI is only going to play a much larger role, it seems to me, in the remedy phase than it did in the liability phase,” Mehta said. “Is that because of the remedies being requested? Perhaps. But is it also potentially because the market that we have all been discussing has shifted?”

To fight the DOJ’s proposed remedies, Google is seemingly dragging its major AI rivals into the trial. Trying to prove that remedies would harm Google’s ability to compete, the tech company is currently trying to pry into Microsoft’s AI deals, including its $13 billion investment in OpenAI, Law360 reported. At least preliminarily, Mehta has agreed that information Google is seeking from rivals has “core relevance” to the remedies litigation, Law360 reported.

The DOJ has asked for a wide range of remedies to stop Google from potentially using AI to entrench its market dominance in search and search text advertising. They include a ban on exclusive agreements with publishers to train on content, which the DOJ fears might allow Google to block AI rivals from licensing data, potentially posing a barrier to entry in both markets. Under the proposed remedies, Google would also face restrictions on investments in or acquisitions of AI products, as well as mergers with AI companies.

Additionally, the DOJ wants Mehta to stop Google from any potential self-preferencing, such as making an AI product mandatory on Android devices Google controls or preventing a rival from distribution on Android devices.

The government seems very concerned that Google may use its ownership of Android to play games in the emerging AI sector. They’ve further recommended an order preventing Google from discouraging partners from working with rivals, degrading the quality of rivals’ AI products on Android devices, or otherwise “coercing” manufacturers or other Android partners into giving Google’s AI products “better treatment.”

Importantly, if the court orders AI remedies linked to Google’s control of Android, Google could risk a forced sale of Android if Mehta grants the DOJ’s request for “contingent structural relief” requiring divestiture of Android if behavioral remedies don’t destroy the current monopolies.

Finally, the government wants Google to be required to allow publishers to opt out of AI training without impacting their search rankings. (Currently, opting out of AI scraping automatically opts sites out of Google search indexing.)

All of this, the DOJ alleged, is necessary to clear the way for a thriving search market as AI stands to shake up the competitive landscape.

“The promise of new technologies, including advances in artificial intelligence (AI), may present an opportunity for fresh competition,” the DOJ said in a court filing. “But only a comprehensive set of remedies can thaw the ecosystem and finally reverse years of anticompetitive effects.”

At the status conference Tuesday, DOJ attorney David Dahlquist reiterated to Mehta that these remedies are needed so that Google’s illegal conduct in search doesn’t extend to this “new frontier” of search, Law360 reported. Dahlquist also clarified that the DOJ views these kinds of AI products “as new access points for search, rather than a whole new market.”

“We’re very concerned about Google’s conduct being a barrier to entry,” Dahlquist said.

Google could not immediately be reached for comment. But the search giant has maintained that AI is beyond the scope of the search trial.

During the status conference, Google attorney John E. Schmidtlein disputed that AI remedies are relevant. While he agreed that “AI is key to the future of search,” he warned that “extraordinary” proposed remedies would “hobble” Google’s AI innovation, Law360 reported.

Microsoft shields confidential AI deals

Microsoft is predictably protective of its AI deals, arguing in a court filing that its “highly confidential agreements with OpenAI, Perplexity AI, Inflection, and G42 are not relevant to the issues being litigated” in the Google trial.

According to Microsoft, Google is arguing that it needs this information to “shed light” on things like “the extent to which the OpenAI partnership has driven new traffic to Bing and otherwise affected Microsoft’s competitive standing” or what’s required by “terms upon which Bing powers functionality incorporated into Perplexity’s search service.”

These insights, Google seemingly hopes, will convince Mehta that Google’s AI deals and investments are the norm in the AI search sector. But Microsoft is currently blocking access, arguing that “Google has done nothing to explain why” it “needs access to the terms of Microsoft’s highly confidential agreements with other third parties” when Microsoft has already offered to share documents “regarding the distribution and competitive position” of its AI products.

Microsoft also opposes Google’s attempts to review how search click-and-query data is used to train OpenAI’s models. Those requests would be better directed at OpenAI, Microsoft said.

If Microsoft gets its way, Google’s discovery requests will be limited to just Microsoft’s content licensing agreements for Copilot. Microsoft alleged those are the only deals “related to the general search or the general search text advertising markets” at issue in the trial.

On Tuesday, Microsoft attorney Julia Chapman told Mehta that Microsoft had “agreed to provide documents about the data used to train its own AI model and also raised concerns about the competitive sensitivity of Microsoft’s agreements with AI companies,” Law360 reported.

It remains unclear at this time if OpenAI will be forced to give Google the click-and-query data Google seeks. At the status hearing, Mehta ordered OpenAI to share “financial statements, information about the training data for ChatGPT, and assessments of the company’s competitive position,” Law360 reported.

But the DOJ may also be interested in seeing that data. In their proposed final judgment, the government forecasted that “query-based AI solutions” will “provide the most likely long-term path for a new generation of search competitors.”

Because of that prediction, any remedy “must prevent Google from frustrating or circumventing” court-ordered changes “by manipulating the development and deployment of new technologies like query-based AI solutions.” Emerging rivals “will depend on the absence of anticompetitive constraints to evolve into full-fledged competitors and competitive threats,” the DOJ alleged.

Mehta seemingly wants to see the evidence supporting the DOJ’s predictions, which could end up exposing carefully guarded secrets of both Google’s and its biggest rivals’ AI deals.

On Tuesday, the judge noted that integration of AI into search engines had already evolved what search results pages look like. And from his “very layperson’s perspective,” it seems like AI’s integration into search engines will continue moving “very quickly,” as both parties seem to agree.

Whether he buys into the DOJ’s theory that Google could use its existing advantage as the world’s greatest gatherer of search query data to block rivals from keeping pace is still up in the air, but the judge seems moved by the DOJ’s claim that “AI has the ability to affect market dynamics in these industries today as well as tomorrow.”

Photo of Ashley Belanger

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

Google’s plan to keep AI out of search trial remedies isn’t going very well Read More »