MidJourney

image-scraping-midjourney-bans-rival-ai-firm-for-scraping-images

Image-scraping Midjourney bans rival AI firm for scraping images

Irony lives —

Midjourney pins blame for 24-hour outage on “bot-net like” activity from Stability AI employee.

A burglar with flash light and papers in business office. Exactly like scraping files from Discord.

Enlarge / A burglar with a flashlight and papers in a business office—exactly like scraping files from Discord.

On Wednesday, Midjourney banned all employees from image synthesis rival Stability AI from its service indefinitely after it detected “botnet-like” activity suspected to be a Stability employee attempting to scrape prompt and image pairs in bulk. Midjourney advocate Nick St. Pierre tweeted about the announcement, which came via Midjourney’s official Discord channel.

Prompts are the written instructions (like “a cat in a car holding a can of a beer”) used by generative AI models such as Midjourney and Stability AI’s Stable Diffusion 3 (SD3) to synthesize images. Having prompt and image pairs could potentially help the training or fine-tuning of a rival AI image generator model.

Bot activity that took place around midnight on March 2 caused a 24-hour outage for the commercial image generator service. Midjourney linked several paid accounts with a Stability AI data team employee trying to “grab prompt and image pairs.” Midjourney then made a decision to ban all Stability AI employees from the service indefinitely. It also indicated a new policy: “aggressive automation or taking down the service results in banning all employees of the responsible company.”

A screenshot of the

Enlarge / A screenshot of the “Midjourney Office Hours” notes posted on March 6, 2024.

Midjourney

Siobhan Ball of The Mary Sue found it ironic that a company like Midjourney, which built its AI image synthesis models using training data scraped off the Internet without seeking permission, would be sensitive about having its own material scraped. “It turns out that generative AI companies don’t like it when you steal, sorry, scrape, images from them. Cue the world’s smallest violin.”

Users of Midjourney pay a monthly subscription fee to access an AI image generator that turns written prompts into lush computer-synthesized images. The bot that makes them was trained on millions of artistic works created by humans—it’s a practice that has been claimed to be disrespectful to artists. “Words can’t describe how dehumanizing it is to see my name used 20,000+ times in MidJourney,” wrote artist Jingna Zhang in a recent viral tweet. “My life’s work and who I am—reduced to meaningless fodder for a commercial image slot machine.”

Stability responds

Shortly after the news of the ban emerged, Stability AI CEO Emad Mostaque said that he was looking into it and claimed that whatever happened was not intentional. He also said it would be great if Midjourney reached out to him directly. In a reply on X, Midjourney CEO David Holz wrote, “sent you some information to help with your internal investigation.”

In a text message exchange with Ars Technica, Mostaque said, “We checked and there were no images scraped there, there was a bot run by a team member that was collecting prompts for a personal project though. We aren’t sure how that would cause a gallery site outage but are sorry if it did, Midjourney is great.”

Besides, Mostaque says, his company doesn’t need Midjourney’s data anyway. “We have been using synthetic & other data given SD3 outperforms all other models,” he wrote on X. In conversation with Ars, Mostaque similarly wanted to contrast his company’s data collection techniques with those of his rival. “We only scrape stuff that has proper robots.txt and is permissive,” Mostaque says. “And also did full opt-out for [Stable Diffusion 3] and Stable Cascade leveraging work Spawning did.”

When asked about Stability’s relationship with Midjourney these days, Mostaque played down the rivalry. “No real overlap, we get on fine though,” he told Ars and emphasized a key link in their histories. “I funded Midjourney to get [them] off the ground with a cash grant to cover [Nvidia] A100s for the beta.”

Image-scraping Midjourney bans rival AI firm for scraping images Read More »

scientists-aghast-at-bizarre-ai-rat-with-huge-genitals-in-peer-reviewed-article

Scientists aghast at bizarre AI rat with huge genitals in peer-reviewed article

AI gone wild —

It’s unclear how such egregiously bad images made it through peer-review.

An actual laboratory rat, who is intrigued.

Enlarge / An actual laboratory rat, who is intrigued.

Appall and scorn ripped through scientists’ social media networks Thursday as several egregiously bad AI-generated figures circulated from a peer-reviewed article recently published in a reputable journal. Those figures—which the authors acknowledge in the article’s text were made by Midjourney—are all uninterpretable. They contain gibberish text and, most strikingly, one includes an image of a rat with grotesquely large and bizarre genitals, as well as a text label of “dck.”

AI-generated Figure 1 of the paper. This image is supposed to show spermatogonial stem cells isolated, purified, and cultured from rat testes.

Enlarge / AI-generated Figure 1 of the paper. This image is supposed to show spermatogonial stem cells isolated, purified, and cultured from rat testes.

On Thursday, the publisher of the review article, Frontiers, posted an “expression of concern,” noting that it is aware of concerns regarding the published piece. “An investigation is currently being conducted and this notice will be updated accordingly after the investigation concludes,” the publisher wrote.

The article in question is titled “Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway,” which was authored by three researchers in China, including the corresponding author Dingjun Hao of Xi’an Honghui Hospital. It was published online Tuesday in the journal Frontiers in Cell and Developmental Biology.

Frontiers did not immediately respond to Ars’ request for comment, but we will update this post with any response.

The first figure in the paper, the one containing the rat, drew immediate attention as scientists began widely sharing it and commenting on it on social media platforms, including Bluesky and the platform formerly known as Twitter. From a distance, the anatomical image is clearly all sorts of wrong. But, looking closer only reveals more flaws, including the labels “dissilced,” Stemm cells,” “iollotte sserotgomar,” and “dck.” Many researchers expressed surprise and dismay that such a blatantly bad AI-generated image could pass through the peer-review system and whatever internal processing is in place at the journal.

Figure 2 is supposed to be a diagram of the JAK-STAT signaling pathway.

Enlarge / Figure 2 is supposed to be a diagram of the JAK-STAT signaling pathway.

But the rat’s package is far from the only problem. Figure 2 is less graphic but equally mangled. While it’s intended to be a diagram of a complex signaling pathway, it instead is a jumbled mess. One scientific integrity expert questioned whether it provide an overly complicated explanation of “how to make a donut with colorful sprinkles.” Like the first image, the diagram is rife with nonsense text and baffling images. Figure 3 is no better, offering a collage of small circular images that are densely annotated with gibberish. The image is supposed to provide visual representations of how the signaling pathway from Figure 2 regulates the biological properties of spermatogonial stem cells.

Some scientists online questioned whether the text was also AI-generated. One user noted that AI detection software determined that it was likely to be AI-generated; however, as Ars has reported previously, such software is unreliable.

Figure 3 is supposed to show the regulation of biological properties of spermatogonial stem cells by JAK/STAT signaling pathway.

Enlarge / Figure 3 is supposed to show the regulation of biological properties of spermatogonial stem cells by JAK/STAT signaling pathway.

The images, while egregious examples, highlight a growing problem in scientific publishing. A scientist’s success relies heavily on their publication record, with a large volume of publications, frequent publishing, and articles appearing in top-tier journals, all of which earn scientists more prestige. The system incentivizes less-than-scrupulous researchers to push through low-quality articles, which, in the era of AI chatbots, could potentially be generated with the help of AI. Researchers worry that the growing use of AI will make published research less trustworthy. As such, research journals have recently set new authorship guidelines for AI-generated text to try to address the problem. But for now, as the Frontiers article shows, there are clearly some gaps.

Scientists aghast at bizarre AI rat with huge genitals in peer-reviewed article Read More »

how-much-detail-is-too-much?-midjourney-v6-attempts-to-find-out

How much detail is too much? Midjourney v6 attempts to find out

An AI-generated image of a

Enlarge / An AI-generated image of a “Beautiful queen of the universe looking at the camera in sci-fi armor, snow and particles flowing, fire in the background” created using alpha Midjourney v6.

Midjourney

In December, just before Christmas, Midjourney launched an alpha version of its latest image synthesis model, Midjourney v6. Over winter break, Midjourney fans put the new AI model through its paces, with the results shared on social media. So far, fans have noted much more detail than v5.2 (the current default) and a different approach to prompting. Version 6 can also handle generating text in a rudimentary way, but it’s far from perfect.

“It’s definitely a crazy update, both in good and less good ways,” artist Julie Wieland, who frequently shares her Midjourney creations online, told Ars. “The details and scenery are INSANE, the downside (for now) are that the generations are very high contrast and overly saturated (imo). Plus you need to kind of re-adapt and rethink your prompts, working with new structures and now less is kind of more in terms of prompting.”

At the same time, critics of the service still bristle about Midjourney training its models using human-made artwork scraped from the web and obtained without permission—a controversial practice common among AI model trainers we have covered in detail in the past. We’ve also covered the challenges artists might face in the future from these technologies elsewhere.

Too much detail?

With AI-generated detail ramping up dramatically between major Midjourney versions, one could wonder if there is ever such as thing as “too much detail” in an AI-generated image. Midjourney v6 seems to be testing that very question, creating many images that sometimes seem more detailed than reality in an unrealistic way, although that can be modified with careful prompting.

  • An AI-generated image of a nurse in the 1960s created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of an astronaut created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of a “juicy flaming cheeseburger” created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of “a handsome Asian man” created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of an “Apple II” sitting on a desk in the 1980s created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of a “photo of a cat in a car holding a can of beer” created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of a forest path created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of a woman among flowers created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of “a plate of delicious pickles” created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of a barbarian beside a TV set that says “Ars Technica” on it created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of “Abraham Lincoln holding a sign that says Ars Technica” created using alpha Midjourney v6.

    Midjourney

  • An AI-generated image of Mickey Mouse holding a machine gun created using alpha Midjourney v6.

    Midjourney

In our testing of version 6 (which can currently be invoked with the “–v 6.0” argument at the end of a prompt), we noticed times when the new model appeared to produce worse results than v5.2, but Midjourney veterans like Wieland tell Ars that those differences are largely due to the different way that v6.0 interprets prompts. That is something Midjourney is continuously updating over time. “Old prompts sometimes work a bit better than the day they released it,” Wieland told us.

How much detail is too much? Midjourney v6 attempts to find out Read More »

a-song-of-hype-and-fire:-the-10-biggest-ai-stories-of-2023

A song of hype and fire: The 10 biggest AI stories of 2023

An illustration of a robot accidentally setting off a mushroom cloud on a laptop computer.

Getty Images | Benj Edwards

“Here, There, and Everywhere” isn’t just a Beatles song. It’s also a phrase that recalls the spread of generative AI into the tech industry during 2023. Whether you think AI is just a fad or the dawn of a new tech revolution, it’s been impossible to deny that AI news has dominated the tech space for the past year.

We’ve seen a large cast of AI-related characters emerge that includes tech CEOs, machine learning researchers, and AI ethicists—as well as charlatans and doomsayers. From public feedback on the subject of AI, we’ve heard that it’s been difficult for non-technical people to know who to believe, what AI products (if any) to use, and whether we should fear for our lives or our jobs.

Meanwhile, in keeping with a much-lamented trend of 2022, machine learning research has not slowed down over the past year. On X, former Biden administration tech advisor Suresh Venkatasubramanian wrote, “How do people manage to keep track of ML papers? This is not a request for support in my current state of bewilderment—I’m genuinely asking what strategies seem to work to read (or “read”) what appear to be 100s of papers per day.”

To wrap up the year with a tidy bow, here’s a look back at the 10 biggest AI news stories of 2023. It was very hard to choose only 10 (in fact, we originally only intended to do seven), but since we’re not ChatGPT generating reams of text without limit, we have to stop somewhere.

Bing Chat “loses its mind”

Aurich Lawson | Getty Images

In February, Microsoft unveiled Bing Chat, a chatbot built into its languishing Bing search engine website. Microsoft created the chatbot using a more raw form of OpenAI’s GPT-4 language model but didn’t tell everyone it was GPT-4 at first. Since Microsoft used a less conditioned version of GPT-4 than the one that would be released in March, the launch was rough. The chatbot assumed a temperamental personality that could easily turn on users and attack them, tell people it was in love with them, seemingly worry about its fate, and lose its cool when confronted with an article we wrote about revealing its system prompt.

Aside from the relatively raw nature of the AI model Microsoft was using, at fault was a system where very long conversations would push the conditioning system prompt outside of its context window (like a form of short-term memory), allowing all hell to break loose through jailbreaks that people documented on Reddit. At one point, Bing Chat called me “the culprit and the enemy” for revealing some of its weaknesses. Some people thought Bing Chat was sentient, despite AI experts’ assurances to the contrary. It was a disaster in the press, but Microsoft didn’t flinch, and it ultimately reigned in some of Bing Chat’s wild proclivities and opened the bot widely to the public. Today, Bing Chat is now known as Microsoft Copilot, and it’s baked into Windows.

US Copyright Office says no to AI copyright authors

An AI-generated image that won a prize at the Colorado State Fair in 2022, later denied US copyright registration.

Enlarge / An AI-generated image that won a prize at the Colorado State Fair in 2022, later denied US copyright registration.

Jason M. Allen

In February, the US Copyright Office issued a key ruling on AI-generated art, revoking the copyright previously granted to the AI-assisted comic book “Zarya of the Dawn” in September 2022. The decision, influenced by the revelation that the images were created using the AI-powered Midjourney image generator, stated that only the text and arrangement of images and text by Kashtanova were eligible for copyright protection. It was the first hint that AI-generated imagery without human-authored elements could not be copyrighted in the United States.

This stance was further cemented in August when a US federal judge ruled that art created solely by AI cannot be copyrighted. In September, the US Copyright Office rejected the registration for an AI-generated image that won a Colorado State Fair art contest in 2022. As it stands now, it appears that purely AI-generated art (without substantial human authorship) is in the public domain in the United States. This stance could be further clarified or changed in the future by judicial rulings or legislation.

A song of hype and fire: The 10 biggest AI stories of 2023 Read More »