openai

lazy-use-of-ai-leads-to-amazon-products-called-“i-cannot-fulfill-that-request”

Lazy use of AI leads to Amazon products called “I cannot fulfill that request”

FILE NOT FOUND —

The telltale error messages are a sign of AI-generated pablum all over the Internet.

I know naming new products can be hard, but these Amazon sellers made some particularly odd naming choices.

Enlarge / I know naming new products can be hard, but these Amazon sellers made some particularly odd naming choices.

Amazon

Amazon users are at this point used to search results filled with products that are fraudulent, scams, or quite literally garbage. These days, though, they also may have to pick through obviously shady products, with names like “I’m sorry but I cannot fulfill this request it goes against OpenAI use policy.”

As of press time, some version of that telltale OpenAI error message appears in Amazon products ranging from lawn chairs to office furniture to Chinese religious tracts. A few similarly named products that were available as of this morning have been taken down as word of the listings spreads across social media (one such example is Archived here).

ProTip: Don't ask OpenAI to integrate a trademarked brand name when generating a name for your weird length of rubber tubing.

Enlarge / ProTip: Don’t ask OpenAI to integrate a trademarked brand name when generating a name for your weird length of rubber tubing.

Other Amazon product names don’t mention OpenAI specifically but feature apparent AI-related error messages, such as “Sorry but I can’t generate a response to that request” or “Sorry but I can’t provide the information you’re looking for,” (available in a variety of colors). Sometimes, the product names even highlight the specific reason why the apparent AI-generation request failed, noting that OpenAI can’t provide content that “requires using trademarked brand names” or “promotes a specific religious institution” or in one case “encourage unethical behavior.”

The repeated invocation of a

Enlarge / The repeated invocation of a “commitment to providing reliable and trustworthy product descriptions” cited in this description is particularly ironic.

The descriptions for these oddly named products are also riddled with obvious AI error messages like, “Apologies, but I am unable to provide the information you’re seeking.” One product description for a set of tables and chairs (which has since been taken down) hilariously noted: “Our [product] can be used for a variety of tasks, such [task 1], [task 2], and [task 3]].” Another set of product descriptions, seemingly for tattoo ink guns, repeatedly apologizes that it can’t provide more information because: “We prioritize accuracy and reliability by only offering verified product details to our customers.”

Spam spam spam spam

Using large language models to help generate product names or descriptions isn’t against Amazon policy. On the contrary, in September Amazon launched its own generative AI tool to help sellers “create more thorough and captivating product descriptions, titles, and listing details.” And we could only find a small handful of Amazon products slipping through with the telltale error messages in their names or descriptions as of press time.

Still, these error-message-filled listings highlight the lack of care or even basic editing many Amazon scammers are exercising when putting their spammy product listings on the Amazon marketplace. For every seller that can be easily caught accidentally posting an OpenAI error, there are likely countless others using the technology to create product names and descriptions that only seem like they were written by a human that has actual experience with the product in question.

A set of clearly real people conversing on Twitter / X.

Enlarge / A set of clearly real people conversing on Twitter / X.

Amazon isn’t the only online platform where these AI bots are outing themselves, either. A quick search for “goes against OpenAI policy” or “as an AI language model” can find a whole lot of artificial posts on Twitter / X or Threads or LinkedIn, for example. Security engineer Dan Feldman noted a similar problem on Amazon back in April, though searching with the phrase “as an AI language model” doesn’t seem to generate any obviously AI-generated search results these days.

As fun as it is to call out these obvious mishaps for AI-generated content mills, a flood of harder-to-detect AI content is threatening to overwhelm everyone from art communities to sci-fi magazines to Amazon’s own ebook marketplace. Pretty much any platform that accepts user submissions that involve text or visual art now has to worry about being flooded with wave after wave of AI-generated work trying to crowd out the human community they were created for. It’s a problem that’s likely to get worse before it gets better.

Listing image by Getty Images | Leon Neal

Lazy use of AI leads to Amazon products called “I cannot fulfill that request” Read More »

at-senate-ai-hearing,-news-executives-fight-against-“fair-use”-claims-for-ai-training-data

At Senate AI hearing, news executives fight against “fair use” claims for AI training data

All’s fair in love and AI —

Media orgs want AI firms to license content for training, and Congress is sympathetic.

WASHINGTON, DC - JANUARY 10: Danielle Coffey, President and CEO of News Media Alliance, Professor Jeff Jarvis, CUNY Graduate School of Journalism, Curtis LeGeyt President and CEO of National Association of Broadcasters, Roger Lynch CEO of Condé Nast, are strong in during a Senate Judiciary Subcommittee on Privacy, Technology, and the Law hearing on “Artificial Intelligence and The Future Of Journalism” at the U.S. Capitol on January 10, 2024 in Washington, DC. Lawmakers continue to hear testimony from experts and business leaders about artificial intelligence and its impact on democracy, elections, privacy, liability and news. (Photo by Kent Nishimura/Getty Images)

Enlarge / Danielle Coffey, president and CEO of News Media Alliance; Professor Jeff Jarvis, CUNY Graduate School of Journalism; Curtis LeGeyt, president and CEO of National Association of Broadcasters; and Roger Lynch, CEO of Condé Nast, are sworn in during a Senate Judiciary Subcommittee on Privacy, Technology, and the Law hearing on “Artificial Intelligence and The Future Of Journalism.”

Getty Images

On Wednesday, news industry executives urged Congress for legal clarification that using journalism to train AI assistants like ChatGPT is not fair use, as claimed by companies such as OpenAI. Instead, they would prefer a licensing regime for AI training content that would force Big Tech companies to pay for content in a method similar to rights clearinghouses for music.

The plea for action came during a US Senate Judiciary Committee hearing titled “Oversight of A.I.: The Future of Journalism,” chaired by Sen. Richard Blumenthal of Connecticut, with Sen. Josh Hawley of Missouri also playing a large role in the proceedings. Last year, the pair of senators introduced a bipartisan framework for AI legislation and held a series of hearings on the impact of AI.

Blumenthal described the situation as an “existential crisis” for the news industry and cited social media as a cautionary tale for legislative inaction about AI. “We need to move more quickly than we did on social media and learn from our mistakes in the delay there,” he said.

Companies like OpenAI have admitted that vast amounts of copyrighted material are necessary to train AI large language models, but they claim their use is transformational and covered under fair use precedents of US copyright law. Currently, OpenAI is negotiating licensing content from some news providers and striking deals, but the executives in the hearing said those efforts are not enough, highlighting closing newsrooms across the US and dropping media revenues while Big Tech’s profits soar.

“Gen AI cannot replace journalism,” said Condé Nast CEO Roger Lynch in his opening statement. (Condé Nast is the parent company of Ars Technica.) “Journalism is fundamentally a human pursuit, and it plays an essential and irreplaceable role in our society and our democracy.” Lynch said that generative AI has been built with “stolen goods,” referring to the use of AI training content from news outlets without authorization. “Gen AI companies copy and display our content without permission or compensation in order to build massive commercial businesses that directly compete with us.”

Roger Lynch, CEO of Condé Nast, testifies before the Senate Judiciary Subcommittee on Privacy, Technology, and the Law during a hearing on “Artificial Intelligence and The Future Of Journalism.”

Enlarge / Roger Lynch, CEO of Condé Nast, testifies before the Senate Judiciary Subcommittee on Privacy, Technology, and the Law during a hearing on “Artificial Intelligence and The Future Of Journalism.”

Getty Images

In addition to Lynch, the hearing featured three other witnesses: Jeff Jarvis, a veteran journalism professor and pundit; Danielle Coffey, the president and CEO of News Media Alliance; and Curtis LeGeyt, president and CEO of the National Association of Broadcasters.

Coffey also shared concerns about generative AI using news material to create competitive products. “These outputs compete in the same market, with the same audience, and serve the same purpose as the original articles that feed the algorithms in the first place,” she said.

When Sen. Hawley asked Lynch what kind of legislation might be needed to fix the problem, Lynch replied, “I think quite simply, if Congress could clarify that the use of our content and other publisher content for training and output of AI models is not fair use, then the free market will take care of the rest.”

Lynch used the music industry as a model: “You think about millions of artists, millions of ultimate consumers consuming that content, there have been models that have been set up, ASCAP, BMI, CSAC, GMR, these collective rights organizations to simplify the content that’s being used.”

Curtis LeGeyt, CEO of the National Association of Broadcasters, said that TV broadcast journalists are also affected by generative AI. “The use of broadcasters’ news content in AI models without authorization diminishes our audience’s trust and our reinvestment in local news,” he said. “Broadcasters have already seen numerous examples where content created by our journalists has been ingested and regurgitated by AI bots with little or no attribution.”

At Senate AI hearing, news executives fight against “fair use” claims for AI training data Read More »

openai’s-gpt-store-lets-chatgpt-users-discover-popular-user-made-chatbot-roles

OpenAI’s GPT Store lets ChatGPT users discover popular user-made chatbot roles

The bot of 1,000 faces —

Like an app store, people can find novel ChatGPT personalities—and some creators will get paid.

Two robots hold a gift box.

On Wednesday, OpenAI announced the launch of its GPT Store—a way for ChatGPT users to share and discover custom chatbot roles called “GPTs”—and ChatGPT Team, a collaborative ChatGPT workspace and subscription plan. OpenAI bills the new store as a way to “help you find useful and popular custom versions of ChatGPT” for members of Plus, Team, or Enterprise subscriptions.

“It’s been two months since we announced GPTs, and users have already created over 3 million custom versions of ChatGPT,” writes OpenAI in its promotional blog. “Many builders have shared their GPTs for others to use. Today, we’re starting to roll out the GPT Store to ChatGPT Plus, Team and Enterprise users so you can find useful and popular GPTs.”

OpenAI launched GPTs on November 6, 2023, as part of its DevDay event. Each GPT includes custom instructions and/or access to custom data or external APIs that can potentially make a custom GPT personality more useful than the vanilla ChatGPT-4 model. Before the GPT Store launch, paying ChatGPT users could create and share custom GPTs with others (by setting the GPT public and sharing a link to the GPT), but there was no central repository for browsing and discovering user-designed GPTs on the OpenAI website.

According to OpenAI, the ChatGPT Store will feature new GPTs every week, and the company shared a list a group of six notable early GPTs that are available now: AllTrails for finding hiking trails, Consensus for searching 200 million academic papers, Code Tutor for learning coding with Khan Academy, Canva for designing presentations, Books for discovering reading material, and CK-12 Flexi for learning math and science.

A screenshot of the OpenAI GPT Store provided by OpenAI.

Enlarge / A screenshot of the OpenAI GPT Store provided by OpenAI.

OpenAI

ChatGPT members can include their own GPTs in the GPT Store by setting them to be accessible to “Everyone” and then verifying a builder profile in ChatGPT settings. OpenAI plans to review GPTs to ensure they meet their policies and brand guidelines. GPTs that violate the rules can also be reported by users.

As promised by CEO Sam Altman during DevDay, OpenAI plans to share revenue with GPT creators. Unlike a smartphone app store, it appears that users will not sell their GPTs in the GPT Store, but instead, OpenAI will pay developers “based on user engagement with their GPTs.” The revenue program will launch in the first quarter of 2024, and OpenAI will provide more details on the criteria for receiving payments later.

“ChatGPT Team” is for teams who use ChatGPT

Also on Monday, OpenAI announced the cleverly named ChatGPT Team, a new group-based ChatGPT membership program akin to ChatGPT Enterprise, which the company launched last August. Unlike Enterprise, which is for large companies and does not have publicly listed prices, ChatGPT Team is a plan for “teams of all sizes” and costs US $25 a month per user (when billed annually) or US $30 a month per user (when billed monthly). By comparison, ChatGPT Plus costs $20 per month.

So what does ChatGPT Team offer above the usual ChatGPT Plus subscription? According to OpenAI, it “provides a secure, collaborative workspace to get the most out of ChatGPT at work.” Unlike Plus, OpenAI says it will not train AI models based on ChatGPT Team business data or conversations. It features an admin console for team management and the ability to share custom GPTs with your team. Like Plus, it also includes access to GPT-4 with the 32K context window, DALL-E 3, GPT-4 with Vision, Browsing, and Advanced Data Analysis—all with higher message caps.

Why would you want to use ChatGPT at work? OpenAI says it can help you generate better code, craft emails, analyze data, and more. Your mileage may vary, of course. As usual, our standard Ars warning about AI language models applies: “Bring your own data” for analysis, don’t rely on ChatGPT as a factual resource, and don’t rely on its outputs in ways you cannot personally confirm. OpenAI has provided more details about ChatGPT Team on its website.

OpenAI’s GPT Store lets ChatGPT users discover popular user-made chatbot roles Read More »

regulators-aren’t-convinced-that-microsoft-and-openai-operate-independently

Regulators aren’t convinced that Microsoft and OpenAI operate independently

Under Microsoft’s thumb? —

EU is fielding comments on potential market harms of Microsoft’s investments.

Regulators aren’t convinced that Microsoft and OpenAI operate independently

European Union regulators are concerned that Microsoft may be covertly controlling OpenAI as its biggest investor.

On Tuesday, the European Commission (EC) announced that it is currently “checking whether Microsoft’s investment in OpenAI might be reviewable under the EU Merger Regulation.”

The EC’s executive vice president in charge of competition policy, Margrethe Vestager, said in the announcement that rapidly advancing AI technologies are “disruptive” and have “great potential,” but to protect EU markets, a forward-looking analysis scrutinizing antitrust risks has become necessary.

Hoping to thwart predictable anticompetitive risks, the EC has called for public comments. Regulators are particularly keen to hear from policy experts, academics, and industry and consumer organizations who can identify “potential competition issues” stemming from tech companies partnering to develop generative AI and virtual world/metaverse systems.

The EC worries that partnerships like Microsoft and OpenAI could “result in entrenched market positions and potential harmful competition behavior that is difficult to address afterwards.” That’s why Vestager said that these partnerships needed to be “closely” monitored now—”to ensure they do not unduly distort market dynamics.”

Microsoft has denied having control over OpenAI.

A Microsoft spokesperson told Ars that, rather than stifling competition, since 2019, the tech giant has “forged a partnership with OpenAI that has fostered more AI innovation and competition, while preserving independence for both companies.”

But ever since Sam Altman was bizarrely ousted by OpenAI’s board, then quickly reappointed as OpenAI’s CEO—joining Microsoft for the brief time in between—regulators have begun questioning whether recent governance changes mean that Microsoft’s got more control over OpenAI than the companies have publicly stated.

OpenAI did not immediately respond to Ars’ request to comment. Last year, OpenAI confirmed that “it remained independent and operates competitively,” CNBC reported.

Beyond the EU, the UK’s Competition and Markets Authority (CMA) and reportedly the US Federal Trade Commission have also launched investigations into Microsoft’s OpenAI investments. On January 3, the CMA ended its comments period, but it’s currently unclear whether significant competition issues were raised that could trigger a full-fledged CMA probe.

A CMA spokesperson declined Ars’ request to comment on the substance of comments received or to verify how many comments were received.

Antitrust legal experts told Reuters that authorities should act quickly to prevent “critical emerging technology” like generative AI from being “monopolized,” noting that before launching a probe, the CMA will need to find evidence showing that Microsoft’s influence over OpenAI materially changed after Altman’s reappointment.

The EC is also investigating partnerships beyond Microsoft and OpenAI, questioning whether agreements “between large digital market players and generative AI developers and providers” may impact EU market dynamics.

Microsoft observing OpenAI board meetings

In total, Microsoft has pumped $13 billion into OpenAI, CNBC reported, which has a somewhat opaque corporate structure. OpenAI’s parent company, Reuters reported in December, is a nonprofit, which is “a type of entity rarely subject to antitrust scrutiny.” But in 2019, as Microsoft started investing billions into the AI company, OpenAI also “set up a for-profit subsidiary, in which Microsoft owns a 49 percent stake,” an insider source told Reuters. On Tuesday, a nonprofit consumer rights group, the Public Citizen, called for California Attorney General Robert Bonta to “investigate whether OpenAI should retain its non-profit status.”

A Microsoft spokesperson told Reuters that the source’s information was inaccurate, reiterating that the terms of Microsoft’s agreement with OpenAI are confidential. Microsoft has maintained that while it is entitled to OpenAI’s profits, it does not own “any portion” of OpenAI.

After OpenAI’s drama with Altman ended with an overhaul of OpenAI’s board, Microsoft appeared to increase its involvement with OpenAI by receiving a non-voting observer role on the board. That’s what likely triggered lawmaker’s initial concerns that Microsoft “may be exerting control over OpenAI,” CNBC reported.

The EC’s announcement comes days after Microsoft confirmed that Dee Templeton would serve as the observer on OpenAI’s board, initially reported by Bloomberg.

Templeton has spent 25 years working for Microsoft and is currently vice president for technology and research partnerships and operations. According to Bloomberg, she has already attended OpenAI board meetings.

Microsoft’s spokesperson told Ars that adding a board observer was the only recent change in the company’s involvement in OpenAI. An OpenAI spokesperson told CNBC that Microsoft’s board observer has no “governing authority or control over OpenAI’s operations.”

By appointing Templeton as a board observer, Microsoft may simply be seeking to avoid any further surprises that could affect its investment in OpenAI, but the CMA has suggested that Microsoft’s involvement in the board may have created “a relevant merger situation” that could shake up competition in the UK if not appropriately regulated.

Regulators aren’t convinced that Microsoft and OpenAI operate independently Read More »

microsoft-is-adding-a-new-key-to-pc-keyboards-for-the-first-time-since-1994

Microsoft is adding a new key to PC keyboards for the first time since 1994

key change —

Copilot key will eventually be required in new PC keyboards, though not yet.

A rendering of Microsoft's Copilot key, as seen on a Surface-esque laptop keyboard.

Enlarge / A rendering of Microsoft’s Copilot key, as seen on a Surface-esque laptop keyboard.

Microsoft

Microsoft pushed throughout 2023 to add generative AI capabilities to its software, even extending its new Copilot AI assistant to Windows 10 late last year. Now, those efforts to transform PCs at a software level is extending to the hardware: Microsoft is adding a dedicated Copilot key to PC keyboards, adjusting the standard Windows keyboard layout for the first time since the Windows key first appeared on its Natural Keyboard in 1994.

The Copilot key will, predictably, open up the Copilot generative AI assistant within Windows 10 and Windows 11. On an up-to-date Windows PC with Copilot enabled, you can currently do the same thing by pressing Windows + C. For PCs without Copilot enabled, including those that aren’t signed into Microsoft accounts, the Copilot key will open Windows Search instead (though this is sort of redundant, since pressing the Windows key and then typing directly into the Start menu also activates the Search function).

A quick Microsoft demo video shows the Copilot key in between the cluster of arrow keys and the right Alt button, a place where many keyboards usually put a menu button, a right Ctrl key, another Windows key, or something similar. The exact positioning, and the key being replaced, may vary depending on the size and layout of the keyboard.

We asked Microsoft if a Copilot key would be required on OEM PCs going forward; the company told us that the key isn’t mandatory now, but that it expects Copilot keys to be required on Windows 11 keyboards “over time.” Microsoft often imposes some additional hardware requirements on major PC makers that sell Windows on their devices, beyond what is strictly necessary to run Windows itself.

If nothing else, this new key is a sign of how much Microsoft wants people to use Copilot and its other generative AI products. Plenty of past company initiatives—Bing, Edge, Cortana, and the Microsoft Store, to name a few—never managed to become baked into the hardware like this. In the Windows 8 epoch, Microsoft required OEMs to build a Windows button into the display bezel of devices with touchscreens, but that requirement eventually disappeared. If Copilot fizzles or is deemphasized the way Cortana was, the Copilot key could become a way to quickly date a Windows PC from the mid-2020s, the way that changes to the Windows logo date keyboards from earlier eras.

We’ll definitely see more AI features from Microsoft this year, too—Microsoft Chief Marketing Officer Yusuf Medhi called 2024 “the year of the AI PC” in today’s announcement.

Chipmakers like Intel, AMD, and Qualcomm are all building neural processing units (NPUs) into their latest silicon, and we’ll likely see more updates for Windows apps and features that can take advantage of this new on-device processing capability. Rumors also indicate that we could see a “Windows 12” release as soon as this year; while Windows 11 has mostly had AI features stacked on top of it, a new OS could launch with AI features more deeply integrated into the UI and apps, as well as additional hardware requirements for some features.

Microsoft says the Copilot key will debut in some PCs that will be announced at the Consumer Electronics Show this month. Surface devices with the revised keyboard layout are “upcoming.”

Microsoft is adding a new key to PC keyboards for the first time since 1994 Read More »

ny-times-copyright-suit-wants-openai-to-delete-all-gpt-instances

NY Times copyright suit wants OpenAI to delete all GPT instances

Not the sincerest form of flattery —

Shows evidence that GPT-based systems will reproduce Times articles if asked.

Image of a CPU on a motherboard with

Enlarge / Microsoft is named in the suit for allegedly building the system that allowed GPT derivatives to be trained using infringing material.

In August, word leaked out that The New York Times was considering joining the growing legion of creators that are suing AI companies for misappropriating their content. The Times had reportedly been negotiating with OpenAI regarding the potential to license its material, but those talks had not gone smoothly. So, eight months after the company was reportedly considering suing, the suit has now been filed.

The Times is targeting various companies under the OpenAI umbrella, as well as Microsoft, an OpenAI partner that both uses it to power its Copilot service and helped provide the infrastructure for training the GPT Large Language Model. But the suit goes well beyond the use of copyrighted material in training, alleging that OpenAI-powered software will happily circumvent the Times’ paywall and ascribe hallucinated misinformation to the Times.

Journalism is expensive

The suit notes that The Times maintains a large staff that allows it to do things like dedicate reporters to a huge range of beats and engage in important investigative journalism, among other things. Because of those investments, the newspaper is often considered an authoritative source on many matters.

All of that costs money, and The Times earns that by limiting access to its reporting through a robust paywall. In addition, each print edition has a copyright notification, the Times’ terms of service limit the copying and use of any published material, and it can be selective about how it licenses its stories. In addition to driving revenue, these restrictions also help it to maintain its reputation as an authoritative voice by controlling how its works appear.

The suit alleges that OpenAI-developed tools undermine all of that. “By providing Times content without The Times’s permission or authorization, Defendants’ tools undermine and damage The Times’s relationship with its readers and deprive The Times of subscription, licensing, advertising, and affiliate revenue,” the suit alleges.

Part of the unauthorized use The Times alleges came during the training of various versions of GPT. Prior to GPT-3.5, information about the training dataset was made public. One of the sources used is a large collection of online material called “Common Crawl,” which the suit alleges contains information from 16 million unique records from sites published by The Times. That places the Times as the third most referenced source, behind Wikipedia and a database of US patents.

OpenAI no longer discloses as many details of the data used for training of recent GPT versions, but all indications are that full-text NY Times articles are still part of that process (Much more on that in a moment.) Expect access to training information to be a major issue during discovery if this case moves forward.

Not just training

A number of suits have been filed regarding the use of copyrighted material during training of AI systems. But the Times’ suit goes well beyond that to show how the material ingested during training can come back out during use. “Defendants’ GenAI tools can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples,” the suit alleges.

The suit alleges—and we were able to verify—that it’s comically easy to get GPT-powered systems to offer up content that is normally protected by the Times’ paywall. The suit shows a number of examples of GPT-4 reproducing large sections of articles nearly verbatim.

The suit includes screenshots of ChatGPT being given the title of a piece at The New York Times and asked for the first paragraph, which it delivers. Getting the ensuing text is apparently as simple as repeatedly asking for the next paragraph.

ChatGPT has apparently closed that loophole in between the preparation of that suit and the present. We entered some of the prompts shown in the suit, and were advised “I recommend checking The New York Times website or other reputable sources,” although we can’t rule out that context provided prior to that prompt could produce copyrighted material.

Ask for a paragraph, and Copilot will hand you a wall of normally paywalled text.

Ask for a paragraph, and Copilot will hand you a wall of normally paywalled text.

John Timmer

But not all loopholes have been closed. The suit also shows output from Bing Chat, since rebranded as Copilot. We were able to verify that asking for the first paragraph of a specific article at The Times caused Copilot to reproduce the first third of the article.

The suit is dismissive of attempts to justify this as a form of fair use. “Publicly, Defendants insist that their conduct is protected as ‘fair use’ because their unlicensed use of copyrighted content to train GenAI models serves a new ‘transformative’ purpose,” the suit notes. “But there is nothing ‘transformative’ about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it.”

Reputational and other damages

The hallucinations common to AI also came under fire in the suit for potentially damaging the value of the Times’ reputation, and possibly damaging human health as a side effect. “A GPT model completely fabricated that “The New York Times published an article on January 10, 2020, titled ‘Study Finds Possible Link between Orange Juice and Non-Hodgkin’s Lymphoma,’” the suit alleges. “The Times never published such an article.”

Similarly, asking about a Times article on heart-healthy foods allegedly resulted in Copilot saying it contained a list of examples (which it didn’t). When asked for the list, 80 percent of the foods on weren’t even mentioned by the original article. In another case, recommendations were ascribed to the Wirecutter when the products hadn’t even been reviewed by its staff.

As with the Times material, it’s alleged that it’s possible to get Copilot to offer up large chunks of Wirecutter articles (The Wirecutter is owned by The New York Times). But the suit notes that these article excerpts have the affiliate links stripped out of them, keeping the Wirecutter from its primary source of revenue.

The suit targets various OpenAI companies for developing the software, as well as Microsoft—the latter for both offering OpenAI-powered services, and for having developed the computing systems that enabled the copyrighted material to be ingested during training. Allegations include direct, contributory, and vicarious copyright infringement, as well as DMCA and trademark violations. Finally, it alleges “Common Law Unfair Competition By Misappropriation.”

The suit seeks nothing less than the erasure of both any GPT instances that the parties have trained using material from the Times, as well as the destruction of the datasets that were used for the training. It also asks for a permanent injunction to prevent similar conduct in the future. The Times also wants money, lots and lots of money: “statutory damages, compensatory damages, restitution, disgorgement, and any other relief that may be permitted by law or equity.”

NY Times copyright suit wants OpenAI to delete all GPT instances Read More »

big-tech-is-spending-more-than-vc-firms-on-ai-startups

Big Tech is spending more than VC firms on AI startups

money cannon —

Microsoft, Google, and Amazon haved crowded out traditional Silicon Valley investors.

A string of deals by Microsoft, Google and Amazon amounted to two-thirds of the $27 billion raised by fledgling AI companies in 2023,

Enlarge / A string of deals by Microsoft, Google and Amazon amounted to two-thirds of the $27 billion raised by fledgling AI companies in 2023,

FT montage/Dreamstime

Big tech companies have vastly outspent venture capital groups with investments in generative AI startups this year, as established giants use their financial muscle to dominate the much-hyped sector.

Microsoft, Google and Amazon last year struck a series of blockbuster deals, amounting to two-thirds of the $27 billion raised by fledgling AI companies in 2023, according to new data from private market researchers PitchBook.

The huge outlay, which exploded after the launch of OpenAI’s ChatGPT in November 2022, highlights how the biggest Silicon Valley groups are crowding out traditional tech investors for the biggest deals in the industry.

The rise of generative AI—systems capable of producing humanlike video, text, image and audio in seconds—have also attracted top Silicon Valley investors. But VCs have been outmatched, having been forced to slow down their spending as they adjust to higher interest rates and falling valuations for their portfolio companies.

“Over the past year, we’ve seen the market quickly consolidate around a handful of foundation models, with large tech players coming in and pouring billions of dollars into companies like OpenAI, Cohere, Anthropic and Mistral,” said Nina Achadjian, a partner at US venture firm Index Ventures referring to some of the top AI startups.

“For traditional VCs, you had to be in early and you had to have conviction—which meant being in the know on the latest AI research and knowing which teams were spinning out of Google DeepMind, Meta and others,” she added.

Financial Times

A string of deals, such as Microsoft’s $10 billion investment in OpenAI as well as billions of dollars raised by San Francisco-based Anthropic from both Google and Amazon, helped push overall spending on AI groups to nearly three times as much as the previous record of $11 billion set two years ago.

Venture investing in tech hit record levels in 2021, as investors took advantage of ultra-low interest rates to raise and deploy vast sums across a range of industries, particularly those most disrupted by Covid-19.

Microsoft has also committed $1.3 billion to Inflection, another generative AI start-up, as it looks to steal a march on rivals such as Google and Amazon.

Building and training generative AI tools is an intensive process, requiring immense computing power and cash. As a result, start-ups have preferred to partner with Big Tech companies which can provide cloud infrastructure and access to the most powerful chips as well as dollars.

That has rapidly pushed up the valuations of private start-ups in the space, making it harder for VCs to bet on the companies at the forefront of the technology. An employee stock sale at OpenAI is seeking to value the company at $86 billion, almost treble the valuation it received earlier this year.

“Even the world’s top venture investors, with tens of billions under management, can’t compete to keep these AI companies independent and create new challengers that unseat the Big Tech incumbents,” said Patrick Murphy, founding partner at Tapestry VC, an early-stage venture capital firm.

“In this AI platform shift, most of the potentially one-in-a-million companies to appear so far have been captured by the Big Tech incumbents already.”

VCs are not absent from the market, however. Thrive Capital, Josh Kushner’s New York-based firm, is the lead investor in OpenAI’s employee stock sale, having already backed the company earlier this year. Thrive has continued to invest throughout a downturn in venture spending in 2023.

Paris-based Mistral raised around $500 million from investors including venture firms Andreessen Horowitz and General Catalyst, and chipmaker Nvidia since it was founded in May this year.

Some VCs are seeking to invest in companies building applications that are being built over so-called “foundation models” developed by OpenAI and Anthropic, in much the same way apps began being developed on mobile devices in the years after smartphones were introduced.

“There is this myth that only the foundation model companies matter,” said Sarah Guo, founder of AI-focused venture firm Conviction. “There is a huge space of still-unexplored application domains for AI, and a lot of the most valuable AI companies will be fundamentally new.”

Additional reporting by Tim Bradshaw.

© 2023 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Big Tech is spending more than VC firms on AI startups Read More »

a-song-of-hype-and-fire:-the-10-biggest-ai-stories-of-2023

A song of hype and fire: The 10 biggest AI stories of 2023

An illustration of a robot accidentally setting off a mushroom cloud on a laptop computer.

Getty Images | Benj Edwards

“Here, There, and Everywhere” isn’t just a Beatles song. It’s also a phrase that recalls the spread of generative AI into the tech industry during 2023. Whether you think AI is just a fad or the dawn of a new tech revolution, it’s been impossible to deny that AI news has dominated the tech space for the past year.

We’ve seen a large cast of AI-related characters emerge that includes tech CEOs, machine learning researchers, and AI ethicists—as well as charlatans and doomsayers. From public feedback on the subject of AI, we’ve heard that it’s been difficult for non-technical people to know who to believe, what AI products (if any) to use, and whether we should fear for our lives or our jobs.

Meanwhile, in keeping with a much-lamented trend of 2022, machine learning research has not slowed down over the past year. On X, former Biden administration tech advisor Suresh Venkatasubramanian wrote, “How do people manage to keep track of ML papers? This is not a request for support in my current state of bewilderment—I’m genuinely asking what strategies seem to work to read (or “read”) what appear to be 100s of papers per day.”

To wrap up the year with a tidy bow, here’s a look back at the 10 biggest AI news stories of 2023. It was very hard to choose only 10 (in fact, we originally only intended to do seven), but since we’re not ChatGPT generating reams of text without limit, we have to stop somewhere.

Bing Chat “loses its mind”

Aurich Lawson | Getty Images

In February, Microsoft unveiled Bing Chat, a chatbot built into its languishing Bing search engine website. Microsoft created the chatbot using a more raw form of OpenAI’s GPT-4 language model but didn’t tell everyone it was GPT-4 at first. Since Microsoft used a less conditioned version of GPT-4 than the one that would be released in March, the launch was rough. The chatbot assumed a temperamental personality that could easily turn on users and attack them, tell people it was in love with them, seemingly worry about its fate, and lose its cool when confronted with an article we wrote about revealing its system prompt.

Aside from the relatively raw nature of the AI model Microsoft was using, at fault was a system where very long conversations would push the conditioning system prompt outside of its context window (like a form of short-term memory), allowing all hell to break loose through jailbreaks that people documented on Reddit. At one point, Bing Chat called me “the culprit and the enemy” for revealing some of its weaknesses. Some people thought Bing Chat was sentient, despite AI experts’ assurances to the contrary. It was a disaster in the press, but Microsoft didn’t flinch, and it ultimately reigned in some of Bing Chat’s wild proclivities and opened the bot widely to the public. Today, Bing Chat is now known as Microsoft Copilot, and it’s baked into Windows.

US Copyright Office says no to AI copyright authors

An AI-generated image that won a prize at the Colorado State Fair in 2022, later denied US copyright registration.

Enlarge / An AI-generated image that won a prize at the Colorado State Fair in 2022, later denied US copyright registration.

Jason M. Allen

In February, the US Copyright Office issued a key ruling on AI-generated art, revoking the copyright previously granted to the AI-assisted comic book “Zarya of the Dawn” in September 2022. The decision, influenced by the revelation that the images were created using the AI-powered Midjourney image generator, stated that only the text and arrangement of images and text by Kashtanova were eligible for copyright protection. It was the first hint that AI-generated imagery without human-authored elements could not be copyrighted in the United States.

This stance was further cemented in August when a US federal judge ruled that art created solely by AI cannot be copyrighted. In September, the US Copyright Office rejected the registration for an AI-generated image that won a Colorado State Fair art contest in 2022. As it stands now, it appears that purely AI-generated art (without substantial human authorship) is in the public domain in the United States. This stance could be further clarified or changed in the future by judicial rulings or legislation.

A song of hype and fire: The 10 biggest AI stories of 2023 Read More »

on-openai-dev-day

On OpenAI Dev Day

OpenAI DevDay was this week. What delicious and/or terrifying things await?

First off, we have GPT-4-Turbo.

Today we’re launching a preview of the next generation of this model, GPT-4 Turbo

GPT-4 Turbo is more capable and has knowledge of world events up to April 2023. It has a 128k context window so it can fit the equivalent of more than 300 pages of text in a single prompt. We also optimized its performance so we are able to offer GPT-4 Turbo at a 3x cheaper price for input tokens and a 2x cheaper price for output tokens compared to GPT-4.

GPT-4 Turbo is available for all paying developers to try by passing gpt-4-1106-preview in the API and we plan to release the stable production-ready model in the coming weeks.

Knowledge up to April 2023 is a big game. Cutting the price in half is another big game. A 128k context window retakes the lead on that from Claude-2. That chart from last week of how GPT-4 was slow and expensive, opening up room for competitors? Back to work, everyone.

What else?

Function calling updates

Function calling lets you describe functions of your app or external APIs to models, and have the model intelligently choose to output a JSON object containing arguments to call those functions. We’re releasing several improvements today, including the ability to call multiple functions in a single message: users can send one message requesting multiple actions, such as “open the car window and turn off the A/C”, which would previously require multiple roundtrips with the model (learn more). We are also improving function calling accuracy: GPT-4 Turbo is more likely to return the right function parameters.

This kind of feature seems highly fiddly and dependent. When it starts working well enough, suddenly it is great, and I have no idea if this will count. I will watch out for reports. For now, I am not trying to interact with any APIs via GPT-4. Use caution.

Improved instruction following and JSON mode

GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., “always respond in XML”). It also supports our new JSON mode, which ensures the model will respond with valid JSON. The new API parameter response_format enables the model to constrain its output to generate a syntactically correct JSON object. JSON mode is useful for developers generating JSON in the Chat Completions API outside of function calling.

Better instruction following is incrementally great. Always frustrating when instructions can’t be relied upon. Could allow some processes to be profitably automated.

Reproducible outputs and log probabilities

The new seed parameter enables reproducible outputs by making the model return consistent completions most of the time. This beta feature is useful for use cases such as replaying requests for debugging, writing more comprehensive unit tests, and generally having a higher degree of control over the model behavior. We at OpenAI have been using this feature internally for our own unit tests and have found it invaluable. We’re excited to see how developers will use it. Learn more.

We’re also launching a feature to return the log probabilities for the most likely output tokens generated by GPT-4 Turbo and GPT-3.5 Turbo in the next few weeks, which will be useful for building features such as autocomplete in a search experience.

I love the idea of seeing the probabilities of different responses on the regular, especially if incorporated into ChatGPT. It provides so much context for knowing what to make of the answer. The distribution of possible answers is the true answer. Super excited in a good way.

Updated GPT-3.5 Turbo

In addition to GPT-4 Turbo, we are also releasing a new version of GPT-3.5 Turbo that supports a 16K context window by default. The new 3.5 Turbo supports improved instruction following, JSON mode, and parallel function calling. For instance, our internal evals show a 38% improvement on format following tasks such as generating JSON, XML and YAML. Developers can access this new model by calling gpt-3.5-turbo-1106 in the API. Applications using the gpt-3.5-turbo name will automatically be upgraded to the new model on December 11. Older models will continue to be accessible by passing gpt-3.5-turbo-0613 in the API until June 13, 2024. Learn more.

Some academics will presumably grumble that the old version is going away. Such incremental improvements seem nice, but with GPT-4 getting a price cut and turbo boost, should be less call for 3.5. I can still see using it in things like multi-agent world simulations.

This claims you can now use GPT 3.5 at only a modest additional marginal cost versus Llama-2.

Hamel Husain: What’s wild is the new pricing for GPT 3.5 is competitive with commercially hosted ~ 70B Llama endpoints like those offered by anyscale and http://fireworks.ai. Cost is eroding as a moat gpt-3.5-turbo-1106 Pricing is $1/1M input and $2/1M output. [Versus 0.15 and 0.20 per million tokens]

I don’t interpret the numbers that way yet. There is still a substantial difference at scale, a factor of five or six. If you cannot afford the superior GPT-4 for a given use case, you may want the additional discount. And as all costs go down, there will be temptation to use far more queries. A factor of five is not nothing.

I’m going to skip ahead a bit to take care of all the incremental stuff first:

All right, back to normal unscary things, there’s new modalities?

New modalities in the API

GPT-4 Turbo with vision

GPT-4 Turbo can accept images as inputs in the Chat Completions API, enabling use cases such as generating captions, analyzing real world images in detail, and reading documents with figures. For example, BeMyEyes uses this technology to help people who are blind or have low vision with daily tasks like identifying a product or navigating a store. Developers can access this feature by using gpt-4-vision-preview in the API. We plan to roll out vision support to the main GPT-4 Turbo model as part of its stable release. Pricing depends on the input image size. For instance, passing an image with 1080×1080 pixels to GPT-4 Turbo costs $0.00765. Check out our vision guide.

DALL·E 3

Developers can integrate DALL·E 3, which we recently launched to ChatGPT Plus and Enterprise users, directly into their apps and products through our Images API by specifying dall-e-3 as the model. Companies like Snap, Coca-Cola, and Shutterstock have used DALL·E 3 to programmatically generate images and designs for their customers and campaigns. Similar to the previous version of DALL·E, the API incorporates built-in moderation to help developers protect their applications against misuse. We offer different format and quality options, with prices starting at $0.04 per image generated. Check out our guide to getting started with DALL·E 3 in the API.

Text-to-speech (TTS)

Developers can now generate human-quality speech from text via the text-to-speech API. Our new TTS model offers six preset voices to choose from and two model variants, tts-1 and tts-1-hd. tts is optimized for real-time use cases and tts-1-hd is optimized for quality. Pricing starts at $0.015 per input 1,000 characters. Check out our TTS guide to get started.

I can see the DALL-E 3 prices adding up to actual money. When I use Stable Diffusion, it is not so unusual that I ask for the full 100 generations, then go away for a while and come back, why not? Of course, it would be worth it for the quality boost, provided DALL-E 3 was willing to do whatever I happened to want that day. The text-to-speech seems not free but highly reasonably priced. All the voices seem oddly similar. I do like them. When do we get our licensed celebrity voice options? So many good choices.

Model customization

GPT-4 fine tuning experimental access

We’re creating an experimental access program for GPT-4 fine-tuning. Preliminary results indicate that GPT-4 fine-tuning requires more work to achieve meaningful improvements over the base model compared to the substantial gains realized with GPT-3.5 fine-tuning. As quality and safety for GPT-4 fine-tuning improves, developers actively using GPT-3.5 fine-tuning will be presented with an option to apply to the GPT-4 program within their fine-tuning console.

All right, sure, I suppose it is that time, makes sense that improvement is harder. Presumably it is easier if you want a quirkier thing. I do not know how the fine-tuning is protected against jailbreak attempts, anyone want to explain?

Custom models

For organizations that need even more customization than fine-tuning can provide (particularly applicable to domains with extremely large proprietary datasets—billions of tokens at minimum), we’re also launching a Custom Models program, giving selected organizations an opportunity to work with a dedicated group of OpenAI researchers to train custom GPT-4 to their specific domain. This includes modifying every step of the model training process, from doing additional domain specific pre-training, to running a custom RL post-training process tailored for the specific domain. Organizations will have exclusive access to their custom models. In keeping with our existing enterprise privacy policies, custom models will not be served to or shared with other customers or used to train other models. Also, proprietary data provided to OpenAI to train custom models will not be reused in any other context. This will be a very limited (and expensive) program to start—interested orgs can apply here.

Expensive is presumably the watchword. This will not be cheap. Then again, compared to the potential, could be very cheap indeed.

So far, so incremental, you love to see it, and… wait, what?

Today, we’re releasing the Assistants API, our first step towards helping developers build agent-like experiences within their own applications. An assistant is a purpose-built AI that has specific instructions, leverages extra knowledge, and can call models and tools to perform tasks. The new Assistants API provides new capabilities such as Code Interpreter and Retrieval as well as function calling to handle a lot of the heavy lifting that you previously had to do yourself and enable you to build high-quality AI apps.

This API is designed for flexibility; use cases range from a natural language-based data analysis app, a coding assistant, an AI-powered vacation planner, a voice-controlled DJ, a smart visual canvas—the list goes on. The Assistants API is built on the same capabilities that enable our new GPTs product: custom instructions and tools such as Code interpreter, Retrieval, and function calling.

A key change introduced by this API is persistent and infinitely long threads, which allow developers to hand off thread state management to OpenAI and work around context window constraints. With the Assistants API, you simply add each new message to an existing thread.

Assistants also have access to call new tools as needed, including:

  • Code Interpreter: writes and runs Python code in a sandboxed execution environment, and can generate graphs and charts, and process files with diverse data and formatting. It allows your assistants to run code iteratively to solve challenging code and math problems, and more.

  • Retrieval: augments the assistant with knowledge from outside our models, such as proprietary domain data, product information or documents provided by your users. This means you don’t need to compute and store embeddings for your documents, or implement chunking and search algorithms. The Assistants API optimizes what retrieval technique to use based on our experience building knowledge retrieval in ChatGPT.

  • Function calling: enables assistants to invoke functions you define and incorporate the function response in their messages.

As with the rest of the platform, data and files passed to the OpenAI API are never used to train our models and developers can delete the data when they see fit.

You can try the Assistants API beta without writing any code by heading to the Assistants playground.

It has been months since I wrote On AutoGPT and everyone was excited. All the hype around agents died off and everyone seemed to despair of making them work in the current model generation. OpenAI had the inside track in many ways, so perhaps they made it work a lot better? We will find out. If you’re not a little nervous, that seems like a mistake.

All right, what’s up with these ‘GPTs’?

First off, horrible name, highly confusing, please fix. Alas, they won’t.

All right, what do we got?

Ah, one of the obvious things we should obviously do that will open up tons of possibilities, I feel bad I didn’t say it explicitly and get zero credit but we were all thinking it.

We’re rolling out custom versions of ChatGPT that you can create for a specific purpose—called GPTs. GPTs are a new way for anyone to create a tailored version of ChatGPT to be more helpful in their daily life, at specific tasks, at work, or at home—and then share that creation with others. For example, GPTs can help you learn the rules to any board game, help teach your kids math, or design stickers.

Anyone can easily build their own GPT—no coding is required. You can make them for yourself, just for your company’s internal use, or for everyone. Creating one is as easy as starting a conversation, giving it instructions and extra knowledge, and picking what it can do, like searching the web, making images or analyzing data.

Example GPTs are available today for ChatGPT Plus and Enterprise users to try out including Canva and Zapier AI Actions. We plan to offer GPTs to more users soon.

There will be an app GPT store, your privacy is safe, if you are feeling frisky you can connect your APIs and then perhaps noting is safe. This is a minute-long demo of a Puppy Hotline, which is an odd example since I’m not sure why all of that shouldn’t work normally anyway.

An incrementally better example is Sam Altman’s creation of the Startup Mentor, which he has grill the user on why they are not growing faster. Again, this is very much functionally a configuration of an LLM, a GPT, rather than an agent. It might include some if-then statements, perhaps. Which is all good, these are things we want and don’t seem dangerous.

Tyler Cowen’s GOAT is another example. Authors can upload a book plus some instructions, suddenly you can chat with the book, takes minutes to hours of your time.

The educational possibilities alone write themselves. The process is so fast you can create one of these daily for a child’s homework assignment, or create one for yourself to spend an hour learning about something.

One experiment I want someone to run is to try using this to teach someone a foreign language. Consider Scott Alexander’s proposed experiment, where you start with English, and then gradually move over to Japanese grammar and vocabulary over time. Now consider doing that with a GPT, as you do what you were doing anyway, and where you can pause and ask if anything is ever confusing, and you can reply back in a hybrid as well.

The right way to use ChatGPT going forward might be to follow the programmer’s maxim that if you do it three times, you should automate it, except now the threshold might be two and if something is nontrivial it also might be one. You can use others’ versions, but there is a lot to be said for rolling one’s own if the process is easy. If it works well enough, of course. But if so, game changer.

Also goes without saying that if you could combine this with removing the adult content filtering, especially if you still had image generation and audio but even without them, that would be a variety of products in very high demand.

Ethan Mollick sums up the initial state of GPTs this way:

  • Right now, GPTs are the easiest way of sharing structured prompts, which are programs, written in plain English (or another language), that can get the AI to do useful things. I discussed creating structured prompts last week, and all the same techniques apply, but the GPT system makes structured prompts more powerful and much easier to create, test, and share. I think this will help solve some of the most important AI use cases (how do I give people in my school, organization, or community access to a good AI tool?)

  • GPTs show a near future where AIs can really start to act as agents, since these GPTs have the ability to connect to other products and services, from your email to a shopping website, making it possible for AIs to do a wide range of tasks. So GPTs are a precursor of the next wave of AI.

  • They also suggest new future vulnerabilities and risks. As AIs are connected to more systems, and begin to act more autonomously, the chance of them being used maliciously increases.

The easy way to make a GPT is something called GPT Builder. In this mode, the AI helps you create a GPT through conversation. You can also test out the results in a window on the side of the interface and ask for live changes, creating a way to iterate and improve your work.

Behind the scenes, based on the conversation I had, the AI is filling out a detailed configuration of the GPT, which I can also edit manually.

To really build a great GPT, you are going to need to modify or build the structured prompt yourself.

As usual, reliability is not perfect, and mistakes are often silent, a warning not to presume or rely upon a GPT will properly absorb details.

The same thing is true here. The file reference system in the GPTs is immensely powerful, but is not flawless. For example, I fed in over 1,000 pages of rules across seven PDFs for an extremely complex game, and the AI was able to do a good job figuring out the rules, walking me through the process of getting started, and rolling dice to help me set up a character. Humans would have struggled to make all of that work. But it also made up a few details that weren’t in the game, and missed other points entirely. I had no warning that these mistakes happened, and would not have noticed them if I wasn’t cross-referencing the rules myself.

I am totally swamped right now. I am also rather excited to build once I get access.

Alas, for now, that continues to wait.

Sam Altman (November 8): usage of our new features from devday is far outpacing our expectations. we were planning to go live with GPTs for all subscribers Monday but still haven’t been able to. we are hoping to soon. there will likely be service instability in the short term due to load. sorry :/

Kevin Fischer: I like to imagine this is GPT coming alive.

Luckily that is obviously not what is happening, but for the record I do not like to imagine that, because I like remaining alive.

There are going to be a lot of cool things. Also a lot of things that aspire to be cool, that claim they will be cool, that are not cool.

Charles Frye: hope i am proven wrong in my fear that “GPTs” will 100x this tech’s reputation for vaporous demoware.

Vivek Ponnaiyan: It’ll be long tail dynamics like the apple App Store.

Agents are where claims of utility go to die. Even if some of it starts working, expect much death to continue.

From the presentation it seems they will be providing a copyright shield to ChatGPT users in case they get sued. This seems like a very SBF-style moment? It is a great idea, except when maybe just maybe it destroys your entire company, but that totally won’t happen right?

Will this kill a bunch of start-ups, the way Microsoft would by incorporating features into Windows? Yes. It is a good thing, the new way is better for everyone, time to go build something else. Should have planned ahead.

Laura Wendel: new toxic relationship just dropped

Brotzky: All the jokes about OpenAI killing startups with each new release have some validity.

We just removed Pinecone and Langchain from our codebase lowering our monthly fees and removing a lot of complexity.

New Assistants API is fantastic ✨

Downside: having to poll Runs endpoint for a result..

Some caveats

– our usecase is “simple” and assistants api fit us perfectly

– we don’t use agents yet

– we use files a lot Look forward to all this AI competition bringing costs down.

Sam Hogan: Just tested OpenAI’s new Assistant’s API.

This is now all the code you need to create a custom ChatGPT trained on an entire website.

Less than 30 lines 🤯

McKay Wrigley: I’m blown away by OpenAI DevDay… I can’t put into words how much the world just changed. This is a 1000x improvement. We’re living in the infancy of an AI revolution that will bring us a golden age beyond our wildest dreams. It’s time to build.

Interestingly, I had 100x typed out originally and changed it to 1000x. These are obviously not measurable, but it more accurately conveys how I feel. I don’t think people truly grasp (myself included!) what just got unlocked and what is about to happen.

Look, no, stop. This stuff is cool and all. I am super excited to play with GPTs and longer context windows and feature integration. Is it all three orders of magnitude over the previous situation? I mean, seriously, what are you even talking about? I know words do not have meaning, but what of the sacred trust that is numbers?

I suppose if you think of it as ‘10 times better’ meaning ‘good enough that you could use this edge to displace an incumbent service’ rather than ‘this is ten times as useful or valuable’ then yes if it they did their jobs this seems ten times better. Maybe even ten times better twice. But to multiply that together is at best to mix metaphors, and this does not plausibly constitute three consecutive such disruptions.

Unless these new agents actually work far better than anyone expects, in which case who knows. I will note that if so, that does not seem like especially… good… news, in a ‘I hope we are not all about to die’ kind of way.

It is also worth noting that this all means that when GPT-5 does arrive, there will be all this infrastructure waiting to go, that will suddenly get very interesting.

Paul Graham retweeted the above quote, and also this related one:

Paul Graham: This is not a random tech visionary. This is a CEO of an actual AI company. So when he says more has happened in the last year than the previous ten, it’s not just a figure of speech.

Alexander Wang (CEO Scale AI): more has happened in the last year of AI than the prior 10 we are unmistakably in the fiery takeoff of the most important technology of the rest of our lives everybody—governments, citizens, technologists—is awaiting w/bated breath (some helplessly) the next version of humanity.

Being the CEO of an AI company does not seem incompatible with what is traditionally referred to as ‘hype.’ When people in AI talk about factors of ten, one cannot automatically take it at face value, as we saw directly above.

Also, yes, ‘the next version of humanity’ sounds suspiciously like tech speak for ‘also we are all quite possibly going to die, but that is a good thing.’

Ben Thompson has covered a lot of keynote presentations, and found this to be an excellent keynote presentation, perhaps a sign they will make a comeback. While he found the new GPTs exciting, he focuses in terms of business impact where I initially focused, which was on the ordinary and seamless feature improvements.

Users get faster responses, a larger context window, more up to date knowledge, and better integration of modalities of web browsing, vision, hearing, speech and image generation. Quality of life improvements, getting rid of annoyances, filling in practical gaps that made a big marginal difference.

How good are the new context windows? Greg Kamradt reports.

Findings:

GPT-4’s recall performance started to degrade above 73K tokens

Low recall performance was correlated when the fact to be recalled was placed between at 7%-50% document depth

If the fact was at the beginning of the document, it was recalled regardless of context length

So what: No Guarantees – Your facts are not guaranteed to be retrieved. Don’t bake the assumption they will into your applications

Less context = more accuracy – This is well know, but when possible reduce the amount of context you send to GPT-4 to increase its ability to recall

Position matters – Also well know, but facts placed at the very beginning and 2nd half of the document seem to be recalled better.

It makes sense to allow 128k tokens or even more than that, even if performance degrades starting around 73k. For practical purposes, sounds like we want to stick to the smaller amount, but it is good not to have a lower hard cap.

Will Thompson be right that the base UI, essentially no UI at all, is where most users will want to remain and what matters most? Or will we all be using GPTs all the time?

In the short term he is clearly correct. The incremental improvements matter more. But as we learn to build GPTs for ourselves both quick and dirty and bespoke, and learn to use those of others, I expect there to be very large value adds, even if it is ‘you find the 1-3 that work for you and always use them.’

Kevin Fisher notes something I have found as well: LLMs can use web browsing in a pinch, but when you have the option you usually want to avoid this. ‘Do not use web browsing’ will sometimes be a good message to include. Kevin is most concerned about speed but I’ve also found other problems.

Nathan Lebenz suggested on a recent podcast that the killer integration is GPT-4V plus web browsing, allowing the LLM to browse the web and accomplish things. Here is vimGPT, a first attempt. Here are some more demos. We should give people time to see what they can come up with.

Currently conspicuously absent is the ability to make a GPT that selects between a set of available GPTs and then seamlessly calls the one most appropriate to your query. That would combine the functionality into the invisible ‘ultimate’ UI of a text box and attachment button, an expansion of seamless switching between existing base modalities. For now, one can presumably still do this ‘the hard way’ by calling other things that then call your GPTs.

On OpenAI Dev Day Read More »

openai:-facts-from-a-weekend

OpenAI: Facts from a Weekend

Approximately four GPTs and seven years ago, OpenAI’s founders brought forth on this corporate landscape a new entity, conceived in liberty, and dedicated to the proposition that all men might live equally when AGI is created.

Now we are engaged in a great corporate war, testing whether that entity, or any entity so conceived and so dedicated, can long endure.

What matters is not theory but practice. What happens when the chips are down?

To a large extent, even more than usual, we do not know. We should not pretend that we know more than we do.

Rather than attempt to interpret here or barrage with an endless string of reactions and quotes, I will instead do my best to stick to a compilation of the key facts.

Here is OpenAI’s corporate structure, giving the board of the 501c3 the power to hire and fire the CEO. It is explicitly dedicated to its nonprofit mission, over and above any duties to shareholders of secondary entities. Investors were warned that there was zero obligation to ever turn a profit:

Here are the most noteworthy things we know happened, as best I can make out.

  • On Friday afternoon at 3: 28pm, the OpenAI board fired Sam Altman, appointing CTO Mira Murati as temporary CEO effective immediately. They did so over a Google Meet that did not include then-chairmen Greg Brockman.

  • Greg Brockman, Altman’s old friend and ally, was removed as chairman of the board but the board said he would stay on as President. In response, he quit.

  • The board told almost no one. Microsoft got one minute of warning.

  • Mira Murati is the only other person we know was told, which happened on Thursday night.

  • From the announcement by the board: “Mr. Altman’s departure follows a deliberative review process by the board, which concluded that he was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities. The board no longer has confidence in his ability to continue leading OpenAI.”

  • In a statement, the board of directors said: “OpenAI was deliberately structured to advance our mission: to ensure that artificial general intelligence benefits all humanity. The board remains fully committed to serving this mission. We are grateful for Sam’s many contributions to the founding and growth of OpenAI. At the same time, we believe new leadership is necessary as we move forward. As the leader of the company’s research, product, and safety functions, Mira is exceptionally qualified to step into the role of interim CEO. We have the utmost confidence in her ability to lead OpenAI during this transition period.”

  • OpenAI’s board of directors at this point: OpenAI chief scientist Ilya Sutskever, independent directors Quora CEO Adam D’Angelo, technology entrepreneur Tasha McCauley, and Georgetown Center for Security and Emerging Technology’s Helen Toner.

  • Usually a 501c3’s board must have a majority of people not employed by the company. Instead, OpenAI’s said that a majority did not have a stake in the company, due to Sam Altman having zero equity.

  • In response to many calling this a ‘board coup’: “You can call it this way,” Sutskever said about the coup allegation. “And I can understand why you chose this word, but I disagree with this. This was the board doing its duty to the mission of the nonprofit, which is to make sure that OpenAI builds AGI that benefits all of humanity.” AGI stands for artificial general intelligence, a term that refers to software that can reason the way humans do.

    When Sutskever was asked whether “these backroom removals are a good way to govern the most important company in the world?” he answered: “I mean, fair, I agree that there is a not ideal element to it. 100%.”

  • Other than that, the board said nothing in public. I am willing to outright say that, whatever the original justifications, the removal attempt was insufficiently considered and planned and massively botched. Either they had good reasons that justified these actions and needed to share them, or they didn’t.

  • There had been various clashes between Altman and the board. We don’t know what all of them were. We do know the board felt Altman was moving too quickly, without sufficient concern for safety, with too much focus on building consumer products, while founding additional other companies. ChatGPT was a great consumer product, but supercharged AI development counter to OpenAI’s stated non-profit mission.

  • OpenAI was previously planning an oversubscribed share sale at a valuation of $86 billion that was to close a few weeks later.

  • Board member Adam D’Angelo said in a Forbes in January: There’s no outcome where this organization is one of the big five technology companies. This is something that’s fundamentally different, and my hope is that we can do a lot more good for the world than just become another corporation that gets that big.

  • Sam Altman on October 16: “4 times in the history of OpenAI––the most recent time was in the last couple of weeks––I’ve gotten to be in the room when we push the veil of ignorance back and the frontier of discovery forward. Getting to do that is the professional honor of a lifetime.” There was speculation that events were driven in whole or in part by secret capabilities gains within OpenAI, possibly from a system called Gobi, perhaps even related to the joking claim ‘AI has been achieved internally’ but we have no concrete evidence of that.

  • Ilya Sutskever co-leads the Superalignment Taskforce, has very short timelines for when we will get AGI, and is very concerned about AI existential risk.

  • Sam Altman was involved in starting multiple new major tech companies. He was looking to raise tens of billions from Saudis to start a chip company. He was in other discussions for an AI hardware company.

  • Sam Altman has stated time and again, including to Congress, that he takes existential risk from AI seriously. He was part of the creation of OpenAI’s corporate structure. He signed the CAIS letter. OpenAI spent six months on safety work before releasing GPT-4. He understands the stakes. One can question OpenAI’s track record on safety, many did including those who left to found Anthropic. But this was not a pure ‘doomer vs. accelerationist’ story.

  • Sam Altman is very good at power games such as fights for corporate control. Over the years he earned the loyalty of his employees, many of whom moved in lockstep, using strong strategic ambiguity. Hand very well played.

  • Essentially all of VC, tech, founder, financial Twitter united to condemn the board for firing Altman and for how they did it, as did many employees, calling upon Altman to either return to the company or start a new company and steal all the talent. The prevailing view online was that no matter its corporate structure, it was unacceptable to fire Altman, who had built the company, or to endanger OpenAI’s value by doing so. That it was good and right and necessary for employees, shareholders, partners and others to unite to take back control.

  • Talk in those circles is that this will completely discredit EA or ‘doomerism’ or any concerns over the safety of AI, forever. Yes, they say this every week, but this time it was several orders of magnitude louder and more credible. New York Times somehow gets this backwards. Whatever else this is, it’s a disaster.

  • By contrast, those concerned about existential risk, and some others, pointed out that the unique corporate structure of OpenAI was designed for exactly this situation. They also mostly noted that the board clearly handled decisions and communications terribly, but that there was much unknown, and tried to avoid jumping to conclusions.

  • Thus we are now answering the question: What is the law? Do we have law? Where does the power ultimately lie? Is it the charismatic leader that ultimately matters? Who you hire and your culture? Can a corporate structure help us, or do commercial interests and profit motives dominate in the end?

  • Great pressure was put upon the board to reinstate Altman. They were given two 5pm Pacific deadlines, on Saturday and Sunday, to resign. Microsoft’s aid, and that of its CEO Satya Nadella, was enlisted in this. We do not know what forms of leverage Microsoft did or did not bring to that table.

  • Sam Altman tweets ‘I love the openai team so much.’ Many at OpenAI respond with hearts, including Mira Murati.

  • Invited by employees including Mira Murati and other top executives, Sam Altman visited the OpenAI offices on Sunday. He tweeted ‘First and last time i ever wear one of these’ with a picture of his visitors pass.

  • The board does not appear to have been at the building at the time.

  • Press reported that the board had agreed to resign in principle, but that snags were hit over who the replacement board would be, and over whether or not they would need to issue a statement absolving Altman of wrongdoing, which could be legally perilous for them given their initial statement.

  • Bloomberg reported on Sunday 11: 16pm that temporary CEO Mira Murati aimed to rehire Altman and Brockman, while board sought alternative CEO.

  • OpenAI board hires former Twitch CEO Emmett Shear to be the new CEO. He issues his initial statement here. I know a bit about him. If the board needs to hire a new CEO from outside that takes existential risk seriously, he seems to me like a truly excellent pick, I cannot think of a clearly better one. The job set for him may or may not be impossible. Shear’s PPS in his note: PPS: “Before I took the job, I checked on the reasoning behind the change. The board did *notremove Sam over any specific disagreement on safety, their reasoning was completely different from that. I’m not crazy enough to take this job without board support for commercializing our awesome models.”

  • New CEO Emmett Shear has made statements in favor of slowing down AI development, although not a stop. His p(doom) is between 5% and 50%. He has said ‘My AI safety discourse is 100% “you are building an alien god that will literally destroy the world when it reaches the critical threshold but be apparently harmless before that.”’ Here is a thread and video link with more, transcript here or a captioned clip. Here he is tweeting a 2×2 faction chart a few days ago.

  • Microsoft CEO Satya Nadella posts 2: 53am Monday morning: We remain committed to our partnership with OpenAI and have confidence in our product roadmap, our ability to continue to innovate with everything we announced at Microsoft Ignite, and in continuing to support our customers and partners. We look forward to getting to know Emmett Shear and OAI’s new leadership team and working with them. And we’re extremely excited to share the news that Sam Altman and Greg Brockman, together with colleagues, will be joining Microsoft to lead a new advanced AI research team. We look forward to moving quickly to provide them with the resources needed for their success.

  • Sam Altman retweets the above with ‘the mission continues.’ Brockman confirms. Other leadership to include Jackub Pachocki the GPT-4 lead, Szymon Sidor and Aleksander Madry.

  • Nadella continued in reply: I’m super excited to have you join as CEO of this new group, Sam, setting a new pace for innovation. We’ve learned a lot over the years about how to give founders and innovators space to build independent identities and cultures within Microsoft, including GitHub, Mojang Studios, and LinkedIn, and I’m looking forward to having you do the same.

  • Ilya Sutskever posts 8: 15am Monday morning: I deeply regret my participation in the board’s actions. I never intended to harm OpenAI. I love everything we’ve built together and I will do everything I can to reunite the company. Sam retweets with three heart emojis. Jan Leike, the other head of the superalignment team, Tweeted that he worked through the weekend on the crisis, and that the board should resign.

  • Microsoft stock was down -1% after hours on Friday, was back to roughly its previous value on Monday morning and at the open. All priced in. Neither Google or S&P made major moves either.

  • 505 of 770 employees of OpenAI, including Ilya Sutskever, sign a letter telling the board to resign and reinstate Altman and Brockman (later claimed to be up to about 650), threatening to otherwise move to Microsoft to work in the new subsidiary under Altman, which will have a job for every OpenAI employee. Full text of the letter that was posted: To the Board of Directors at OpenAI,

    OpenAl is the world’s leading Al company. We, the employees of OpenAl, have developed the best models and pushed the field to new frontiers. Our work on Al safety and governance shapes global norms. The products we built are used by millions of people around the world. Until now, the company we work for and cherish has never been in a stronger position.

    The process through which you terminated Sam Altman and removed Greg Brockman from the board has jeopardized all of this work and undermined our mission and company. Your conduct has made it clear you did not have the competence to oversee OpenAI.

    When we all unexpectedly learned of your decision, the leadership team of OpenAl acted swiftly to stabilize the company. They carefully listened to your concerns and tried to cooperate with you on all grounds. Despite many requests for specific facts for your allegations, you have never provided any written evidence. They also increasingly realized you were not capable of carrying out your duties, and were negotiating in bad faith.

    The leadership team suggested that the most stabilizing path forward – the one that would best serve our mission, company, stakeholders, employees and the public – would be for you to resign and put in place a qualified board that could lead the company forward in stability. Leadership worked with you around the clock to find a mutually agreeable outcome. Yet within two days of your initial decision, you again replaced interim CEO Mira Murati against the best interests of the company. You also informed the leadership team that allowing the company to be destroyed “would be consistent with the mission.”

    Your actions have made it obvious that you are incapable of overseeing OpenAl. We are unable to work for or with people that lack competence, judgement and care for our mission and employees. We, the undersigned, may choose to resign from OpenAl and join the newly announced Microsoft subsidiary run by Sam Altman and Greg Brockman. Microsoft has assured us that there are positions for all OpenAl employees at this new subsidiary should we choose to join. We will take this step imminently, unless all current board members resign, and the board appoints two new lead independent directors, such as Bret Taylor and Will Hurd, and reinstates Sam Altman and Greg Brockman.

    1. Mira Murati

    2. Brad Lightcap

    3. Jason Kwon

    4. Wojciech Zaremba

    5. Alec Radford

    6. Anna Makanju

    7. Bob McGrew

    8. Srinivas Narayanan

    9. Che Chang

    10. Lillian Weng

    11. Mark Chen

    12. Ilya Sutskever

  • There is talk that OpenAI might completely disintegrate as a result, that ChatGPT might not work a few days from now, and so on.

  • It is very much not over, and still developing.

  • There is still a ton we do not know.

  • This weekend was super stressful for everyone. Most of us, myself included, sincerely wish none of this had happened. Based on what we know, there are no villains in the actual story that matters here. Only people trying their best under highly stressful circumstances with huge stakes and wildly different information and different models of the world and what will lead to good outcomes. In short, to all who were in the arena for this on any side, or trying to process it, rather than spitting bile: ❤️.

  • Later, when we know more, I will have many other things to say, many reactions to quote and react to. For now, everyone please do the best you can to stay sane and help the world get through this as best you can.

    OpenAI: Facts from a Weekend Read More »

    openai:-the-battle-of-the-board

    OpenAI: The Battle of the Board

    Discover more from Don’t Worry About the Vase

    A world made of gears. Doing both speed premium short term updates and long term world model building. Currently focused on weekly AI updates. Explorations include AI, policy, rationality, medicine and fertility, education and games.

    Over 10,000 subscribers

    Previously: OpenAI: Facts from a Weekend.

    On Friday afternoon, OpenAI’s board fired CEO Sam Altman.

    Overnight, an agreement in principle was reached to reinstate Sam Altman as CEO of OpenAI, with an initial new board of Bret Taylor (ex-co-CEO of Salesforce, chair), Larry Summers and Adam D’Angelo.

    What happened? Why did it happen? How will it ultimately end? The fight is far from over. 

    We do not entirely know, but we know a lot more than we did a few days ago.

    This is my attempt to put the pieces together.

    This was and still is a fight about control of OpenAI, its board, and its direction.

    This has been a long simmering battle and debate. The stakes are high.

    Until recently, Sam Altman worked to reshape the company in his own image, while clashing with the board, and the board did little.

    While I must emphasize we do not know what motivated the board, a recent power move by Altman likely played a part in forcing the board’s hand.

    The structure of OpenAI and its board put control in doubt.

    Here is a diagram of OpenAI’s structure:

    Here is OpenAI’s mission statement, the link has intended implementation details as well:

    This document reflects the strategy we’ve refined over the past two years, including feedback from many people internal and external to OpenAI. The timeline to AGI remains uncertain, but our Charter will guide us in acting in the best interests of humanity throughout its development.

    OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity. We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.

    OpenAI warned investors that they might not make any money:

    The way a 501(c)3 works is essentially that the board is answerable to no one. If you have a majority of the board for one meeting, you can take full control of the board.

    But does the board have power? Sort of. It has a supervisory role, which means it can hire or fire the CEO. Often the board uses this leverage to effectively be in charge of major decisions. Other times, the CEO effectively controls the board and the CEO does what he wants.

    A critical flaw is that firing (and hiring) the CEO, and choosing the composition of a new board, is the board’s only real power.

    The board only has one move. It can fire the CEO or not fire the CEO. Firing the CEO is a major escalation that risks disruption. But escalation and disruption have costs, reputational and financial. Knowing this, the CEO can and often does take action to make them painful to fire, or calculates that the board would not dare.

    While his ultimate goals for OpenAI are far grander, Sam Altman wants OpenAI for now to mostly function as an ordinary Big Tech company in partnership with Microsoft. He wants to build and ship, to move fast and break things. He wants to embark on new business ventures to remove bottlenecks and get equity in the new ventures, including planning a Saudi-funded chip factory in the UAE and starting an AI hardware project. He lobbies in accordance with his business interests, and puts a combination of his personal power, valuation and funding rounds, shareholders and customers first.

    To that end, over the course of years, he has remade the company culture through addition and subtraction, hiring those who believe in this mission and who would be personally loyal to him. He has likely structured the company to give him free rein and hide his actions from the board and others. Normal CEO did normal CEO things.

    Altman is very good, Paul Graham says best in the world, at becoming powerful and playing power games. That, and scaling tech startups, are core to his nature. One assumes that ‘not being fully candid’ and other strategic action was part of this.

    Sam Altman’s intermediate goal has been, from the beginning, full personal control of OpenAI, and thus control over the construction of AGI. Power always wants more power. I can’t fault him for rational instrumental convergence in his goals. The ultimate goal is to build ‘safe’ AGI.

    That does not mean that Sam Altman does not believe in the necessity of ensuring that AI is safe. Altman understands this. I do not think he understands how hard it will be or the full difficulties that lie ahead, but he understands such questions better than most. He signed the CAIS letter. He testified frankly before Congress. Unlike many who defended him, Altman understands that AGI will be truly transformational, and he does not want humans to lose control over the future. He does not mean the effect on jobs.

    To be clear, I do think that Altman sincerely believes that his way is best for everyone.

    Right before Altman was fired, Altman had firm control over two board seats. One was his outright. Another belonged to Greg Brockman.

    That left four other board members.

    Helen Toner, Adam D’Angelo and Tasha McCauley had a very different perspective on the purpose of OpenAI and what was good for the world.

    They did not want OpenAI to be a big tech company. They do not want OpenAI to move as quickly as possible to train and deploy ever-more-capable frontier models, and sculpt them into maximally successful consumer products. They acknowledge the need for commercialization in order to raise funds, and I presume that such products can provide great value for people and that this is good.

    They want a more cautious approach, that avoids unnecessarily creating or furthering race dynamics with other labs or driving surges of investment like we saw after ChatGPT, that takes necessary precautions at each step. And they want to ensure that the necessary controls are in place, including government controls, for when the time comes that AGI is on the line, to ensure we can train and deploy it safely.

    Adam D’Angelo said the whole point was not to let OpenAI become a big tech company. Helen Toner is a strong advocate for policy action to guard against existential risk. I presume from what we know Tasha McCauley is in the same camp.

    Ilya Sutskever loves OpenAI and its people, and the promise of building safe AGI. He had reportedly become increasingly concerned that timelines until AGI could be remarkably short. He was also reportedly concerned Altman was moving too fast and was insufficiently concerned with the risks. He may or may not have been privy to additional information about still-undeployed capabilities advances.

    Reports are Ilya’s takes on alignment have been epistemically improving steadily. He is co-leading the Superalignment Taskforce seeking to figure out how to align future superintelligent AI. I am not confident in what alignment takes I have heard from members of the taskforce, but Ilya is an iterator, and my hope is that timelines are not as short as he thinks and Ilya, Jan Leike and his team can figure it out before the chips have to be down.

    Ilya later reversed course, after the rest of the board fully lost control of the narrative and employees, and the situation threatened to tear OpenAI apart.

    Altman and the board were repeatedly clashing. Altman continued to consolidate his power, confident that Ilya would not back an attempt to fire him. But it was tense. It would be much better to have a board more clearly loyal to Altman, more on board with the commercial mission.

    Then Altman saw an opportunity.

    In October, board member Helen Toner, together with Andrew Imbrie and Owen Daniels, published the paper Decoding Intentions: Artificial Intelligence and Costly Signals.

    The paper correctly points out that while OpenAI engages in costly signaling and takes steps to ensure safety, Anthropic does more costly signaling and takes more steps to ensure safety, and puts more emphasis on communicating this message. That is not something anyone could reasonably disagree with. The paper also notes that others have criticized OpenAI, and says OpenAI could and perhaps should do more. The biggest criticism in the paper is that it asserts that ChatGPT set off an arms race, with Anthropic’s Claude only following afterwards. This is very clearly true. OpenAI didn’t expect ChatGPT to take off like it did, but in practice ChatGPT definitely set off an arms race. To the extent it is a rebuke, it is stating facts. 

    However, the paper was sufficiently obscure that, if I saw it at all, I don’t remember it in the slightest. It is a trifle.

    Altman strongly rebuked Helen Toner for the paper, according to the New York Times.

    In the email, Mr. Altman said that he had reprimanded Ms. Toner for the paper and that it was dangerous to the company, particularly at a time, he added, when the Federal Trade Commission was investigating OpenAI over the data used to build its technology.

    Ms. Toner defended it as an academic paper that analyzed the challenges that the public faces when trying to understand the intentions of the countries and companies developing A.I. But Mr. Altman disagreed.

    “I did not feel we’re on the same page on the damage of all this,” he wrote in the email. “Any amount of criticism from a board member carries a lot of weight.”

    Senior OpenAI leaders, including Mr. Sutskever, who is deeply concerned that A.I. could one day destroy humanity, later discussed whether Ms. Toner should be removed, a person involved in the conversations said.

    From my perspective, even rebuking Toner here is quite bad. It is completely inconsistent with the nonprofit’s mission to not allow debate and disagreement and criticism. I do not agree with Altman’s view that Toner’s paper ‘carried a lot of weight,’ and I question whether Altman believed it either. But even if the paper did carry weight, we are not going to get through this crucial period if we cannot speak openly. Altman’s reference to the FTC investigation is a non-sequitur given the content of the paper as far as I can tell.

    Sam Altman then attempted to use this (potentially manufactured) drama to get Toner removed from the board. He used a similar tactic at Reddit, a manufactured crisis to force others to give up power. Once Toner was gone, presumably Altman would have moved to reshape the rest of the board.

    The board had a choice.

    If Ilya was willing to cooperate, the board could fire Altman, with the Thanksgiving break available to aid the transition, and hope for the best.

    Alternatively, the board could choose once again not to fire Altman, watch as Altman finished taking control of OpenAI and turned it into a personal empire, and hope this turns out well for the world.

    They chose to pull the trigger.

    We do not know what the board knew, or what combination of factors ultimately drove their decision. The board made a strategic decision not to explain their reasoning or justifications in detail during this dispute.

    What do we know?

    The board felt it at least had many small data points saying it could not trust Altman, in combination with Altman’s known power-seeking moves elsewhere (e.g. what happened at Reddit), and also that Altman was taking many actions that the board might reasonably see as in direct conflict with the mission.

    Why has the board failed to provide details on deception? Presumably because without one clear smoking gun, any explanations would be seen as weak sauce. All CEOs do some amount of manipulation and politics and withholding information. When you give ten examples, people often then judge on the strength of the weakest one rather than adding them up. Providing details might also burn bridges and expose legal concerns, make reconciliations and business harder. There is still much we do not know about what we do not know.

    What about concrete actions?

    Altman was raising tens of billions from Saudis, to start a chip company to rival Nvidia, which was to produce its chips in the UAE, leveraging the fundraising for OpenAI in the process. For various reasons this is kind of a ‘wtf’ move. 

    Altman was also looking to start an AI hardware device company. Which in principle seems good and fine, mundane utility, but as the CEO of OpenAI with zero equity the conflict is obvious.

    Altman increasingly focused on shipping products the way you would if you were an exceptional tech startup founder, and hired and rebuilt the culture in that image. Anthropic visibly built a culture of concern about safety in a way that OpenAI did not.

    Concerns about safety (at least in part) led to the Anthropic exodus after the board declined to remove Altman at that time. If you think the exit was justified, Altman wasn’t adhering to the charter. If you think it wasn’t, Altman created a rival and intensified arms race dynamics, which is a huge failure.

    This article claims both OpenAI and Microsoft were central in lobbying to take any meaningful requirements for foundation models out of the EU’s AI Act. If I was a board member, I would see this as incompatible with the OpenAI charter.

    Altman aggressively cut prices in ways that prioritized growth and made OpenAI much more dependent on Microsoft and further fueling the boom in AI development. Other deals and arrangements with Microsoft deepened the dependence, while making a threatened move to Microsoft more credible.

    He also offered a legal shield to users on copyright infringement, potentially endangering the company. It seems reasonable to assume he moved to expand OpenAI’s offerings on Dev Day in the face of safety concerns. Various attack vectors seem fully exposed.

    And he reprimanded and moved to remove Helen Toner from the board for writing a standard academic paper exploring how to pursue AI safety. To me this is indeed a smoking gun, although I understand why they did not expect others to see it that way.

    From the perspective of the public, or of winning over hearts and minds inside or outside the company, the board utterly failed in its communications. Rather than explain, the board issued its statement:

    Mr. Altman’s departure follows a deliberative review process by the board, which concluded that he was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities. The board no longer has confidence in his ability to continue leading OpenAI.

    Then it went silent. The few times it offered any explanation at all, the examples were so terrible it was worse than staying silent.

    As the previous section shows, the board had many reasonable explanations it could have given. Would that have mattered? Could they have won over enough of the employees for this to have turned out differently? We will never know.

    Even Emmett Shear, the board’s third pick for CEO (after Nat Friedman and Alex Wang), couldn’t get a full explanation. His threat to walk away without one was a lot of what enabled the ultimate compromise on a new board and bringing Altman back.

    By contrast, based on what we know, Altman played the communication and political games masterfully. Three times he coordinated the employees to his side in ways that did not commit anyone to anything. The message his team wanted to send about what was going on consistently was the one that got reported. He imposed several deadlines, the deadlines passed and no credibility was lost. He threatened joining Microsoft and taking all the employees, without ever agreeing to anything there either.

    Really great show. Do you want your leader to have those skills? That depends.

    Note the symmetry. Both sides were credibly willing to blow up OpenAI rather than give board control, and with it ultimate control of OpenAI, to the other party. Both sides were willing to let Altman return as CEO only under the right conditions.

    Altman moved to take control. Once the board pulled the trigger firing him in response, Altman had a choice on what to do next, even if we all knew what choice he would make. If Altman wanted OpenAI to thrive without him, he could have made that happen. Finding investors would not have been an issue for OpenAI, or for Altman’s new ventures whatever they might be.

    Instead, as everyone rightfully assumed he would, he chose to fight, clearly willing to destroy OpenAI if not put back in charge. He and the employees demanded the board resign, offering unconditional surrender. Or else they would all move to Microsoft.

    That is a very strong negotiating position.

    You know what else is a very strong negotiating position? Believing that OpenAI, if you give full control over it to Altman, would be a net negative for the company’s mission.

    This was misrepresented to be ‘Helen Toner thinks that destroying OpenAI would accomplish its mission of building safe AGI.’ Not so. She was expressing, highly reasonably, the perspective that if the only other option was Altman with full board control, determined to use his status and position to blitz forward on AI on all fronts, not taking safety sufficiently seriously, that might well be worse for OpenAI’s mission than OpenAI ceasing to exist in its current form.

    Thus, a standoff. A negotiation. Unless both the board and Sam Altman agree that OpenAI survives, OpenAI does not survive. They had to agree on a new governance framework. That means a new board.

    Which in the end was Bret Taylor, Larry Summers and Adam D’Angelo.

    Emmett Shear is saying mission accomplished, passes the baton, seeing this result as the least bad option including for safety. This is strong evidence it was the least bad option.

    Altman, Brockman and the employees are declaring victory. They are so back.

    Mike Solana summarizes as ‘Altman knifed the board.’

    Not so fast.

    This is not over.

    Altman very much did not get an obviously controlled board.

    The succession problem is everything.

    What will the new board do? What full board will they select? What will they find when they investigate further? 

    These are three ‘adults in the room’ to be sure. But D’Angelo already voted to fire Altman once, and Summers is a well-known bullet-biter and is associated with Effective Altruism.  

    If you assume that Altman was always in the right, everyone knows it is his company to run as he wants to maximize profits, and any sane adult would side with him? Then you assume Bret Taylor and Larry Summers will conclude that as well. 

    If you do not assume that, if you assume OpenAI is a non-profit with a charter? If you see many of Altman’s actions and instincts as in conflict with that charter? If you think there might be a lot of real problems here, including things we do not know? If you think that this new board could lead to a new expanded board that could serve as a proper check on Altman, without the lack of gravitas and experience that plagued the previous board, and with a fresh start on employee relations?

    If you think OpenAI could be a great thing for the world, or its end, depending on choices we make?

    Then this is far from over. 

    OpenAI: The Battle of the Board Read More »

    ai-#39:-the-week-of-openai

    AI #39: The Week of OpenAI

    The board firing Sam Altman, then reinstating him, dominated everything else this week. Other stuff also happened, but definitely focus on that first.

    Developments at OpenAI were far more important than everything else this read. So you can read this timeline of events over the weekend, and this attempt to put all the information together.

    1. Introduction.

    2. Table of Contents.

    3. Language Models Offer Mundane Utility. Narrate your life, as you do all life.

    4. Language Models Don’t Offer Mundane Utility. Prompt injection unsolved.

    5. The Q Continuum. Disputed claims about new training techniques.

    6. OpenAI: The Saga Continues. The story is far from over.

    7. Altman Could Step Up. He understands existential risk. Now he can act.

    8. You Thought This Week Was Tough. It is not getting any easier.

    9. Fun With Image Generation. A few seconds of an Emu.

    10. Deepfaketown and Botpocalypse Soon. Beware phone requests for money.

    11. They Took Our Jobs. Freelancers in some areas are in trouble.

    12. Get Involved. Dave Orr hiring for DeepMind alignment team.

    13. Introducing. Claude 2.1 looks like a substantial incremental improvement.

    14. In Other AI News. Meta breaks up ‘responsible AI’ team. Microsoft invests $50b.

    15. Quiet Speculations. Will deep learning hit a wall?

    16. The Quest for Sane Regulation. EU AI Act struggles, FTC AI definition is nuts.

    17. That Is Not What Totalitarianism Means. People need to cut that claim out.

    18. The Week in Audio. Sam Altman, Yoshua Bengio, Davidad, Ilya Sutskever.

    19. Rhetorical Innovation. David Sacks says it best this week.

    20. Aligning a Smarter Than Human Intelligence is Difficult. Technical debates.

    21. People Are Worried About AI Killing Everyone. Roon fully now in this section.

    22. Other People Are Not As Worried About AI Killing Everyone. Listen to them.

    23. The Lighter Side. Yes, of course I am, but do you even hear yourself?

    GPT-4-Turbo substantially outperforms GPT-4 on Arena leaderboard. GPT-3.5-Turbo is still ahead of every model not from either OpenAI or Anthropic. Claude-1 outscores Claude-2 and is very close to old GPT-4 for second place, which is weird.

    Own too much cryptocurrency? Ian built a GPT that can ‘bank itself using blockchains.’

    Paper says AI pancreatic cancer detection finally outperforming expert radiologists. This is the one we keep expecting that keeps not happening.

    David Attenborough narrates your life how-to guide, using Eleven Labs and GPT-4V. Code here. Good pick. Not my top favorite, but very good pick.

    Another good pick, Larry David as productivity coach.

    Oh no.

    Kai Greshake: PSA: The US Military is actively testing and deploying LLMs to the battlefield. I think these systems are likely to be vulnerable to indirect prompt injection by adversaries. I’ll lay out the story in this thread.

    This is http://Scale.ai’s Donovan model. Basically, they let an LLM see and search through all of your military data (assets and threat intelligence) and then it tells you what you should do..

    Now, it turns out to be really useful if you let the model see news and public information as well. This is called open-source intelligence or OSINT. In this screenshot, you can see them load “news and press reports” from the target area that the *adversarycan publish!

    We’ve shown many times that if an attacker can inject text into your model, you get to “reprogram” it with natural language. Imagine hiding & manipulating information that is presented to the operators and then having your little adversarial minion tell them where to strike.

    Unfortunately the goal here is to shorten the time to a decision, so cross-checking everything is impossible, and they are not afraid to talk about the intentions. There will be a “human in the loop”, but that human would get their information from the attacker’s minion!

    @alexlevinson (head of security at scale) responded to me, saying these are “potential vulnerabilities inherent to AI systems, […] do not automatically translate into specific vulnerabilities within individual AI systems”

    And that “Each AI system […] is designed with unique security measures that may or may not be susceptible to the vulnerabilities you’ve identified”.

    Now, I’ve not had any access to Donovan, and am only judging based on the publicly available information and my expertise. I hope everyone can judge for themselves whether they trust Scale to have found a secret fix to this issue that gives them confidence to deploy.. or not.

    Yeah, this is, what’s the term, completely insane? Not in a ‘you are going to wake up Skynet’ way, although it certainly is not helping on such matters but in a ‘you are going to get prompt injected and otherwise attacked by your enemies’ way.

    This does not even get into the ways in which such a system might, for example, be used to generate leaks of classified documents and other intel.

    You can hook the LLM up to the weapons and your classified data. You can hook the LLM up to outside data sources. You cannot responsibly or safely do both. Pick one.

    Robin Hanson survey finds majority see not too much productivity enhancement in software yet.

    Robin Hanson: Median estimate of ~7% cut in software time/cost over last 10 years (ignoring LLMs), ~4% recently due to LLMs. But high variance of estimates.

    Robin Debreuil: Lots of experience, and I’m 100% sure it’s already less than 90. Also a lot of this saving are on the front end of development (finding and integrating technologies, testing, etc), so prototyping ideas much faster. Quality will improve dramatically too, but hasn’t so much yet.

    I agree with Debreuil for most projects. What is less obvious is if this yet applies to the most expensive and valuable ones, where the results need to be rather bulletproof. My presumption is that it still does. I know my productivity coding is up dramatically.

    There was a disputed claim in Reuters that prior to Altman’s ouster, the board was given notice by several researchers of alarming progress in tests of a new algorithmic technique called Q*, and that this contributed to Altman’s ouster. Q refers to a known type of RL algorithm, which it makes sense for OpenAI to have been working on.

    The reported results are not themselves scary, but could point to scary potential later. If Altman had not shared the results with the board, that could be part of ‘not consistently candid.’ However, this story has been explicitly denied in Verge, with their editor Alex Heath saying multiple sources claimed it wasn’t true, and my prediction market has the story as 29% to be substantively true even offering ‘some give.’ This other market says 40% that Qis ‘a significant capabilities advance.’

    For now I will wait for more info. Expect follow-up later.

    Sam Altman was fired from OpenAI. Now he’s back. For details, see my two posts on the subject, OpenAI: Facts from a Weekend and OpenAI: Battle of the Boards.

    The super short version is that Altman gave the board various reasons to fire him that we know about and was seeking to consolidate power, the board fired Altman essentially without explanation, Altman rallied investors especially Microsoft and 97% of the employees, he threatened to have everyone leave and join Microsoft, and the board agreed to resign in favor of a new negotiated board and bring Altman back.

    What happens next depends on the full board chosen and who functionally controls it. The new temporary board is Brad Taylor, Larry Summers and Adam D’Angelo. The final board will have nine members, one from Microsoft at least as an observer, and presumably Altman will eventually return. That still leaves room for any number of outcomes. If they create a new board that cares about safety enough to make a stand and can serve as a check on Altman, that is a very different result than if the board ends up essentially under Altman’s control, or as a traditional board of CEOs who unlike D’Angelo prioritize investors and profits over humanity not dying. We shall see.

    The loud Twitter statements continue to be that this was a total victory for Altman and for the conducting of ordinary profit-maximizing VC-SV-style business. Or that there is no other way any threat to corporate profits could ever end. That is all in large part a deliberate attempt at manifestation and self-fulfilling declaration. Power as the magician’s trick, residing where people believe it resides.

    Things could turn out that way, but do not confuse such power plays with reality. We do not yet know what the ultimate outcome will be.

    Nor was Altman’s reinstatement inevitable. Imagine a world in which the board, instead of remaining silent, made its case, and also brought in credible additional board members while firing Altman (let’s say Taylor, Summers, Shear and Mira Murati), and also did this after the stock sale to employees had finished. I bet that goes considerably differently.

    Recommended: A timeline history of the OpenAI board. At one point Altman and Brockman were two of four board members. The board has expanded and contracted many times. No one seems to have taken the board sufficiently seriously during this whole time as a permanent ultimate authority that controls its own succession. How were things allowed to get to this point, from all parties?

    Recommended: Nathan Lebenz offers a thread about his experiences red teaming GPT-4. Those at OpenAI did not realize what they had, they were too used to worrying about shortcomings to see how good their new model was. Despite the willingness to wait a long time before deployment, he finds the efforts unguided, most involved checked out, a fully inadequate process. Meanwhile for months the board was given no access to GPT-4, and when Lebenz went to the board attacks on Lebenz’s character were used to silence him.

    At the ‘we’re so back’ party at OpenAI, there was a fire alarm triggered by a smoke machine, causing two fire trucks to show up. All future fire alarms are hereby discredited, as are reality’s hack writers. Do better.

    Bloomberg gives additional details on Wednesday afternoon. There will be an independent investigation into the whole incident.

    A thing Larry Summers once said that seems relevant, from Elizabeth Warren:

    He teed it up this way: I had a choice. I could be an insider or I could be an outsider. Outsiders can say whatever they want. But people on the inside don’t listen to them. Insiders, however, get lots of access and a chance to push their ideas. People – powerful people – listen to what they have to say. But insiders also understand one unbreakable rule: They don’t criticize other insiders.

    I had been warned.

    Stuart Buck interprets this as siding with Altman’s criticism of Toner.

    The other implication in context would be that Altman is this form of insider. Which would mean that he will not listen to anyone who criticizes an insider. Which would mean he will not listen to most meaningful criticism. I like to think that instead what we saw was that Altman is willing to use such principles as weapons.

    My actual understanding of the insider rule is not that insiders will never listen to outside criticism. It is that they do not feel obligated or bound by it, and can choose to ignore it. They can also choose to listen.

    A key question is whether this is Summers endorsing this rule, or whether it is, as I would hope, Summers observing that the rule exists, and providing clarity. The second insider rule is that you do not talk about the insider rules.

    Also on Summers, Bloomberg notes he expects AI to come for white collar jobs. In a worrying sign, he has expressed concern about America ‘losing its lead’ to China. What a world in which our fate largely rests on the world models of Larry Summers.

    Parmy Olson writes in Bloomberg that the previous setup of OpenAI was good for humanity, but bad for Microsoft, that the new board will be traditional and current members scream ‘safe hands’ to investors. And that Microsoft benefits by keeping the new tech at arms length to allow OpenAI to move faster.

    Rob Bensinger asks, if Toner’s statements about OpenAI shutting down potentially being consistent with its mission are considered crazy by all employees, what does that say about potential future actions in a dangerous future?

    Cate Hall reminds us that from the perspective of someone who thinks OpenAI is not otherwise a good thing, those board seats came with a very high price. If the new board proves to not be a check on Altman, and instead the week backfires, years of strategy by those with certain large purse strings made no sense.

    Claim that Altman at his startup Loopt had his friends show up during a negotiation and pretend to be employees working on other major deals to give a false impression. As poster notes, this is Altman being resourceful in the way successful start-up founders are. It is also a classic con artist move, and not the sign of someone worthy of trust.

    After the deal was in place, Vinod Khosla said the nonprofit control system is fine, look at companies like IKEA. Does he not understand the difference?

    Fun claim on Reddit by ‘OpenAIofThrones’ (without private knowledge) of a more specific, more extreme version of what I outline in Battle of the Board. That Altman tried to convene the board without Toner to expel her, Ilya balked, that presented both the means and a short window to fire Altman before Ilya changed his mind, and that ultimately Altman blinked and agreed to real supervision.

    Whatever else happens, we can all set aside our differences to point out the utter failure of The New York Times to understand what happened.

    Garry Tan: NYT just going with straight ad hominem with no facts on the front page these days Putting the “capitalists” in the crosshairs with the tuxedo photos is some high class real propaganda.

    Paul Graham: OpenAI’s leaders, employees, and code are all about to migrate to Microsoft. Strenuous efforts enable them to remain working for a nonprofit instead. New York Times reaction: “A.I. Belongs to the Capitalists Now.”

    That is very much not how any of this works.

    On the narrow issue below, performative people like to gloat. But Helen Toner is right, and Greg Brockman is wrong.

    It is very much an underdog, but one sincere hope I have is Nixon Goes to China.

    Altman now has the loyalty of his team, a clear ability to shut down what he helped build if crossed, and unwavering faith of investors who trust him to find a way. No one can say they do not ship. OpenAI retains a rather large lead in the AI race. The e/acc crowd has rallied its flags, and was always more into being against those against things rather than anything else.

    If Altman and his team really do care deeply about the safe part of building safe AGI, he now has the opportunity to do the funniest, and also the best, possible thing.

    Rather than navigating a conflict between ‘EA’ and ‘e/acc,’ or between worried and unworried, ‘doomer’ and (boomer?), he now has the credibility to say that it is time to make serious costly commitments and investments in the name of ensuring that safe AGI is actually safe.

    Not because he was forced into it – surely we all know that any such secret promises the old board might have extracted are worth exactly nothing. Not to placate factions or board members. He can do it because he knows it is the right thing to do, and he now is in a position to do it without endangering his power.

    That is the thing about attempting to align something more capable than you that will pursue power due to instrumental convergence. The early steps look the same whether or not you succeeded. You only find out at the end whether or not the result was compatible with humans.

    Ilya Sutskever tried to put the breaks on AI development and remove what he saw at the time as a reckless CEO, from an organization explicitly dedicated to safety. Or at least, that’s the story everyone believed on the outside.

    What happened? An avalanche of pressure from all sides. This despite no attempt to turn off any existing systems.

    Ask yourself: What would happen if AI was integrated into the economy, or even was a useful tool everyone liked, and it suddenly became necessary to turn it off?

    Never mind whether we could. Suppose we could, and also that we should. Would we?

    Would anyone even dare try?

    Chris Maddison: The wrath that @ilyasut is facing is just a prelude of the wrath that will be faced by anyone who tries to “unplug” an unaligned, supervaluable AI. This weekend has not been encouraging from an AI safety perspective.

    David Rein: This is an extremely important and underrated point. Once AI systems are deeply integrated into the economy, there’s ~0% chance we will be able to just “turn them off”, even if they start acting against our interests.

    Meta introduces Emu Video and Emu Edit. Edit means that if you get part of what you want, you can keep it and build upon it. Video is a few seconds of video. I have yet to see any useful applications of a few seconds of video that is essentially ‘things drift in a direction’ but someday, right?

    Report of scam deepfake calls hitting right now, asking for bail money.

    Distinct first hand report of that same scam, asking for bail money.

    As a general rule, scammers are profoundly uncreative. The reason the Nigerian scam is still the Nigerian scam is that if you recognize the term ‘Nigerian prince’ as a scam then you were not about to fall for an ‘Angolan principle’ either.

    So for now, while a code word is still a fine idea, you can get most of the way there with ‘any request for bail money or a suspiciously low-random-amount kidnapping is highly suspicious, probably a scam.’

    An even easier, more robust rule suggests itself.

    If a phone call leads to a request for money or financial information, assume until proven otherwise that this is a scam!

    Good rule even without AI.

    Paper suggests top freelancers are losing business due to ChatGPT. Overall demand drops for knowledge workers and also narrows gaps between them.

    I would not presume this holds long term. The skill gap has narrowed, but also there is always demand for the best, although they do not find such an effect here. Mostly I would caution against generalizing too much from early impacts in quirky domains.

    Vance Ginn at EconLib opposes the EO, gives the standard ‘they will never take our jobs’ speech warning that ‘red teaming’ and any other regulation will only slow down innovation, does not even bother dismissing existential risks.

    Dave Orr is hiring for a new DeepMind alignment team he is joining to start. Post is light on details, including planned technical details.

    Red teaming competition, goal is to find an embedded Trojan in otherwise aligned LLMs, a backdoor that lets the user do whatever they want. Submissions accepted until February 25.

    Claude v2.1. 200k context window, specific prompt engineering techniques, half as many hallucinations (they say), system prompts and experimental tool use for calling arbitrary functions, private knowledge bases or browsing the web. It seems you use things like , or . You can also employ .

    All seems highly incrementally useful.

    GPQA, a new benchmark of 448 multiple choice science questions where experts often get them wrong and the answers are Google-proof.

    ChatGPT Voice rolls out for all free users, amid all the turmoil, with Greg Brockman promoting it. This is presumably a strong cooperative sign, but it could also be a move to raise costs even further.

    Microsoft to spend over $50 billion on data centers. Yikes indeed, sir.

    Meta, while everyone is super distracted by OpenAI drama, breaks up its ‘Responsible AI’ team. To which we all said, Meta has a responsible AI team? I do not believe this is a team worth worrying about. Again, I expect the main effect of yelling when people disband such teams is that companies will avoid making such teams, or ensure they are illegible. Yes, they are trying to bury it, but I’m basically OK with letting them.

    Jack Morris: now seems as good a time as ever to remind people that the biggest breakthroughs at OpenAI came from a previously unknown researcher [Alec Radord] with a bachelors degree from olin college of engineering.

    Ethan Mollick points out that the OpenAI situation highlights the need to not enforce noncompete agreements in tech. I once sat out six months of Magic writing because I had a non-compete and my attempts to reach a win-win deal to difuse it were roundly rejected, and I keep my agreements. I do think advocates are too eager to ignore the cases where such agreements are necessary in some form, or at least net useful, so it is not as simple as all that.

    New paper: Building the Epistemic Community of AI Safety. Don’t think there’s much here but included for completeness.

    Will deep learning soon hit a wall? Gary Marcus says it is already hitting one based on this answer, at the end of the Sam Altman video I discuss in This Week in Audio, and writes a post suggesting Altman agrees.

    Sam Altman (CEO of OpenAI): There are more breakthroughs required in order to get to AGI

    Cambridge Student: “To get to AGI, can we just keep min maxing language models, or is there another breakthrough that we haven’t really found yet to get to AGI?”

    Sam Altman: “We need another breakthrough. We can still push on large language models quite a lot, and we will do that. We can take the hill that we’re on and keep climbing it, and the peak of that is still pretty far away. But, within reason, I don’t think that doing that will (get us to) AGI. If (for example) super intelligence can’t discover novel physics I don’t think it’s a superintelligence. And teaching it to clone the behavior of humans and human text – I don’t think that’s going to get there.And so there’s this question which has been debated in the field for a long time: what do we have to do in addition to a language model to make a system that can go discover new physics?”

    Gary Marcus: Translation: “deep learning is hitting a wall”

    Rob Bensinger: What’s your rough probability on “In 2026, it will seem as though deep learning hit a non-transient wall at some point in 2023-2025, after the relatively impressive results from GPT-4?”

    Michael Vassar: 85% by 2028. 65% in 2026. But economic impact will still be accelerating in 2028

    AgiDoomerAnon: 25-30% maybe, it’s where most of my hope comes from 🙂

    Negative Utilitarian: 80%.

    Fanged Desire: 90%. Though at the same time it’s true that a *lotof things are going to be possible just by modifying the capabilities we have now in basic ways and making them applicable to different scenarios, so to a layman it may *looklike we’re still progressing at lightning speed.

    CF: Lol, 0%.

    Jason: 0%. There’s obvious ways forward. Adding short term memory for a start. Parallel streams of consciousness output that are compared and rated by another LLM.

    Such strong disagreement. I opened up a market.

    A good perspective:

    Sarah Constantin: The good news is that you’d need an AI to do original science for the worst-case scenarios to occur, and it doesn’t look like LLMs are remotely close.

    The bad news is that @sama apparently *wantsan AI physicist.

    Not remotely close now, but I am not confident about how long until a lot closer.

    Dwarkesh Patel asks why LLMs with so much knowledge don’t notice new correlations and discoveries, pretty much at all. Eliezer responds that humans are computers too, so this is unlikely to be a fundamental limitation, but we do not know how much more capacity would be required for this to happen under current architectures. Roon predicts better and more creative reasoning will solve it.

    FTC is latest agency to give an absurd definition of AI.

    Al includes, but is not limited to, machine-based systems that can, for a set of defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Generative Al can be used to generate synthetic content including images, videos, audio, text, and other digital content that appear to be created by humans. Many companies now offer products and services using Al and generative Al, while others offer products and services that claim to detect content made by generative Al.

    I get that a perfect legal definition of AI is hard, but this is broad enough to include essentially every worthwhile piece of software.

    UK to not regulate AI at all ‘in the short term’ to avoid ‘stifling innovation.’ This could in theory be part of a sensible portfolio, where Sunak helps them lay groundwork for international cooperation and sensible regulation when we need it, while not getting in the way now. Or it could be a different way to describe the situation – Biden’s Executive Order also does not regulate AI in the short term in any meaningful way. What’s the difference? This could also reflect deep dysfunction. We will see.

    Simeon attempts to explain one more time why not regulating foundation models, as in letting anyone who wants to create the most dangerous, capable and intelligent systems possible, won’t work. No, you can’t meaningfully ‘regulate applications,’ by then it is too late. Also he notes that to the extent Mistral (the most prominent voice advocating this path) has a track record on alignment, it is atrocious.

    Corporate Europe on how Big Tech undermined the AI Act. Much to wince over on many fronts. It confirms the basic story of Big Tech lobbying hard against regulations with teeth, then Mistral and Aleph Alpha lobbying hard against those same regulations by claiming the regulations are a Big Tech conspiracy. Nice trick.

    Politico Pro EU’s Vincent Manancourt: New: EU “crazy” to consider carve out for foundation models in AI act, ‘Godfather of AI’ Yoshua Bengio told me. He warned bloc risks “law of the jungle” for most advanced forms of the tech. [Story for pros here]

    Control AI points out that Mistral’s lobbying champion is the former French tech minister for Macron, who has very much reversed his tune. I’ve heard other reports as well that this relationship has been central.

    Italy seems to be with Germany and France on this. Others not so much, but that’s rather a big three, who want a regime with zero teeth whatsoever. Other officials walked out in response.

    “This is a declaration of war,” a parliament official told Euractiv on condition of anonymity.

    Max Tegmark and Yoshua Bengio indeed point out that this would be the worst possible thing.

    Connor Dunlop of Euractiv argues regulating AI foundation models is crucial for innovation. I agree that the proposed regulations would assist, not hurt, Mistral and Aleph Alpha, the so-called would-be upstart ‘national champions.’ I do not think the linked post effectively makes that case.

    One approach to dismissing any attempts to not die, or any form of regulation on AI, has always been to refer to any, and in many although not all cases I do mean any, restriction or regulation on AI as totalitarianism.

    They want people to think of a surveillance state with secret police looking for rogue laptops and monitoring your every keystroke. Plus airstrikes.

    What they are actually talking about is taking frontier models, the creation and deployment of new entities potentially smarter and more capable than humans, and applying a normal regulatory regime.

    It is time to treat such talk the same way we treat people who get arrested because they deny the constitutional right of the United States Government to collect income taxes.

    As in: I am not saying that taxation is not theft, but seriously, get a grip, stop it.

    As an excellent concrete example with 500k+ views up top, I am highly disappointed in this disingenuous thread from Brian Chau.

    Brian Chau: Did you guys know there’s 24-author paper by EAs, for EAs, about how Totalitarianism is absolutely necessary to prevent AI from killing everyone?

    What is this ‘totalitarianism’ as it applies here, as Chau explains it?

    They predictably call for exactly the kind of regulatory capture most convenient to OpenAI, Deepmind, and other large players.

    [From Paper, quoted by Chau]: blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models.

    In other words, the ‘totalitarianism’ is:

    1. Standards for frontier models.

    2. Registration and reporting of training runs.

    3. Mechanisms allowing enforcement of the safety rules.

    This is not totalitarianism. This is a completely ordinary regulatory regime.

    (I checked with Claude, which was about as kind to Chau’s claims as I was.)

    If you want to argue that standard regulatory interventions tend to favor insiders and large players over time? I will agree. We could then work together against that on 90% (98%?) of issues, while looking for the best solution available in the AI space.

    Or you can make your true position flat out text?

    Here’s how the paper describes three ways their regulatory regime might fail.

    The Unexpected Capabilities Problem. Dangerous capabilities can arise unpredictably and undetected, both during development and after deployment.

    The Deployment Safety Problem. Preventing deployed AI models from causing harm is a continually evolving challenge.

    The Proliferation Problem. Frontier AI models can proliferate rapidly, making accountability difficult.

    Here’s how Brian Chau describes this:

    Brain Chau: They lay out three obstacles to their plans. If you pause for a moment and read the lines carefully, you will realize they are all synonyms for freedom.

    If you want to take the full anarchist position, go ahead. But own it.

    In addition to the above mischaracterization, he then ‘rebuts’ the four claims of harm. Here is his reason biological threats don’t matter.

    Re 1: the limiting factor of designing new biological weapons is equipment, safety, and not killing yourself with them. No clue why this obviously false talking point is trodded out by EAs so often.

    This seems to be a claim that no amount of expertise or intelligence enables the importantly easier creation of dangerous biological pandemic agents? Not only the claim, but the assertion that this is so obviously false that it is crazy to suggest?

    He says repeatedly ‘show me the real examples,’ dismissing the danger of anything not yet dangerous. That is not how any of this works.

    Sam Altman at Cambridge Union Society on November 1 accepting an award and answering questions. Opening speech is highly skippable. The risks and promise of AI are both front and center throughout, I provide a summary that suffices, except you will also want to watch that last question, where Sam says more breakthroughs are needed to get to AGI.

    The existential risk protest interruption is at about 17: 00 and is quite brief.

    At 19: 30 he describes OpenAI as tool builders. Notice every time someone assumes that sufficiently capable AIs would long remain our tools.

    Right after that he says that young programmers are now outperforming older ones due to greater familiarity with AI tools.

    22: 00 Sam says he learned two things in school: How to learn new things, and how to think of new things he hadn’t heard elsewhere. Learning how to learn was all the value, the content was worthless.

    25: 30 Sam responds to the protest, saying that things decay without progress, the benefits can’t be ignored, there needs to be a way forward. Except of course no, there is no reason why there needs to be a way forward. Maybe there is. Maybe there isn’t.

    34: 00 Sam’s primary concern remains misuse.

    47: 00 Sam discusses open source, warns of potential to make an irreversible mistake. Calls immediately open sourcing any model trained ‘insanely reckless’ but says open source has a place.

    From two weeks ago, I happened to listen to this right before Friday’s events: OpenAI co-founder Ilya Sutskever on No Priors. p(doom) went up as I heard him express the importance of ensuring future AIs have, his term, ‘warm feelings’ towards us, or needing it to ‘be prosocial’ or ‘humanity loving.’ That is not how I believe any of this works. He’s working alongside Leike on Superalignment, and he is still saying that, and I do not understand how or why. But assuming they can continue to work together after this and still have OpenAI’s support, they can hopefully learn and adjust as they go. It is also very possible that Ilya is speaking loosely here and his actual detailed beliefs are much better and more precise.

    What is very clear here is Ilya’s sincerity and genuine concern. I wish him all the best.

    Yoshua Bengio talk, towards AI safety that improves with more compute. I have not watched yet.

    Davidad brief thread compares his approach to those of Bengio and Tegmark.

    Some basic truth well said.

    David Sacks: I’m all in favor of accelerating technological progress, but there is something unsettling about the way OpenAI explicitly declares its mission to be the creation of AGI.

    AI is a wonderful tool for the betterment of humanity; AGI is a potential successor species.

    By the way, I doubt OpenAI would be subject to so many attacks from the safety movement if it wasn’t constantly declaring its outright intention to create AGI.

    To the extent the mission produces extra motivation for the team to ship good products, it’s a positive. To the extent it might actually succeed, it’s a reason for concern. Since it’s hard to assess the likelihood or risk of AGI, most investors just think about the former.

    How true is this?

    Staff Engineer: If you don’t believe in existential risk from artificial super intelligence. then you don’t believe in artificial super intelligence. You’re just looking at something that isn’t scary so you don’t have to think about the thing that is.

    Jeffrey Ladish: Agree with the first sentence but not the second. Many people are just choosing to look away, but some genuinely think ASI is extremely hard / very far away / impossible. I think that’s wrong, but it doesn’t seem like a crazy thing to believe.

    Eliezer Yudkowsky notes that people constantly gaslight us saying ‘creating smarter than human AIs would not be a threat to humanity’s survival’ and gets gaslit by most of the comments, including the gaslighting that ‘those who warn of AI’s existential risk deny its upsides’ and ‘those who warn of AI’s existential risk do not say over and over that not ever building it would be a tragedy.’

    Your periodic reminder and reference point: AI has huge upside even today. Future more capable AI has transformational insanely great upside, if we can keep control of the future, not get killed and otherwise choose wisely. Never building it would be a great tragedy. However, it would not be as big a tragedy as everyone dying, so if those are the choices then don’t fing build it.

    On Liars and Lying, a perspective on such questions very different from my own.

    Your periodic reminder that ‘doomer’ is mostly a label used either as shorthand, or as a kudgel of those who want to ridicule the idea that AI could be existentially dangerous. Whereas those who do worry have widely varying opinions.

    Eliezer Yudkowsky: Disturbing tendency to conflate anyone who believes in any kind of AGI risk as a “doomer”. If that’s the definition, Sam Altman is a doomer. Ilya Sutskever is a doomer. Helen Toner is a doomer. Shane Legg is a doomer. I am a doomer. Guess what? We are importantly different doomers. None of their opinions are my own, nor their plans, nor their choices. Right or wrong they are not mine and do not proceed from my own reasons.

    Your periodic reminder that a for-profit business has in practice a strong economic incentive to not kill all of its customers, but only to the extent that would leave other humans alive but not leave it as many customers. If everyone is dead, the company makes no money, but no one cares or is punished for it.

    Packy McCormick (a16z): The cool thing about for-profit AI, from an alignment perspective, is that it gives you a strong economic incentive to not kill all of your customers.

    Rob Bensinger: If you die in all the scenarios where your customers die, then I don’t see how for-profit improves your long-term incentives. “I and all my loved ones and the entire planet die” is just as bad as “I and all my loved ones and the entire planet die, and a bunch of my customers.”

    A for-profit structure may or may not be useful for other reasons, but I don’t think it’s specifically useful because of the “all my customers suddenly die (at the same time the reset of humanity does) scenario”, which is the main scenario to worry about.

    Do the events of the past week doom all nuance even more than usual?

    Haseeb: This weekend we all witnessed how a culture war is born.

    E/accs now have their original sin they can point back to. This will become the new thing that people feel compelled to take a side on–e/acc vs decel–and nuance or middle ground will be punished.

    Such claims are constant. Everything that happens nuance is presumed dead. Also every time most people with the ‘e/acc’ label speak nuance is announced dead. Them I do not worry about, they are a lost cause. The question is whether a lot of otherwise reasonable people will follow. Too soon to tell.

    Not the argument you want to be making, but…

    Misha Gurevich: People who think EA X-risk worries about AI are a destructive ideology: imagine the kind of ideologies artificial intelligences are gonna have.

    Following up on the deceptive alignment paper from last week:

    Robert Wiblin: In my mind the probability that normal AI reinforcement will produce ‘deceptive alignment’ is like… 30%. So extremely worth working on, and it’s crazy we don’t know. But it might turn out to be a red herring. What’s the best evidence/argument that actually it’s <1% or >90%?

    [bunch of mostly terrible arguments in various directions in reply]

    I notice a key difference here. Wiblin is saying 30% to deceptive alignment. Last week’s estimate was similar (25%) but it was conditional on the AI already having goals and being situationally aware. Conditional on all that, I am confused how such behavior could fail to arise. Unconditionally is far less clear.

    I still expect to almost always see something that is effectively ‘deceptive alignment.’

    The AI system will figure out to do that which, within the training environment, best convinces us it is aligned. That’s the whole idea with such techniques. I do not assume that the AI will then go ‘aha, fooled you, now that I am not being trained or tested I can stop pretending.’ I don’t rule that out, but my default scenario is that the thing we got it to do fails to generalize out of distribution the way we expected. That it is sensitive to details and context in difficult to anticipate ways that do not match what we want in both directions. That it does not generalize the ways we hope for.

    We discover that we did not, after all, know how to specify what we wanted, in a way that resulted in things turning out well.

    Is that ‘deceptive alignment’? You tell me.

    Here’s Eliezer’s response:

    Eliezer Yudkowsky: Are you imagining that it won’t be smart enough to do that? Or that deception will genuinely not be in its interests because it gets just as much of what it wants with humans believing true things as the literally optimal state of affairs? Or that someone solved soft optimization? How do you imagine the weird, special circumstances where this doesn’t happen? Remember that if MIRI is worried about a scenario, that means we think it’s a convergent endpoint and not some specific pathway; if you think we’re trying to predict a hard-to-predict special case, then you’ve misunderstood a central argument.

    Robert Wiblin: Joe’s paper does a better job than me of laying out ways it might or might not happen. But ‘not being smart enough’ isn’t an important reason.

    ‘Not trained to be a global optimizer’ is one vision.

    Another is that the reinforcement for doing things we like and not things we don’t like (with some common-sense adjustments to how the feedback works suggested by alignment folks) evolves models to basically do what we want and share our aversions, maybe because that’s the simplest / most efficient / most parsimonious way to get reward during training. The wedge between what we want and what we reward isn’t large enough to generate lots of scheming behavior, because scheming isn’t the best way to turn compute into reward in training setups.

    I am so completely confused by Wiblin’s position here, especially that last sentence. Why would ‘scheming’ not be the best way to turn compute into rewards? Why would a completely honest, consistent, straightforward approach be the most rewarded one, given how humans decide how to reward things? I don’t get it.

    Eliezer Yudkowsky offers what for him counts as high praise.

    Eliezer Yudkowsky (QTing what follows here): This seems a very weak test of the ability of dumber judges to extract truth from smarter debaters, but the methodology could be adapted to tougher tests. Increasingly tough versions of this are a good candidate for standard evals.

    Julian Michael: As AIs improve at persuasion & argumentation, how do we ensure that they help us seek truth vs. just sounding convincing? In human experiments, we validate debate as a truth-seeking process, showing that it may soon be needed for supervising AI. Paper here.

    When a doctor gives a diagnosis, common advice is to get a second opinion to help evaluate whether to trust their judgment, because it’s too difficult to evaluate their diagnosis by yourself.

    The idea (originally proposed by @geoffreyirving et al.) is that having equally-capable adversarial AIs critique each other’s answers will make it easier for non-expert judges to evaluate their truthfulness. But does this actually hold in practice?

    We find for the first time on a realistic task that the answer is yes! We use NYU competitive debaters to stand in for future AI systems, having them debate reading comprehension questions where the judge *can’t see the passage(except for quotes revealed by the debaters).

    We compare debate to a baseline we call *consultancy*, where the judge interacts with a single expert that has a 50% chance of lying. We use this to explicitly elicit dishonest behavior that may implicitly arise in methods like RLHF.

    We find that judges are significantly more accurate in debate than consultancy, AND debates are much more efficient, at two-thirds the length on average.

    Furthermore, many of the errors we observe in debate seem fixable with more careful judging and stronger debaters. In a third of mistakes, judges end the debate prematurely, and in nearly half, honest debaters mistakenly missed key evidence that would have helped them win.

    We don’t see a difference in accuracy or efficiency between debate and consultancy when using GPT-4 as a debater — yet. In particular, GPT-4 was not very skilled at deception, which may not remain the case for future powerful AI systems.

    As we move from relatively unskilled AI systems to skilled humans, non-expert judge accuracy *improveswith debate, but *decreaseswith consultancy. This suggests that training AI systems to debate may be an important alternative to methods like RLHF as models improve.

    In the paper, we lay out considerations on how to train AI debaters and open problems that need to be solved.

    How optimistic should we be about this in the AI case where you are trying to do this to use outputs of models you cannot otherwise trust? I continue to assume this will break exactly when you need it to not break. It could have mundane utility in the period before that, but I always worry about things I assume are destined to break.

    Thread with Nora Belrose and Eliezer Yudkowsky debating deceptive alignment. Nora bites the bullet and says GPT-4 scaled up in a naive way would not have such issues. Whereas I would say, that seems absurd, GPT-4 already has such problems. Nora takes the position that if your graders and feedback suck, your AI ends up believing false things the graders believe and not being so capable, but not in a highly dangerous way. I continue to be confused why one would expect that outcome.

    Roon reminds us that people acting like idiots and making deeply stupid strategic power moves, only to lose to people better at power moves, has nothing to do with the need to ensure we do not die from AI.

    Roon: throughout all this please remember a few things that will be critical for the future of mankind: – this coup had nothing to do with ai safety. Sama has been a global champion of safe agi dev – the creation of new life is fraught and no one must forget that for political reasons

    Sorry if the second point is vague. I literally just mean don’t turn your back on x-risk just because of this remarkably stupid event

    We need better words for stuff that matters. But yes.

    Roon: if a group of people are building artificial life in the lab and don’t view it with near religious significance you should be really concerned.

    Axios’ Jim VandeHei and Mike Allen may or may not be worried about existential risk here, but they say outright that ‘this awesome new power’ cannot be contained, ethics never triumphs over profits, never has, never will. So we will get whatever we get.

    I say this is at best midwit meme territory. Sometimes, yes, ethics or love or the common good wins. We are not living in a full-on cyberpunk dystopia of unbridled capitalism. I am not saying it will be easy. It won’t be easy. It also is not impossible.

    It is important to notice this non-sequitur will be with us, until very late in the game. There is always some metaphorical hard thing that looks easy.

    Katherine Dee: Had an ex who worked on self-driving cars. He once said to me, “you can’t use self-check out or self-ticketing machines at the airport reliably. No AI overlords are coming.” I think about that a lot.

    Eliezer Yudkowsky: This really is just a non-sequitur. Not all machines are one in their competence. The self-check-out machines can go on being bad indefinitely, right up until the $10-billion-dollar frontier research model inside the world’s leading AI lab starts self-improving.

    What term would he prefer to use for the possibility?

    Stewart Brand: Maybe this is the episode that makes the term “existential risk” as passé as it needs to be.

    When someone tells you who they are. Believe them.

    Roon (Sunday evening): i truly respect everyone involved [in the OpenAI situation].

    Eliezer Yudkowsky: I respect that.

    Anton: If one has actual principles, this is not possible.

    And when you see them you shall call them by their true name.

    AI #39: The Week of OpenAI Read More »