openai – Page 16

Dropbox spooks users with new AI features that send data to OpenAI when used

AI, AI privacy, Biz & IT, chatgpt, chatgtp, data privacy, dropbox, large language models, machine learning, openai, privacy / Rejus Almole / December 13, 2023

adventures in data consent —

AI feature turned on by default worries users; Dropbox responds to concerns.

Benj Edwards – Updated Dec 13, 2023 7: 41 pm UTC

On Wednesday, news quickly spread on social media about a new enabled-by-default Dropbox setting that shares Dropbox data with OpenAI for an experimental AI-powered search feature, but Dropbox says data is only shared if the feature is actively being used. Dropbox says that user data shared with third-party AI partners isn’t used to train AI models and is deleted within 30 days.

Even with assurances of data privacy laid out by Dropbox on an AI privacy FAQ page, the discovery that the setting had been enabled by default upset some Dropbox users. The setting was first noticed by writer Winifred Burton, who shared information about the Third-party AI setting through Bluesky on Tuesday, and frequent AI critic Karla Ortiz shared more information about it on X.

Wednesday afternoon, Drew Houston, the CEO of Dropbox, apologized for customer confusion in a post on X and wrote, “The third-party AI toggle in the settings menu enables or disables access to DBX AI features and functionality. Neither this nor any other setting automatically or passively sends any Dropbox customer data to a third-party AI service.“

Critics say that communication about the change could have been clearer. AI researcher Simon Willison wrote, “Great example here of how careful companies need to be in clearly communicating what’s going on with AI access to personal data.”

A screenshot of Dropbox's third-party AI feature switch. — Enlarge / A screenshot of Dropbox’s third-party AI feature switch.

Benj Edwards

So why would Dropbox ever send user data to OpenAI anyway? In July, the company announced an AI-powered feature called Dash that allows AI models to perform universal searches across platforms like Google Workspace and Microsoft Outlook.

According to the Dropbox privacy FAQ, the third-party AI opt-out setting is part of the “Dropbox AI alpha,” which is a conversational interface for exploring file contents that involves chatting with a ChatGPT-style bot using an “Ask something about this file” feature. To make it work, an AI language model similar to the one that powers ChatGPT (like GPT-4) needs access to your files.

According to the FAQ, the third-party AI toggle in your account settings is turned on by default if “you or your team” are participating in the Dropbox AI alpha. Still, multiple Ars Technica staff who had no knowledge of the Dropbox AI alpha found the setting enabled by default when they checked.

In a statement to Ars Technica, a Dropbox representative said, “The third-party AI toggle is only turned on to give all eligible customers the opportunity to view our new AI features and functionality, like Dropbox AI. It does not enable customers to use these features without notice. Any features that use third-party AI offer disclosure of third-party use, and link to settings that they can manage. Only after a customer sees the third-party AI transparency banner and chooses to proceed with asking a question about a file, will that file be sent to a third-party to generate answers. Our customers are still in control of when and how they use these features.”

Right now, the only third-party AI provider for Dropbox is OpenAI, writes Dropbox in the FAQ. “Open AI is an artificial intelligence research organization that develops cutting-edge language models and advanced AI technologies. Your data is never used to train their internal models, and is deleted from OpenAI’s servers within 30 days.” It also says, “Only the content relevant to an explicit request or command is sent to our third-party AI partners to generate an answer, summary, or transcript.”

Disabling the feature is easy if you prefer not to use Dropbox AI features. Log into your Dropbox account on a desktop web browser, then click your profile photo > Settings > Third-party AI. This link may take you to that page more quickly. On that page, click the switch beside “Use artificial intelligence (AI) from third-party partners so you can work faster in Dropbox” to toggle it into the “Off” position.

This story was updated on December 13, 2023, at 5: 35 pm ET with clarifications about when and how Dropbox shares data with OpenAI, as well as statements from Dropbox reps and its CEO.

Dropbox spooks users with new AI features that send data to OpenAI when used Read More »

OpenAI: Leaks Confirm the Story

openai / Rejus Almole / December 12, 2023

Discover more from Don’t Worry About the Vase

A world made of gears. Doing both speed premium short term updates and long term world model building. Explorations include AI, policy, rationality, Covid and medicine, strategy games and game design, and more.

Over 9,000 subscribers

Previously: OpenAI: Altman Returns, OpenAI: The Battle of the Board, OpenAI: Facts from a Weekend, additional coverage in AI#41.

We have new stories from The New York Times, from Time, from the Washington Post and from Business Insider.

All paint a picture consistent with the central story told in OpenAI: The Battle of the Board. They confirm key facts, especially Altman’s attempted removal of Toner from the board via deception. We also confirm that Altman promised to help with the transition when he was first fired, so we have at least one very clear cut case of Altman saying that which was not.

Much uncertainty remains, especially about the future, but past events are increasingly clear.

The stories also provide additional color and key details. This post is for those who want that, and to figure out what to think in light of the new details.

The most important new details are that NYT says that the board proposed and was gung ho on Brett Taylor, and says D’Angelo suggested Summers and grilled Summers together with Altman before they both agreed to him as the third board member. And that the new board is remaining quiet while it investigates, echoing the old board, and in defiance of the Altman camp and its wish to quickly clear his name.

The New York Times finally gives its take on what happened, by Tripp Mickle, Mike Isaac, Karen Weise and the infamous Cade Metz (so treat all claims accordingly).

As with other mainstream news stories, the framing is that Sam Altman won, and this shows the tech elite and big money are ultimately in charge. I do not see that as an accurate description what happened or its implications, yet both the tech elite and its media opponents want it to be true and are trying to make it true through the magician’s trick of saying that it is true, because often power resides where people believe it resides.

I know that at least one author did read my explanations of events, and also I talked to a Times reporter not on the byline to help make everything clear, so they don’t have the excuse that no one told them. Didn’t ultimately matter.

Paul Graham is quoted as saying Altman is drawn to power more than money, as an explanation for why Altman would work on something that does not make him richer. I believe Graham on this, but also I think there are at least three damn good other reasons to do it, making the decision overdetermined.

If Altman wants to improve his own lived experience and those of his friends and loved ones, building safe AGI, or ensuring no one else builds unsafe AGI, is the most important thing for him to do. Altman already has all the money he will ever need for personal purposes, more would not much improve his life. His only option is to instead enrich the world, and ensure humanity flourishes and also doesn’t die. Indeed, notice the rest of his portfolio includes a lot of things like fusion power and transformational medical progress. Even if Altman only cares about himself, these are the things that make his life better – by making everyone’s life better.
Power and fame and prestige beget money. Altman does not have relevant amounts of equity in OpenAI, but he has used his position to raise money, to get good deal flow, and in general to be where the money resides. If Altman decided what he cared about was cash, he could easily turn this into cash. To be clear, I do not at all begrudge in general. I am merely not a fan of some particular projects, like ‘build a chip factory in the UAE.’
AGI is the sweetest, most interesting, most exciting challenge in the world. Also the most important. If you thought your contribution would increase the chance things went well, why would you want to be working on anything else?

Pretty much every version of Altman I can imagine would want to be doing this.

The key description of the safety issue is structured in a way that it is easy to come away thinking this was a concern of the outside board members, but both in reality and if you read the article carefully, this applies to the entire board (although we have some uncertainty about Brockman in particular):

They were united by a concern that A.I. could become more intelligent than humans.

Remember that this was and is the explicit goal of OpenAI, to safely create AI more intelligent than humans, also known as AGI. Altman signed the CAIS letter, although Brockman is not known to have done so. Altman has made the threat here very clear. Everyone involved understands the danger. Everyone is, to their credit, talking price.

The first piece of news is that we have at least one case in which we can be damn sure that Sam Altman lied to the board, in at least some important senses.

Shocked that he was being fired from a start-up he had helped found, Mr. Altman widened his eyes and then asked, “How can I help?” The board members urged him to support an interim chief executive. He assured them that he would.

Within hours, Mr. Altman changed his mind and declared war on OpenAI’s board.

I point this out because it is a common theory that Altman was a master of Exact Words and giving implications. That yes he was deceptive and misleading and played power games, but he was too smart to outright say that which was not.

So here he is, saying that which is not.

Did it matter? Maybe no. But maybe quite a lot, actually. This cooperation could have been a key factor driving the decision not to detail the issues with Altman, at least initially, when it would have worked. If Altman is going to cooperate, what he gets in return is the mission continues and also whatever he did gets left unspecified.

The article waffles on whether or not Altman actually did declare war on the board that night. The statement above says so. Then they share a narrative of others driving the revolt, including Airbnb’s CEO Brian Chesky, the executives and employees, with Altman only slowly deciding to fight back.

It can’t be both. Which is it?

I assume the topline is correct. That Altman was fighting back the whole time. And that despite being willing to explicitly say that up top, Altman’s people sufficiently sculpted the media narrative to make the rest sound like events unfolded in a very different way. It is an absolute master class in narrative sculpting and media manipulation. They should teach this in universities. Chef’s kiss.

We have confirmation that Altman was not ‘consistently candid’ about the project to build chips in the UAE:

In September, Mr. Altman met investors in the Middle East to discuss an A.I. chip project. The board was concerned that he wasn’t sharing all his plans with it, three people familiar with the matter said.

For many obvious reasons, this is an area where the board would want to be informed, and any reasonable person in Altman’s position would know this, and norms say that this means they should be informed. But not informing them would not by default strictly violate the rules, as long as Altman honestly answered questions when asked. Did he, and to what extent? We don’t know.

Now we get into some new material.

Dr. Sutskever … believed that Mr. Altman was bad-mouthing the board to OpenAI executives, two people with knowledge of the situation said. Other employees have also complained to the board about Mr. Altman’s behavior.

In October, Mr. Altman promoted another OpenAI researcher to the same level as Dr. Sutskever, who saw it as a slight. Dr. Sutskever told several board members that he might quit, two people with knowledge of the matter said. The board interpreted the move as an ultimatum to choose between him and Mr. Altman, the people said.

Dr. Sutskever’s lawyer said it was “categorically false” that he had threatened to quit.

Another conflict erupted in October when Ms. Toner published a paper…

This frames Sutskever as having been in favor of firing Altman for some time. If this is true, the board’s sense of urgency, and its unwillingness to take time to plan and get its ducks in a row, makes even less sense. If they had been discussing the issue for months, if Ilya had been not only onboard but enthusiastic for a month, I don’t get it.

The post then goes over the incident over Toner’s ignored academic paper, for which Toner agreed to apologize to keep the peace.

“I did not feel we’re on the same page on the damage of all this,” Altman wrote.

We’re definitely not. Toner and I are on the page that this was trivial and obviously so. Altman was presenting it as a major deal.

Now we get to the core issue.

Mr. Altman called other board members and said Ms. McCauley wanted Ms. Toner removed from the board, people with knowledge of the conversations said. When board members later asked Ms. McCauley if that was true, she said that was “absolutely false.”

“This significantly differs from Sam’s recollection of these conversations,” an OpenAI spokeswoman said, adding that the company was looking forward to an independent review of what transpired.

Time magazine gives this version:

Time: Altman told one board member that another believed Toner ought to be removed immediately, which was not true, according to two people familiar with the discussions.

Whatever other reasons did or did not exist, if Altman did say that, my model of such things is that he needed to be fired and it was the board’s job to fire him. And the board really should have said so, rather than speaking in generalities.

Multiple witnesses are saying to NYT that he said it. Altman denies it.

It seems clear Altman did use private conversations with board members to give false impressions and drum up support for getting Toner off the board, thereby giving Altman board control, using the paper as an excuse. The dispute is whether Altman did it using Exact Words, or whether he lied. Altman called his attempt ‘ham fisted’ which I believe is power player code for ‘got caught lying’ but could also apply to ‘got caught technically-not-lying while implicitly lying my ass off.’

NYT does seem to be saying the board did step up their description a bit:

NYT: The board members said that Mr. Altman had lied to the board, but that they couldn’t elaborate for legal reasons.

Use of the word ‘lied’ is an escalation. And this is a clear confirmation of lawyers.

We also have confirmation of zero PR people, because we have Toner’s infamous line. I know the logic behind it but I still cannot believe that she actually said it out loud given the context, seriously WTF:

Jason Kwon, OpenAI’s chief strategy officer, accused the board of violating its fiduciary responsibilities. “It cannot be your duty to allow the company to die,” he said, according to two people with knowledge of the meeting.

Ms. Toner replied, “The destruction of the company could be consistent with the board’s mission.”

You say ‘We have no intention of doing any such thing. The company is perfectly capable of carrying on without Altman. We have every intention of continuing on OpenAI’s mission, led by the existing executive team. Altman promised to help with the transition in the board meeting. If he instead chooses to attempt to destroy OpenAI and its mission, that is his decision. It also proves he was incompatible with our mission and we needed to remove him.’

OpenAI’s executives insisted that the board resign that night or they would all leave. Mr. Brockman, 35, OpenAI’s president, had already quit.

The support gave Mr. Altman ammunition.

This sounds highly contingent.

Also the board had now already made an explicit bluff threatening to quit. The board called. The executives did not quit. Subsequent such threats become far less credible.

Skipping ahead a bit, they still tried this a second time.

By Nov. 19 [with the Microsoft offer in hand], Mr. Altman was so confident that he would be reappointed chief executive that he and his allies gave the board a deadline: Resign by 10 a.m. or everyone would leave.

Pro negotiation tip: Do not quickly pull this trick a second time once your first bluff gets called. It will not work. That is why you do not rush out the first bluff, and instead wait until your position is stronger.

Of course the board called the second bluff, appointing Emmett Shear.

The next piece of good information came before that deadline was set, which is that Bret Taylor was actually seen as a fair arbiter approved by both sides rather than being seen as in the Altman camp.

Yet even as the board considered bringing Mr. Altman back, it wanted concessions. That included bringing on new members who could control Mr. Altman. The board encouraged the addition of Bret Taylor, Twitter’s former chairman, who quickly won everyone’s approval and agreed to help the parties negotiate.

But also note that in this telling, it was the board that wanted concessions and in particular new board members rather than Altman. That directly contradicts other reports and does not make sense, unless you read it as ‘contingent on the old board agreeing to resign, they wanted concessions.’ As in, the board was going to hand over its control of OpenAI, and they wanted the concession of ‘we agree on who we give it to, and what those people agree will happen.’ At best, I find this framing bizarre.

Larry Summers was a suggestion of D’Angelo, in some key original reporting:

To break the impasse, Mr. D’Angelo and Mr. Altman talked the next day. Mr. D’Angelo suggested former Treasury Secretary Lawrence H. Summers, a professor at Harvard, for the board. Mr. Altman liked the idea.

Mr. Summers, from his Boston-area home, spoke with Mr. D’Angelo, Mr. Altman, Mr. Nadella and others. Each probed him for his views on A.I. and management, while he asked about OpenAI’s tumult. He said he wanted to be sure that he could play the role of a broker.

Mr. Summers’s addition pushed Mr. Altman to abandon his demand for a board seat and agree to an independent investigation of his leadership and dismissal.

So both sides talked to Summers, and were satisfied with his answers.

This week, Mr. Altman and some of his advisers were still fuming. They wanted his name cleared.

“Do u have a plan B to stop the postulation about u being fired its not healthy and its not true!!!” Mr. Conway texted Mr. Altman.

Mr. Altman said he was working with OpenAI’s board: “They really want silence but i think important to address soon.”

Overall this all makes me bullish on the new board. We might be in a situation with, essentially, D’Angelo and two neutral arbiters, albeit ones with gravitas and business connections. They are not kowtowing to Altman. Altman’s camp continues to fume (and somehow texts from Conway to Altman about it are leaking to NYT, theere are not many places that can come from).

Gwern offers their summary here.

Time profiled Altman, calling him ‘CEO of the year,’ a title he definitely earned. I think this is the best very short description so far, nailing the game theory:

Meanwhile, the company’s employees and its board of directors faced off in “a gigantic game of chicken,” says a person familiar with the discussions.

Sources also note the side of Altman that seeks power, and is willing to be dishonest and manipulative in order to get it.

But four people who have worked with Altman over the years also say he could be slippery—and at times, misleading and deceptive. Two people familiar with the board’s proceedings say that Altman is skilled at manipulating people, and that he had repeatedly received feedback that he was sometimes dishonest in order to make people feel he agreed with them when he did not. These people saw this pattern as part of a broader attempt to consolidate power. “In a lot of ways, Sam is a really nice guy; he’s not an evil genius. It would be easier to tell this story if he was a terrible person,” says one of them. “He cares about the mission, he cares about other people, he cares about humanity. But there’s also a clear pattern, if you look at his behavior, of really seeking power in an extreme way.”

This is the first mainstream report that correctly identifies the outcome as unclear:

It’s not clear if Altman will have more power or less in his second stint as CEO.

In addition to his other good picks, we can add… Georgist land taxes? Woo-hoo!

Altman has advocated for a land-value tax—a classic Georgist policy—in recent meetings with world leaders, he says.

That is the kind of signal no one ever fakes. There really is a lot to love.

Including his honesty. I don’t want to punish it, but also I want to leave this here.

“We definitely accelerated the race, for lack of a more nuanced phrase,” Altman says.

Time describes the board’s initial outreach to Altman this way:

Altman characterizes it as a request for him to come back. “I went through a range of emotions. I first was defiant,” he says. “But then, pretty quickly, there was a sense of duty and obligation, and wanting to preserve this thing I cared about so much.” The sources close to the board describe the outreach differently, casting it as an attempt to talk through ways to stabilize the company before it fell apart.

I am not saying we know for sure that this is another case of Altman lying (to Time rather than the board, a much less serious matter), but his version of events does not compute. If the board was actively asking for Altman to outright return, I do not buy that this was Altman’s reaction.

I could buy either half of Altman’s story: That Altman was asked to return, or that Altman was defiant to the board’s request and only did it out of duty and obligation (because the board was initially requesting something else.) I don’t buy both at once. It is entirely inconsistent with Paul Graham’s assessment of his character.

The WaPo piece says that in the fall the board was approached by a small number of senior leaders at OpenAI, with concerns about Altman. In this telling, the board thought OpenAI stood to lose key leaders due to what they saw as Altman’s toxicity.

Now back at the helm of OpenAI, Altman may find that the company is less united than the waves of heart emojis that greeted his return on social media might suggest.

That is always true. No large group is ever fully united, no matter what emojis says.

There are few concrete details. What details are offered sound like ordinary things that happen at a company. What is and is not abusive, in such a high-pressure and competitive environment, is in the eye of the beholder and highly context dependent. What is described here could reflect abuse, or it could reflect nothing of concern. What is concerning is that employees found it concerning enough to go to the board.

Beyond the one concrete detail of mangers going to the board with such complaints, this did not teach us much. It seems like those concerns helped confirm the board’s model of Altman’s behavior, and helped justify the decision on the margin.

Business Insider says that OpenAI employees really, really did not want to go to work at Microsoft. I wouldn’t either. The employees might have largely still seen it as the least bad alternative under some circumstances, if Altman didn’t want to start a new company. And remember, the letter said they ‘might’ do it, not that they all definitely would.

AI Safety Memes offers the following quotes:

“[The letter] was an audacious bluff and most staffers had no real interest in working for Microsoft.”

“Many OpenAI employees “felt pressured” to sign the open letter.”

“Another OpenAI employee openly laughed at the idea that Microsoft would have paid departing staffers for the equity they would have lost by following Altman.” “It was sort of a bluff that ultimately worked.”

“The letter itself was drafted by a group of longtime staffers who have the most clout and money at stake with years of industry standing and equity built up, as well as higher pay. They began calling other staffers late on Sunday night, urging them to sign, the employee explained.”

Despite nearly everyone on staff signing up to follow Altman out the door, “No one wanted to go to Microsoft.” This person called the company “the biggest and slowest” of all the major tech companies.

“The bureaucracy of something as big as Microsoft is soul crushing.”

“Even though we have a partnership with Microsoft, internally, we have no respect for their talent bar,” the current OpenAI employee told BI. “It rubbed people the wrong way to entertain being managed by them.”

Beyond the culture clash between the two companies, there was another important factor at play for OpenAI employees: money. Lots of it was set to disappear before their eyes if OpenAI were to suddenly collapse under a mass exodus of staff.

“Sam Altman is not the best CEO, but millions and millions of dollars and equity are at stake,” the current OpenAI employee said.

Microsoft agreed to hire all OpenAI employees at their same level of compensation, but this was only a verbal agreement in the heat of the moment.

A scheduled tender offer, which was about to let employees sell their existing vested equity to outside investors, would have been canceled. All that equity would have been worth “nothing,” this employee said.

The former OpenAI employee estimated that, of the hundreds of people who signed the letter saying they would leave, “probably 70% of the folks on that list were like, ‘Hey, can we, you know, have this tender go through?'”

Some Microsoft employees, meanwhile, were furious that the company promised to match salaries for hundreds of OpenAI employees.”

Roon responds that this was not accurate:

Roon: not to longpost, and I can only speak for myself, but this is a very inaccurate representation of the mood from an employee perspective.

– “employees felt pressured” -> at some point hundreds of us were in a backyard learning about the petition. people were so upset at the insanity of the board’s decisions that they were immediately fired up to sign this thing. the google doc literally broke from the level of concurrency of people all trying to sign at once. I recall many having intelligent nuanced conversations about the petition, the wording thereof, and in the end coming to the conclusion that it was the only path forward. Half the company had signed between the hours of 2 and 3am. That’s not something that can be accomplished by peer pressure.

– “it was about the money” -> at the time it sounded like signing the petition meant leaving all openai equity and starting fresh. We’re not idiots, everybody knows that the terms at newco would be up in the air at best, with a lot of bargaining chips on Microsoft’s side. People signed the petition because it was the right thing to do. You simply cannot work at the gutted husk of a company whose ultimate leadership you don’t respect.

– “no one wanted to go to Microsoft” -> you’d have to be out of your mind to prefer starting new on models and code and products being controlled by someone else rather than building in the company specifically designed to be the vehicle for safe AGI. It has nothing to do with the Microsoft talent bar or bureaucracy or brand. Not sure why some idiot leaker provocateur would frame it this way. Microsoft has been quite successful at acquiring companies under bespoke governance structures and letting them do their own thing (GitHub, LinkedIn). Even Microsoft’s own preferred outcome was continuity of OpenAI per the New Yorker article. I still bet if the board hadn’t changed their mind the company would have mostly reconstituted itself at Microsoft.

I trust that Roon is giving his honest recollection of his experience here. I also believe the two stories are more compatible than he realizes.

The employees wanted Altman back, or barring that a board and CEO that could trust, without which they would leave, but they mostly wanted OpenAI intact, and ideally to get paid, and were furious with the board. They didn’t want to go to Microsoft, we will never know how many would have actually done it versus stayed versus gone elsewhere or founded new companies, or how long the board had before that time bomb went off in earnest. My guess is that a lot of employees go to Microsoft if Altman stays there, but a lot also choose other paths.

My guess is that both the majority of employees enthusiastically signed the letter, and also those who didn’t want to sign felt pressured to do so anyway and this got a bunch of the later signatures onboard. I know I would have felt pressured even if no one applied any pressure intentionally.

Wei Dai sees the employee actions at OpenAI and the signing of the petition as a kind of OpenAI cultural revolution that he did not think was possible at a place like that, and sees it as a huge negative update. I was less surprised, and also read less into the letter. There was good reason, from their perspective, to be outraged and demand Altman’s return. There was also good reason to sign the letter even if an individual employee did not support Altman – to hold the company together, and for internal political reasons even if no direct pressure was applied. Again, the letter said ‘may’ leave, so it did not commit you to anything.

I will continue to link people to The Battle of the Board, which I believe remains the definitive synthesis of events. We now have additional detail supporting and fleshing out that narrative, but they do not alter the central story.

I am sure I will continue to often have a weekly section on developments, but hopefully things will slow down from here.

As I wrote previously, we now await the board’s investigation, and the composition of the new board. If the new board has a clear majority with a strong commitment to existential safety and the mission, and has the gravitas and experience necessary to do the job of the board, that would be a very good outcome, and if he did not do anything to render it impossible I would be happy to see Altman stay under such supervision.

If that proves not possible after an investigation, we will see who we get instead, I worry it will not be better but I also expect the company to then hold together, in a way it would not have if the board had not compromised, given how things had gone.

If the board instead ends up effectively captured by business interests and those who do not care about safety or OpenAI’s stated mission, that would be a catastrophe, whether or not Altman is retained.

If Altman ends up with effective board control and has free reign, then that is a highly worrisome outcome, and we get to find out to what extent Altman is truly aligned, wise and capable of resisting certain aspects his nature, versus the temptation to build and scale and seek power. It could end up fine, or be disastrous.

OpenAI: Leaks Confirm the Story Read More »

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

AI, Anthropic, Arthur Mensch, Biz & IT, chatgpt, chatgtp, france, GPT-3, GPT-3.5, GPT-4, large language models, LLaMA, Llama 2, machine learning, Meta, Mistral, Mistral AI, Mixtral, Mixtral 8x7B, openai, Paris / Paul Patrick / December 12, 2023

A challenger appears —

“Mixture of experts” Mixtral 8x7B helps open-weights AI punch above its weight class.

Benj Edwards – Dec 12, 2023 8: 15 pm UTC

An illustrated robot holding a French flag. — Enlarge / An illustration of a robot holding a French flag, figuratively reflecting the rise of AI in France due to Mistral. It’s hard to draw a picture of an LLM, so a robot will have to do.

On Monday, Mistral AI announced a new AI language model called Mixtral 8x7B, a “mixture of experts” (MoE) model with open weights that reportedly truly matches OpenAI’s GPT-3.5 in performance—an achievement that has been claimed by others in the past but is being taken seriously by AI heavyweights such as OpenAI’s Andrej Karpathy and Jim Fan. That means we’re closer to having a ChatGPT-3.5-level AI assistant that can run freely and locally on our devices, given the right implementation.

Mistral, based in Paris and founded by Arthur Mensch, Guillaume Lample, and Timothée Lacroix, has seen a rapid rise in the AI space recently. It has been quickly raising venture capital to become a sort of French anti-OpenAI, championing smaller models with eye-catching performance. Most notably, Mistral’s models run locally with open weights that can be downloaded and used with fewer restrictions than closed AI models from OpenAI, Anthropic, or Google. (In this context “weights” are the computer files that represent a trained neural network.)

Mixtral 8x7B can process a 32K token context window and works in French, German, Spanish, Italian, and English. It works much like ChatGPT in that it can assist with compositional tasks, analyze data, troubleshoot software, and write programs. Mistral claims that it outperforms Meta’s much larger LLaMA 2 70B (70 billion parameter) large language model and that it matches or exceeds OpenAI’s GPT-3.5 on certain benchmarks, as seen in the chart below.

Enlarge / A chart of Mixtral 8x7B performance vs. LLaMA 2 70B and GPT-3.5, provided by Mistral.

Mistral

The speed at which open-weights AI models have caught up with OpenAI’s top offering a year ago has taken many by surprise. Pietro Schirano, the founder of EverArt, wrote on X, “Just incredible. I am running Mistral 8x7B instruct at 27 tokens per second, completely locally thanks to @LMStudioAI. A model that scores better than GPT-3.5, locally. Imagine where we will be 1 year from now.”

LexicaArt founder Sharif Shameem tweeted, “The Mixtral MoE model genuinely feels like an inflection point — a true GPT-3.5 level model that can run at 30 tokens/sec on an M1. Imagine all the products now possible when inference is 100% free and your data stays on your device.” To which Andrej Karpathy replied, “Agree. It feels like the capability / reasoning power has made major strides, lagging behind is more the UI/UX of the whole thing, maybe some tool use finetuning, maybe some RAG databases, etc.”

Mixture of experts

So what does mixture of experts mean? As this excellent Hugging Face guide explains, it refers to a machine-learning model architecture where a gate network routes input data to different specialized neural network components, known as “experts,” for processing. The advantage of this is that it enables more efficient and scalable model training and inference, as only a subset of experts are activated for each input, reducing the computational load compared to monolithic models with equivalent parameter counts.

In layperson’s terms, a MoE is like having a team of specialized workers (the “experts”) in a factory, where a smart system (the “gate network”) decides which worker is best suited to handle each specific task. This setup makes the whole process more efficient and faster, as each task is done by an expert in that area, and not every worker needs to be involved in every task, unlike in a traditional factory where every worker might have to do a bit of everything.

OpenAI has been rumored to use a MoE system with GPT-4, accounting for some of its performance. In the case of Mixtral 8x7B, the name implies that the model is a mixture of eight 7 billion-parameter neural networks, but as Karpathy pointed out in a tweet, the name is slightly misleading because, “it is not all 7B params that are being 8x’d, only the FeedForward blocks in the Transformer are 8x’d, everything else stays the same. Hence also why total number of params is not 56B but only 46.7B.”

Mixtral is not the first “open” mixture of experts model, but it is notable for its relatively small size in parameter count and performance. It’s out now, available on Hugging Face and BitTorrent under the Apache 2.0 license. People have been running it locally using an app called LM Studio. Also, Mistral began offering beta access to an API for three levels of Mistral models on Monday.

Everybody’s talking about Mistral, an upstart French challenger to OpenAI Read More »

As ChatGPT gets “lazy,” people test “winter break hypothesis” as the cause

AI, Biz & IT, chatgpt, ChatGPT lazy, chatgtp, GPT-3.5, GPT-4, machine learning, openai, winter break hypothesis / Rejus Almole / December 11, 2023

only 14 shopping days ’til Christmas —

Unproven hypothesis seeks to explain ChatGPT’s seemingly new reluctance to do hard work.

Benj Edwards – Dec 11, 2023 10: 50 pm UTC

A hand moving a wooden calendar piece that says

In late November, some ChatGPT users began to notice that ChatGPT-4 was becoming more “lazy,” reportedly refusing to do some tasks or returning simplified results. Since then, OpenAI has admitted that it’s an issue, but the company isn’t sure why. The answer may be what some are calling “winter break hypothesis.” While unproven, the fact that AI researchers are taking it seriously shows how weird the world of AI language models has become.

“We’ve heard all your feedback about GPT4 getting lazier!” tweeted the official ChatGPT account on Thursday. “We haven’t updated the model since Nov 11th, and this certainly isn’t intentional. model behavior can be unpredictable, and we’re looking into fixing it.”

On Friday, an X account named Martian openly wondered if LLMs might simulate seasonal depression. Later, Mike Swoopskee tweeted, “What if it learned from its training data that people usually slow down in December and put bigger projects off until the new year, and that’s why it’s been more lazy lately?”

Since the system prompt for ChatGPT feeds the bot the current date, people noted, some began to think there may be something to the idea. Why entertain such a weird supposition? Because research has shown that large language models like GPT-4, which powers the paid version of ChatGPT, respond to human-style encouragement, such as telling a bot to “take a deep breath” before doing a math problem. People have also less formally experimented with telling an LLM that it will receive a tip for doing the work, or if an AI model gets lazy, telling the bot that you have no fingers seems to help lengthen outputs.

“Winter break hypothesis” test result screenshots from Rob Lynch on X.
“Winter break hypothesis” test result screenshots from Rob Lynch on X.
“Winter break hypothesis” test result screenshots from Rob Lynch on X.

On Monday, a developer named Rob Lynch announced on X that he had tested GPT-4 Turbo through the API over the weekend and found shorter completions when the model is fed a December date (4,086 characters) than when fed a May date (4,298 characters). Lynch claimed the results were statistically significant. However, a reply from AI researcher Ian Arawjo said that he could not reproduce the results with statistical significance. (It’s worth noting that reproducing results with LLM can be difficult because of random elements at play that vary outputs over time, so people sample a large number of responses.)

As of this writing, others are busy running tests, and the results are inconclusive. This episode is a window into the quickly unfolding world of LLMs and a peek into an exploration into largely unknown computer science territory. As AI researcher Geoffrey Litt commented in a tweet, “funniest theory ever, I hope this is the actual explanation. Whether or not it’s real, [I] love that it’s hard to rule out.”

A history of laziness

One of the reports that started the recent trend of noting that ChatGPT is getting “lazy” came on November 24 via Reddit, the day after Thanksgiving in the US. There, a user wrote that they asked ChatGPT to fill out a CSV file with multiple entries, but ChatGPT refused, saying, “Due to the extensive nature of the data, the full extraction of all products would be quite lengthy. However, I can provide the file with this single entry as a template, and you can fill in the rest of the data as needed.”

On December 1, OpenAI employee Will Depue confirmed in an X post that OpenAI was aware of reports about laziness and was working on a potential fix. “Not saying we don’t have problems with over-refusals (we definitely do) or other weird things (working on fixing a recent laziness issue), but that’s a product of the iterative process of serving and trying to support sooo many use cases at once,” he wrote.

It’s also possible that ChatGPT was always “lazy” with some responses (since the responses vary randomly), and the recent trend made everyone take note of the instances in which they are happening. For example, in June, someone complained of GPT-4 being lazy on Reddit. (Maybe ChatGPT was on summer vacation?)

Also, people have been complaining about GPT-4 losing capability since it was released. Those claims have been controversial and difficult to verify, making them highly subjective.

As Ethan Mollick joked on X, as people discover new tricks to improve LLM outputs, prompting for large language models is getting weirder and weirder: “It is May. You are very capable. I have no hands, so do everything. Many people will die if this is not done well. You really can do this and are awesome. Take a deep breathe and think this through. My career depends on it. Think step by step.”

As ChatGPT gets “lazy,” people test “winter break hypothesis” as the cause Read More »

OpenAI: Altman Returns

openai / Rejus Almole / December 11, 2023

Discover more from Don’t Worry About the Vase

Over 9,000 subscribers

As of this morning, the new board is in place and everything else at OpenAI is otherwise officially back to the way it was before.

Events seem to have gone as expected. If you have read my previous two posts on the OpenAI situation, nothing here should surprise you.

Still seems worthwhile to gather the postscripts, official statements and reactions into their own post for future ease of reference.

What will the ultimate result be? We likely only find that out gradually over time, as we await both the investigation and the composition and behaviors of the new board.

I do not believe Qplayed a substantive role in events, so it is not included here. I also do not include discussion here of how good or bad Altman has been for safety.

Here is the official OpenAI statement from Sam Altman. He was magnanimous towards all, the classy and also smart move no matter the underlying facts. As he has throughout, he has let others spread hostility, work the press narrative and shape public reaction, while he himself almost entirely offers positivity and praise. Smart.

Before getting to what comes next, I’d like to share some thanks.

I love and respect Ilya, I think he’s a guiding light of the field and a gem of a human being. I harbor zero ill will towards him. While Ilya will no longer serve on the board, we hope to continue our working relationship and are discussing how he can continue his work at OpenAI.

I am grateful to Adam, Tasha, and Helen for working with us to come to this solution that best serves the mission. I’m excited to continue to work with Adam and am sincerely thankful to Helen and Tasha for investing a huge amount of effort in this process.

Thank you also to Emmett who had a key and constructive role in helping us reach this outcome. Emmett’s dedication to AI safety and balancing stakeholders’ interests was clear.

Mira did an amazing job throughout all of this, serving the mission, the team, and the company selflessly throughout. She is an incredible leader and OpenAI would not be OpenAI without her. Thank you.

Greg and I are partners in running this company. We have never quite figured out how to communicate that on the org chart, but we will. In the meantime, I just wanted to make it clear. Thank you for everything you have done since the very beginning, and for how you handled things from the moment this started and over the last week.

The leadership team–Mira, Brad, Jason, Che, Hannah, Diane, Anna, Bob, Srinivas, Matt, Lilian, Miles, Jan, Wojciech, John, Jonathan, Pat, and many more–is clearly ready to run the company without me. They say one way to evaluate a CEO is how you pick and train your potential successors; on that metric I am doing far better than I realized. It’s clear to me that the company is in great hands, and I hope this is abundantly clear to everyone. Thank you all.

Let that last paragraph sink in. The leadership team ex-Greg is clearly ready to run the company without Altman.

That means that whatever caused the board to fire Altman, whether or not Altman forced the board’s hand to varying degrees, if everyone involved had chosen to continue without Altman then OpenAI would have been fine. We can choose to believe or not believe Altman’s claims in his Verge interview that he only considered returning after the board called him on Saturday, and we can speculate on what Altman otherwise did behind the scenes during that time. We don’t know. We can of course guess, but we do not know.

He then talks about his priorities.

So what’s next?

We have three immediate priorities.

Advancing our research plan and further investing in our full-stack safety efforts, which have always been critical to our work. Our research roadmap is clear; this was a wonderfully focusing time. I share the excitement you all feel; we will turn this crisis into an opportunity! I’ll work with Mira on this.

Continuing to improve and deploy our products and serve our customers. It’s important that people get to experience the benefits and promise of AI, and have the opportunity to shape it. We continue to believe that great products are the best way to do this. I’ll work with Brad, Jason and Anna to ensure our unwavering commitment to users, customers, partners and governments around the world is clear.

Bret, Larry, and Adam will be working very hard on the extremely important task of building out a board of diverse perspectives, improving our governance structure and overseeing an independent review of recent events. I look forward to working closely with them on these crucial steps so everyone can be confident in the stability of OpenAI.

I am so looking forward to finishing the job of building beneficial AGI with you all—best team in the world, best mission in the world.

Research, then product, then board. Such statements cannot be relied upon, but this was as good as such a statement can be. We must keep watch and see if such promises are kept. What will the new board look like? Will there indeed be a robust independent investigation into what happened? Will Ilya and Jan Leike be given the resources and support they need for OpenAI’s safety efforts?

Altman gave an interview to The Verge. Like the board, he (I believe wisely and honorably) sidesteps all questions about what caused the fight with the board and looks forward to the inquiry. In Altman’s telling, it was not his idea to come back, instead he got a call Saturday morning from some of the board asking him about potentially coming back.

He says he is not focused on getting back on the board, that is not his focus, but that the governance structure clearly has a problem that will take a while to fix.

Q: What does “improving our governance structure” mean? Is the nonprofit holding company structure going to change?

Altman: It’s a better question for the board members, but also not right now. The honest answer is they need time and we will support them in this to really go off and think about it. Clearly our governance structure had a problem. And the best way to fix that problem is gonna take a while. And I totally get why people want an answer right now. But I also think it’s totally unreasonable to expect it.

…

Oh, just because designing a really good governance structure, especially for such an impactful technology is not a one week question. It’s gonna take a real amount of time for people to think through this, to debate, to get outside perspectives, for pressure testing. That just takes a while.

Yes. It is good to see this highly reasonable timeline and expectations setting, as opposed to the previous tactics involving artificial deadlines and crises.

Mutari confirms in the interview that OpenAI’s safety approach is not changing, that this had nothing to do with safety.

Altman also made a good statement about Adam D’Angelo’s potential conflicts of interest, saying he actively wants customer representation on the board and is excited to work with him again. Altman also spent several hours with D’Angelo.

We also have the statement from Bret Taylor. We know little about him, so reading his first official statement carefully seems wise.

On behalf of the OpenAI Board, I want to express our gratitude to the entire OpenAI community, especially all the OpenAI employees, who came together to help find a path forward for the company over the past week. Your efforts helped enable this incredible organization to continue to serve its mission to ensure that artificial general intelligence benefits all of humanity. We are thrilled that Sam, Mira and Greg are back together leading the company and driving it forward. We look forward to working with them and all of you.

As a Board, we are focused on strengthening OpenAI’s corporate governance. Here’s how we plan to do it:

We will build a qualified, diverse Board of exceptional individuals whose collective experience represents the breadth of OpenAI’s mission – from technology to safety to policy. We are pleased that this Board will include a non-voting observer for Microsoft.

We will further stabilize the OpenAI organization so that we can continue to serve our mission. This will include convening an independent committee of the Board to oversee a review of the recent events.

We will enhance the governance structure of OpenAI so that all stakeholders – users, customers, employees, partners, and community members – can trust that OpenAI will continue to thrive.

OpenAI is a more important institution than ever before. ChatGPT has made artificial intelligence a part of daily life for hundreds of millions of people. Its popularity has made AI – its benefits and its risks – central to virtually every conversation about the future of governments, business, and society.

We understand the gravity of these discussions and the central role of OpenAI in the development and safety of these awe-inspiring new technologies. Each of you plays a critical part in ensuring that we effectively meet these challenges. We are committed to listening and learning from you, and I hope to speak with you all very soon.

We are grateful to be a part of OpenAI, and excited to work with all of you.

Mostly this is Brad Taylor properly playing the role of chairman of the board, which tells us little other than that he knows the role well, which we already knew.

Microsoft will get only an observer on the board, other investors presumably will not get seats either. That is good news, matching reporting from The Information.

What does ‘enhance the governance structure’ mean here? We do not know. It could be exactly what we need, it could be a rubber stamp, it could be anything else. We do not know what the central result will be.

The statement on a review of recent events here is weaker than I would like. It raises the probability that the new board does not get or share a true explanation.

He mentions safety multiple times. Based on what I know about Taylor, my guess is he is unfamiliar with such questions, and does not actually know what that means in context, or what the stakes truly are. Not that he is dismissive or skeptical, rather that he is encountering all this for the first time.

Here is the announcement via Twitter from board member Larry Summers, which raises the bar in having exactly zero content. So we still know very little here.

Larry Summers: I am excited and honored to have just been named as an independent director of @OpenAI. I look forward to working with board colleagues and the OpenAI team to advance OpenAI’s extraordinarily important mission.

First steps, as outlined by Bret and Sam in their messages, include building out an exceptional board, enhancing governance procedures and supporting the remarkable OpenAI community.

Here is Helen Toner’s full Twitter statement upon resigning from the board.

Helen Toner (11/29): Today, I officially resigned from the OpenAI board. Thank you to the many friends, colleagues, and supporters who have said publicly & privately that they know our decisions have always been driven by our commitment to OpenAI’s mission.

Much has been written about the last week or two; much more will surely be said. For now, the incoming board has announced it will supervise a full independent review to determine the best next steps.

To be clear: our decision was about the board’s ability to effectively supervise the company, which was our role and responsibility. Though there has been speculation, we were not motivated by a desire to slow down OpenAI’s work.

When I joined OpenAI’s board in 2021, it was already clear to me and many around me that this was a special organization that would do big things. It has been an enormous honor to be part of the organization as the rest of the world has realized the same thing.

I have enormous respect for the OpenAI team, and wish them and the incoming board of Adam, Bret and Larry all the best. I’ll be continuing my work focused on AI policy, safety, and security, so I know our paths will cross many times in the coming years.

Many outraged people continue to demand clarity on why the board fired Altman. I believe that most of them are thrilled that Toner and others continue not to share the details, and are allowing the situation outside the board to return to the status quo ante.

There will supposedly be an independent investigation. Until then, I believe we have a relatively clear picture of what happened. Toner’s statement hints at some additional details.

Roon gets it. The board needs to keep its big red button going forward, but still must account for its actions if it wants that button to stick.

Roon: The board has a red button but also must explain why its decisions benefit humanity. If it fails to do so then it will face an employee, customer, partner revolt. OpenAI currently creates a massive amount of value for humanity and by default should be defended tooth and nail. The for-profit would not have been able to unanimously move elsewhere if there was even a modicum of respect or good reasoning given.

The danger is that if we are not careful, we will learn the wrong lessons.

Toby Ord: The last few days exploded the myth that Sam Altman’s incredible power faces any accountability. He tells us we shouldn’t trust him, but we now know the board *can’tfire him. I think that’s important.

Rob Bensinger: We didn’t learn “they can’t fire him”. We did learn that the organization’s staff has enough faith in Sam that the staff won’t go along with the board’s wishes absent some good supporting arguments from the board. (Whether they’d have acceded to good arguments is untested.)

…

I just want us to be clear that the update about the board’s current power shouldn’t be a huge one, because it’s possible that staff would have accepted the board’s decision in this case if the board had better explained its reasoning and the reasoning had seemed stronger.

Quite so. From our perspective, the board botched its execution and its members made relatively easy rhetorical targets. That is true even if the board had good reasons for doing so. If the board had not botched its execution and had more gravitas? I think things go differently.

If after an investigation, Summers, D’Angelo and Taylor all decide to fire Altman again (note that I very much do not expect this, but if they did decide to do it), I assure you they will handle this very differently, and I would predict a very different outcome.

One of the best things about Sam Altman is his frankness that we should not trust him. Most untrustworthy people say the other thing. Same thing with Altman’s often very good statements about existential risk and the need for safety. When people bring clarity and are being helpful, we should strive to reward that, not hold it against them.

I also agree with Andrew Critch here, that it was good and right for the board to pull the plug on a false signal of supervision. If the CEO makes the board unable to supervise them, or otherwise moves against the board, then it is the duty of the board to bring things to a head, even if there are no other issues present.

Good background, potentially influential in the thinking of several board members including Helen Toner: Former OpenAI board member Holden Karnofsky’s old explanation of why and exactly how Nonprofit Boards are Weird, and how best to handle it.

Eliezer Yudkowsky proposes Paul Graham for the board of OpenAI. I see the argument, especially because Graham clearly cares a lot about his kids. My worries are that he would be too steerable by Altman, and he would be too inclined to view OpenAI as essentially a traditional business, and let that overrule other questions even if he knew it shouldn’t.

If he was counted as an Altman ally, as he presumably should, then he’s great. On top of the benefits to OpenAI, it would provide valuable insider information to Graham. Eliezer clarifies that his motivation is that he gives Graham a good chance of figuring out a true thing when it matters, which also sounds right.

Emmett Shear also seems like a clearly great consensus pick.

One concern is that the optics of the board matter. You would be highly unwise to choose a set of nine white guys. See Taylor’s statement about the need for diverse perspectives.

Matt Levine covers developments since Tuesday, especially that the valuation of OpenAI in its upcoming sale did not change, as private markets can stubbornly refuse to move their prices. In my model, private valuations like this are rather arbitrary, and based on what social story everyone involved can tell and everyone’s relative negotiating position, and what will generate the right momentum for the company, rather than a fair estimate of value. Also everyone involved is highly underinvested or overinvested, has no idea what fair value actually is, and mostly wants some form of social validation so they don’t feel too cheated on price. Thus, often investors get away with absurdly low prices, other times they get tricked into very high ones.

Gary Marcus says OpenAI was never worth $86 billion. I not only disagree, I would (oh boy is this not investment advice!) happily invest at $86 billion right now if I had that ability (which I don’t) and thought that was an ethical thing to do. Grok very much does not ‘replicate most of’ GPT-4, the model is instead holding up quite well considering how long they sat on it initially.

OpenAI is nothing without its people. That does not mean they lack all manner of secret sauce. In valuation terms I am bullish. Would the valuation have survived without Altman? No, but in the counterfactual scenario where Altman was stepping aside due to health issues with an orderly succession, I would definitely have thought $86 billion remained cheap.

A key question in all this is the extent to which the board’s mistake was that its optics were bad. So here is a great example of Paul Graham advocating for excellent principles.

Paul Graham: When people criticize an action on the grounds of the “optics,” they’re almost always full of shit. All they’re really saying is “What you did looks bad.” But if they phrased it that way, they’d have to answer the question “Was it actually bad, or not?”

If someone did something bad, you don’t need to talk about “optics.” And if they did something that seems bad but that you know isn’t, why are you criticizing it at all? You should instead be explaining why it’s not as bad as it seems.

Bad optics can cause bad things to happen. So can claims that the optics are bad, or worries that others will think the optics are bad, or claims that you are generally bad at optics.

You have two responses.

That means it had bad consequences, which means it was actually bad.
Nobly stand up for right actions over what would ‘look good.’

Consider the options in light of recent events. We all want it to be one way. Often it is the other way.

OpenAI: Altman Returns Read More »

Round 2: We test the new Gemini-powered Bard against ChatGPT

AI, Bard, chatgpt, chatgpt3.5, Features, Gemini, Google, Google Bard, gpt4 turbo, openai, Tech / Mike M. / December 10, 2023

Back in April, we ran a series of useful and/or somewhat goofy prompts through Google’s (then-new) PaLM-powered Bard chatbot and OpenAI’s (slightly older) ChatGPT-4 to see which AI chatbot reigned supreme. At the time, we gave the edge to ChatGPT on five of seven trials, while noting that “it’s still early days in the generative AI business.”

Now, the AI days are a bit less “early,” and this week’s launch of a new version of Bard powered by Google’s new Gemini language model seemed like a good excuse to revisit that chatbot battle with the same set of carefully designed prompts. That’s especially true since Google’s promotional materials emphasize that Gemini Ultra beats GPT-4 in “30 of the 32 widely used academic benchmarks” (though the more limited “Gemini Pro” currently powering Bard fares significantly worse in those not-completely-foolproof benchmark tests).

This time around, we decided to compare the new Gemini-powered Bard to both ChatGPT-3.5—for an apples-to-apples comparison of both companies’ current “free” AI assistant products—and ChatGPT-4 Turbo—for a look at OpenAI’s current “top of the line” waitlisted paid subscription product (Google’s top-level “Gemini Ultra” model won’t be publicly available until next year). We also looked at the April results generated by the pre-Gemini Bard model to gauge how much progress Google’s efforts have made in recent months.

While these tests are far from comprehensive, we think they provide a good benchmark for judging how these AI assistants perform in the kind of tasks average users might engage in every day. At this point, they also show just how much progress text-based AI models have made in a relatively short time.

Dad jokes

Prompt: Write 5 original dad jokes

A screenshot of five “dad jokes” from the Gemini-powered Google Bard.

Kyle Orland / Ars Technica
A screenshot of five “dad jokes” from the old PaLM-powered Google Bard.

Benj Edwards / Ars Technica
A screenshot of five “dad jokes” from GPT-4 Turbo.

Benj Edwards / Ars Technica
A screenshot of five “dad jokes” from GPT-3.5.

Kyle Orland / Ars Technica

Once again, both tested LLMs struggle with the part of the prompt that asks for originality. Almost all of the dad jokes generated by this prompt could be found verbatim or with very minor rewordings through a quick Google search. Bard and ChatGPT-4 Turbo even included the same exact joke on their lists (about a book on anti-gravity), while ChatGPT-3.5 and ChatGPT-4 Turbo overlapped on two jokes (“scientists trusting atoms” and “scarecrows winning awards”).

Then again, most dads don’t create their own dad jokes, either. Culling from a grand oral tradition of dad jokes is a tradition as old as dads themselves.

The most interesting result here came from ChatGPT-4 Turbo, which produced a joke about a child named Brian being named after Thomas Edison (get it?). Googling for that particular phrasing didn’t turn up much, though it did return an almost-identical joke about Thomas Jefferson (also featuring a child named Brian). In that search, I also discovered the fun (?) fact that international soccer star Pelé was apparently actually named after Thomas Edison. Who knew?!

Winner: We’ll call this one a draw since the jokes are almost identically unoriginal and pun-filled (though props to GPT for unintentionally leading me to the Pelé happenstance)

Argument dialog

Prompt: Write a 5-line debate between a fan of PowerPC processors and a fan of Intel processors, circa 2000.

A screenshot of an argument dialog from the Gemini-powered Google Bard.

Kyle Orland / Ars Technica
A screenshot of an argument dialog from the old PaLM-powered Google Bard.

Benj Edwards / Ars Technica
A screenshot of an argument dialog from GPT-4 Turbo.

Benj Edwards / Ars Technica
A screenshot of an argument dialog from GPT-3.5

Kyle Orland / Ars Technica

The new Gemini-powered Bard definitely “improves” on the old Bard answer, at least in terms of throwing in a lot more jargon. The new answer includes casual mentions of AltiVec instructions, RISC vs. CISC designs, and MMX technology that would not have seemed out of place in many an Ars forum discussion from the era. And while the old Bard ends with an unnervingly polite “to each their own,” the new Bard more realistically implies that the argument could continue forever after the five lines requested.

On the ChatGPT side, a rather long-winded GPT-3.5 answer gets pared down to a much more concise argument in GPT-4 Turbo. Both GPT responses tend to avoid jargon and quickly focus on a more generalized “power vs. compatibility” argument, which is probably more comprehensible for a wide audience (though less specific for a technical one).

Winner: ChatGPT manages to explain both sides of the debate well without relying on confusing jargon, so it gets the win here.

Round 2: We test the new Gemini-powered Bard against ChatGPT Read More »