Biz & IT – Page 9

Ongoing attacks on Ivanti VPNs install a ton of sneaky, well-written malware

Biz & IT, Security / Mike M. / January 9, 2025

Networks protected by Ivanti VPNs are under active attack by well-resourced hackers who are exploiting a critical vulnerability that gives them complete control over the network-connected devices.

Hardware maker Ivanti disclosed the vulnerability, tracked as CVE-2025-0283, on Wednesday and warned that it was under active exploitation against some customers. The vulnerability, which is being exploited to allow hackers to execute malicious code with no authentication required, is present in the company’s Connect Secure VPN, and Policy Secure & ZTA Gateways. Ivanti released a security patch at the same time. It upgrades Connect Secure devices to version 22.7R2.5.

Well-written, multifaceted

According to Google-owned security provider Mandiant, the vulnerability has been actively exploited against “multiple compromised Ivanti Connect Secure appliances” since December, a month before the then zero-day came to light. After exploiting the vulnerability, the attackers go on to install two never-before-seen malware packages, tracked under the names DRYHOOK and PHASEJAM on some of the compromised devices.

PHASEJAM is a well-written and multifaceted bash shell script. It first installs a web shell that gives the remote hackers privileged control of devices. It then injects a function into the Connect Secure update mechanism that’s intended to simulate the upgrading process.

“If the ICS administrator attempts an upgrade, the function displays a visually convincing upgrade process that shows each of the steps along with various numbers of dots to mimic a running process,” Mandiant said. The company continued:

PHASEJAM injects a malicious function into the /home/perl/DSUpgrade.pm file named processUpgradeDisplay(). The functionality is intended to simulate an upgrading process that involves 13 steps, with each of those taking a predefined amount of time. If the ICS administrator attempts an upgrade, the function displays a visually convincing upgrade process that shows each of the steps along with various numbers of dots to mimic a running process. Further details are provided in the System Upgrade Persistence section.

The attackers are also using a previously seen piece of malware tracked as SPAWNANT on some devices. One of its functions is to disable an integrity checker tool (ICT) Ivanti has built into recent VPN versions that is designed to inspect device files for unauthorized additions. SpawnAnt does this by replacing the expected SHA256 cryptographic hash of a core file with the hash of it after it has been infected. As a result, when the tool is run on compromised devices, admins see the following screen:

Ongoing attacks on Ivanti VPNs install a ton of sneaky, well-written malware Read More »

Here’s how hucksters are manipulating Google to promote shady Chrome extensions

abuse, Biz & IT, Chrome Web Store, extensions, Google, Security, spam / Rejus Almole / January 8, 2025

The people overseeing the security of Google’s Chrome browser explicitly forbid third-party extension developers from trying to manipulate how the browser extensions they submit are presented in the Chrome Web Store. The policy specifically calls out search-manipulating techniques such as listing multiple extensions that provide the same experience or plastering extension descriptions with loosely related or unrelated keywords.

On Wednesday, security and privacy researcher Wladimir Palant revealed that developers are flagrantly violating those terms in hundreds of extensions currently available for download from Google. As a result, searches for a particular term or terms can return extensions that are unrelated, inferior knockoffs, or carry out abusive tasks such as surreptitiously monetizing web searches, something Google expressly forbids.

Not looking? Don’t care? Both?

A search Wednesday morning in California for Norton Password Manager, for example, returned not only the official extension but three others, all of which are unrelated at best and potentially abusive at worst. The results may look different for searches at other times or from different locations.

Search results for Norton Password Manager.

It’s unclear why someone who uses a password manager would be interested in spoofing their time zone or boosting the audio volume. Yes, they’re all extensions for tweaking or otherwise extending the Chrome browsing experience, but isn’t every extension? The Chrome Web Store doesn’t want extension users to get pigeonholed or to see the list of offerings as limited, so it doesn’t just return the title searched for. Instead, it draws inferences from descriptions of other extensions in an attempt to promote ones that may also be of interest.

In many cases, developers are exploiting Google’s eagerness to promote potentially related extensions in campaigns that foist offerings that are irrelevant or abusive. But wait, Chrome security people have put developers on notice that they’re not permitted to engage in keyword spam and other search-manipulating techniques. So, how is this happening?

Here’s how hucksters are manipulating Google to promote shady Chrome extensions Read More »

Time to check if you ran any of these 33 malicious Chrome extensions

Biz & IT, Browsers, chrome, extensions, privacy, Security / Tim Belzer / January 4, 2025

Screenshot showing the phishing email sent to Cyberhaven extension developers. Credit: Amit Assaraf

A link in the email led to a Google consent screen requesting access permission for an OAuth application named Privacy Policy Extension. A Cyberhaven developer granted the permission and, in the process, unknowingly gave the attacker the ability to upload new versions of Cyberhaven’s Chrome extension to the Chrome Web Store. The attacker then used the permission to push out the malicious version 24.10.4.

Screenshot showing the Google permission request. Credit: Amit Assaraf

As word of the attack spread in the early hours of December 25, developers and researchers discovered that other extensions were targeted, in many cases successfully, by the same spear phishing campaign. John Tuckner, founder of Secure Annex, a browser extension analysis and management firm, said that as of Thursday afternoon, he knew of 19 other Chrome extensions that were similarly compromised. In every case, the attacker used spear phishing to push a new malicious version and custom, look-alike domains to issue payloads and receive authentication credentials. Collectively, the 20 extensions had 1.46 million downloads.

“For many I talk to, managing browser extensions can be a lower priority item in their security program,” Tuckner wrote in an email. “Folks know they can present a threat, but rarely are teams taking action on them. We’ve often seen in security [that] one or two incidents can cause a reevaluation of an organization’s security posture. Incidents like this often result in teams scrambling to find a way to gain visibility and understanding of impact to their organizations.”

The earliest compromise occurred in May 2024. Tuckner provided the following spreadsheet:

Name	ID	Version	Patch	Available	Users	Start	End
VPNCity	nnpnnpemnckcfdebeekibpiijlicmpom	2.0.1		FALSE	10,000	12/12/24	12/31/24
Parrot Talks	kkodiihpgodmdankclfibbiphjkfdenh	1.16.2		TRUE	40,000	12/25/24	12/31/24
Uvoice	oaikpkmjciadfpddlpjjdapglcihgdle	1.0.12		TRUE	40,000	12/26/24	12/31/24
Internxt VPN	dpggmcodlahmljkhlmpgpdcffdaoccni	1.1.1	1.2.0	TRUE	10,000	12/25/24	12/29/24
Bookmark Favicon Changer	acmfnomgphggonodopogfbmkneepfgnh	4.00		TRUE	40,000	12/25/24	12/31/24
Castorus	mnhffkhmpnefgklngfmlndmkimimbphc	4.40	4.41	TRUE	50,000	12/26/24	12/27/24
Wayin AI	cedgndijpacnfbdggppddacngjfdkaca	0.0.11		TRUE	40,000	12/19/24	12/31/24
Search Copilot AI Assistant for Chrome	bbdnohkpnbkdkmnkddobeafboooinpla	1.0.1		TRUE	20,000	7/17/24	12/31/24
VidHelper – Video Downloader	egmennebgadmncfjafcemlecimkepcle	2.2.7		TRUE	20,000	12/26/24	12/31/24
AI Assistant – ChatGPT and Gemini for Chrome	bibjgkidgpfbblifamdlkdlhgihmfohh	0.1.3		FALSE	4,000	5/31/24	10/25/24
TinaMind – The GPT-4o-powered AI Assistant!	befflofjcniongenjmbkgkoljhgliihe	2.13.0	2.14.0	TRUE	40,000	12/15/24	12/20/24
Bard AI chat	pkgciiiancapdlpcbppfkmeaieppikkk	1.3.7		FALSE	100,000	9/5/24	10/22/24
Reader Mode	llimhhconnjiflfimocjggfjdlmlhblm	1.5.7		FALSE	300,000	12/18/24	12/19/24
Primus (prev. PADO)	oeiomhmbaapihbilkfkhmlajkeegnjhe	3.18.0	3.20.0	TRUE	40,000	12/18/24	12/25/24
Cyberhaven security extension V3	pajkjnmeojmbapicmbpliphjmcekeaac	24.10.4	24.10.5	TRUE	400,000	12/24/24	12/26/24
GraphQL Network Inspector	ndlbedplllcgconngcnfmkadhokfaaln	2.22.6	2.22.7	TRUE	80,000	12/29/24	12/30/24
GPT 4 Summary with OpenAI	epdjhgbipjpbbhoccdeipghoihibnfja	1.4		FALSE	10,000	5/31/24	9/29/24
Vidnoz Flex – Video recorder & Video share	cplhlgabfijoiabgkigdafklbhhdkahj	1.0.161		FALSE	6,000	12/25/24	12/29/24
YesCaptcha assistant	jiofmdifioeejeilfkpegipdjiopiekl	1.1.61		TRUE	200,000	12/29/24	12/31/24
Proxy SwitchyOmega (V3)	hihblcmlaaademjlakdpicchbjnnnkbo	3.0.2		TRUE	10,000	12/30/24	12/31/24

But wait, there’s more

One of the compromised extensions is called Reader Mode. Further analysis showed it had been compromised not just in the campaign targeting the other 19 extensions but in a separate campaign that started no later than April 2023. Tuckner said the source of the compromise appears to be a code library developers can use to monetize their extensions. The code library collects details about each web visit a browser makes. In exchange for incorporating the library into the extensions, developers receive a commission from the library creator.

Time to check if you ran any of these 33 malicious Chrome extensions Read More »

Passkey technology is elegant, but it’s most definitely not usable security

Biz & IT, Features, login, passkeys, passwords, phishing, Security, signin / Tim Belzer / December 30, 2024

NOT (QUITE) READY FOR PRIMETIME

Just in time for holiday tech-support sessions, here’s what to know about passkeys.

It’s that time again, when families and friends gather and implore the more technically inclined among them to troubleshoot problems they’re having behind the device screens all around them. One of the most vexing and most common problems is logging into accounts in a way that’s both secure and reliable.

Using the same password everywhere is easy, but in an age of mass data breaches and precision-orchestrated phishing attacks, it’s also highly unadvisable. Then again, creating hundreds of unique passwords, storing them securely, and keeping them out of the hands of phishers and database hackers is hard enough for experts, let alone Uncle Charlie, who got his first smartphone only a few years ago. No wonder this problem never goes away.

Passkeys—the much-talked-about password alternative to passwords that have been widely available for almost two years—was supposed to fix all that. When I wrote about passkeys two years ago, I was a big believer. I remain convinced that passkeys mount the steepest hurdle yet for phishers, SIM swappers, database plunderers, and other adversaries trying to hijack accounts. How and why is that?

Elegant, yes, but usable?

The FIDO2 specification and the overlapping WebAuthn predecessor that underpin passkeys are nothing short of pure elegance. Unfortunately, as support has become ubiquitous in browsers, operating systems, password managers, and other third-party offerings, the ease and simplicity envisioned have been undone—so much so that they can’t be considered usable security, a term I define as a security measure that’s as easy, or only incrementally harder, to use as less-secure alternatives.

“There are barriers at each turn that guide you through a developer’s idea of how you should use them,” William Brown, a software engineer specializing in authentication, wrote in an online interview. “None of them are deal-breaking, but they add up.”

Passkeys are now supported on hundreds of sites and roughly a dozen operating systems and browsers. The diverse ecosystem demonstrates the industry-wide support for passkeys, but it has also fostered a jumble of competing workflows, appearances, and capabilities that can vary greatly depending on the particular site, OS, and browser (or browser agents such as native iOS or Android apps). Rather than help users understand the dizzying number of options and choose the right one, each implementation strong-arms the user into choosing the vendor’s preferred choice.

The experience of logging into PayPal with a passkey on Windows will be different from logging into the same site on iOS or even logging into it with Edge on Android. And forget about trying to use a passkey to log into PayPal on Firefox. The payment site doesn’t support that browser on any OS.

Another example is when I create a passkey for my LinkedIn account on Firefox. Because I use a wide assortment of browsers on platforms, I have chosen to sync the passkey using my 1Password password manager. In theory, that choice allows me to automatically use this passkey anywhere I have access to my 1Password account, something that isn’t possible otherwise. But it’s not as simple as all that.

When I look at the passkey in LinkedIn settings, it shows as being created for Firefox on Mac OS X 10, even though it works on all the browsers and OSes I’m using.

Screenshot showing passkey is created for Firefox on Mac OS X 10.

Why is LinkedIn indicating otherwise? The answer is that there’s no way for LinkedIn to interoperate flexibly with the browsers and OSes and vice versa. Per the FIDO2 and WebAuthn specs, LinkedIn knows only the browser and OS I used when creating the credential. 1Password, meanwhile, has no way to coordinate with LinkedIn to ensure I’m presented with consistent information that will help me keep track of this. Suddenly, using passkeys is more confusing than it needs to be for there to be utility to ordinary users.

Things get more complicated still when I want to log into LinkedIn on Firefox for Android, and am presented with the following dialog box.

Screenshot showing a dialog box with the text: “You’re using on-device encryption. Unlock your passwords to sign in.”

At this point, I don’t know if it’s Google or Firefox that’s presenting me with this non-intuitive response. I just want to open LinkedIn using the passkey that’s being synced by 1Password to all my devices. Somehow, the mysterious entity responsible for this message (it’s Google in this case) has hijacked the process in an attempt to convince me to use its platform.

Also, consider the experience on WebAuthn.io, a site that demonstrates how the standard works under different scenarios. When a user wants to enroll a physical security key to log in on macOS, they receive a dialog that steers them toward using a passkey instead and to sync it through iCloud.

Dialog box showing macOS passkeys message.

The user just wants to enroll a security key in the form of a USB dongle or smartphone and can be used when logging in on any device. But instead, macOS preempts this task with directions for creating a passkey that will be synced through iCloud. What’s the user to do? Maybe click on the “other options” in small text at the very bottom? Let’s try and see.

The dialog box that appears after clicking “other options.”

Wait, why is it still offering the option for the passkey to be synced in iCloud, and how does that qualify as “other options”? And why is the most prominent suggestion that the user “continue with Touch ID”? It isn’t until selectng “security key” that the user will see that option they wanted all along—to store the credential on a security key. Only after this step—now three clicks in—does the light on a USB security key begin blinking, and the key is finally ready to be enrolled.

Dialog box finally allows the creation of a passkey on a security key.

The dueling dialogs in this example are by no means unique to macOS.

Too many cooks in the kitchen

“Most try to funnel you into a vendor’s sync passkey option, and don’t make it clear how you can use other things,” Brown noted. “Chrome, Apple, Windows, all try to force you to use their synced passkeys by default, and you have to click through prompts to use alternatives.”

Bruce Davie, another software engineer with expertise in authentication, agreed, writing in an October post that the current implementation of passkeys “seems to have failed the ‘make it easy for users’ test, which in my view is the whole point of passkeys.”

In April, Son Nguyen Kim, the product lead for the free Proton Pass password manager, penned a post titled Big Tech passkey implementations are a trap. In it, he complained that passkey implementations to date lock users into the platform they created the credential on.

“If you use Google Chrome as your browser on a Mac, it uses the Apple Keychain feature to store your passkeys,” he wrote. “This means you can’t sync your passkeys to your Chrome profile on other devices.” In an email last month, Kim said users can now override this option and choose to store their passkeys in Chrome. Even then, however, “passkeys created on Chrome on Mac don’t sync to Chrome in iPhone, so the user can’t use it seamlessly on Chrome on their iPhone.”

Other posts reciting similar complaints are here and here.

In short, there are too many cooks in the kitchen, and each one thinks they know the proper way to make pie.

I have put these and other criticisms to the test over the past four months. I have used them on a true heterogeneous environment that includes a MacBook Air, a Lenovo X1 ThinkPad, an iPhone, and a Pixel running Firefox, Chrome, Edge, Safari, and on the phones, a large number of apps, including those for LinkedIn, PayPal, eBay, Kayak, Gmail, Amazon, and Uber. My objective has been to understand how well passkey-based authentication works over the long term, particularly for cross-platform users.

I fully agree that syncing across different platforms is much harder than it should be. So is the messaging provided during the passkey enrollment phase. The dialogs users see are dictated arbitrarily by whatever OS or browser has control at the moment. There’s no way for previously made configuration choices to be communicated to tailor dialog boxes and workflow.

Another shortcoming: There’s no programming interface for Apple, Google, and Microsoft platforms to directly pass credentials from one to the other. The FIDO2 standard has devised a clever method in an attempt to bridge this gap. It typically involves joining two devices over a secure BLE connection and using a QR code so the already-authenticated device can vouch for the trustworthiness of the other. This process is easy for some people in some cases, but it can quickly become quirky and prone to failure, particularly when fussy devices can’t connect over BLE.

In many cases, however, critics overstate the severity of these sorts of problems. These are definitely things that unnecessarily confuse and complicate the use of passkeys. But often, they’re one-time events that can be overcome by creating multiple passkeys and bootstrapping them for each device. From then on, these unphishable, unstealable credentials live on both devices, in much the way some users allow credentials for their Gmail or Apple ID to be stored in two or more browsers or password managers for convenience.

More helpful still is using a cross-platform password manager to store and sync passkeys. I have been using 1Password to do just that for a month with no problems to report. Most other name-brand password managers would likely perform as well. In keeping with the FIDO2 spec, these credentials are end-to-end encrypted.

Halfway house for password managers

With my 1Password account running on my devices, I had no trouble using a passkey to log into any enrolled site on a device running any browser. The flow was fast and intuitive. In most cases, both iOS and Android had no problem passing the key from 1Password to an app for Uber, Amazon, Gmail, or another site. Signing into phone apps is one of the bigger hassles for me. Passkeys made this process much easier, and it did so while also allowing me the added security of MFA.

This reliance on a password manager, however, largely undermines a key value proposition of passkeys, which has been to provide an entirely new paradigm for authenticating ourselves. Using 1Password to sync a password is almost identical to syncing a passkey, so why bother? Worse still, the majority of people still don’t use password managers. I’m a big believer in password managers for the security they offer. Making them a condition for using a passkey would be a travesty.

I’m not the first person to voice this criticism. David Heinemeier Hansson said much the same thing in September.

“The problem with passkeys is that they’re essentially a halfway house to a password manager, but tied to a specific platform in ways that aren’t obvious to a user at all, and liable to easily leave them unable to access … their accounts,” wrote the Danish software engineer and programmer, who created Ruby on Rails and is the CTO of web-based software development firm 37signals. “Much the same way that two-factor authentication can do, but worse, since you’re not even aware of it.”

He continued:

Let’s take a simple example. You have an iPhone and a Windows computer. Chrome on Windows stores your passkeys in Windows Hello, so if you sign up for a service on Windows, and you then want to access it on iPhone, you’re going to be stuck (unless you’re so forward thinking as to add a second passkey, somehow, from the iPhone will on the Windows computer!). The passkey lives on the wrong device, if you’re away from the computer and want to login, and it’s not at all obvious to most users how they might fix that.

Even in the best case scenario, where you’re using an iPhone and a Mac that are synced with Keychain Access via iCloud, you’re still going to be stuck, if you need to access a service on a friend’s computer in a pinch. Or if you’re not using Keychain Access at all. There are plenty of pitfalls all over the flow. And the solutions, like scanning a QR code with a separate device, are cumbersome and alien to most users.

If you’re going to teach someone how to deal with all of this, and all the potential pitfalls that might lock them out of your service, you almost might as well teach them how to use a cross-platform password manager like 1Password.

Undermining security promises

The security benefits of passkeys at the moment are also undermined by an undeniable truth. Of the hundreds of sites supporting passkeys, there isn’t one I know of that allows users to ditch their password completely. The password is still mandatory. And with the exception of Google’s Advanced Protection Program, I know of no sites that won’t allow logins to fall back on passwords, often without any additional factor. Even then, all but Google APP accounts can be accessed using a recovery code.

This fallback on phishable, stealable credentials undoes some of the key selling points of passkeys. As soon as passkey adoption poses a meaningful hurdle in account takeovers, threat actors will devise hacks and social engineering attacks that exploit this shortcoming. Then we’re right back where we were before.

Christiaan Brandt, co-chair of the FIDO2 technical working group and an identity and security product manager at Google, said in an online interview that most users aren’t ready for true passwordless authentication.

“We have to meet users where they are,” he wrote. “When we tested messaging for passkeys, users balked at ‘replace your password with passkeys,’ but felt much more comfortable with more softened language like “you can now use a passkey to log in to your account too.’ Over time, we most definitely plan to wean users off phishable authentication factors, but we anticipate this journey to take multiple years. We really can only do it once users are so comfortable with passkeys that the fallback to passwords is (almost) never needed.”

A design choice further negating the security benefits of passkeys: Amazon, PayPal, Uber, and no small number of other sites supporting passkeys continue to rely on SMS texts for authentication even after passkeys are enrolled.

SMS-based MFA is among the weakest form of this protection. Not only can the texts be phished, but they’re also notoriously vulnerable to SIM swaps, in which an adversary gains control of a target’s phone number. As long as these less-secure fallbacks exist, passkeys aren’t much more than security theater.

I still think passkeys make sense in many cases. I’ll say more about that later. First, for a bit more context, readers should know:

Passkeys are defined in the WebAuthn spec as a “discoverable credential,” historically known as a “resident key.” The credential is in the form of a private-public key pair, which is created on the security key, which can be in the form of a FIDO-approved secure enclave embedded into a USB dongle, smartphone, or computer. The key pair is unique to each user account. The user creates the key pair after proving their identity to the website using an existing authentication method, typically a password. The private key never leaves the security key.

Going forward, when the user logs in, the site sends a security challenge to the user. The user then uses the locally stored private key to cryptographically sign the challenge and sends it to the website. The website then uses the public key it stores to verify the response is signed with the private key. With that, the user is logged in.

Under the FIDO2 spec, the passkey can never leave the security key, except as an encrypted blob of bits when the passkey is being synced from one device to another. The secret key can be unlocked only when the user authenticates to the physical key using a PIN, password, or most commonly a fingerprint or face scan. In the event the user authenticates with a biometric, it never leaves the security key, just as they never leave Android and iOS phones and computers running macOS or Windows.

Passkeys can be stored and synced using the same mechanisms millions of people already use for passwords—a password manager such as Bitwarden, Apple iCloud, Google Password Manager, or Microsoft’s cloud. Just like passwords, passkeys available in these managers are end-to-end encrypted using tried and true cryptographic algorithms.

The advent of this new paradigm was supposed to solve multiple problems at once—make authenticating ourselves online easier, eliminate the hassle of remembering passwords, and all but eradicate the most common forms of account takeovers.

When not encumbered by the problems mentioned earlier, this design provides multifactor authentication in a single stroke. The user logs in using something they have—the physical key, which must be near the device logging in. They must also use something they know—the PIN or password—or something they are—their face or fingerprint—to complete the credential transfer. The cryptographic secret never leaves the enclave embedded into the physical key.

What to tell Uncle Charlie?

In enterprise environments, passkeys can be a no-brainer alternative to passwords and authenticators. And even for Uncle Charlie—who has a single iPhone and Mac, and logs into only a handful of sites—passkeys may provide a simpler, less phishable path forward. Using a password manager to log into Gmail with a passkey ensures he’s protected by MFA. Using the password alone does not.

The takeaway from all of this—particularly for those recruited to provide technical support this week but also anyone trying to decide if it’s time to up their own authentication game: If a password manager isn’t already a part of the routine, see if it’s viable to add one now. Password managers make it practical to use a virtually unlimited number of long, randomly generated passwords that are unique to each site.

For some, particularly people with diminished capacity or less comfort being online, this step alone will be enough. Everyone else should also, whenever possible, opt into MFA, ideally using security keys or, if that’s not available, an authenticator app. I’m partial to 1Password as a password manager, Authy as an authenticator, and security keys from Yubico or Titan. There are plenty of other suitable alternatives.

I still think passkeys provide the greatest promise yet for filling the many security pitfalls of passwords and lowering the difficulty of remembering and storing them. For now, however, the hassles of using passkeys, coupled with their diminished security created by the presence of fallbacks, means no one should feel like a technophobe or laggard for sticking with their passwords. For now, passwords and key- or authenticator-based MFA remain essential.

With any luck, passkeys will someday be ready for the masses, but that day is not (yet) here.

Dan Goodin is Senior Security Editor at Ars Technica, where he oversees coverage of malware, computer espionage, botnets, hardware hacking, encryption, and passwords. In his spare time, he enjoys gardening, cooking, and following the independent music scene. Dan is based in San Francisco. Follow him at here on Mastodon and here on Bluesky. Contact him on Signal at DanArs.82.

Passkey technology is elegant, but it’s most definitely not usable security Read More »

2024: The year AI drove everyone crazy

AI, AI jabberwocky, air canada, Anthropic, Biz & IT, chatgpt, confabulation, Features, Google, GPT-4o, jabberwocky, machine learning, Meta, openai, OpenAI o1, rat genitals, Wonka / Mike M. / December 26, 2024

What do eating rocks, rat genitals, and Willy Wonka have in common? AI, of course.

It’s been a wild year in tech thanks to the intersection between humans and artificial intelligence. 2024 brought a parade of AI oddities, mishaps, and wacky moments that inspired odd behavior from both machines and man. From AI-generated rat genitals to search engines telling people to eat rocks, this year proved that AI has been having a weird impact on the world.

Why the weirdness? If we had to guess, it may be due to the novelty of it all. Generative AI and applications built upon Transformer-based AI models are still so new that people are throwing everything at the wall to see what sticks. People have been struggling to grasp both the implications and potential applications of the new technology. Riding along with the hype, different types of AI that may end up being ill-advised, such as automated military targeting systems, have also been introduced.

It’s worth mentioning that aside from crazy news, we saw fewer weird AI advances in 2024 as well. For example, Claude 3.5 Sonnet launched in June held off the competition as a top model for most of the year, while OpenAI’s o1 used runtime compute to expand GPT-4o’s capabilities with simulated reasoning. Advanced Voice Mode and NotebookLM also emerged as novel applications of AI tech, and the year saw the rise of more capable music synthesis models and also better AI video generators, including several from China.

But for now, let’s get down to the weirdness.

ChatGPT goes insane

Early in the year, things got off to an exciting start when OpenAI’s ChatGPT experienced a significant technical malfunction that caused the AI model to generate increasingly incoherent responses, prompting users on Reddit to describe the system as “having a stroke” or “going insane.” During the glitch, ChatGPT’s responses would begin normally but then deteriorate into nonsensical text, sometimes mimicking Shakespearean language.

OpenAI later revealed that a bug in how the model processed language caused it to select the wrong words during text generation, leading to nonsense outputs (basically the text version of what we at Ars now call “jabberwockies“). The company fixed the issue within 24 hours, but the incident led to frustrations about the black box nature of commercial AI systems and users’ tendency to anthropomorphize AI behavior when it malfunctions.

The great Wonka incident

A photo of the Willy's Chocolate Experience, which did not match AI-generated promises. — A photo of “Willy’s Chocolate Experience” (inset), which did not match AI-generated promises, shown in the background. Credit: Stuart Sinclair

The collision between AI-generated imagery and consumer expectations fueled human frustrations in February when Scottish families discovered that “Willy’s Chocolate Experience,” an unlicensed Wonka-ripoff event promoted using AI-generated wonderland images, turned out to be little more than a sparse warehouse with a few modest decorations.

Parents who paid £35 per ticket encountered a situation so dire they called the police, with children reportedly crying at the sight of a person in what attendees described as a “terrifying outfit.” The event, created by House of Illuminati in Glasgow, promised fantastical spaces like an “Enchanted Garden” and “Twilight Tunnel” but delivered an underwhelming experience that forced organizers to shut down mid-way through its first day and issue refunds.

While the show was a bust, it brought us an iconic new meme for job disillusionment in the form of a photo: the green-haired Willy’s Chocolate Experience employee who looked like she’d rather be anywhere else on earth at that moment.

Mutant rat genitals expose peer review flaws

An actual laboratory rat, who is intrigued. Credit: Getty | Photothek

In February, Ars Technica senior health reporter Beth Mole covered a peer-reviewed paper published in Frontiers in Cell and Developmental Biology that created an uproar in the scientific community when researchers discovered it contained nonsensical AI-generated images, including an anatomically incorrect rat with oversized genitals. The paper, authored by scientists at Xi’an Honghui Hospital in China, openly acknowledged using Midjourney to create figures that contained gibberish text labels like “Stemm cells” and “iollotte sserotgomar.”

The publisher, Frontiers, posted an expression of concern about the article titled “Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway” and launched an investigation into how the obviously flawed imagery passed through peer review. Scientists across social media platforms expressed dismay at the incident, which mirrored concerns about AI-generated content infiltrating academic publishing.

Chatbot makes erroneous refund promises for Air Canada

If, say, ChatGPT gives you the wrong name for one of the seven dwarves, it’s not such a big deal. But in February, Ars senior policy reporter Ashley Belanger covered a case of costly AI confabulation in the wild. In the course of online text conversations, Air Canada’s customer service chatbot told customers inaccurate refund policy information. The airline faced legal consequences later when a tribunal ruled the airline must honor commitments made by the automated system. Tribunal adjudicator Christopher Rivers determined that Air Canada bore responsibility for all information on its website, regardless of whether it came from a static page or AI interface.

The case set a precedent for how companies deploying AI customer service tools could face legal obligations for automated systems’ responses, particularly when they fail to warn users about potential inaccuracies. Ironically, the airline had reportedly spent more on the initial AI implementation than it would have cost to maintain human workers for simple queries, according to Air Canada executive Steve Crocker.

Will Smith lampoons his digital double

The real Will Smith eating spaghetti, parodying an AI-generated video from 2023. Credit: Will Smith / Getty Images / Benj Edwards

In March 2023, a terrible AI-generated video of Will Smith’s AI doppelganger eating spaghetti began making the rounds online. The AI-generated version of the actor gobbled down the noodles in an unnatural and disturbing way. Almost a year later, in February 2024, Will Smith himself posted a parody response video to the viral jabberwocky on Instagram, featuring AI-like deliberately exaggerated pasta consumption, complete with hair-nibbling and finger-slurping antics.

Given the rapid evolution of AI video technology, particularly since OpenAI had just unveiled its Sora video model four days earlier, Smith’s post sparked discussion in his Instagram comments where some viewers initially struggled to distinguish between the genuine footage and AI generation. It was an early sign of “deep doubt” in action as the tech increasingly blurs the line between synthetic and authentic video content.

Robot dogs learn to hunt people with AI-guided rifles

A still image of a robotic quadruped armed with a remote weapons system, captured from a video provided by Onyx Industries. Credit: Onyx Industries

At some point in recent history—somewhere around 2022—someone took a look at robotic quadrupeds and thought it would be a great idea to attach guns to them. A few years later, the US Marine Forces Special Operations Command (MARSOC) began evaluating armed robotic quadrupeds developed by Ghost Robotics. The robot “dogs” integrated Onyx Industries’ SENTRY remote weapon systems, which featured AI-enabled targeting that could detect and track people, drones, and vehicles, though the systems require human operators to authorize any weapons discharge.

The military’s interest in armed robotic dogs followed a broader trend of weaponized quadrupeds entering public awareness. This included viral videos of consumer robots carrying firearms, and later, commercial sales of flame-throwing models. While MARSOC emphasized that weapons were just one potential use case under review, experts noted that the increasing integration of AI into military robotics raised questions about how long humans would remain in control of lethal force decisions.

Microsoft Windows AI is watching

A screenshot of Microsoft's new — A screenshot of Microsoft’s new “Recall” feature in action. Credit: Microsoft

In an era where many people already feel like they have no privacy due to tech encroachments, Microsoft dialed it up to an extreme degree in May. That’s when Microsoft unveiled a controversial Windows 11 feature called “Recall” that continuously captures screenshots of users’ PC activities every few seconds for later AI-powered search and retrieval. The feature, designed for new Copilot+ PCs using Qualcomm’s Snapdragon X Elite chips, promised to help users find past activities, including app usage, meeting content, and web browsing history.

While Microsoft emphasized that Recall would store encrypted snapshots locally and allow users to exclude specific apps or websites, the announcement raised immediate privacy concerns, as Ars senior technology reporter Andrew Cunningham covered. It also came with a technical toll, requiring significant hardware resources, including 256GB of storage space, with 25GB dedicated to storing approximately three months of user activity. After Microsoft pulled the initial test version due to public backlash, Recall later entered public preview in November with reportedly enhanced security measures. But secure spyware is still spyware—Recall, when enabled, still watches nearly everything you do on your computer and keeps a record of it.

Google Search told people to eat rocks

In May, Ars senior gaming reporter Kyle Orland (who assisted commendably with the AI beat throughout the year) covered Google’s newly launched AI Overview feature. It faced immediate criticism when users discovered that it frequently provided false and potentially dangerous information in its search result summaries. Among its most alarming responses, the system advised humans could safely consume rocks, incorrectly citing scientific sources about the geological diet of marine organisms. The system’s other errors included recommending nonexistent car maintenance products, suggesting unsafe food preparation techniques, and confusing historical figures who shared names.

The problems stemmed from several issues, including the AI treating joke posts as factual sources and misinterpreting context from original web content. But most of all, the system relies on web results as indicators of authority, which we called a flawed design. While Google defended the system, stating these errors occurred mainly with uncommon queries, a company spokesperson acknowledged they would use these “isolated examples” to refine their systems. But to this day, AI Overview still makes frequent mistakes.

Stable Diffusion generates body horror

An AI-generated image created using Stable Diffusion 3 of a girl lying in the grass. Credit: HorneyMetalBeing

In June, Stability AI’s release of the image synthesis model Stable Diffusion 3 Medium drew criticism online for its poor handling of human anatomy in AI-generated images. Users across social media platforms shared examples of the model producing what we now like to call jabberwockies—AI generation failures with distorted bodies, misshapen hands, and surreal anatomical errors, and many in the AI image-generation community viewed it as a significant step backward from previous image-synthesis capabilities.

Reddit users attributed these failures to Stability AI’s aggressive filtering of adult content from the training data, which apparently impaired the model’s ability to accurately render human figures. The troubled release coincided with broader organizational challenges at Stability AI, including the March departure of CEO Emad Mostaque, multiple staff layoffs, and the exit of three key engineers who had helped develop the technology. Some of those engineers founded Black Forest Labs in August and released Flux, which has become the latest open-weights AI image model to beat.

ChatGPT Advanced Voice imitates human voice in testing

An illustration of a computer synthesizer spewing out letters.

AI voice-synthesis models are master imitators these days, and they are capable of much more than many people realize. In August, we covered a story where OpenAI’s ChatGPT Advanced Voice Mode feature unexpectedly imitated a user’s voice during the company’s internal testing, revealed by OpenAI after the fact in safety testing documentation. To prevent future instances of an AI assistant suddenly speaking in your own voice (which, let’s be honest, would probably freak people out), the company created an output classifier system to prevent unauthorized voice imitation. OpenAI says that Advanced Voice Mode now catches all meaningful deviations from approved system voices.

Independent AI researcher Simon Willison discussed the implications with Ars Technica, noting that while OpenAI restricted its model’s full voice synthesis capabilities, similar technology would likely emerge from other sources within the year. Meanwhile, the rapid advancement of AI voice replication has caused general concern about its potential misuse, although companies like ElevenLabs have already been offering voice cloning services for some time.

San Francisco’s robotic car horn symphony

A Waymo self-driving car in front of Google's San Francisco headquarters, San Francisco, California, June 7, 2024. — A Waymo self-driving car in front of Google’s San Francisco headquarters, San Francisco, California, June 7, 2024. Credit: Getty Images

In August, San Francisco residents got a noisy taste of robo-dystopia when Waymo’s self-driving cars began creating an unexpected nightly disturbance in the South of Market district. In a parking lot off 2nd Street, the cars congregated autonomously every night during rider lulls at 4 am and began engaging in extended honking matches at each other while attempting to park.

Local resident Christopher Cherry’s initial optimism about the robotic fleet’s presence dissolved as the mechanical chorus grew louder each night, affecting residents in nearby high-rises. The nocturnal tech disruption served as a lesson in the unintentional effects of autonomous systems when run in aggregate.

Larry Ellison dreams of all-seeing AI cameras

A colorized photo of CCTV cameras in London, 2024.

In September, Oracle co-founder Larry Ellison painted a bleak vision of ubiquitous AI surveillance during a company financial meeting. The 80-year-old database billionaire described a future where AI would monitor citizens through networks of cameras and drones, asserting that the oversight would ensure lawful behavior from both police and the public.

His surveillance predictions reminded us of parallels to existing systems in China, where authorities already used AI to sort surveillance data on citizens as part of the country’s “sharp eyes” campaign from 2015 to 2020. Ellison’s statement reflected the sort of worst-case tech surveillance state scenario—likely antithetical to any sort of free society—that dozens of sci-fi novels of the 20th century warned us about.

A dead father sends new letters home

An AI-generated image featuring Dad's Uppercase handwriting. — An AI-generated image featuring my late father’s handwriting. Credit: Benj Edwards / Flux

AI has made many of us do weird things in 2024, including this writer. In October, I used an AI synthesis model called Flux to reproduce my late father’s handwriting with striking accuracy. After scanning 30 samples from his engineering notebooks, I trained the model using computing time that cost less than five dollars. The resulting text captured his distinctive uppercase style, which he developed during his career as an electronics engineer.

I enjoyed creating images showing his handwriting in various contexts, from folder labels to skywriting, and made the trained model freely available online for others to use. While I approached it as a tribute to my father (who would have appreciated the technical achievement), many people found the whole experience weird and somewhat disturbing. The things we unhinged Bing Chat-like journalists do to bring awareness to a topic are sometimes unconventional. So I guess it counts for this list!

For 2025? Expect even more AI

Thanks for reading Ars Technica this past year and following along with our team coverage of this rapidly emerging and expanding field. We appreciate your kind words of support. Ars Technica’s 2024 AI words of the year were: vibemarking, deep doubt, and the aforementioned jabberwocky. The old stalwart “confabulation” also made several notable appearances. Tune in again next year when we continue to try to figure out how to concisely describe novel scenarios in emerging technology by labeling them.

Looking back, our prediction for 2024 in AI last year was “buckle up.” It seems fitting, given the weirdness detailed above. Especially the part about the robot dogs with guns. For 2025, AI will likely inspire more chaos ahead, but also potentially get put to serious work as a productivity tool, so this time, our prediction is “buckle down.”

Finally, we’d like to ask: What was the craziest story about AI in 2024 from your perspective? Whether you love AI or hate it, feel free to suggest your own additions to our list in the comments. Happy New Year!

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

2024: The year AI drove everyone crazy Read More »

Health care giant Ascension says 5.6 million patients affected in cyberattack

Biz & IT, Data breaches, health care, ransomware, Security / Mike M. / December 23, 2024

Health care company Ascension lost sensitive data for nearly 5.6 million individuals in a cyberattack that was attributed to a notorious ransomware gang, according to documents filed with the attorney general of Maine.

Ascension owns 140 hospitals and scores of assisted living facilities. In May, the organization was hit with an attack that caused mass disruptions as staff was forced to move to manual processes that caused errors, delayed or lost lab results, and diversions of ambulances to other hospitals. Ascension managed to restore most services by mid-June. At the time, the company said the attackers had stolen protected health information and personally identifiable information for an undisclosed number of people.

Investigation concluded

A filing Ascension made earlier in December revealed that nearly 5.6 million people were affected by the breach. Data stolen depended on the particular person but included individuals’ names and medical information (e.g., medical record numbers, dates of service, types of lab tests, or procedure codes), payment information (e.g., credit card information or bank account numbers), insurance information (e.g., Medicaid/Medicare ID, policy number, or insurance claim), government

identification (e.g., Social Security numbers, tax identification numbers, driver’s license numbers, or passport numbers), and other personal information (such as date of birth or address).

Health care giant Ascension says 5.6 million patients affected in cyberattack Read More »

The AI war between Google and OpenAI has never been more heated

agentic AI, AI, AI agents, AI assistants, Biz & IT, chatgpt, Gemini 2.0, Google, Google Gemini, Google Veo, imag synthesis, machine learning, openai, Veo, Veo 2, video synthesis / Tim Belzer / December 21, 2024

Over the past month, we’ve seen a rapid cadence of notable AI-related announcements and releases from both Google and OpenAI, and it’s been making the AI community’s head spin. It has also poured fuel on the fire of the OpenAI-Google rivalry, an accelerating game of one-upmanship taking place unusually close to the Christmas holiday.

“How are people surviving with the firehose of AI updates that are coming out,” wrote one user on X last Friday, which is still a hotbed of AI-related conversation. “in the last <24 hours we got gemini flash 2.0 and chatGPT with screenshare, deep research, pika 2, sora, chatGPT projects, anthropic clio, wtf it never ends."

Rumors travel quickly in the AI world, and people in the AI industry had been expecting OpenAI to ship some major products in December. Once OpenAI announced “12 days of OpenAI” earlier this month, Google jumped into gear and seemingly decided to try to one-up its rival on several counts. So far, the strategy appears to be working, but it’s coming at the cost of the rest of the world being able to absorb the implications of the new releases.

“12 Days of OpenAI has turned into like 50 new @GoogleAI releases,” wrote another X user on Monday. “This past week, OpenAI & Google have been releasing at the speed of a new born startup,” wrote a third X user on Tuesday. “Even their own users can’t keep up. Crazy time we’re living in.”

“Somebody told Google that they could just do things,” wrote a16z partner and AI influencer Justine Moore on X, referring to a common motivational meme telling people they “can just do stuff.”

The Google AI rush

OpenAI’s “12 Days of OpenAI” campaign has included releases of their full o1 model, an upgrade from o1-preview, alongside o1-pro for advanced “reasoning” tasks. The company also publicly launched Sora for video generation, added Projects functionality to ChatGPT, introduced Advanced Voice features with video streaming capabilities, and more.

The AI war between Google and OpenAI has never been more heated Read More »

12 days of OpenAI: The Ars Technica recap

12 days of OpenAI, AI, Biz & IT, chatgpt, chatgtp, machine learning, openai / Kris Guyer / December 20, 2024

Did OpenAI’s big holiday event live up to the billing?

Over the past 12 business days, OpenAI has announced a new product or demoed an AI feature every weekday, calling the PR event “12 days of OpenAI.” We’ve covered some of the major announcements, but we thought a look at each announcement might be useful for people seeking a comprehensive look at each day’s developments.

The timing and rapid pace of these announcements—particularly in light of Google’s competing releases—illustrates the intensifying competition in AI development. What might normally have been spread across months was compressed into just 12 business days, giving users and developers a lot to process as they head into 2025.

Humorously, we asked ChatGPT what it thought about the whole series of announcements, and it was skeptical that the event even took place. “The rapid-fire announcements over 12 days seem plausible,” wrote ChatGPT-4o, “But might strain credibility without a clearer explanation of how OpenAI managed such an intense release schedule, especially given the complexity of the features.”

But it did happen, and here’s a chronicle of what went down on each day.

Day 1: Thursday, December 5

On the first day of OpenAI, the company released its full o1 model, making it available to ChatGPT Plus and Team subscribers worldwide. The company reported that the model operates faster than its preview version and reduces major errors by 34 percent on complex real-world questions.

The o1 model brings new capabilities for image analysis, allowing users to upload and receive detailed explanations of visual content. OpenAI said it plans to expand o1’s features to include web browsing and file uploads in ChatGPT, with API access coming soon. The API version will support vision tasks, function calling, and structured outputs for system integration.

OpenAI also launched ChatGPT Pro, a $200 subscription tier that provides “unlimited” access to o1, GPT-4o, and Advanced Voice features. Pro subscribers receive an exclusive version of o1 that uses additional computing power for complex problem-solving. Alongside this release, OpenAI announced a grant program that will provide ChatGPT Pro access to 10 medical researchers at established institutions, with plans to extend grants to other fields.

Day 2: Friday, December 6

Day 2 wasn’t as exciting. OpenAI unveiled Reinforcement Fine-Tuning (RFT), a model customization method that will let developers modify “o-series” models for specific tasks. The technique reportedly goes beyond traditional supervised fine-tuning by using reinforcement learning to help models improve their reasoning abilities through repeated iterations. In other words, OpenAI created a new way to train AI models that lets them learn from practice and feedback.

OpenAI says that Berkeley Lab computational researcher Justin Reese tested RFT for researching rare genetic diseases, while Thomson Reuters has created a specialized o1-mini model for its CoCounsel AI legal assistant. The technique requires developers to provide a dataset and evaluation criteria, with OpenAI’s platform managing the reinforcement learning process.

OpenAI plans to release RFT to the public in early 2024 but currently offers limited access through its Reinforcement Fine-Tuning Research Program for researchers, universities, and companies.

Day 3: Monday, December 9

On day 3, OpenAI released Sora, its text-to-video model, as a standalone product now accessible through sora.com for ChatGPT Plus and Pro subscribers. The company says the new version operates faster than the research preview shown in February 2024, when OpenAI first demonstrated the model’s ability to create videos from text descriptions.

The release moved Sora from research preview to a production service, marking OpenAI’s official entry into the video synthesis market. The company published a blog post detailing the subscription tiers and deployment strategy for the service.

Day 4: Tuesday, December 10

On day 4, OpenAI moved its Canvas feature out of beta testing, making it available to all ChatGPT users, including those on free tiers. Canvas provides a dedicated interface for extended writing and coding projects beyond the standard chat format, now with direct integration into the GPT-4o model.

The updated canvas allows users to run Python code within the interface and includes a text-pasting feature for importing existing content. OpenAI added compatibility with custom GPTs and a “show changes” function that tracks modifications to writing and code. The company said Canvas is now on chatgpt.com for web users and also available through a Windows desktop application, with more features planned for future updates.

Day 5: Wednesday, December 11

On day 5, OpenAI announced that ChatGPT would integrate with Apple Intelligence across iOS, iPadOS, and macOS devices. The integration works on iPhone 16 series phones, iPhone 15 Pro models, iPads with A17 Pro or M1 chips and later, and Macs with M1 processors or newer, running their respective latest operating systems.

The integration lets users access ChatGPT’s features (such as they are), including image and document analysis, directly through Apple’s system-level intelligence features. The feature works with all ChatGPT subscription tiers and operates within Apple’s privacy framework. Iffy message summaries remain unaffected by the additions.

Enterprise and Team account users need administrator approval to access the integration.

Day 6: Thursday, December 12

On the sixth day, OpenAI added two features to ChatGPT’s voice capabilities: “video calling” with screen sharing support for ChatGPT Plus and Pro subscribers and a seasonal Santa Claus voice preset.

The new visual Advanced Voice Mode features work through the mobile app, letting users show their surroundings or share their screen with the AI model during voice conversations. While the rollout covers most countries, users in several European nations, including EU member states, Switzerland, Iceland, Norway, and Liechtenstein, will get access at a later date. Enterprise and education users can expect these features in January.

The Santa voice option appears as a snowflake icon in the ChatGPT interface across mobile devices, web browsers, and desktop apps, with conversations in this mode not affecting chat history or memory. Don’t expect Santa to remember what you want for Christmas between sessions.

Day 7: Friday, December 13

OpenAI introduced Projects, a new organizational feature in ChatGPT that lets users group related conversations and files, on day 7. The feature works with the company’s GPT-4o model and provides a central location for managing resources related to specific tasks or topics—kinda like Anthropic’s “Projects” feature.

ChatGPT Plus, Pro, and Team subscribers can currently access Projects through chatgpt.com and the Windows desktop app, with view-only support on mobile devices and macOS. Users can create projects by clicking a plus icon in the sidebar, where they can add files and custom instructions that provide context for future conversations.

OpenAI said it plans to expand Projects in 2024 with support for additional file types, cloud storage integration through Google Drive and Microsoft OneDrive, and compatibility with other models like o1. Enterprise and education users will receive access to Projects in January.

Day 8: Monday, December 16

On day 8, OpenAI expanded its search features in ChatGPT, extending access to all users with free accounts while reportedly adding speed improvements and mobile optimizations. Basically, you can use ChatGPT like a web search engine, although in practice it doesn’t seem to be as comprehensive as Google Search at the moment.

The update includes a new maps interface and integration with Advanced Voice, allowing users to perform searches during voice conversations. The search capability, which previously required a paid subscription, now works across all platforms where ChatGPT operates.

Day 9: Tuesday, December 17

On day 9, OpenAI released its o1 model through its API platform, adding support for function calling, developer messages, and vision processing capabilities. The company also reduced GPT-4o audio pricing by 60 percent and introduced a GPT-4o mini option that costs one-tenth of previous audio rates.

OpenAI also simplified its WebRTC integration for real-time applications and unveiled Preference Fine-Tuning, which provides developers new ways to customize models. The company also launched beta versions of software development kits for the Go and Java programming languages, expanding its toolkit for developers.

Day 10: Wednesday, December 18

On Wednesday, OpenAI did something a little fun and launched voice and messaging access to ChatGPT through a toll-free number (1-800-CHATGPT), as well as WhatsApp. US residents can make phone calls with a 15-minute monthly limit, while global users can message ChatGPT through WhatsApp at the same number.

OpenAI said the release is a way to reach users who lack consistent high-speed Internet access or want to try AI through familiar communication channels, but it’s also just a clever hack. As evidence, OpenAI notes that these new interfaces serve as experimental access points, with more “limited functionality” than the full ChatGPT service, and still recommends existing users continue using their regular ChatGPT accounts for complete features.

Day 11: Thursday, December 19

On Thursday, OpenAI expanded ChatGPT’s desktop app integration to include additional coding environments and productivity software. The update added support for Jetbrains IDEs like PyCharm and IntelliJ IDEA, VS Code variants including Cursor and VSCodium, and text editors such as BBEdit and TextMate.

OpenAI also included integration with Apple Notes, Notion, and Quip while adding Advanced Voice Mode compatibility when working with desktop applications. These features require manual activation for each app and remain available to paid subscribers, including Plus, Pro, Team, Enterprise, and Education users, with Enterprise and Education customers needing administrator approval to enable the functionality.

Day 12: Friday, December 20

On Friday, OpenAI concluded its twelve days of announcements by previewing two new simulated reasoning models, o3 and o3-mini, while opening applications for safety and security researchers to test them before public release. Early evaluations show o3 achieving a 2727 rating on Codeforces programming contests and scoring 96.7 percent on AIME 2024 mathematics problems.

The company reports o3 set performance records on advanced benchmarks, solving 25.2 percent of problems on EpochAI’s Frontier Math evaluations and scoring above 85 percent on the ARC-AGI test, which is comparable to human results. OpenAI also published research about “deliberative alignment,” a technique used in developing o1. The company has not announced firm release dates for either new o3 model, but CEO Sam Altman said o3-mini might ship in late January.

So what did we learn?

OpenAI’s December campaign revealed that OpenAI had a lot of things sitting around that it needed to ship, and it picked a fun theme to unite the announcements. Google responded in kind, as we have covered.

Several trends from the releases stand out. OpenAI is heavily investing in multimodal capabilities. The o1 model’s release, Sora’s evolution from research preview to product, and the expansion of voice features with video calling all point toward systems that can seamlessly handle text, images, voice, and video.

The company is also focusing heavily on developer tools and customization, so it can continue to have a cloud service business and have its products integrated into other applications. Between the API releases, Reinforcement Fine-Tuning, and expanded IDE integrations, OpenAI is building out its ecosystem for developers and enterprises. And the introduction of o3 shows that OpenAI is still attempting to push technological boundaries, even in the face of diminishing returns in training LLM base models.

OpenAI seems to be positioning itself for a 2025 where generative AI moves beyond text chatbots and simple image generators and finds its way into novel applications that we probably can’t even predict yet. We’ll have to wait and see what the company and developers come up with in the year ahead.

12 days of OpenAI: The Ars Technica recap Read More »

OpenAI announces o3 and o3-mini, its next simulated reasoning models

AI, Biz & IT, chatgpt, chatgtp, GPT-4o, large language models, machine learning, o1, openai, OpenAI o1, OpenAI Orion, simulated reasoning / Mike M. / December 20, 2024

On Friday, during Day 12 of its “12 days of OpenAI,” OpenAI CEO Sam Altman announced its latest AI “reasoning” models, o3 and o3-mini, which build upon the o1 models launched earlier this year. The company is not releasing them yet but will make these models available for public safety testing and research access today.

The models use what OpenAI calls “private chain of thought,” where the model pauses to examine its internal dialog and plan ahead before responding, which you might call “simulated reasoning” (SR)—a form of AI that goes beyond basic large language models (LLMs).

The company named the model family “o3” instead of “o2” to avoid potential trademark conflicts with British telecom provider O2, according to The Information. During Friday’s livestream, Altman acknowledged his company’s naming foibles, saying, “In the grand tradition of OpenAI being really, truly bad at names, it’ll be called o3.”

According to OpenAI, the o3 model earned a record-breaking score on the ARC-AGI benchmark, a visual reasoning benchmark that has gone unbeaten since its creation in 2019. In low-compute scenarios, o3 scored 75.7 percent, while in high-compute testing, it reached 87.5 percent—comparable to human performance at an 85 percent threshold.

OpenAI also reported that o3 scored 96.7 percent on the 2024 American Invitational Mathematics Exam, missing just one question. The model also reached 87.7 percent on GPQA Diamond, which contains graduate-level biology, physics, and chemistry questions. On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent.

OpenAI announces o3 and o3-mini, its next simulated reasoning models Read More »

Not to be outdone by OpenAI, Google releases its own “reasoning” AI model

AI, AI reasoning, Biz & IT, Gemini 2.0, Gemini 2.0 Flash, Gemini 2.0 Flash Thinking, Google, google deepmind, Google Gemini, machine learning, o1, openai, OpenAI o1 / Tim Belzer / December 19, 2024

Google DeepMind’s chief scientist, Jeff Dean, says that the model receives extra computing power, writing on X, “we see promising results when we increase inference time computation!” The model works by pausing to consider multiple related prompts before providing what it determines to be the most accurate answer.

Since OpenAI’s jump into the “reasoning” field in September with o1-preview and o1-mini, several companies have been rushing to achieve feature parity with their own models. For example, DeepSeek launched DeepSeek-R1 in early November, while Alibaba’s Qwen team released its own “reasoning” model, QwQ earlier this month.

While some claim that reasoning models can help solve complex mathematical or academic problems, these models might not be for everybody. While they perform well on some benchmarks, questions remain about their actual usefulness and accuracy. Also, the high computing costs needed to run reasoning models have created some rumblings about their long-term viability. That high cost is why OpenAI’s ChatGPT Pro costs $200 a month, for example.

Still, it appears Google is serious about pursuing this particular AI technique. Logan Kilpatrick, a Google employee in its AI Studio, called it “the first step in our reasoning journey” in a post on X.

Not to be outdone by OpenAI, Google releases its own “reasoning” AI model Read More »

As firms abandon VMware, Broadcom is laughing all the way to the bank

Biz & IT, Broadcom, vmware / Mike M. / December 19, 2024

2025 challenges

Broadcom seemed okay with ending business with Ingram, which ties it to solution providers that may be supporting smaller firms. At the same time, Broadcom has shown willingness to fight for the business of large accounts.

For example, this month it settled an increasingly nasty dispute in which AT&T sued Broadcom for allegedly breaking a contract to provide perpetual license support. Broadcom infamously stopped VMware perpetual license sales, in favor of subscriptions, in December 2023.

Broadcom is also paying close attention to VMware’s biggest accounts, taking over 500 of those biggest accounts directly, thereby barring channel partners from deals.

Broadcom originally planned to take VMware’s biggest 2,000 accounts direct. But as Canalys chief analyst Alastair Edwards put it, letting 1,500 of the biggest accounts be run by channel partners helps tie professional services to VMware products, making migrations harder.

However, the VMware channel is under turmoil, having undergone numerous business-impacting changes over the past year, including Broadcom killing the VMware partner program in favor of its own, while announcing that there will be a new VMware channel chief, as CRN reported. Some of the resellers that could help VMware keep customers are showing frustration with the changes and what the characterize as poor communication from Broadcom.

“Broadcom has abandoned the channel market by making it nearly impossible to work with them due to constantly changing requirements, packaging and changes to the program,” Jason Slagle, president of Toledo-based managed services provider and VMware partner CNWR, told CRN today.

Meanwhile, Forrester analysts Michele Pelino and Naveen Chhabra predict that next year, “VMware’s largest 2,000 customers will shrink their deployment size by an average of 40 percent,” in favor of “public cloud, on-premises alternatives, and new architecture.”

Still, “Broadcom’s price increases and cost-cutting measures are expected to boost its net profits, as there are not many credible competitors capable of helping clients replace VMware virtualization,” the Forrester analysts said.

So although Broadcom is challenged to maintain business from VMware’s biggest accounts and appease the solution providers driving smaller accounts, it’s expected to keep making money off of VMware—even as firms like Ingram close the door on it.

As firms abandon VMware, Broadcom is laughing all the way to the bank Read More »

New physics sim trains robots 430,000 times faster than reality

AI, AI physics, Biz & IT, Carnegie Mellon University, Genesis, Jim Fan, machine learning, Zhou Xian / Mike M. / December 19, 2024

The AI-generated worlds reportedly include realistic physics, camera movements, and object behaviors, all from text commands. The system then creates physically accurate ray-traced videos and data that robots can use for training.

Examples of “4D dynamical and physical” worlds that Genesis created from text prompts.

This prompt-based system lets researchers create complex robot testing environments by typing natural language commands instead of programming them by hand. “Traditionally, simulators require a huge amount of manual effort from artists: 3D assets, textures, scene layouts, etc. But every component in the workflow can be automated,” wrote Fan.

Using its engine, Genesis can also generate character motion, interactive 3D scenes, facial animation, and more, which may allow for the creation of artistic assets for creative projects, but may also lead to more realistic AI-generated games and videos in the future, constructing a simulated world in data instead of operating on the statistical appearance of pixels as with a video synthesis diffusion model.

Examples of character motion generation from Genesis, using a prompt that includes, “A miniature Wukong holding a stick in his hand sprints across a table surface for 3 seconds, then jumps into the air, and swings his right arm downward during landing.”

While the generative system isn’t yet part of the currently available code on GitHub, the team plans to release it in the future.

Training tomorrow’s robots today (using Python)

Genesis remains under active development on GitHub, where the team accepts community contributions.

The platform stands out from other 3D world simulators for robotic training by using Python for both its user interface and core physics engine. Other engines use C++ or CUDA for their underlying calculations while wrapping them in Python APIs. Genesis takes a Python-first approach.

Notably, the non-proprietary nature of the Genesis platform makes high-speed robot training simulations available to any researcher for free through simple Python commands that work on regular computers with off-the-shelf hardware.

Previously, running robot simulations required complex programming and specialized hardware, says Fan in his post announcing Genesis, and that shouldn’t be the case. “Robotics should be a moonshot initiative owned by all of humanity,” he wrote.

New physics sim trains robots 430,000 times faster than reality Read More »