Tech

openai-pushes-ai-agent-capabilities-with-new-developer-api

OpenAI pushes AI agent capabilities with new developer API

Developers using the Responses API can access the same models that power ChatGPT Search: GPT-4o search and GPT-4o mini search. These models can browse the web to answer questions and cite sources in their responses.

That’s notable because OpenAI says the added web search ability dramatically improves the factual accuracy of its AI models. On OpenAI’s SimpleQA benchmark, which aims to measure confabulation rate, GPT-4o search scored 90 percent, while GPT-4o mini search achieved 88 percent—both substantially outperforming the larger GPT-4.5 model without search, which scored 63 percent.

Despite these improvements, the technology still has significant limitations. Aside from issues with CUA properly navigating websites, the improved search capability doesn’t completely solve the problem of AI confabulations, with GPT-4o search still making factual mistakes 10 percent of the time.

Alongside the Responses API, OpenAI released the open source Agents SDK, providing developers free tools to integrate models with internal systems, implement safeguards, and monitor agent activities. This toolkit follows OpenAI’s earlier release of Swarm, a framework for orchestrating multiple agents.

These are still early days in the AI agent field, and things will likely improve rapidly. However, at the moment, the AI agent movement remains vulnerable to unrealistic claims, as demonstrated earlier this week when users discovered that Chinese startup Butterfly Effect’s Manus AI agent platform failed to deliver on many of its promises, highlighting the persistent gap between promotional claims and practical functionality in this emerging technology category.

OpenAI pushes AI agent capabilities with new developer API Read More »

leaked-geforce-rtx-5060-and-5050-specs-suggest-nvidia-will-keep-playing-it-safe

Leaked GeForce RTX 5060 and 5050 specs suggest Nvidia will keep playing it safe

Nvidia has launched all of the GeForce RTX 50-series GPUs that it announced at CES, at least technically—whether you’re buying from Nvidia, AMD, or Intel, it’s nearly impossible to find any of these new cards at their advertised prices right now.

But hope springs eternal, and newly leaked specs for GeForce RTX 5060 and 5050-series cards suggest that Nvidia may be announcing these lower-end cards soon. These kinds of cards are rarely exciting, but Steam Hardware Survey data shows that these xx60 and xx50 cards are what the overwhelming majority of PC gamers are putting in their systems.

The specs, posted by a reliable leaker named Kopite and reported by Tom’s Hardware and others, suggest a refresh that’s in line with what Nvidia has done with most of the 50-series so far. Along with a move to the next-generation Blackwell architecture, the 5060 GPUs each come with a small increase to the number of CUDA cores, a jump from GDDR6 to GDDR7, and an increase in power consumption, but no changes to the amount of memory or the width of the memory bus. The 8GB versions, in particular, will probably continue to be marketed primarily as 1080p cards.

RTX 5060 Ti (leaked) RTX 4060 Ti RTX 5060 (leaked) RTX 4060 RTX 5050 (leaked) RTX 3050
CUDA Cores 4,608 4,352 3,840 3,072 2,560 2,560
Boost Clock Unknown 2,535 MHz Unknown 2,460 MHz Unknown 1,777 MHz
Memory Bus Width 128-bit 128-bit 128-bit 128-bit 128-bit 128-bit
Memory bandwidth Unknown 288 GB/s Unknown 272 GB/s Unknown 224 GB/s
Memory size 8GB or 16GB GDDR7 8GB or 16GB GDDR6 8GB GDDR7 8GB GDDR6 8GB GDDR6 8GB GDDR6
TGP 180 W 160 W 150 W 115 W 130 W 130 W

As with the 4060 Ti, the 5060 Ti is said to come in two versions, one with 8GB of RAM and one with 16GB. One of the 4060 Ti’s problems was that its relatively narrow 128-bit memory bus limited its performance at 1440p and 4K resolutions even with 16GB of RAM—the bandwidth increase from GDDR7 could help with this, but we’ll need to test to see for sure.

Leaked GeForce RTX 5060 and 5050 specs suggest Nvidia will keep playing it safe Read More »

gmail-gains-gemini-powered-“add-to-calendar”-button

Gmail gains Gemini-powered “Add to calendar” button

Google has a new mission in the AI era: to add Gemini to as many of the company’s products as possible. We’ve already seen Gemini appear in search results, text messages, and more. In Google’s latest update to Workspace, Gemini will be able to add calendar appointments from Gmail with a single click. Well, assuming Gemini gets it right the first time, which is far from certain.

The new calendar button will appear at the top of emails, right next to the summarize button that arrived last year. The calendar option will show up in Gmail threads with actionable meeting chit-chat, allowing you to mash that button to create an appointment in one step. The Gemini sidebar will open to confirm the appointment was made, which is a good opportunity to double-check the robot. There will be a handy edit button in the Gemini window in the event it makes a mistake. However, the robot can’t invite people to these events yet.

The effect of using the button is the same as opening the Gemini panel and asking it to create an appointment. The new functionality is simply detecting events and offering the button as a shortcut of sorts. You should not expect to see this button appear on messages that already have calendar integration, like dining reservations and flights. Those already pop up in Google Calendar without AI.

Gmail gains Gemini-powered “Add to calendar” button Read More »

firmware-update-bricks-hp-printers,-makes-them-unable-to-use-hp-cartridges

Firmware update bricks HP printers, makes them unable to use HP cartridges

HP, along with other printer brands, is infamous for issuing firmware updates that brick already-purchased printers that have tried to use third-party ink. In a new form of frustration, HP is now being accused of issuing a firmware update that broke customers’ laser printers—even though the devices are loaded with HP-brand toner.

The firmware update in question is version 20250209, which HP issued on March 4 for its LaserJet MFP M232-M237 models. Per HP, the update includes “security updates,” a “regulatory requirement update,” “general improvements and bug fixes,” and fixes for IPP Everywhere. Looking back to older updates’ fixes and changes, which the new update includes, doesn’t reveal anything out of the ordinary. The older updates mention things like “fixed print quality to ensure borders are not cropped for certain document types,” and “improved firmware update and cartridge rejection experiences.” But there’s no mention of changes to how the printers use or read toner.

However, users have been reporting sudden problems using HP-brand toner in their M232–M237 series printers since their devices updated to 20250209. Users on HP’s support forum say they see Error Code 11 and the hardware’s toner light flashing when trying to print. Some said they’ve cleaned the contacts and reinstalled their toner but still can’t print.

“Insanely frustrating because it’s my small business printer and just stopped working out of nowhere[,] and I even replaced the tone[r,] which was a $60 expense,” a forum user wrote on March 8.

Firmware update bricks HP printers, makes them unable to use HP cartridges Read More »

amd-says-top-tier-ryzen-9900x3d-and-9950x3d-cpus-arrive-march-12-for-$599-and-$699

AMD says top-tier Ryzen 9900X3D and 9950X3D CPUs arrive March 12 for $599 and $699

Like the 7950X3D and 7900X3D, these new X3D chips combine a pair of AMD’s CPU chiplets, one that has the extra 64MB of cache stacked underneath it and one that doesn’t. For the 7950X3D, you get eight cores with extra cache and eight without; for the 7900X3D, you get eight cores with extra cache and four without.

It’s up to AMD’s chipset software to decide what kinds of apps get to run on each kind of CPU core. Non-gaming workloads prioritize the normal CPU cores, which are generally capable of slightly higher peak clock speeds, while games that benefit disproportionately from the extra cache are run on those cores instead. AMD’s software can “park” the non-V-Cache CPU cores when you’re playing games to ensure they’re not accidentally being run on less-suitable CPU cores.

We didn’t have issues with this core parking technology when we initially tested the 7950X3D and 7900X3D, and AMD has steadily made improvements since then to make sure that core parking is working properly. The new 9000-series X3D chips should benefit from that work, too. To get the best results, AMD officially recommends a fresh and fully updated Windows install, along with the newest BIOS for your motherboard and the newest AMD chipset drivers; swapping out another Ryzen CPU for an X3D model (or vice versa) without reinstalling Windows can occasionally lead to CPUs being parked (or not parked) when they are supposed to be (or not supposed to be).

AMD says top-tier Ryzen 9900X3D and 9950X3D CPUs arrive March 12 for $599 and $699 Read More »

iphone-16e-review:-the-most-expensive-cheap-iphone-yet

iPhone 16e review: The most expensive cheap iPhone yet


The iPhone 16e rethinks—and prices up—the basic iPhone.

An iPhone sits on the table, displaying the time with the screen on

The iPhone 16e, with a notch and an Action Button. Credit: Samuel Axon

The iPhone 16e, with a notch and an Action Button. Credit: Samuel Axon

For a long time, the cheapest iPhones were basically just iPhones that were older than the current flagship, but last week’s release of the $600 iPhone 16e marks a big change in how Apple is approaching its lineup.

Rather than a repackaging of an old iPhone, the 16e is the latest main iPhone—that is, the iPhone 16—with a bunch of stuff stripped away.

There are several potential advantages to this change. In theory, it allows Apple to support its lower-end offerings for longer with software updates, and it gives entry-level buyers access to more current technologies and features. It also simplifies the marketplace of accessories and the like.

There’s bad news, too, though: Since it replaces the much cheaper iPhone SE in Apple’s lineup, the iPhone 16e significantly raises the financial barrier to entry for iOS (the SE started at $430).

We spent a few days trying out the 16e and found that it’s a good phone—it’s just too bad it’s a little more expensive than the entry-level iPhone should ideally be. In many ways, this phone solves more problems for Apple than it does for consumers. Let’s explore why.

Table of Contents

A beastly processor for an entry-level phone

Like the 16, the 16e has Apple’s A18 chip, the most recent in the made-for-iPhone line of Apple-designed chips. There’s only one notable difference: This variation of the A18 has just four GPU cores instead of five. That will show up in benchmarks and in a handful of 3D games, but it shouldn’t make too much of a difference for most people.

It’s a significant step up over the A15 found in the final 2022 refresh of the iPhone SE, enabling a handful of new features like AAA games and Apple Intelligence.

The A18’s inclusion is good for both Apple and the consumer; Apple gets to establish a new, higher baseline of performance when developing new features for current and future handsets, and consumers likely get many more years of software updates than they’d get on the older chip.

The key example of a feature enabled by the A18 that Apple would probably like us all to talk about the most is Apple Intelligence, a suite of features utilizing generative AI to solve some user problems or enable new capabilities across iOS. By enabling these for the cheapest iPhone, Apple is making its messaging around Apple Intelligence a lot easier; it no longer needs to put effort into clarifying that you can use X feature with this new iPhone but not that one.

We’ve written a lot about Apple Intelligence already, but here’s the gist: There are some useful features here in theory, but Apple’s models are clearly a bit behind the cutting edge, and results for things like notifications summaries or writing tools are pretty mixed. It’s fun to generate original emojis, though!

The iPhone 16e can even use Visual Intelligence, which actually is handy sometimes. On my iPhone 16 Pro Max, I can point the rear camera at an object and press the camera button a certain way to get information about it.

I wouldn’t have expected the 16e to support this, but it does, via the Action Button (which was first introduced in the iPhone 15 Pro). This is a reprogrammable button that can perform a variety of functions, albeit just one at a time. Visual Intelligence is one of the options here, which is pretty cool, even though it’s not essential.

The screen is the biggest upgrade over the SE

Also like the 16, the 16e has a 6.1-inch display. The resolution’s a bit different, though; it’s 2,532 by 1,170 pixels instead of 2,556 by 1,179. It also has a notch instead of the Dynamic Island seen in the 16. All this makes the iPhone 16e’s display seem like a very close match to the one seen in 2022’s iPhone 14—in fact, it might literally be the same display.

I really missed the Dynamic Island while using the iPhone 16e—it’s one of my favorite new features added to the iPhone in recent years, as it consolidates what was previously a mess of notification schemes in iOS. Plus, it’s nice to see things like Uber and DoorDash ETAs and sports scores at a glance.

The main problem with losing the Dynamic Island is that we’re back to the old minor mess of notifications approaches, and I guess Apple has to keep supporting the old ways for a while yet. That genuinely surprises me; I would have thought Apple would want to unify notifications and activities with the Dynamic Island just like the A18 allows the standardization of other features.

This seems to indicate that the Dynamic Island is a fair bit more expensive to include than the good old camera notch flagship iPhones had been rocking since 2017’s iPhone X.

That compromise aside, the display on the iPhone 16e is ridiculously good for a phone at this price point, and it makes the old iPhone SE’s small LCD display look like it’s from another eon entirely by comparison. It gets brighter for both HDR content and sunny-day operation; the blacks are inky and deep, and the contrast and colors are outstanding.

It’s the best thing about the iPhone 16e, even if it isn’t quite as refined as the screens in Apple’s current flagships. Most people would never notice the difference between the screens in the 16e and the iPhone 16 Pro, though.

There is one other screen feature I miss from the higher-end iPhones you can buy in 2025: Those phones can drop the display all the way down to 1 nit, which is awesome for using the phone late at night in bed without disturbing a sleeping partner. Like earlier iPhones, the 16e can only get so dark.

It gets quite bright, though; Apple claims it typically reaches 800 nits in peak brightness but that it can stretch to 1200 when viewing certain HDR photos and videos. That means it gets about twice as bright as the SE did.

Connectivity is key

The iPhone 16e supports the core suite of connectivity options found in modern phones. There’s Wi-Fi 6, Bluetooth 5.3, and Apple’s usual limited implementation of NFC.

There are three new things of note here, though, and they’re good, neutral, and bad, respectively.

USB-C

Let’s start with the good. We’ve moved from Apple’s proprietary Lightning port found in older iPhones (including the final iPhone SE) toward USB-C, now a near-universal standard on mobile devices. It allows faster charging and more standardized charging cable support.

Sure, it’s a bummer to start over if you’ve spent years buying Lightning accessories, but it’s absolutely worth it in the long run. This change means that the entire iPhone line has now abandoned Lightning, so all iPhones and Android phones will have the same main port for years to come. Finally!

The finality of this shift solves a few problems for Apple: It greatly simplifies the accessory landscape and allows the company to move toward producing a smaller range of cables.

Satellite connectivity

Recent flagship iPhones have gradually added a small suite of features that utilize satellite connectivity to make life a little easier and safer.

Among those is crash detection and roadside assistance. The former will use the sensors in the phone to detect if you’ve been in a car crash and contact help, and roadside assistance allows you to text for help when you’re outside of cellular reception in the US and UK.

There are also Emergency SOS and Find My via satellite, which let you communicate with emergency responders from remote places and allow you to be found.

Along with a more general feature that allows Messages via satellite, these features can greatly expand your options if you’re somewhere remote, though they’re not as easy to use and responsive as using the regular cellular network.

Where’s MagSafe?

I don’t expect the 16e to have all the same features as the 16, which is $200 more expensive. In fact, it has more modern features than I think most of its target audience needs (more on that later). That said, there’s one notable omission that makes no sense to me at all.

The 16e does not support MagSafe, a standard for connecting accessories to the back of the device magnetically, often while allowing wireless charging via the Qi standard.

Qi wireless charging is still supported, albeit at a slow 7.5 W, but there are no magnets, meaning a lot of existing MagSafe accessories are a lot less useful with this phone, if they’re usable at all. To be fair, the SE didn’t support MagSafe either, but every new iPhone design since the iPhone 12 way back in 2020 has—and not just the premium flagships.

It’s not like the MagSafe accessory ecosystem was some bottomless well of innovation, but that magnetic alignment is handier than you might think, whether we’re talking about making sure the phone locks into place for the fastest wireless charging speeds or hanging the phone on a car dashboard to use GPS on the go.

It’s one of those things where folks coming from much older iPhones may not care because they don’t know what they’re missing, but it could be annoying in households with multiple generations of iPhones, and it just doesn’t make any sense.

Most of Apple’s choices in the 16e seem to serve the goal of unifying the whole iPhone lineup to simplify the message for consumers and make things easier for Apple to manage efficiently, but the dropping of MagSafe is bizarre.

It almost makes me think that Apple might plan to drop MagSafe from future flagship iPhones, too, and go toward something new, just because that’s the only explanation I can think of. That otherwise seems unlikely to me right now, but I guess we’ll see.

The first Apple-designed cellular modem

We’ve been seeing rumors that Apple planned to drop third-party modems from companies like Qualcomm for years. As far back as 2018, Apple was poaching Qualcomm employees in an adjacent office in San Diego. In 2020, Apple SVP Johny Srouji announced to employees that work had begun.

It sounds like development has been challenging, but the first Apple-designed modem has arrived here in the 16e of all places. Dubbed the C1, it’s… perfectly adequate. It’s about as fast or maybe just a smidge slower than what you get in the flagship phones, but almost no user would notice any difference at all.

That’s really a win for Apple, which has struggled with a tumultuous relationship with its partners here for years and which has long run into space problems in its phones in part because the third-party modems weren’t compact enough.

This change may not matter much for the consumer beyond freeing up just a tiny bit of space for a slightly larger battery, but it’s another step in Apple’s long journey to ultimately and fully control every component in the iPhone that it possibly can.

Bigger is better for batteries

There is one area where the 16e is actually superior to the 16, much less the SE: battery life. The 16e reportedly has a 3,961 mAh battery, the largest in any of the many iPhones with roughly this size screen. Apple says it offers up to 26 hours of video playback, which is the kind of number you expect to see in a much larger flagship phone.

I charged this phone three times in just under a week with it, though I wasn’t heavily hitting 5G networks, playing many 3D games, or cranking the brightness way up all the time while using it.

That’s a bit of a bump over the 16, but it’s a massive leap over the SE, which promised a measly 15 hours of video playback. Every single phone in Apple’s lineup now has excellent battery life by any standard.

Quality over quantity in the camera system

The 16E’s camera system leaves the SE in the dust, but it’s no match for the robust system found in the iPhone 16. Regardless, it’s way better than you’d typically expect from a phone at this price.

Like the 16, the 16e has a 48 MP “Fusion” wide-angle rear camera. It typically doesn’t take photos at 48 MP (though you can do that while compromising color detail). Rather, 24 MP is the target. The 48 MP camera enables 2x zoom that is nearly visually indistinguishable from optical zoom.

Based on both the specs and photo comparisons, the main camera sensor in the 16e appears to me to be exactly the same as that one found in the 16. We’re just missing the ultra-wide lens (which allows more zoomed-out photos, ideal for groups of people in small spaces, for example) and several extra features like advanced image stabilization, the newest Photographic Styles, and macro photography.

The iPhone 16e takes excellent photos in bright conditions. Samuel Axon

That’s a lot of missing features, sure, but it’s wild how good this camera is for this price point. Even something like the Pixel 8a can’t touch it (though to be fair, the Pixel 8a is $100 cheaper).

Video capture is a similar situation: The 16e shoots at the same resolutions and framerates as the 16, but it lacks a few specialized features like Cinematic and Action modes. There’s also a front-facing camera with the TrueDepth sensor for Face ID in that notch, and it has comparable specs to the front-facing cameras we’ve seen in a couple of years of iPhones at this point.

If you were buying a phone for the cameras, this wouldn’t be the one for you. It’s absolutely worth paying another $200 for the iPhone 16 (or even just $100 for the iPhone 15 for the ultra-wide lens for 0.5x zoom; the 15 is still available in the Apple Store) if that’s your priority.

The iPhone 16’s macro mode isn’t available here, so ultra-close-ups look fuzzy. Samuel Axon

But for the 16e’s target consumer (mostly folks with the iPhone 11 or older or an iPhone SE, who just want the cheapest functional iPhone they can get) it’s almost overkill. I’m not complaining, though it’s a contributing factor to the phone’s cost compared to entry-level Android phones and Apple’s old iPhone SE.

RIP small phones, once and for all

In one fell swoop, the iPhone 16e’s replacement of the iPhone SE eliminates a whole range of legacy technologies that have held on at the lower end of the iPhone lineup for years. Gone are Touch ID, the home button, LCD displays, and Lightning ports—they’re replaced by Face ID, swipe gestures, OLED, and USB-C.

Newer iPhones have had most of those things for quite some time. The latest feature was USB-C, which came in 2023’s iPhone 15. The removal of the SE from the lineup catches the bottom end of the iPhone up with the top in these respects.

That said, the SE had maintained one positive differentiator, too: It was small enough to be used one-handed by almost anyone. With the end of the SE and the release of the 16e, the one-handed iPhone is well and truly dead. Of course, most people have been clear they want big screens and batteries above almost all else, so the writing had been on the wall for a while for smaller phones.

The death of the iPhone SE ushers in a new era for the iPhone with bigger and better features—but also bigger price tags.

A more expensive cheap phone

Assessing the iPhone 16e is a challenge. It’s objectively a good phone—good enough for the vast majority of people. It has a nearly top-tier screen (though it clocks in at 60Hz, while some Android phones close to this price point manage 120Hz), a camera system that delivers on quality even if it lacks special features seen in flagships, strong connectivity, and performance far above what you’d expect at this price.

If you don’t care about extra camera features or nice-to-haves like MagSafe or the Dynamic Island, it’s easy to recommend saving a couple hundred bucks compared to the iPhone 16.

The chief criticism I have that relates to the 16e has less to do with the phone itself than Apple’s overall lineup. The iPhone SE retailed for $430, nearly half the price of the 16. By making the 16e the new bottom of the lineup, Apple has significantly raised the financial barrier to entry for iOS.

Now, it’s worth mentioning that a pretty big swath of the target market for the 16e will buy it subsidized through a carrier, so they might not pay that much up front. I always recommend buying a phone directly if you can, though, as carrier subsidization deals are usually worse for the consumer.

The 16e’s price might push more people to go for the subsidy. Plus, it’s just more phone than some people need. For example, I love a high-quality OLED display for watching movies, but I don’t think the typical iPhone SE customer was ever going to care about that.

That’s why I believe the iPhone 16e solves more problems for Apple than it does for the consumer. In multiple ways, it allows Apple to streamline production, software support, and marketing messaging. It also drives up the average price per unit across the whole iPhone line and will probably encourage some people who would have spent $430 to spend $600 instead, possibly improving revenue. All told, it’s a no-brainer for Apple.

It’s just a mixed bag for the sort of no-frills consumer who wants a minimum viable phone and who for one reason or another didn’t want to go the Android route. The iPhone 16e is definitely a good phone—I just wish there were more options for that consumer.

The good

  • Dramatically improved display than the iPhone SE
  • Likely stronger long-term software support than most previous entry-level iPhones
  • Good battery life and incredibly good performance for this price point
  • A high-quality camera, especially for the price

The bad

  • No ultra-wide camera
  • No MagSafe
  • No Dynamic Island

The ugly

  • Significantly raises the entry price point for buying an iPhone

Photo of Samuel Axon

Samuel Axon is a senior editor at Ars Technica. He covers Apple, software development, gaming, AI, entertainment, and mixed reality. He has been writing about gaming and technology for nearly two decades at Engadget, PC World, Mashable, Vice, Polygon, Wired, and others. He previously ran a marketing and PR agency in the gaming industry, led editorial for the TV network CBS, and worked on social media marketing strategy for Samsung Mobile at the creative agency SPCSHP. He also is an independent software and game developer for iOS, Windows, and other platforms, and he is a graduate of DePaul University, where he studied interactive media and software development.

iPhone 16e review: The most expensive cheap iPhone yet Read More »

what-does-“phd-level”-ai-mean?-openai’s-rumored-$20,000-agent-plan-explained.

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained.

On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent—suggesting a leap in mathematical reasoning capabilities over the previous model.

Benchmarks vs. real-world value

Ideally, potential applications for a true PhD-level AI model would include analyzing medical research data, supporting climate modeling, and handling routine aspects of research work.

The high price points reported by The Information, if accurate, suggest that OpenAI believes these systems could provide substantial value to businesses. The publication notes that SoftBank, an OpenAI investor, has committed to spending $3 billion on OpenAI’s agent products this year alone—indicating significant business interest despite the costs.

Meanwhile, OpenAI faces financial pressures that may influence its premium pricing strategy. The company reportedly lost approximately $5 billion last year covering operational costs and other expenses related to running its services.

News of OpenAI’s stratospheric pricing plans come after years of relatively affordable AI services that have conditioned users to expect powerful capabilities at relatively low costs. ChatGPT Plus remains $20 per month and Claude Pro costs $30 monthly—both tiny fractions of these proposed enterprise tiers. Even ChatGPT Pro’s $200/month subscription is relatively small compared to the new proposed fees. Whether the performance difference between these tiers will match their thousandfold price difference is an open question.

Despite their benchmark performances, these simulated reasoning models still struggle with confabulations—instances where they generate plausible-sounding but factually incorrect information. This remains a critical concern for research applications where accuracy and reliability are paramount. A $20,000 monthly investment raises questions about whether organizations can trust these systems not to introduce subtle errors into high-stakes research.

In response to the news, several people quipped on social media that companies could hire an actual PhD student for much cheaper. “In case you have forgotten,” wrote xAI developer Hieu Pham in a viral tweet, “most PhD students, including the brightest stars who can do way better work than any current LLMs—are not paid $20K / month.”

While these systems show strong capabilities on specific benchmarks, the “PhD-level” label remains largely a marketing term. These models can process and synthesize information at impressive speeds, but questions remain about how effectively they can handle the creative thinking, intellectual skepticism, and original research that define actual doctoral-level work. On the other hand, they will never get tired or need health insurance, and they will likely continue to improve in capability and drop in cost over time.

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained. Read More »

“they-curdle-like-milk”:-wb-dvds-from-2006–2008-are-rotting-away-in-their-cases

“They curdle like milk”: WB DVDs from 2006–2008 are rotting away in their cases

Although digital media has surpassed physical media in popularity, there are still plenty of reasons for movie buffs and TV fans to hold onto, and even continue buying, DVDs. With physical media, owners are assured that they’ll always be able to play their favorite titles, so long as they take care of their discs. While digital copies are sometimes abruptly ripped away from viewers, physical media owners don’t have to worry about a corporation ruining their Friday night movie plans. At least, that’s what we thought.

It turns out that if your DVD collection includes titles distributed by Warner Bros. Home Entertainment, the home movie distribution arm of Warner Bros. Discovery (WBD), you may one day open up the box to find a case of DVD rot.

Recently, Chris Bumbray, editor-in-chief of movie news and reviews site JoBlo, detailed what would be a harrowing experience for any film collector. He said he recently tried to play his Passage to Marseille DVD, but “after about an hour, the disc simply stopped working.” He said “the same thing happened” with Across the Pacific. Bumbray bought a new DVD player but still wasn’t able to play his Desperate Journey disc. The latter case was especially alarming because, like a lot of classic films and shows, the title isn’t available as a digital copy.

DVDs, if taken care of properly, should last for 30 to up to 100 years. It turned out that the problems that Bumbray had weren’t due to a DVD player or poor DVD maintenance. In a statement to JoBlo shared on Tuesday, WBD confirmed widespread complaints about DVDs manufactured between 2006 and 2008. The statement said:

Warner Bros. Home Entertainment is aware of potential issues affecting select DVD titles manufactured between 2006 – 2008, and the company has been actively working with consumers to replace defective discs.

Where possible, the defective discs have been replaced with the same title. However, as some of the affected titles are no longer in print or the rights have expired, consumers have been offered an exchange for a title of like-value.

Consumers with affected product can contact the customer support team at whv@wbd.com.

Collectors have known about this problem for years

It’s helpful that WBD recently provided some clarity about this situation, but its statement to JoBlo appears to be the first time the company has publicly acknowledged the disc problems. This is despite DVD collectors lamenting early onset disc rot for years, including via YouTube and online forums.

“They curdle like milk”: WB DVDs from 2006–2008 are rotting away in their cases Read More »

bad-vibes?-google-may-have-screwed-up-haptics-in-the-new-pixel-drop-update

Bad vibes? Google may have screwed up haptics in the new Pixel Drop update

The unexpected appearance of notification cooldown, along with smaller changes to haptics globally, could be responsible for the complaints. Maybe this is working as intended and Pixel owners are just caught off guard; or maybe Google broke something. It wouldn’t be the first time.

Pixel notification cooldown

The unexpected appearance of Notification Cooldown in the update might have something to do with the reports—it’s on by default.

Credit: Ryan Whitwam

The unexpected appearance of Notification Cooldown in the update might have something to do with the reports—it’s on by default. Credit: Ryan Whitwam

In 2022, Google released an update that weakened haptic feedback on the Pixel 6, making it so soft that people were missing calls. Google released a fix for the problem a few weeks later. If there’s something wrong with the new Pixel Drop, it’s a more subtle problem. People can’t even necessarily explain how it’s different, but most seem to agree that it is.

After testing several Pixel phones both before and after the update, there may be some truth to the complaints. The length and intensity of haptic notification feedback feel different on a Pixel 9 Pro XL post-update, but our Pixel 9 Pro feels the same after installing the Pixel Drop. The different models may simply have been tuned differently in the update, or there could be a bug involved. We’ve reached out to Google to ask about this possible issue and have been told the Pixel team is actively investigating the reports.

Updated on 3/7/2025 with comment from Google. 

Bad vibes? Google may have screwed up haptics in the new Pixel Drop update Read More »

cmu-research-shows-compression-alone-may-unlock-ai-puzzle-solving-abilities

CMU research shows compression alone may unlock AI puzzle-solving abilities


Tis the season for a squeezin’

New research challenges prevailing idea that AI needs massive datasets to solve problems.

A pair of Carnegie Mellon University researchers recently discovered hints that the process of compressing information can solve complex reasoning tasks without pre-training on a large number of examples. Their system tackles some types of abstract pattern-matching tasks using only the puzzles themselves, challenging conventional wisdom about how machine learning systems acquire problem-solving abilities.

“Can lossless information compression by itself produce intelligent behavior?” ask Isaac Liao, a first-year PhD student, and his advisor Professor Albert Gu from CMU’s Machine Learning Department. Their work suggests the answer might be yes. To demonstrate, they created CompressARC and published the results in a comprehensive post on Liao’s website.

The pair tested their approach on the Abstraction and Reasoning Corpus (ARC-AGI), an unbeaten visual benchmark created in 2019 by machine learning researcher François Chollet to test AI systems’ abstract reasoning skills. ARC presents systems with grid-based image puzzles where each provides several examples demonstrating an underlying rule, and the system must infer that rule to apply it to a new example.

For instance, one ARC-AGI puzzle shows a grid with light blue rows and columns dividing the space into boxes. The task requires figuring out which colors belong in which boxes based on their position: black for corners, magenta for the middle, and directional colors (red for up, blue for down, green for right, and yellow for left) for the remaining boxes. Here are three other example ARC-AGI puzzles, taken from Liao’s website:

Three example ARC-AGI benchmarking puzzles.

Three example ARC-AGI benchmarking puzzles. Credit: Isaac Liao / Albert Gu

The puzzles test capabilities that some experts believe may be fundamental to general human-like reasoning (often called “AGI” for artificial general intelligence). Those properties include understanding object persistence, goal-directed behavior, counting, and basic geometry without requiring specialized knowledge. The average human solves 76.2 percent of the ARC-AGI puzzles, while human experts reach 98.5 percent.

OpenAI made waves in December for the claim that its o3 simulated reasoning model earned a record-breaking score on the ARC-AGI benchmark. In testing with computational limits, o3 scored 75.7 percent on the test, while in high-compute testing (basically unlimited thinking time), it reached 87.5 percent, which OpenAI says is comparable to human performance.

CompressARC achieves 34.75 percent accuracy on the ARC-AGI training set (the collection of puzzles used to develop the system) and 20 percent on the evaluation set (a separate group of unseen puzzles used to test how well the approach generalizes to new problems). Each puzzle takes about 20 minutes to process on a consumer-grade RTX 4070 GPU, compared to top-performing methods that use heavy-duty data center-grade machines and what the researchers describe as “astronomical amounts of compute.”

Not your typical AI approach

CompressARC takes a completely different approach than most current AI systems. Instead of relying on pre-training—the process where machine learning models learn from massive datasets before tackling specific tasks—it works with no external training data whatsoever. The system trains itself in real-time using only the specific puzzle it needs to solve.

“No pretraining; models are randomly initialized and trained during inference time. No dataset; one model trains on just the target ARC-AGI puzzle and outputs one answer,” the researchers write, describing their strict constraints.

When the researchers say “No search,” they’re referring to another common technique in AI problem-solving where systems try many different possible solutions and select the best one. Search algorithms work by systematically exploring options—like a chess program evaluating thousands of possible moves—rather than directly learning a solution. CompressARC avoids this trial-and-error approach, relying solely on gradient descent—a mathematical technique that incrementally adjusts the network’s parameters to reduce errors, similar to how you might find the bottom of a valley by always walking downhill.

A block diagram of the CompressARC architecture, created by the researchers.

A block diagram of the CompressARC architecture, created by the researchers. Credit: Isaac Liao / Albert Gu

The system’s core principle uses compression—finding the most efficient way to represent information by identifying patterns and regularities—as the driving force behind intelligence. CompressARC searches for the shortest possible description of a puzzle that can accurately reproduce the examples and the solution when unpacked.

While CompressARC borrows some structural principles from transformers (like using a residual stream with representations that are operated upon), it’s a custom neural network architecture designed specifically for this compression task. It’s not based on an LLM or standard transformer model.

Unlike typical machine learning methods, CompressARC uses its neural network only as a decoder. During encoding (the process of converting information into a compressed format), the system fine-tunes the network’s internal settings and the data fed into it, gradually making small adjustments to minimize errors. This creates the most compressed representation while correctly reproducing known parts of the puzzle. These optimized parameters then become the compressed representation that stores the puzzle and its solution in an efficient format.

An animated GIF showing the multi-step process of CompressARC solving an ARC-AGI puzzle.

An animated GIF showing the multi-step process of CompressARC solving an ARC-AGI puzzle. Credit: Isaac Liao

“The key challenge is to obtain this compact representation without needing the answers as inputs,” the researchers explain. The system essentially uses compression as a form of inference.

This approach could prove valuable in domains where large datasets don’t exist or when systems need to learn new tasks with minimal examples. The work suggests that some forms of intelligence might emerge not from memorizing patterns across vast datasets, but from efficiently representing information in compact forms.

The compression-intelligence connection

The potential connection between compression and intelligence may sound strange at first glance, but it has deep theoretical roots in computer science concepts like Kolmogorov complexity (the shortest program that produces a specified output) and Solomonoff induction—a theoretical gold standard for prediction equivalent to an optimal compression algorithm.

To compress information efficiently, a system must recognize patterns, find regularities, and “understand” the underlying structure of the data—abilities that mirror what many consider intelligent behavior. A system that can predict what comes next in a sequence can compress that sequence efficiently. As a result, some computer scientists over the decades have suggested that compression may be equivalent to general intelligence. Based on these principles, the Hutter Prize has offered awards to researchers who can compress a 1GB file to the smallest size.

We previously wrote about intelligence and compression in September 2023, when a DeepMind paper discovered that large language models can sometimes outperform specialized compression algorithms. In that study, researchers found that DeepMind’s Chinchilla 70B model could compress image patches to 43.4 percent of their original size (beating PNG’s 58.5 percent) and audio samples to just 16.4 percent (outperforming FLAC’s 30.3 percent).

Photo of a C-clamp compressing books.

That 2023 research suggested a deep connection between compression and intelligence—the idea that truly understanding patterns in data enables more efficient compression, which aligns with this new CMU research. While DeepMind demonstrated compression capabilities in an already-trained model, Liao and Gu’s work takes a different approach by showing that the compression process can generate intelligent behavior from scratch.

This new research matters because it challenges the prevailing wisdom in AI development, which typically relies on massive pre-training datasets and computationally expensive models. While leading AI companies push toward ever-larger models trained on more extensive datasets, CompressARC suggests intelligence emerging from a fundamentally different principle.

“CompressARC’s intelligence emerges not from pretraining, vast datasets, exhaustive search, or massive compute—but from compression,” the researchers conclude. “We challenge the conventional reliance on extensive pretraining and data, and propose a future where tailored compressive objectives and efficient inference-time computation work together to extract deep intelligence from minimal input.”

Limitations and looking ahead

Even with its successes, Liao and Gu’s system comes with clear limitations that may prompt skepticism. While it successfully solves puzzles involving color assignments, infilling, cropping, and identifying adjacent pixels, it struggles with tasks requiring counting, long-range pattern recognition, rotations, reflections, or simulating agent behavior. These limitations highlight areas where simple compression principles may not be sufficient.

The research has not been peer-reviewed, and the 20 percent accuracy on unseen puzzles, though notable without pre-training, falls significantly below both human performance and top AI systems. Critics might argue that CompressARC could be exploiting specific structural patterns in the ARC puzzles that might not generalize to other domains, challenging whether compression alone can serve as a foundation for broader intelligence rather than just being one component among many required for robust reasoning capabilities.

And yet as AI development continues its rapid advance, if CompressARC holds up to further scrutiny, it offers a glimpse of a possible alternative path that might lead to useful intelligent behavior without the resource demands of today’s dominant approaches. Or at the very least, it might unlock an important component of general intelligence in machines, which is still poorly understood.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

CMU research shows compression alone may unlock AI puzzle-solving abilities Read More »

no-one-asked-for-this:-google-is-testing-round-keys-in-gboard

No one asked for this: Google is testing round keys in Gboard

Most Android phones ship with Google’s Gboard as the default input option. It’s a reliable, feature-rich on-screen keyboard, so most folks just keep using it instead of installing a third-party option. Depending on how you feel about circles, it might be time to check out some of those alternatives. Google has quietly released an update that changes the shape and position of the keys, and users are not pleased.

In the latest build of Gboard (v15.1.05.726012951-beta-arm64-v8a), Google has changed the key shape from the long-running squares to circle shapes. If you’re using the four-row layout, the keys are like little pills. In five-row mode with the exposed number row, the keys are collapsed further into circles. The reactions seem split between those annoyed by this change and those annoyed that everyone else is so annoyed.

Change can be hard sometimes, so certainly some of the discontent is just a function of having the phone interface changed without warning. If you find it particularly distasteful, you can head into the Gboard settings and open the Themes menu. From there, you can tap on a theme and then turn off the key borders. Thus, you won’t be distracted by the horror of rounded edges. That’s not the only problem with the silent update, though.

The wave of objections isn’t just about aesthetics—this update also moves the keys around a bit. After years of tapping away on keys with a particular layout, people develop muscle memory. Big texters can sometimes type messages on their phone without even looking at it, but moving the keys around even slightly, as Google has done here, can cause you to miss more keys than you did before the update.

No one asked for this: Google is testing round keys in Gboard Read More »

the-most-intriguing-tech-gadget-prototypes-demoed-this-week

The most intriguing tech gadget prototypes demoed this week


Some of these ideas have genuine shots at making it into real products.

Creating new and exciting tech products requires thinking outside of the box. At this week’s Mobile World Congress (MWC) conference in Barcelona, we got a peek at some of the research and development happening in the hopes of forging a functional gadget that people might actually want to buy one day.

While MWC is best known for its smartphone developments, we thought we’d break down the most intriguing, non-phone prototypes brought to the show for you. Since these are just concept devices, it’s possible that you’ll never see any of the following designs in real products. However, every technology described below is being demonstrated via a tangible proof of concept. And the companies involved—Samsung and Lenovo—both have histories of getting prototyped technologies into real gadgets.

Samsung’s briefcase-tablet

How many times must something repeat before it’s considered a trend? We ask because Samsung Display this week demoed the third recent take we’ve seen on integrating computing devices into suitcases.

Samsung Display’s Flexible Briefcase prototype uses an 18.1-inch OLED tablet that “can be folded into a compact briefcase for convenience,” per the company’s announcement. Samsung Display brought a proof of concept to MWC, but attendees say they haven’t been allowed to touch it.

But just looking at it, the device appears similar to LG’s StanbyMe Go (27LX5QKNA), which is a real product that people can buy. LG’s product is a 27-inch tablet that can fold out from a briefcase and be propped up within the luggage. Samsung’s prototype looks more like a metal case that opens up to reveal a foldable, completely removable tablet.

The folding screen could yield a similar experience to using a foldable laptop. However, that brings questions around how one could easily navigate the tablet via touch and why a folding, massive tablet in luggage is better than a regular one. Samsung Display is a display supplier and doesn’t make gadgets, though, so it may relegate answering those questions to someone else.

Samsung’s concept also brings to mind the Base Case, a briefcase that encapsulates two 24-inch monitors for mobile work setups. The Base Case is also not a real product currently and is seeking crowdfunding.

The laptop that bends over backward for you

There are several laptops that you can buy with a foldable screen right now. But none of them bends the way that Lenovo’s Thinkbook Flip AI concept laptop does. As Lenovo described it, the computer’s OLED panel uses two hinges for “outward folding,” enabling the display to go from 13 inches to 18.1 inches.

Lenovo Thinkbook Flip AI Concept

A new trick. Credit: Lenovo

Enhanced flexibility is supposed to enable the screen to adapt to different workloads. In addition to the concept functioning like a regular clamshell laptop, users could extend the screen into an extra-tall display. That could be helpful for tasks like reading long documents or scrolling social feeds.

Lenovo Thinkbook Flip AI Concept in Vertical Mode.

This would be “Vertical Mode.” Credit: Lenovo

There’s also Share Mode, which enables you and someone sitting across from you to both see the laptop’s display.

Again, every concept in this article may never be sold in actual products. Still, Lenovo’s prototype is said to be fully functional with an Intel Core Ultra 7 processor, 32GB of LPDDR5X RAM, and a PCIe SSD (further spec details weren’t provided). Lenovo also has a strong record of incorporating prototypes into final products. For example, this June, Lenovo is scheduled to release the rollable-screen laptop that it showed as a proof of concept in February 2023.

Hands-on with Lenovo’s Foldable Laptop Concept at MWC 2025.

Lenovo’s solar-powered gadgets

There are many complications involved in making a solar-powered laptop. For one, depending on the configuration, laptops can drain power quickly. Having a computer rely on the sun for power would lead to numerous drawbacks, like shorter battery life and dimmer screens.

In an attempt to get closer to addressing some of those problems, the Lenovo Yoga Solar PC Concept has a solar panel integrated into its cover. Lenovo claims the panel has a conversion rate of “over 24 percent.” Per Lenovo’s announcement:

This conversion rate is achieved by leveraging ‘Back Contact Cell’ technology, which moves mounting brackets and gridlines to the back of the solar cells, maximizing the active absorption. The … Dynamic Solar Tracking system constantly measures the solar panel’s current and voltage, working with the Solar-First Energy system to automatically adjust the charger’s settings to prioritize sending the harvested energy to the system, helping to ensure maximum energy savings and system stability, regardless of light intensity. Even in low-light conditions, the panel can still generate power, sustaining battery charge when the PC is idle.

We’ll take Lenovo’s claims with a grain of salt, but Lenovo does appear to be seriously researching solar-powered gadgets. The vendor claimed that its solar panel can absorb and convert enough “direct,” “optimal,” and “uninterrupted” sunlight in 20 minutes for an hour of video 1080p playback with default settings. That conversion rate could drop based on how bright the sunlight is, the angle at which sunlight is hitting the PC, geographic location, and other factors.

For certain types of users, though, solar power will not be reliably powerful enough to be their computer’s sole source of power. A final offering would have better appeal using solar power as a backup. Lenovo is also toying with that idea through its Solar Power Kit attachment proof of concept.

Lenovo's idea of a Solar Power Kit for its Yoga line of laptops.

Lenovo’s Solar Power Kit proof of concept. Credit: Lenovo

Lenovo designed it to provide extra power to Lenovo Yoga laptops. The solar panel can use its own kickstand or attach to whatever else is around, like a backpack or tree. It absorbs solar energy and converts it to PC power using Maximum Power Point Tracking, Lenovo said. The Kit would attach to laptops via a USB-C cable. Another option is to use the Solar Power Kit to charge a power bank.

Lenovo isn’t limiting this concept to PCs and suggested that the tech it demonstrated could be applied to other devices, like tablets and speakers.

A new take on foldable gaming handhelds

We’ve seen gaming handheld devices that can fold in half before. But the gadget that Samsung Display demoed this week brings the fold to the actual screen.

Samsung Display Flex Gaming

The crease would be a problem for me. Credit: Samsung

Again, Samsung Display is a display supplier, so it makes sense that its approach to a new gaming handheld focuses on the display. The prototype it brought to MWC, dubbed Flex Gaming, is smaller than a Nintendo Switch and included joysticks and areas that look fit for D-pads or buttons.

The emphasis is on the foldable display, which could make a gadget extremely portable but extra fragile. We’d also worry about the viewing experience. Foldable screens have visible creases, especially when viewed from different angles or in bright conditions. Both of those conditions are likely to come up with a gaming device meant for playing on the go.

Still, companies are eager to force folding screens into more types of devices, with the tech already expanding from phones to PCs and monitors. And although all of the concepts in this article may never evolve into real products, Samsung Display has shown repeated interest in providing unique displays for handheld gaming devices. At the 2023 CES trade show in Las Vegas, Samsung demoed a similar device with a horizontal fold, like a calendar, compared to the newer prototype’s book-like vertical fold:

It’s unclear why the fold changed from prototype to prototype, but we do know that this is a concept that Samsung Display has been playing with for at least a few years. In 2022, Samsung filed a patent application for a foldable gaming handheld that looks similar to the device shown off at MWC 2025:

Samsung Display foldable gaming console

An image from Samsung Display’s patent application. Credit: Samsung Display/WIPO

Lenovo’s magnetic PC accessories

Framework has already proven how helpful modular laptops can be for longevity and durability. Being able to add new components and parts to your PC enables the system to evolve with the times and your computing needs.

Framework’s designs largely focus on easily upgrading essential computer parts, like RAM, keyboards, and ports. Lenovo’s new concepts, on the other hand, offer laptop accessories that you can live without.

Among the prototypes that Lenovo demoed this week is a small, circular display adorned with cat ears and a tail. The display shows a smiley face with an expression that changes based on what you’re doing on the connected system and “offers personalized emoji notifications,” per Lenovo. The Tiko Pro Concept is a small screen that attaches to a Lenovo Thinkbook laptop and shows widgets, like the time, a stopwatch, transcriptions, or a combination.

Likely offering greater appeal, Lenovo also demoed detachable secondary laptop screens, including a pair of 13.3 inch displays that connect to the left and right side of a Lenovo laptop’s display, plus a 10-inch option.

Lenovo's idea for magnetically attacble secondary laptop screens.

Lenovo’s idea for magnet-attachable secondary laptop screens. Credit: Lenovo

Lenovo demoed these attachments on a real upcoming laptop, the Thinkbook 16p Gen 6 (which is supposed to come out in June starting at 2,999 euros excluding VAT, or about $3,245).

Lenovo has been discussing using pogo pins to attach removable accessories to laptops since CES 2024. PCMag reported that the company plans to release a Magic Bay webcam with 4K resolution and speakers this year.

Photo of Scharon Harding

Scharon is a Senior Technology Reporter at Ars Technica writing news, reviews, and analysis on consumer gadgets and services. She’s been reporting on technology for over 10 years, with bylines at Tom’s Hardware, Channelnomics, and CRN UK.

The most intriguing tech gadget prototypes demoed this week Read More »