Google says running AI models on phones is a huge RAM hog

8GB of RAM ought to be enough for anybody —

Google wants AI models to be loaded 24/7, so 8GB of RAM might not be enough.

Ron Amadeo – Mar 29, 2024 5: 16 pm UTC

Enlarge / The Google Gemini logo.

Google

In early March, Google made the odd announcement that only one of its two latest smartphones, the Pixel 8 and Pixel 8 Pro, would be able to run its latest AI model, called “Google Gemini.” Despite having very similar specs, the smaller Pixel 8 wouldn’t get the new AI model, with the company citing mysterious “hardware limitations” as the reason. It was a strange statement considering the fact that Google designed and marketed the Pixel 8 to be AI-centric and then designed a smartphone-centric AI model called “Gemini Nano” yet still couldn’t make the two work together.

A few weeks later, Google is backtracking somewhat. The company announced on the Pixel Phone Help forum that the smaller Pixel 8 actually will get Gemini Nano in the next big quarterly Android release, which should happen in June. There’s a catch, though—while the Pixel 8 Pro will get Gemini Nano as a user-facing feature, on the Pixel 8, it’s only being released “as a developer option.” That means you’ll be able to turn it on only via the hidden Developer Options menu in the settings, and most people will miss out on it.

Google’s Seang Chau, VP of devices and services software, explained the decision on the company’s in-house “Made by Google” podcast. “The Pixel 8 Pro, having 12GB of RAM, was a perfect place for us to put [Gemini Nano] on the device and see what we could do,” Chau said. “When we looked at the Pixel 8 as an example, the Pixel 8 has 4GB less memory, and it wasn’t as easy of a call to just say, ‘all right, we’re going to enable it on Pixel 8 as well.'” According to Chau, Google’s trepidation is because the company doesn’t want to “degrade the experience” on the smaller Pixel 8, which only has 8GB of RAM.

Chau went on to describe what it’s like to have a large language model like Gemini Nano on your phone, and it sounds like there are big trade-offs involved. Google wants some of the AI models to be “RAM-resident” so they’re always loaded in memory. One such feature is “smart reply,” which tries to auto-generate text replies.

Chau told the podcast, “Smart Reply is something that requires the models to be RAM-resident so that it’s available all the time. You don’t want to wait for the model to load on a Gboard reply, so we keep it resident.” On the Pixel 8 Pro, smart reply can be turned on and off via the normal keyboard settings, but on the Pixel 8, you’ll need to turn on the developer flag first.

Enlarge / The bigger Pixel 8 Pro gets the latest AI features. The smaller model will have it locked behind a developer option.

Google

So unlike an app, which can be loaded and unloaded as you use it, running something like Gemini Nano could mean permanently losing what is apparently a big chunk of system memory. The baseline of 8GB of RAM for Android phones may need to be increased again in the future. The high mark we’ve seen for phones is 24GB of RAM, and the bigger flagships usually have 12GB or 16GB of RAM, so it’s certainly doable.

Google’s Gemini Nano model is also shipping on the Galaxy S24 lineup, and the base model there has 8GB of RAM, too. When Google originally cited hardware limitations on the Pixel 8 for the feature’s absence, its explanation was confusing—if the base-model S24 can run it, the Pixel 8 should be able to as well. It’s all about how much of a trade-off you’re willing to make in available memory for apps, though. Chau says the team is “still doing system health validation because even if you’re a developer, you might want to use your phone on a daily basis.”

The elephant in the room, though, is that as a user, I don’t even know if I want Gemini Nano on my phone. We’re at the peak of the generative AI hype cycle, and Google has its own internal reasons (the stock market) for pushing AI so hard. While visiting ChatGPT and asking it questions can be useful, that’s just an app. Actually useful OS-level generative AI features are few and far between. I don’t really need a keyboard to auto-generate replies. If it’s just going to use up a bunch of RAM that could be used by apps, I might want to turn it off.