bugs

what-kind-of-bug-would-make-machine-learning-suddenly-40%-worse-at-nethack?

What kind of bug would make machine learning suddenly 40% worse at NetHack?

Large Moon Models (LMMs) —

One day, a roguelike-playing system just kept biffing it, for celestial reasons.

Moon rendered in ASCII text, with

Aurich Lawson

Members of the Legendary Computer Bugs Tribunal, honored guests, if I may have your attention? I would, humbly, submit a new contender for your esteemed judgment. You may or may not find it novel, you may even deign to call it a “bug,” but I assure you, you will find it entertaining.

Consider NetHack. It is one of the all-time roguelike games, and I mean that in the more strict sense of that term. The content is procedurally generated, deaths are permanent, and the only thing you keep from game to game is your skill and knowledge. I do understand that the only thing two roguelike fans can agree on is how wrong the third roguelike fan is in their definition of roguelike, but, please, let us move on.

NetHack is great for machine learning…

Being a difficult game full of consequential choices and random challenges, as well as a “single-agent” game that can be generated and played at lightning speed on modern computers, NetHack is great for those working in machine learning—or imitation learning, actually, as detailed in Jens Tuyls’ paper on how compute scaling affects single-agent game learning. Using Tuyls’ model of expert NetHack behavior, Bartłomiej Cupiał and Maciej Wołczyk trained a neural network to play and improve itself using reinforcement learning.

By mid-May of this year, the two had their model consistently scoring 5,000 points by their own metrics. Then, on one run, the model suddenly got worse, on the order of 40 percent. It scored 3,000 points. Machine learning generally, gradually, goes in one direction with these types of problems. It didn’t make sense.

Cupiał and Wołczyk tried quite a few things: reverting their code, restoring their entire software stack from a Singularity backup, and rolling back their CUDA libraries. The result? 3,000 points. They rebuild everything from scratch, and it’s still 3,000 points.

<em>NetHack</em>, played by a regular human.” height=”506″ src=”https://cdn.arstechnica.net/wp-content/uploads/2024/06/13863751533_64654db44e_o.png” width=”821″></img><figcaption>
<p><em>NetHack</em>, played by a regular human.</p>
</figcaption></figure>
<h2>… except on certain nights</h2>
<p>As <a href=detailed in Cupiał’s X (formerly Twitter) thread, this was several hours of confused trial and error by him and Wołczyk. “I am starting to feel like a madman. I can’t even watch a TV show constantly thinking about the bug,” Cupiał wrote. In desperation, he asks model author Tuyls if he knows what could be wrong. He wakes up in Kraków to an answer:

“Oh yes, it’s probably a full moon today.”

In NetHack, the game in which the DevTeam has thought of everything, if the game detects from your system clock that it should be a full moon, it will generate a message: “You are lucky! Full moon tonight.” A full moon imparts a few player benefits: a single point added to Luck, and werecreatures mostly kept to their animal forms.

It’s an easier game, all things considered, so why would the learning agent’s score be lower? It simply doesn’t have data about full moon variables in its training data, so a branching series of decisions likely leads to lesser outcomes, or just confusion. It was indeed a full moon in Kraków when the 3,000-ish scores started showing up. What a terrible night to have a learning model.

Of course, “score” is not a real metric for success in NetHack, as Cupiał himself noted. Ask a model to get the best score, and it will farm the heck out of low-level monsters because it never gets bored. “Finding items required for [ascension] or even [just] doing a quest is too much for pure RL agent,” Cupiał wrote. Another neural network, AutoAscend, does a better job of progressing through the game, but “even it can only solve sokoban and reach mines end,” Cupiał notes.

Is it a bug?

I submit to you that, although NetHack responded to the full moon in its intended way, this quirky, very hard-to-fathom stop on a machine-learning journey was indeed a bug and a worthy one in the pantheon. It’s not a Harvard moth, nor a 500-mile email, but what is?

Because the team used Singularity to back up and restore their stack, they inadvertently carried forward the machine time and resulting bug each time they tried to solve it. The machine’s resulting behavior was so bizarre, and seemingly based on unseen forces, that it drove a coder into fits. And the story has a beginning, a climactic middle, and a denouement that teaches us something, however obscure.

The NetHack Lunar Learning Bug is, I submit, quite worth memorializing. Thank you for your time.

What kind of bug would make machine learning suddenly 40% worse at NetHack? Read More »

pixel-phones-are-broken-again-with-critical-storage-permission-bug

Pixel phones are broken again with critical storage permission bug

Did Google lay off all their bug testers? —

Users say they can’t access their device storage after January 2024 update.

Pixel phones are broken again with critical storage permission bug

It’s almost hard to believe this is happening again, but Pixel users are reporting that an OS update has locked them out of their phones’ internal storage, causing app crashes, non-functional phones, and a real possibility of data loss. Over in the Google Pixel subreddit, user “Liv-Lyf” compiled a dozen posts that complain of an “internal storage access issue” and blame the January 2024 Google Play system update.

In October, Pixel phones faced a nightmare storage bug that caused bootlooping, inaccessible devices, and data loss. The recent post says, “The symptoms are all the same” as that October bug, with “internal storage not getting mounted, camera crashes, Files app shows no files, screenshots not getting saved, internal storage shows up empty in ADB Shell, etc.” When asked for a comment, Google told Ars, “We’re aware of this issue and are looking into it,” and a Google rep posted effectively the same statement in the comments.

In the October bug, users were locked out of their system storage due to a strange permissions issue. Having a phone try to run without any user access to your own storage is a mess. It breaks the camera and screenshots because you can’t write media. File Managers read “0 bytes” for every file and folder. Nothing works over USB, and some phones, understandably, just fail to boot. The issue in October arrived as part of the initial Android 14 release and only affected devices that had multiple users set up.

Picking through the posts, it’s unclear if there’s a certain type of user that should be more wary of the January 2024 Google Play update. Some users say they haven’t enabled the multiple-user functionality, but several mention having a work profile enabled. Work Profiles aren’t quite “multiple users,” but the system leverages a lot of multi-user features to let users have duplicate “personal” and “work” copies of the same apps. Many users don’t say if they have a work profile or not.

The “January 2024 Google Play system update” isn’t the usual OTA system update but is a Project Mainline or APEX module. These take core system components and wrap them up into easily distributable packaging where they can be delivered via the Play Store, much like an app, but with way more permissions (only Google can make Play system updates). Google posts release notes for Play system updates, and there’s nothing in the January 2024 update that jumps out as the potential cause of a storage access problem. You can check your current version on a Pixel phone by going to Settings, Security & Privacy, then “System & updates.” At the bottom, you’ll see a month and year for your “Google Play system update” level. DO NOT tap on this section because that will bring up the update screen.

Google’s “we’re looking into it” statement doesn’t give users much guidance on how they should deal with this in the meantime. A good first step, at any time, is to ensure you have backups of all your important phone data. Obviously, avoiding the January 2024 Google Play system update is recommended for now, but I don’t think there’s a way for users to do that. Google Play system updates don’t offer users any controls, so you’re mostly hoping an automatic update doesn’t brick your phone. The good news is that the Google Play system often fails to check for updates. They get installed on reboot, so try not to power cycle your phone. Disabling a work profile and any other multi-user features sounds like a good idea if you can manage that. There are instructions here.

Pixel phones are broken again with critical storage permission bug Read More »