Author name: Tim Belzer

“awful”:-roku-tests-autoplaying-ads-loading-before-the-home-screen

“Awful”: Roku tests autoplaying ads loading before the home screen

Owners of smart TVs and streaming sticks running Roku OS are already subject to video advertisements on the home screen. Now, Roku is testing what it might look like if it took things a step further and forced people to watch a video ad play before getting to the Roku OS home screen.

Reports of Roku customers seeing video ads automatically play before they could view the OS’ home screen started appearing online this week. A Reddit user, for example, posted yesterday: “I just turned on my Roku and got an … ad for a movie, before I got to the regular Roku home screen.” Multiple apparent users reported seeing an ad for the movie Moana 2. The ads have a close option, but some users appear to have not seen it.

When reached for comment, a Roku spokesperson shared a company statement that confirms that the autoplaying ads are expected behavior but not a permanent part of Roku OS currently. Instead, Roku claimed, it was just trying the ad capability out.

Roku’s representative said that Roku’s business “has and will always require continuous testing and innovation across design, navigation, content, and our first-rate advertising products,” adding:

Our recent test is just the latest example, as we explore new ways to showcase brands and programming while still providing a delightful and simple user experience.

Roku didn’t respond to requests for comment on whether it has plans to make autoplaying ads permanent on Roku OS, which devices are affected, why Roku decided to use autoplaying ads, or customer backlash.

“Awful”: Roku tests autoplaying ads loading before the home screen Read More »

uk-online-safety-law-musk-hates-kicks-in-today,-and-so-far,-trump-can’t-stop-it

UK online safety law Musk hates kicks in today, and so far, Trump can’t stop it

Enforcement of a first-of-its-kind United Kingdom law that Elon Musk wants Donald Trump to gut kicked in today, with potentially huge penalties possibly imminent for any Big Tech companies deemed non-compliant.

UK’s Online Safety Act (OSA) forces tech companies to detect and remove dangerous online content, threatening fines of up to 10 percent of global turnover. In extreme cases, widely used platforms like Musk’s X could be shut down or executives even jailed if UK online safety regulator Ofcom determines there has been a particularly egregious violation.

Critics call it a censorship bill, listing over 130 “priority” offenses across 17 categories detailing what content platforms must remove. The list includes illegal content connected to terrorism, child sexual exploitation, human trafficking, illegal drugs, animal welfare, and other crimes. But it also broadly restricts content in legally gray areas, like posts considered “extreme pornography,” harassment, or controlling behavior.

Matthew Lesh, a public policy fellow at the Institute of Economic Affairs, told The Telegraph that “the idea that Elon Musk, or any social media executive, could be jailed for failing to remove enough content should send chills down the spine of anyone who cares about free speech.”

Musk has publicly signaled that he expects Trump to intervene, saying, “Thank goodness Donald Trump will be president just in time,” regarding the OSA’s enforcement starting in March, The Telegraph reported last month. The X owner has been battling UK regulators since last summer after resisting requests from the UK government to remove misinformation during riots considered the “worst unrest in England for more than a decade,” The Financial Times reported.

According to Musk, X was refusing to censor UK users. Attacking the OSA, Musk falsely claimed Prime Minister Keir Starmer’s government was “releasing convicted pedophiles in order to imprison people for social media posts,” FT reported. Such a post, if seen as spreading misinformation potentially inciting violence, could be banned under the OSA, the FT suggested.

Trump’s UK deal may disappoint Musk

Musk hopes that Trump will strike a deal with the UK government to potentially water down the OSA.

UK online safety law Musk hates kicks in today, and so far, Trump can’t stop it Read More »

a-tough-race-for-the-rookies-as-f1-starts-2025-in-australia

A tough race for the rookies as F1 starts 2025 in Australia

Williams’ Alex Albon scored a fine fifth for the storied team. The preseason vibes for Williams were correct—after a few years of being one of—if not the—slowest, it now looks to be leading the midfield. And Racing Bull’s Yuki Tsunoda demonstrated that he probably should have been promoted to the Red Bull team with a fine 5th place in qualifying that sadly did not translate to points in the race.

The Sauber team, which becomes Audi next year, appeared dreadful in Bahrain but arrived in Oz with some new bodywork, including a revised front wing. That helped Nico Hulkenberg finish seventh, scoring more points in the process than the Swiss-based team managed across all 24 races last year.

MELBOURNE, AUSTRALIA - MARCH 16: Alexander Albon of Thailand driving the (23) Williams FW47 Mercedes leads Lewis Hamilton of Great Britain driving the (44) Scuderia Ferrari SF-25 on track during the F1 Grand Prix of Australia at Albert Park Grand Prix Circuit on March 16, 2025 in Melbourne, Australia.

Albon drove a great race to fifth place. Credit: James Sutton – Formula 1/Formula 1 via Getty Images

Better luck in China

It was a much harder day for some, including most of the rookies. Racing Bull’s new driver, Isack Hadjar, was caught out on the formation lap by differing grip conditions between the asphalt and painted lines on what are public roads for most of the year. Cleaning up the crash delayed the start by 15 minutes as a distraught Hadjar made his way back to the pits to watch the race unfold without him. After he barely lost out on the F2 title at the end of last year when his car stalled at the start, one hopes he can put his last couple of races behind him.

Alpine’s Jack Doohan, Sauber’s Gabriel Bortoleto (who beat Hadjar to the F2 championship last year), and Red Bull’s Liam Lawson (who sort of still counts as a rookie) also each ended their days prematurely after crashing out, but so too did former world champion Fernando Alonso and last year’s race winner Carlos Sainz. That two such experienced drivers also got caught out should bring some comfort to the four youngsters.

It was also a rough start to Lewis Hamilton’s tenure at Ferrari. The seven-time world champion and his new race engineer were developing their working relationship in real time, and Hamilton bristled at the constant suggestions from the pit wall. It was an underwhelming day in general for Ferrari, which only finished 8th (Leclerc) and 10th (Hamilton).

Isack Hadjar crashed out of the Australian Grand Prix before it even happened. Kym Illman/Getty Images

The sport returns next weekend in China.

A tough race for the rookies as F1 starts 2025 in Australia Read More »

the-wheel-of-time-is-back-for-season-three,-and-so-are-our-weekly-recaps

The Wheel of Time is back for season three, and so are our weekly recaps

Andrew Cunningham and Lee Hutchinson have spent decades of their lives with Robert Jordan and Brandon Sanderson’s Wheel of Time books, and they previously brought that knowledge to bear as they recapped each first season episode and second season episode of Amazon’s WoT TV series. Now we’re back in the saddle for season three—along with insights, jokes, and the occasional wild theory.

These recaps won’t cover every element of every episode, but they will contain major spoilers for the show and the book series. We’ll do our best to not spoil major future events from the books, but there’s always the danger that something might slip out. If you want to stay completely unspoiled and haven’t read the books, these recaps aren’t for you.

New episodes of The Wheel of Time season three will be posted for Amazon Prime subscribers every Thursday. This write-up covers the entire three-episode season premiere, which was released on March 13.

Lee: Welcome back! Holy crap, has it only been 18 months since we left our broken and battered heroes standing in tableaux, with the sign of the Dragon flaming above Falme? Because it feels like it’s been about ten thousand years.

Andrew: Yeah, I’m not saying I want to return to the days when every drama on TV had 26 hour-long episodes per season, but when you’re doing one eight-episode run every year-and-a-half-to-two-years, you really feel those gaps. And maybe it’s just [waves arms vaguely at The World], but I am genuinely happy to have this show back.

This season’s premiere simply whips, balancing big action set-pieces and smaller character moments in between. But the whole production seems to be hitting a confident stride. The cast has gelled; they know what book stuff they’re choosing to adapt and what they’re going to skip. I’m sure there will still be grumbles, but the show does finally feel like it’s become its own thing.

Rosamund Pike returns as as Moiraine Damodred.

Credit: Courtesy of Prime/Amazon MGM Studios

Rosamund Pike returns as as Moiraine Damodred. Credit: Courtesy of Prime/Amazon MGM Studios

Lee: Oh yeah. The first episode hits the ground running, with explosions and blood and stolen ter’angreal. And we’ve got more than one episode to talk about—the gods of production at Amazon have given us a truly gigantic three-episode premiere, with each episode lasting more than an hour. Our content cup runneth over!

Trying to straight-up recap three hours of TV isn’t going to happen in the space we have available, so we’ll probably bounce around a bit. What I wanted to talk about first was exactly what you mentioned: unlike seasons one and two, this time, the show seems to have found itself and locked right in. To me, it feels kind of like Star Trek: The Next Generation’s third season versus its first two.

Andrew: That’s a good point of comparison. I feel like a lot of TV shows fall into one of two buckets: either it starts with a great first season and gradually falls off, or it gets off to a rocky start and finds itself over time. Fewer shows get to take the second path because a “show with a rocky start” often becomes a “canceled show,” but they can be more satisfying to watch.

The one Big Overarching Plot Thing to know for book readers is that they’re basically doing book 4 (The Shadow Rising) this season, with other odds and ends tucked in. So even if it gets canceled after this, at least they will have gotten to do what I think is probably the series’ high point.

Lee: Yep, we find out in our very first episode this season that we’re going to be heading to the Aiel Waste rather than the southern city of Tear, which is a significant re-ordering of events from the books. But unlike some of the previous seasons’ changes that feel like they were forced upon the show by outside factors (COVID, actors leaving, and so on), this one feels like it serves a genuine narrative purpose. Rand is reciting the Prophesies of the Dragon to himself and he knows he needs the “People of the Dragon” to guarantee success in Tear, and while he’s not exactly sure who the “People of the Dragon” might be, it’s obvious that Rand has no army as of yet. Maybe the Aiel can help?

Rand is doing all of this because both the angel and the devil on Rand’s shoulders—that’s the Aes Sedai Moiraine Damodred with cute blue angel wings and the Forsaken Lanfear in fancy black leather BDSM gear—want him wielding Callandor, The Sword That is Not a Sword (as poor Mat Cauthon explains in the Old Tongue). This powerful sa’angreal is located in the heart of the Stone of Tear (it’s the sword in the stone, get it?!), and its removal from the Stone is a major prophetic sign that the Dragon has indeed come again.

Book three is dedicated to showing how all that happens—but, like you said, we’re not in book three anymore. We’re gonna eat our book 4 dessert before our book 3 broccoli!

Natasha O’Keeffe as Lanfear.

Credit: Courtesy of Prime/Amazon MGM Studios

Natasha O’Keeffe as Lanfear. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: I like book 4 a lot (and I’d include 5 and 6 here too) because I think it’s when Robert Jordan was doing his best work balancing his worldbuilding and politicking with the early books’ action-adventure stuff, and including multiple character perspectives without spreading the story so thin that it could barely move forward. Book 3 was a stepping stone to this because the first two books had mainly been Rand’s, and we spend almost no time in Rand’s head in book 3. But you can’t do that in a TV show! So they’re mixing it up. Good! I am completely OK with this.

Lee:What did you think of Queen Morgase’s flashback introduction where we see how she won the Lion Throne of Andor (flanked by a pair of giant lions that I’m pretty sure came straight from Pier One Imports)? It certainly seemed a bit… evil.

Andrew: One of the bigger swerves that the show has taken with an established book character, I think! And well before she can claim to have been under the control of a Forsaken. (The other swerves I want to keep tabs on: Moiraine actively making frenemies with Lanfear to direct Rand, and Lan being the kind of guy who would ask Rand if he “wants to talk about it” when Rand is struggling emotionally. That one broke my brain, the books would be half as long as they are if men could openly talk to literally any other men about their states of mind.)

But I am totally willing to accept that Morgase change because the alternative is chapters and chapters of people yapping about consolidating political support and daes dae’mar and on and on. Bo-ring!

But speaking of Morgase and Forsaken, we’re starting to spend a little time with all the new baddies who got released at the end of last season. How do you feel about the ones we’ve met so far? I know we were generally supportive of the fact that the show is just choosing to have fewer of them in the first place.

Lee: Hah, I loved the contrast with Book Lan, who appears to only be capable of feeling stereotypically manly feelings (like rage, shame, or the German word for when duty is heavier than a mountain, which I’m pretty sure is something like “Bergpflichtenschwerengesellschaften”). It continues to feel like all of our main characters have grown up significantly from their portrayals on the page—they have sex, they use their words effectively, and they emotionally support each other like real people do in real life. I’m very much here for that particular change.

But yes, the Forsaken. We know from season two that we’re going to be seeing fewer than in the books—I believe we’ve got eight of them to deal with, and we meet almost all of them in our three-episode opening blast. I’m very much enjoying Moghedien’s portrayal by Laia Costa, but of course Lanfear is stealing the show and chewing all the scenery. It will be fascinating to see how the show lets the others loose—we know from the books that every one of the Forsaken has a role to play (including one specific Forsaken whose existence has yet to be confirmed but who figures heavily into Rand learning more about how the One Power works), and while some of those roles can be dropped without impacting the story, several definitely cannot.

And although Elaida isn’t exactly a Forsaken, it was awesome to see Shohreh Aghdashloo bombing around the White Tower looking fabulous as hell. Chrisjen Avasarala would be proud.

The boys, communicating and using their words like grown-ups.

Credit: Courtesy of Prime/Amazon MGM Studios

The boys, communicating and using their words like grown-ups. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: Maybe I’m exaggerating but I think Shohreh Aghdashloo’s actual voice goes deeper than Hammed Animashaun’s lowered-in-post-production voice for Loial. It’s an incredible instrument.

Meeting Morgase in these early episodes means we also meet Gaebril, and the show only fakes viewers out for a few scenes before revealing what book-readers know: that he’s the Forsaken Rahvin. But I really love how these scenes play, particularly his with Elayne. After one weird, brief look, they fall into a completely convincing chummy, comfortable stepdad-stepdaughter relationship, and right after that, you find out that, oops, nope, he’s been there for like 15 minutes and has successfully One Power’d everyone into believing he’s been in their lives for decades.

It’s something that we’re mostly told-not-shown in the books, and it really sells how powerful and amoral and manipulative all these characters are. Trust is extremely hard to come by in Randland, and this is why.

Lee: I very much liked the way Gaebril’s/Rahvin’s crazy compulsion comes off, and I also like the way Nuno Lopes is playing Gaebril. He seems perhaps a little bumbling, and perhaps a little self-effacing—truly, a lovable uncle kind of guy. The kind of guy who would say “thank you” to a servant and smile at children playing. All while, you know, plotting the downfall of the kingdom. In what is becoming a refrain, it’s a fun change from the books.

And along the lines of unassuming folks, we get our first look at a Gray Man and the hella creepy mechanism by which they’re created. I can’t recall in the books if Moghedien is explicitly mentioned as being able to fashion the things, but she definitely can in the show! (And it looks uncomfortable as hell. “Never accept an agreement that involves the forcible removal of one’s soul” is an axiom I try to live by.)

Olivia Williams as Queen Morgase Trakand and Shohreh Aghdashloo as Elaida do Avriny a’Roihan.

Credit: Courtesy of Prime/Amazon MGM Studios

Olivia Williams as Queen Morgase Trakand and Shohreh Aghdashloo as Elaida do Avriny a’Roihan. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: It’s just one of quite a few book things that these first few episodes speedrun. Mat has weird voices in his head and speaks in tongues! Egwene and Elayne pass the Accepted test! (Having spent most of an episode on Nynaeve’s Accepted test last season, the show yada-yadas this a bit, showing us just a snippet of Egwene’s Rand-related trials and none of Elayne’s test at all.) Elayne’s brothers Gawyn and Galad show up, and everyone thinks they’re very hot, and Mat kicks their asses! The Black Ajah reveals itself in explosive fashion, and Siuan can only trust Elayne and Nynaeve to try and root them out! Min is here! Elayne and Aviendha kiss, making more of the books’ homosexual subtext into actual text! But for the rest of the season, we split the party in basically three ways: Rand, Egwene, Moiraine and company head with Aviendha to the Waste, so that Rand can make allies of the Aiel. Perrin and a few companions head home to the Two Rivers and find that things are not as they left them. Nynaeve and Elayne are both dealing with White Tower intrigue. There are other threads, but I think this sets up most of what we’ll be paying attention to this season.

As we try to wind down this talk about three very busy episodes, is there anything you aren’t currently vibing with? I feel like Josha Stradowski’s Rand is getting lost in the shuffle a bit, despite this nominally being his story.

Lee: I agree about Rand—but, hey, the same de-centering of Rand happened in the books, so at least there is symmetry. I think the things I’m not vibing with are at this point just personal dislikes. The sets still feel cheap. The costumes are great, but the Great Serpent rings are still ludicrously large and impractical.

I’m overjoyed the show is unafraid to shine a spotlight on queer characters, and I’m also desperately glad that we aren’t being held hostage by Robert Jordan’s kinks—like, we haven’t seen a single Novice or Accepted get spanked, women don’t peel off their tops in private meetings to prove that they’re women, and rather than titillation or weirdly uncomfortable innuendo, these characters are just straight-up screwing. (The Amyrlin even notes that she’s not sure the Novices “will ever recover” after Gawyn and Galad come to—and all over—town.)

If I had to pick a moment that I enjoyed the most out of the premiere, it would probably be the entire first episode—which in spite of its length kept me riveted the entire time. I love the momentum, the feeling of finally getting the show that I’d always hoped we might get rather than the feeling of having to settle.

How about you? Dislikes? Loves?

Ceara Coveney as Elayne Trakand and Ayoola Smart as Aviendha, and they’re thinking about exactly what you think they’re thinking about.

Credit: Courtesy of Prime/Amazon MGM Studios

Ceara Coveney as Elayne Trakand and Ayoola Smart as Aviendha, and they’re thinking about exactly what you think they’re thinking about. Credit: Courtesy of Prime/Amazon MGM Studios

Andrew: Not a ton of dislikes, I am pretty in the tank for this at this point. But I do agree that some of the prop work is weird. The Horn of Valere in particular looks less like a legendary artifact and more like a decorative pitcher from a Crate & Barrel.

There were two particular scenes/moments that I really enjoyed. Rand and Perrin and Mat just hang out, as friends, for a while in the first episode, and it’s very charming. We’re told in the books constantly that these three boys are lifelong pals, but (to the point about Unavailable Men we were talking about earlier) we almost never get to see actual evidence of this, either because they’re physically split up or because they’re so wrapped up in their own stuff that they barely want to speak to each other.

I also really liked that brief moment in the first episode where a Black Ajah Aes Sedai’s Warder dies, and she’s like, “hell yeah, this feels awesome, this is making me horny because of how evil I am.” Sometimes you don’t want shades of gray—sometimes you just need some cartoonishly unambiguous villainy.

Lee: I thought the Black Ajah getting excited over death was just the right mix of of cartoonishness and actual-for-real creepiness, yeah. These people have sold their eternal souls to the Shadow, and it probably takes a certain type. (Though, as book readers know, there are some surprising Black Ajah reveals yet to be had!)

We close out our three-episode extravaganza with Mat having his famous stick fight with Zoolander-esque male models Gawyn and Galad, Liandrin and the Black Ajah setting up shop (and tying off some loose ends) in Tanchico, Perrin meeting Faile and Lord Luc in the Two Rivers, and Rand in the Aiel Waste, preparing to do—well, something important, one can be sure.

We’ll leave things here for now. Expect us back next Friday to talk about episode four, which, based on the preview trailers already showing up online, will involve a certain city in the desert, wherein deep secrets will be revealed.

Mia dovienya nesodhin soende, Andrew!

Andrew: The Wheel weaves as the Wheel wills.

Credit: WoT Wiki

The Wheel of Time is back for season three, and so are our weekly recaps Read More »

for-climate-and-livelihoods,-africa-bets-big-on-solar-mini-grids

For climate and livelihoods, Africa bets big on solar mini-grids


Nigeria is pioneering the development of small, off-grid solar panel installations.

A general view of a hybrid minigrids station in Doma Town which is mainly powered by solar energy in Doma, Nassarawa State, Nigeria on October 16, 2023. Credit: Kola Sulaimon/AFP via Getty Images

To the people of Mbiabet Esieyere and Mbiabet Udouba in Nigeria’s deep south, sundown would mean children doing their homework by the glow of kerosene lamps, and the faint thrum of generators emanating from homes that could afford to run them. Like many rural communities, these two villages of fishermen and farmers in the community of Mbiabet, tucked away in clearings within a dense palm forest, had never been connected to the country’s national electricity grid.

Most of the residents had never heard of solar power either. When, in 2021, a renewable-energy company proposed installing a solar “mini-grid” in their community, the villagers scoffed at the idea of the sun powering their homes. “We didn’t imagine that something [like this] can exist,” says Solomon Andrew Obot, a resident in his early 30s.

The small installation of solar panels, batteries and transmission lines proposed by the company Prado Power would service 180 households in Mbiabet Esieyere and Mbiabet Udouba, giving them significantly more reliable electricity for a fraction of the cost of diesel generators. Village leaders agreed to the installation, though many residents remained skeptical. But when the panels were set up in 2022, lights blinked on in the brightly painted two-room homes and tan mud huts dotted sparsely through the community. At a village meeting in September, locals erupted into laughter as they recalled walking from house to house, turning on lights and plugging in phone chargers. “I [was] shocked,” Andrew Obot says.

Like many African nations, Nigeria has lagged behind Global North countries in shifting away from planet-warming fossil fuels and toward renewable energy. Solar power contributes just around 3 percent of the total electricity generated in Africa—though it is the world’s sunniest continent—compared to nearly 12 percent in Germany and 6 percent in the United States.

At the same time, in many African countries, solar power now stands to offer much more than environmental benefits. About 600 million Africans lack reliable access to electricity; in Nigeria specifically, almost half of the 230 million people have no access to electricity grids. Today, solar has become cheap and versatile enough to help bring affordable, reliable power to millions—creating a win-win for lives and livelihoods as well as the climate.

That’s why Nigeria is placing its bets on solar mini-grids—small installations that produce up to 10 megawatts of electricity, enough to power over 1,700 American homes—that can be set up anywhere. Crucially, the country has pioneered mini-grid development through smart policies to attract investment, setting an example for other African nations.

Nearly 120 mini-grids are now installed, powering roughly 50,000 households and reaching about 250,000 people. “Nigeria is actually like a poster child for mini-grid development across Africa,” says energy expert Rolake Akinkugbe-Filani, managing director of EnergyInc Advisors, an energy infrastructure consulting firm.

Though it will take more work—and funding—to expand mini-grids across the continent, Nigeria’s experience demonstrates that they could play a key role in weaning African communities off fossil-fuel-based power. But the people who live there are more concerned with another, immediate benefit: improving livelihoods. Affordable, reliable power from Mbiabet’s mini-grid has already supercharged local businesses, as it has in many places where nonprofits like Clean Technology Hub have supported mini-grid development, says Ifeoma Malo, the organization’s founder. “We’ve seen how that has completely transformed those communities.”

The African energy transition takes shape

Together, Africa’s countries account for less than 5 percent of global carbon dioxide emissions, and many experts, like Malo, take issue with the idea that they need to rapidly phase out fossil fuels; that task should be more urgent for the United States, China, India, the European countries and Russia, which create the bulk of emissions. Nevertheless, many African countries have set ambitious phase-out goals. Some have already turned to locally abundant renewable energy sources, like geothermal power from the Earth’s crust, which supplies nearly half of the electricity produced in Kenya, and hydropower, which creates more than 80 percent of the electricity in the Democratic Republic of Congo, Ethiopia and Uganda.

But hydropower and geothermal work only where those resources naturally exist. And development of more geographically versatile power sources, like solar and wind, has progressed more slowly in Africa. Though solar is cheaper than fossil-fuel-derived electricity in the long term, upfront construction costs are often higher than they are for building new fossil-fuel power plants.

Thanks to its sunny, equatorial position, the African continent has an immense potential for solar power, shown here in kilowatt-hours. However, solar power contributes less than 3 percent of the electricity generated in Africa. Credit: Knowable Magazine

Getting loans to finance big-ticket energy projects is especially hard in Africa, too. Compared to Europe or the United States, interest rates for loans can be two to three times higher due to perceived risks—for instance, that cash-strapped utility companies, already struggling to collect bills from customers, won’t be able to pay back the loans. Rapid political shifts and currency fluctuations add to the uncertainty. To boot, some Western African nations such as Nigeria charge high tariffs on importing technologies such as solar panels. “There are challenges that are definitely hindering the pace at which renewable energy development could be scaling in the region,” says renewable energy expert Tim Reber of the Colorado-based US National Renewable Energy Laboratory.

Some African countries are beginning to overcome these barriers and spur renewable energy development, notes Bruno Merven, an expert in energy systems modeling at the University of Cape Town in South Africa, coauthor of a look at renewable energy development in the Annual Review of Resource Economics. Super-sunny Morocco, for example, has phased out subsidies for gasoline and industrial fuel. South Africa is agreeing to buy power from new, renewable infrastructure that is replacing many coal plants that are now being retired.

Nigeria, where only about a quarter of the national grid generates electricity and where many turn to generators for power, is leaning on mini-grids—since expanding the national grid to its remote communities, scattered across an area 1.3 times the size of Texas, would cost a prohibitive amount in the tens of billions of dollars. Many other countries are in the same boat. “The only way by which we can help to electrify the entire continent is to invest heavily in renewable energy mini-grids,” says Stephen Kansuk, the United Nations Development Program’s regional technical advisor for Africa on climate change mitigation and energy issues.

Experts praise the steps Nigeria has taken to spur such development. In 2016, the country’s Electricity Regulatory Commission provided legal guidelines on how developers, electricity distribution companies, regulators and communities can work together to develop the small grids. This was accompanied by a program through which organizations like the World Bank, the Global Energy Alliance for People and Planet, Bezos Earth Fund and the Rockefeller Foundation could contribute funds, making mini-grid investments less financially risky for developers.

Solar power was also made more attractive by a recent decision by Nigerian President Bola Ahmed Tinubu to remove a long-standing government subsidy on petroleum products. Fossil-fuel costs have been soaring since, for vehicles as well as the generators that many communities rely on. Nigeria has historically been Africa’s largest crude oil producer, but fuel is now largely unaffordable for the average Nigerian, including those living in rural areas, who often live on less than $2 a day. In the crude-oil-rich state of Akwa Ibom, where the Mbiabet villages are located, gasoline was 1,500 naira per liter (around $1) at the time of publishing. “Now that subsidies have come off petrol,” says Akinkugbe-Filani, “we’re seeing a lot more people transition to alternative sources of energy.”

Mini-grids take off

To plan a mini-grid in Nigeria, developers often work with government agencies that have mapped out ideal sites: sunny places where there are no plans to extend the national grid, ensuring that there’s a real power need.

More than 500 million Africans lack access to electricity, and where there is electricity, much of it comes from fossil fuels. Countries are taking different approaches to bring more renewable energy into the mix. Nigeria is focusing on mini-grids, which are especially useful in areas that lack national electricity grids. Morocco and South Africa are building large-scale solar power installations, while Kenya and the Democratic Republic of the Congo are making use of local renewable energy sources like geothermal and hydropower, respectively. Credit: Knowable Magazine

The next step is getting communities on board, which can take months. Malo recalls a remote Indigenous village in the hills of Adamawa state in Nigeria’s northeast, where locals have preserved their way of life for hundreds of years and are wary of outsiders. Her team had almost given up trying to liaise with reluctant male community leaders and decided to try reaching out to the women. The women, it turned out, were fascinated by the technology and how it could help them, especially at night — to fetch water from streams, to use the bathroom and to keep their children safe from snakes. “We find that if we convince them, they’re able to go and convince their husbands,” Malo says.

The Mbiabet community took less convincing. Residents were drawn to the promise of cheap, reliable electricity and its potential to boost local businesses.

Like many other mini-grids, the one in Mbiabet benefited from a small grant, this one from the Rocky Mountain Institute, a US-based nonprofit focused on renewable energy adoption. The funds allowed residents to retain 20 percent ownership of the mini-grid and reduced upfront costs for Prado Power, which built the panels with the help of local laborers.

On a day in late September, it’s a sunny afternoon, though downpours from the days before have made their imprint on the ground. There are no paved roads and today, the dirt road leading through the tropical forest into the cluster of villages is unnavigable by car. At one point, we build an impromptu bridge of grass and vegetation across a sludgy impasse; the last stretch of the journey is made on foot. It would be costly and labor-intensive to extend the national grid here.

Palm trees give way to tin roofs propped up by wooden poles, and Andrew Obot is waiting at the meeting point. He was Mbiabet’s vice youth president when Prado Power first contacted the community; now he’s the site manager. He steers his okada—a local motorbike—up the bumpy red dirt road to go see the solar panels.

Along the way, we see transmission lines threading through thick foliage. “That’s the solar power,” shouts Andrew Obot over the drone of the okada engine. All the lines were built by Prado Power to supply households in the two villages.

We enter a grassy clearing where three rows of solar panels sit behind wire gates. Collectively, the 39 panels have a capacity of over 20 kilowatts—enough to power just one large, energy-intensive American household but more than enough for the lightbulbs, cooker plates and fans in the 180 households in Mbiabet Esieyere and Mbiabet Udouba.

Whereas before, electricity was more conservatively used, now it is everywhere. An Afrobeats tune blares from a small barbershop on the main road winding through Mbiabet Esieyere. Inside, surrounded by walls plastered with shiny posters of trending hairstyles — including a headshot of popular musician Davido with the tagline “BBC—Big Boyz Cutz”—two young girls sit on a bench near a humming fan, waiting for their heads to be shaved.

The salon owner, Christian Aniefiok Asuquo, started his business two years ago when he was 16, just before the panels were installed. Back then, his appliances were powered by a diesel generator, which he would fill with 2,000 naira worth (around $1.20) of fuel daily. This would last around an hour. Now, he spends just 2,000 naira a month on electricity. “I feel so good,” he says, and his customers, too, are happy. He used to charge 500 naira ($0.30) per haircut, but now charges 300 naira ($0.18) and still makes a profit. He has more customers these days.

For many Mbiabet residents, “it’s an overall boost in their economic development,” says Suleiman Babamanu, the Rocky Mountain Institute’s program director in Nigeria. Also helping to encourage residents to take full advantage of their newly available power is the installation of an “agro-processing hub,” equipped with crop-processing machines and a community freezer to store products like fish. Provided by the company Farm Warehouse in partnership with Prado Power, the hub is leased out to locals. It includes a grinder and fryer to process cassava—the community’s primary crop—into garri, a local food staple, which many of the village women sell to neighboring communities and at local markets.

The women are charged around 200 naira ($0.12) to process a small basin of garri from beginning to end. Sarah Eyakndue Monday, a 24-year-old cassava farmer, used to spend three to four hours processing cassava each day; it now takes her less than an hour. “It’s very easy,” she says with a laugh. She produces enough garri during that time to earn up to 50,000 naira ($30.25) a week—almost five times what she was earning before.

Prado Power also installed a battery system to save some power for nighttime (there’s a backup diesel generator should batteries become depleted during multiple overcast days). That has proved especially valuable to women in Mbiabet Esieyere and Mbiabet Udouba, who now feel safer. “Everywhere is … brighter than before,” says Eyakndue Monday.

Other African communities have experienced similar benefits, according to Renewvia Energy, a US-based solar company. In a recent company-funded survey, 2,658 Nigerian and Kenyan households and business owners were interviewed before and after they got access to Renewvia’s mini-grids. Remarkably, the median income of Kenyan households had quadrupled. Instead of spending hours each day walking kilometers to collect drinking water, many communities were able to install electricity-powered wells or pumps, along with water purifiers.

“With all of that extra time, women in the community were able to either start their own businesses or just participate in businesses that already exist,” says Renewvia engineer Nicholas Selby, “and, with that, gain some income for themselves.”

Navigating mini-grid challenges

Solar systems require regular maintenance—replacing retired batteries, cleaning, and repairing and addressing technical glitches over the 20- to 25-year lifetime of a panel. Unless plans for care are built into a project, they risk failure. In some parts of India, for example, thousands of mini-grids installed by the government in recent decades have fallen into disrepair, according to a report provided to The Washington Post. Typically, state agencies have little long-term incentive to maintain solar infrastructure, Kansuk says.

Kansuk says this is less likely in situations where private companies that make money off the grids help to fund them, encouraging them to install high-quality devices and maintain them. It also helps to train locals with engineering skills so they can maintain the panels themselves—companies like Renewvia have done this at their sites. Although Prado Power hasn’t been able to provide such training to locals in Mbiabet or their other sites, they recruit locals like Andrew Obot to work as security guards, site managers and construction workers.

Over the longer term, demographic shifts may also leave some mini-grids in isolated areas abandoned—as in northern Nigeria, for instance, where banditry and kidnapping are forcing rural populations toward more urban settings. “That’s become a huge issue,” Malo says. Partly for this reason, some developers are focusing on building mini-grids in regions that are less prone to violence and have higher economic activity—often constructing interconnected mini-grids that supply multiple communities.

Eventually, those close enough to the national grid will likely be connected to the larger system, says Chibuikem Agbaegbu, a Nigeria-based climate and energy expert of the Africa Policy Research Institute. They can send their excess solar-sourced electricity into the main grid, thus making a region’s overall energy system greener and more reliable.

The biggest challenge for mini-grids, however, is cost. Although they tend to offer cheaper, more reliable electricity compared to fossil-fuel-powered generators, it is still quite expensive for many people — and often much more costly than power from national grids, which is frequently subsidized by African governments. Costs can be even higher when communities sprawl across large areas that are expensive to connect.

Mini-grid companies have to charge relatively high rates in order to break even, and many communities may not be buying enough power to make a mini-grid worthwhile for the developers — for instance, Kansuk says, if residents want electricity only for lighting and to run small household appliances.

Kansuk adds that this is why developers like Prado Power still rely on grants or other funding sources to subsidize construction costs so they can charge locals affordable prices for electricity. Another solution, as evidenced in Mbiabet, is to introduce industrial machinery and equipment in tandem with mini-grids to increase local incomes so that people can afford the electricity tariffs.

“For you to be able to really transform lives in rural communities, you need to be able to improve the business viability—both for the mini-grid and for the community,” says Babamanu. The Rocky Mountain Institute is part of an initiative that identifies suitable electrical products, from cold storage to rice mills to electric vehicle chargers, and supports their installation in communities with the mini-grids.

Spreading mini-grids across the continent

Energy experts believe that these kinds of solutions will be key for expanding mini-grids across Africa. Around 60 million people in the continent gained access to electricity through mini-grids between 2009 and 2019, in countries such as Kenya, Tanzania and Senegal, and the United Nations Development Program is working with a total of 21 African countries, Kansuk says, including Mali, Niger and Somalia, to incentivize private companies to develop mini-grids there.

But it takes more than robust policies to help mini-grids thrive. Malo says it would help if Western African countries removed import tariffs for solar panels, as many governments in Eastern Africa have done. And though Agbaegbu estimates that Nigeria has seen over $900 million in solar investments since 2018—and the nation recently announced $750 million more through a multinationally funded program that aims to provide over 17.5 million Nigerians with electricity access—it needs more. “If you look at what is required versus what is available,” says Agbaegbu, “you find that there’s still a significant gap.”

Many in the field argue that such money should come from more industrialized, carbon-emitting countries to help pay for energy development in Global South countries in ways that don’t add to the climate problem; some also argue for funds to compensate for damages caused by climate impacts, which hit these countries hardest. At the 2024 COP29 climate change conference, wealthy nations set a target of $300 billion in annual funding for climate initiatives in other countries by 2035—three times more than what they had previously pledged. But African countries alone need an estimated $200 billion per year by 2030 to meet their energy goals, according to the International Energy Agency.

Meanwhile, Malo adds, it’s important that local banks in countries like Nigeria also invest in mini-grid development, to lessen dependence on foreign financing. That’s especially the case in light of current freezes in USAID funding, she says, which has resulted in a loss of money for solar projects in Nigeria and other nations.

With enough support, Reber says, mini-grids—along with rooftop and larger solar projects—could make a sizable contribution to lowering carbon emissions in Africa. Those who already have the mini-grids seem convinced they’re on the path toward a better, economically richer future, and Babamanu knows of communities that have written letters to policymakers to express their interest.

Eyakndue Monday, the cassava farmer from Mbiabet, doesn’t keep her community’s news a secret. Those she has told now come to her village to charge their phones and watch television. “I told a lot of my friends that our village is … better because of the light,” she says. “They were just happy.”

This story was originally published by Knowable Magazine.

Photo of Knowable Magazine

Knowable Magazine explores the real-world significance of scholarly work through a journalistic lens.

For climate and livelihoods, Africa bets big on solar mini-grids Read More »

why-snes-hardware-is-running-faster-than-expected—and-why-it’s-a-problem

Why SNES hardware is running faster than expected—and why it’s a problem


gotta go precisely the right speed

Cheap, unreliable ceramic APU resonators lead to “constant, pervasive, unavoidable” issues.

Sir, do you know how fast your SNES was going? Credit: Getty Images

Ideally, you’d expect any Super NES console—if properly maintained—to operate identically to any other Super NES unit ever made (in the same region, at least). Given the same base ROM file and the same set of precisely timed inputs, all those consoles should hopefully give the same gameplay output across individual hardware and across time.

The TASBot community relies on this kind of solid-state predictability when creating tool-assisted speedruns that can be executed with robotic precision on actual console hardware. But on the SNES in particular, the team has largely struggled to get emulated speedruns to sync up with demonstrated results on real consoles.

After significant research and testing on dozens of actual SNES units, the TASBot team now thinks that a cheap ceramic resonator used in the system’s Audio Processing Unit (APU) is to blame for much of this inconsistency. While Nintendo’s own documentation says the APU should run at a consistent rate of 24.576 Mhz (and the associated Digital Signal Processor sample rate at a flat 32,000 Hz), in practice, that rate can vary just a bit based on heat, system age, and minor physical variations that develop in different console units over time.

Casual players would only notice this problem in the form of an almost imperceptibly higher pitch for in-game music and sounds. But for TASBot, Allan “dwangoAC” Cecil says this unreliable clock has become a “constant, pervasive, unavoidable” problem for getting frame-accurate consistency in hardware-verified speedruns.

Not to spec

Cecil testing his own SNES APU in 2016.

Cecil testing his own SNES APU in 2016. Credit: Allan Cecil

Cecil says he first began to suspect the APU’s role in TASBot’s SNES problems back in 2016 when he broke open his own console to test it with an external frequency counter. He found that his APU clock had “degraded substantially enough to cause problems with repeatability,” causing the console to throw out unpredictable “lag frames” if and when the CPU and APU load cycles failed to line up in the expected manner. Those lag frames, in turn, are enough to “desynchronize” TASBot’s input on actual hardware from the results you’d see on a more controlled emulator.

Unlike the quartz crystals used in many electronics (including the SNES’s more consistent and differently timed CPU), the cheaper ceramic resonators in the SNES APU are “known to degrade over time,” as Cecil put it. Documentation for the resonators used in the APU also seems to suggest that excess heat may impact the clock cycle speed, meaning the APU might speed up a bit as a specific console heats up.

The APU resonator manual shows slight variations in operating thresholds based on heat and other factors.

The APU resonator manual shows slight variations in operating thresholds based on heat and other factors. Credit: Ceralock ceramic resonator manual

The TASBot team was not the first group to notice this kind of audio inconsistency in the SNES. In the early 2000s, some emulator developers found that certain late-era SNES games don’t run correctly when the emulator’s Digital Signal Processor (DSP) sample rate is set to the Nintendo-specified value of precisely 32,000 Hz (a number derived from the speed of the APU clock). Developers tested actual hardware at the time and found that the DSP was actually running at 32,040 Hz and that setting the emulated DSP to run at that specific rate suddenly fixed the misbehaving commercial games.

That small but necessary emulator tweak implies that “the original developers who wrote those games were using hardware that… must have been running slightly faster at that point,” Cecil told Ars. “Because if they had written directly to what the spec said, it may not have worked.”

Survey says…

While research and testing confirmed the existence of these APU variations, Cecil wanted to determine just how big the problem was across actual consoles today. To do that, he ran an informal online survey last month, cryptically warning his social media followers that “SNES consoles seem to be getting faster as they age.” He asked respondents to run a DSP clock measurement ROM on any working SNES hardware they had lying around and to rerun the test after the console had time to warm up.

After receiving 143 responses and crunching the numbers, Cecil said he was surprised to find that temperature seemed to have a minimal impact on measured DSP speed; the measurement only rose an insignificant 8 Hz on average between “cold” and “warm” readings on the same console. Cecil even put his own console in a freezer to see if the DSP clock rate would change as it thawed out and found only a 32 Hz difference as it warmed back up to room temperature.

A sample result from the DSP sample test program.

Credit: Allan Cecil

A sample result from the DSP sample test program. Credit: Allan Cecil

Those heat effects paled in comparison to the natural clock variation across different consoles, though. The slowest and fastest DSPs in Cecil’s sample showed a clock difference of 234 Hz, or about 0.7 percent of the 32,000 Hz specification.

That difference is small enough that human players probably wouldn’t notice it directly; TASBot team member Total estimated it might amount to “at most maybe a second or two [of difference] over hours of gameplay.” Skilled speedrunners could notice small differences, though, if differing CPU and APU alignments cause “carefully memorized enemy pattern changes to something else” between runs, Cecil said.

For a frame-perfect tool-assisted speedrun, though, the clock variations between consoles could cause innumerable headaches. As TASBot team member Undisbeliever explained in his detailed analysis: “On one console this might take 0.126 frames to process the music-tick, on a different console it might take 0.127 frames. It might not seem like much but it is enough to potentially delay the start of song loading by 1 frame (depending on timing, lag and game-code).”

Cecil’s survey found variation across consoles was much higher than the effects of heat on any single console.

Cecil’s survey found variation across consoles was much higher than the effects of heat on any single console. Credit: SNES SMP Speed test survey

Cecil also said the survey-reported DSP clock speeds were also a bit higher than he expected, at an average rate of 32,076 Hz at room temperature. That’s quite a bit higher than both the 32,000 Hz spec set by Nintendo and the 32,040 Hz rate that emulator developers settled on after sampling actual hardware in 2003.

To some observers, this is evidence that SNES APUs originally produced in the ’90s have been speeding up slightly as they age and could continue to get faster in the coming years and decades. But Cecil says the historical data they have is too circumstantial to make such a claim for certain.

“We’re all a bunch of differently skilled geeks and nerds, and it’s in our nature to argue over what the results mean, which is fine,” Cecil said. “The only thing we can say with certainty is the statistical significance of the responses that show the current average DSP sample rate is 32,076 Hz, faster on average than the original specification. The rest of it is up to interpretation and a certain amount of educated guessing based on what we can glean.”

A first step

For the TASBot team, knowing just how much real SNES hardware timing can differ from dry specifications (and emulators) is an important step to getting more consistent results on real hardware. But that knowledge hasn’t completely solved their synchronization problems. Even when Cecil replaced the ceramic APU resonator in his Super NES with a more accurate quartz version (tuned precisely to match Nintendo’s written specification), the team “did not see perfect behavior like we expected,” he told Ars.

Beyond clock speed inconsistencies, Cecil explained to Ars that TASBot team testing has found an additional “jitter pattern” present in the APU sampling that “injects some variance in how long it takes to perform various actions” between runs. That leads to non-deterministic performance even on the same hardware, Cecil said, which means that “TASBot is likely to desync” after just a few minutes of play on most SNES games.

The order in which these components start when the SNES is reset can have a large impact on clock synchronization.

The order in which these components start when the SNES is reset can have a large impact on clock synchronization. Credit: Rasteri

Extensive research from Rasteri suggests that these inconsistencies across same-console runs are likely caused by a “very non-deterministic reset circuit” that changes the specific startup order and timing for a console’s individual components every time it’s powered on. That leads to essentially “infinite possibilities” for the relative place where the CPU and APU clocks start in their “synchronization cycle” for each fresh run, making it impossible to predict specifically where and when lag frames will appear, Rasteri wrote.

Cecil said these kind of “butterfly effect” timing issues make the Super NES “a surprisingly complicated console [that has] resisted our attempts to fully model it and coerce it into behaving consistently.” But he’s still hopeful that the team will “eventually find a way to restore an SNES to the behavior game developers expected based on the documentation they were provided without making invasive changes…”

In the end, though, Cecil seems to have developed an almost grudging respect for how the SNES’s odd architecture leads to such unpredictable operation in practice. “If you want to deliberately create a source of randomness and non-deterministic behavior, having two clock sources that spinloop independently against one another is a fantastic choice,” he said.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Why SNES hardware is running faster than expected—and why it’s a problem Read More »

used-tesla-prices-tumble-as-embarrassed-owners-look-to-sell

Used Tesla prices tumble as embarrassed owners look to sell

Similarly, one should take with a pinch of salt a website offering to steal Teslas from owners who are unable to find a buyer themselves.

According to data from Car Gurus, used Tesla prices have fallen twice as fast (-3.7 percent) as the wider car market (-1.5 percent) over the last 90 days. Year over year, used Tesla prices are down 7.5 percent, compared to 2.8 percent for the market as a whole. And that’s on top of steep depreciation caused by a series of new car price cuts over the past few years, as well as rental car companies and other companies disposing of fleets of Teslas en masse.

The Model 3 has been on sale longer than the Model Y, and you’d expect the older cars to have depreciated more. Indeed, the average price of a 2017 Model 3 is just under $20,000 now. But even recent model years are shedding value rapidly—a model-year 2022 Model 3 is worth just $25,000 on average.

Model Y prices have decreased by a greater degree, although the higher MSRP and younger age of the Y mean prices haven’t dropped quite as far as the 3, yet. But CarGurus has seen between 16–21 percent drops for each model year of the Model Y, year over year.

CarGurus isn’t the only one to notice this trend, either. According to its data, iSeeCars says used Teslas have dropped by 13.6 percent, year over year. The Models 3, Y, and S were all in its top four EVs for depreciation, although top place went to the Porsche Taycan (which might be starting to look like a bargain).

For its part, Tesla has been trying to boost its image with the help of President Trump. On Monday, the president took to the South Lawn of the White House to promote Tesla’s cars, apparently buying one despite having campaigned on an explicitly anti-electric vehicle platform.

Used Tesla prices tumble as embarrassed owners look to sell Read More »

on-maim-and-superintelligence-strategy

On MAIM and Superintelligence Strategy

Dan Hendrycks, Eric Schmidt and Alexandr Wang released an extensive paper titled Superintelligence Strategy. There is also an op-ed in Time that summarizes.

The major AI labs expect superintelligence to arrive soon. They might be wrong about that, but at minimum we need to take the possibility seriously.

At a minimum, the possibility of imminent superintelligence will be highly destabilizing. Even if you do not believe it represents an existential risk to humanity (and if so you are very wrong about that) the imminent development of superintelligence is an existential threat to the power of everyone not developing it.

Planning a realistic approach to that scenario is necessary.

What would it look like to take superintelligence seriously? What would it look like if everyone took superintelligence seriously, before it was developed?

The proposed regime here, Mutually Assured AI Malfunction (MAIM), relies on various assumptions in order to be both necessary and sufficient. If those assumptions did turn out to hold, it would be a very interesting, highly not crazy proposal.

  1. ASI (Artificial Superintelligence) is Dual Use.

  2. Three Proposed Interventions.

  3. The Shape of the Problems.

  4. Strategic Competition.

  5. Terrorism.

  6. Loss of Control.

  7. Existing Strategies.

  8. MAIM of the Game.

  9. Nonproliferation.

  10. Competitiveness.

  11. Laying Out Assumptions: Crazy or Crazy Enough To Work?.

  12. Don’t MAIM Me Bro.

ASI helps you do anything you want to do, which in context is often called ‘dual use.’

As in, AI is both a highly useful technology for both military and economic use. It can be used for, or can be an engine of, creation and also for destruction.

It can do both these things in the hands of humans, or on its own.

That means that America must stay competitive in AI, or even stay dominant in AI, both for our economic and our military survival.

The key players include not only states but also non-state actors.

Given what happens by default, what can we do to steer to a different outcome?

They propose three pillars.

Two are highly conventional and traditional. One is neither, in the context of AI.

First, the two conventional ones.

Essentially everyone can get behind Competitiveness, building up AI chips through domestic manufacturing. At least in principle. Trump called for us to end the Chips Act because he is under some strange delusions about how economics and physics work and thinks tariffs are how you fix everything (?), but he does endorse the goal.

Nonproliferation is more controversial but enjoys broad support. America already imposes export controls on AI chips and the proposed diffusion regulations would substantially tighten that regime. This is a deeply ordinary and obviously wise policy. There is a small extremist minority that flips out and calls proposals for ordinary enforcement of things like ‘a call for a global totalitarian surveillance state’ but such claims are rather Obvious Nonsense, entirely false and without merit, since they describe the existing policy regime in many sectors, not only in AI.

The big proposal here is Deterrence with Mutual Assured AI Malfunction (MAIM), as a system roughly akin to Mutually Assured Destruction (MAD) from nuclear weapons.

The theory is that if it is possible to detect and deter opposing attempts to developer superintelligence, the world can perhaps avoid developing superintelligence until we are ready for that milestone.

This chart of wicked problems in need of solving is offered. The ‘tame technical subproblems’ are not easy, but are likely solvable. The wicked problems are far harder.

Note that we are not doing that great a job on even the tame technical subproblems.

  1. Train AI systems to refuse harmful requests: We don’t have an AI system that cannot be jailbroken, even if it is closed weights and under full control, without crippling the mundane utility offered by the system.

  2. Prepare cyberattacks for AI datacenters: This is the one that is not obviously a net positive idea. Presumably this is being done in secret, but I have no knowledge of us doing anything here.

  3. Upgrade AI chip firmware to add geolocation functionality: We could presumably do this, but we haven’t done it.

  4. Patch known vulnerabilities in AI developers’ computer systems: I hope we are doing a decent job of this. However the full ‘tame’ problem is to do this across all systems, since AI will soon be able to automate attacks on all systems, exposing vulnerable legacy systems that often are tied to critical infrastructure. Security through obscurity is going to become a lot less effective.

  5. Design military drones: I do not get the sense we are doing a great job here, either in design or production, relative to its military importance.

  6. Economic strength: Improve AI performance in economically valuable tasks: We’re making rapid progress here, and it still feels like balls are dropped constantly.

  7. Loss of control: Research methods to make current AIs follow instructions: I mean yes we are doing that, although we should likely be investing 10x more. The problem is that our current methods to make this work won’t scale to superintelligence, with the good news being that we are largely aware of that.

They focus on three problems.

They don’t claim these are a complete taxonomy. At a sufficiently abstract level, we have a similar trio of threats to the ones OpenAI discusses in their philosophy document: Humans might do bad things on purpose (terrorism), the AI might do bad things we didn’t intend (loss of control), or locally good things could create bad combined effects (this is the general case of strategic competition, the paper narrowly focuses on state competition but I would generalize this to competition generally).

These problems interact. In particular, strategic competition is a likely key motivator for terrorism, and for risking or triggering a loss of control.

Note the term ‘meaningful’ in meaningful human control. If humans nominally have control, but in practice cannot exercise that control, humans still have lost control.

The paper focuses on the two most obvious strategic competition elements: Economic and military.

Economics is straightforward. If AI becomes capable of most or all labor, then how much inference you can do becomes a prime determinant of economic power, similar to what labor is today, even if there is no full strategic dominance.

Military is also straightforward. AI could enable military dominance through ‘superweapons,’ up to and including advanced drone swarms, new forms of EMP, decisive cyber weapons or things we aren’t even imagining. Sufficiently strong AI would presumably be able to upend nuclear deterrence.

If you are about to stare down superintelligence, you don’t know what you’ll face, but you know if you don’t act now, it could be too late. You are likely about to get outcompeted. It stands to reason countries might consider preventative action, up to and including outright war. We need to anticipate this possibility.

Strategic competition also feeds into the other two risks.

If you are facing strong strategic competition, either the way the paper envisioned at a national level, or competition at the corporate or personal level, from those employing superintelligence, you may have no choice but to either lose or deploy superintelligence yourself. And if everyone else is fully unleashing that superintelligence, can you afford not to do the same? How do humans stay in the loop or under meaningful control?

Distinctly from that fear, or perhaps in combination with it, if actions that are shaped like ‘terrorism’ dominate the strategic landscape, what then?

The term terrorism makes an assertion about what the goal of terrorism is. Often, yes, the goal is to instill fear, or to trigger a lashing out or other expensive response. But we’ve expanded the word ‘terrorism’ to include many other things, so that doesn’t have to be true.

In the cases of this ‘AI-enabled terrorism’ the goal mostly is not to instill fear. We are instead talking about using asymmetric weapons, to inflict as much damage as possible. The scale of the damage relatively unresourced actors can do will scale up.

We have to worry in particular about bioterrorism and cyberattacks on critical infrastructure – this essay chooses to not mention nuclear and radiological risks.

As always this question comes down to offense-defense balance and the scale (and probability) of potential harm. If everyone gets access to similarly powerful AI, what happens? Does the ‘good guy with an AI’ beat the ‘bad guy with an AI’? Does this happen in practice, despite the future being unevenly distributed, and thus much of critical infrastructure not having up-to-date defenses, and suffering from ‘patch lag’?

This is a cost-benefit analysis, including the costs of limiting proliferation. There are big costs in taking action to limit proliferation, even if you are confident it will ultimately work.

The question is, are there even larger costs to not doing so? That’s a fact question. I don’t know the extent to which future AI systems might enable catastrophic misuse, or how much damage that might cause. You don’t either.

We need to do our best to answer that question in advance, and if necessary to limit proliferation. If we want to do that limiting gracefully, with minimal economic costs and loss of freedom, that means laying the necessary groundwork now. The alternative is doing so decidedly ungracefully, or failing to do so at all.

The section on Loss of Control is excellent given its brevity. They cover three subsections.

  1. Erosion of control is similar to the concerns about gradual disempowerment. If anyone not maximally employing AI becomes uncompetitive, humans would rapidly find themselves handing control over voluntarily.

  2. Unleashed AI Agents are an obvious danger. Even a single sufficient capable rouge AI agent unleashed on the internet could cause no end of trouble, and there might be no reasonable way to undo this without massive economic costs we would not be willing to pay once it starts gathering resources and self-replicating. Even a single such superintelligent agent could mean irrevocable loss of control. As always, remember that people will absolutely be so stupid as to, and also some will want to do it, on purpose.

  3. Intelligence Recursion, traditionally called Recursive Self-Improvement (RSI), where smarter AI builds smarter AI builds smarter AI, perhaps extremely rapidly. This is exactly how one gets a strategic monopoly or dominant position, and is ‘the obvious thing to do,’ it’s tough not to do it.

They note explicitly that strategic competition, in the form of geopolitical competitive pressures, could easily make us highly tolerant of such risks, and therefore we could initiate such a path of RSI even if those involved thought the risk of loss of control was very high. I would note that this motivation also holds for corporations and others, not only nations, and again that some people would welcome a loss of control, and others will severely underestimate the risks, with varying levels of conscious intention.

What are our options?

They note three.

  1. There is the pure ‘hands-off’ or ‘YOLO’ strategy where we intentionally avoid any rules or restrictions whatsoever, on the theory that humans having the ability to collectively steer the future is bad, actually, and we should avoid it. This pure anarchism is a remarkably popular position among those who are loud on Twitter. As they note, from a national security standpoint, this is neither a credible nor a coherent strategy. I would add that from the standpoint of trying to ensure humanity survives, it is again neither credible nor coherent.

  2. Moratorium strategy. Perhaps we can pause development past some crucial threshold? That would be great if we could pull it off, but coordination is hard and the incentives make this even harder than usual, if states lack reliable verification mechanisms.

  3. Monopoly strategy. Try to get there first and exert a monopoly, perhaps via a ‘Manhattan Project’ style state program. They argue that it would be impossible to hide this program, and others would doubtless view it as a threat and respond with escalations and hostile countermeasures.

They offer this graph as an explanation for why they don’t like Monopoly strategy:

Certainly escalation and even war is one potential response to the monopoly strategy, but the assumption that it goes that way is based on China or others treating superintelligence as an existential strategic threat. They have to take the threat so seriously that they will risk war over it, for real.

Would they take it that seriously before it happens? I think this is very far from obvious. It takes a lot of conviction to risk everything over something like that. Historically, deterrence strikes are rare, even when they would have made strategic sense, and the situation was less speculative. Nor does a successful strike automatically lead to escalation.

That doesn’t mean that going down these paths is good or safe. Racing for superintelligence as quickly as possible, with no solution on how to control it, in a way that forces your rival to respond in kind when previously let’s face it they weren’t trying all that hard, does not seem like a wise thing to aim for or do. But I think the above chart is too pessimistic.

Instead they propose a Multipolar strategy, with the theory being that Deterrence with Mutual Assured AI Malfunction (MAIM), combined with strong nonproliferation and competitiveness, can hopefully sustain an equilibrium.

There are two importantly distinct claims here.

The first claim here is that a suboptimal form of MAIM is the default regime, that costs for training runs will balloon, thus they can only happen at large obvious facilities, and therefore there are a variety of escalations those involved can use to shut down AI programs, from sabotage up to outright missile attacks, and any one rival is sufficient to shut down an attempt.

The second claim is that it would be wise to pursue a more optimal form of MAIM as an intentional policy choice.

MAIM is trivially true, at least in the sense that MAD is still in effect, although the paper claims that sabotage means there are reliable options available well short of a widespread nuclear strike. Global thermonuclear war would presumably shut down everyone’s ASI projects, but it seems likely that launching missiles at a lot of data centers would lead to full scale war, perhaps even somewhat automatic nuclear war. Do we really think ‘kinetic escalation’ or sabotage can reliably work and also be limited to the AI realm? Are there real options short of that?

Yes, you could try to get someone to sabotage, or engage in a cyberattack. The paper authors think that between all the options available, many of which are hard to attribute or defend against, we should expect such an afford to work if it is well resourced, at least enough to delay progress on the order of months. I’m not sure I have even that confidence, and I worry that it won’t count for much. Human sabotage seems likely to become less effective over time, as AIs themselves take on more of the work and error checking. Cyberattacks similarly seem like they are going to get more difficult, especially once everyone involved is doing fully serious active defense and accepting real costs of doing so.

The suggestion here is to intentionally craft and scope out MAIM, to allow for limited escalations along a clear escalation ladder, such as putting data centers far away from population centers and making clear distinctions between acceptable projects and destabilizing ones, and implementing ‘AI-assisted inspections.’

Some actions of this type took place during the Cold War. Then there are other nations and groups with a history of doing the opposite, doing some combination of hiding their efforts, hardening the relevant targets and intentionally embedding military targets inside key civilian infrastructure and using ‘human shields.’

That’s the core idea. I’ll touch quickly on the other two parts of the plan, Nonproliferation and Competitiveness, then circle back to whether the core idea makes sense and what assumptions it is making. You can safety skip ahead to that.

They mention you can skip this, and indeed nothing here should surprise you.

In order for the regime of everyone holding back to make sense, there need to be a limited number of actors at the established capabilities frontier, and you need to keep that level of capability out of the hands of the true bad actors. AI chips would be treated, essentially, as if they were also WMD inputs.

Compute security is about ensuring that AI chips are allocated to legitimate actors for legitimate purposes. This echoes the export controls employed to limit the spread of fissile materials, chemical weapons, and biological agents.

Information security involves securing sensitive AI research and model weights that form the core intellectual assets of AI. Protecting these elements prevents unwarranted dissemination and malicious use, paralleling the measures taken to secure sensitive information in the context of WMDs.

They discuss various mechanisms for tracking chips, including geolocation and geofencing, remote attestation, networking restrictions and physical tamper resistance. Keeping a lockdown on frontier-level model weights also follows, and they offer various suggestions on information security.

Under AI Security (5.3) they claim that model-level safeguards can be made ‘significantly resistant to manipulation.’ In practice I am not yet convinced.

They offer a discussion in 5.3.2 of loss of control, including controlling an intelligence recursion (RSI). I am not impressed by what is on offer here in terms of it actually being sufficient, but if we had good answers that would be a case for moving forward, not for pursuing a solution like MAIM.

The question on competitiveness is not if but rather how. The section feels somewhat tacked on, they themselves mention you can skip this.

The suggestions under military and economy should be entirely uncontroversial.

The exception is ‘facilitate immigration for AI scientists,’ which seems like the most obvious thing in the world to do, but alas. What a massive unforced error.

The correct legal framework for AI and AI agents has been the subject of extended debate, which doubtless will continue. The proposed framework here is to impose upon AIs a duty of reasonable care to the public, another duty of care to the principle, and a duty not to lie. They propose to leave the rest to the market to decide.

The section is brief so they can’t cover everything, but as a taste to remind one that the rabbit holes run deep even when considering mundane situations: Missing here is which human or corporation bears liability for harms. If something goes wrong, who is to blame? The user? The developer or deployer? They also don’t discuss how to deal with other obligations under the law, and they mention the issue of mens rea but not how they propose to handle it.

They also don’t discuss what happens if an AI agent is unleashed and is outside of human control, whether or not doing so was intentional, other than encouraging other AIs to not transact with such an AI. And they don’t discuss to what extent an AI agent would be permitted to act as a legal representative of a human. Can they sign contracts? Make payments? When is the human bound, or unbound?

They explicitly defer discussion of potential AI rights, which is its own rabbit hole.

The final discussion here is on political stability, essentially by using AI to empower decision makers and filter information, and potentially doing redistribution in the wake of automation. This feels like gesturing at questions beyond the scope of the paper.

What would make deliberately pursuing MAIM as a strategy both necessary and sufficient?

What would make it, as they assert, the default situation?

Both are possible, but there are a good number of assumptions.

The most basic requirement is that it essentially requires common knowledge.

Everyone must ‘feel the superintelligence,’ and everyone must be confident that:

  1. At least one other major player feels the superintelligence.

  2. That another state will attempt to stop you via escalation, if you go for it.

  3. That such escalation would either succeed or escalate to total war.

If you don’t believe all of that, you don’t have MAIM, the same way you would not have had MAD.

Indeed, we have had many cases of nuclear proliferation, exactly because states including North Korea have correctly gambled that no one would escalate sufficiently to stop them. Our planetary track record of following through in even the most obvious of situations is highly spotty. Our track record of preemptive wars in other contexts is even worse, with numerous false negatives and also false positives.

Superintelligence is a lot murkier and uncertain in its definition, threshold and implications than a nuclear bomb. How confident are you that your rivals will be willing to pull the trigger? How confident do they need to be that this is it? Wouldn’t there be great temptation to be an ostrich, and pretend it wasn’t happening, or wasn’t that big a deal?

That goes together with the question of whether others can reliably identify an attempt to create superintelligence, and then whether they can successfully sabotage that effort with a limited escalation. Right now, no one is trying all that hard to hide or shield what they are up to, but that could change. Right now, the process requires very obvious concentrated data centers, but that also could change, especially if one was willing to sacrifice efficiency. And so on. If we want to preserve things as they are, we will have to do that deliberately.

The paper asserts states ‘would not stand idly by’ while another was on the ‘cusp of superintelligence.’ I don’t think we can assume that. They might not realize what is happening. They might not realize the implications. They might realize probabilistically but not be willing to move that far up the escalatory ladder or credibly threaten to do so. A central failure mode is that the threat is real but not believed.

It seems, at minimum, rather strange to assume MAIM is the default. Surely, various sabotage efforts could complicate things, but presumably things get backed up and it is not at all obvious that there is a limited-scope way to stop a large training run indefinitely. It’s not clear what a few months of sabotage buys you even if it works.

The proposal here is to actively engineer a stable MAIM situation, which if enacted improves your odds, but the rewards to secrecy and violating the deals are immense. Even they admit that MAIM is a ‘wicked problem’ that would be in an unstable, constantly evolving state in the best of times.

I’m not saying it cannot be done, or even that you shouldn’t try. It certainly seems important to have the ability to implement such a plan in your back pocket, to the greatest extent possible, if you don’t intentionally want to throw your steering wheel out the window. I’m saying that even with the buy-in of those involved, it is a heavy lift. And with those currently in power in America, the lift is now that much tougher.

All of this can easily seem several levels of rather absurd. One could indeed point to many reasons why this strategy could wind up being profoundly flawed, or that the situation might be structured so that this does not apply, or that there could end up being a better way.

The point is to start thinking about these questions now, in case this type of scenario does play out, and to consider under what conditions one would want to seek out such a solution and steer events in that direction. To develop options for doing so, in case we want to do that. And to use this as motivation to actually consider all the other ways things might play out, and take them all seriously, and ask how we can differentiate which world we are living in, including how we might move between those worlds.

Discussion about this post

On MAIM and Superintelligence Strategy Read More »

google-is-bringing-every-android-game-to-windows-in-big-gaming-update

Google is bringing every Android game to Windows in big gaming update

The annual Game Developers Conference is about to kick off, and even though Stadia is dead and buried, Google has a lot of plans for games. It’s expanding tools that help PC developers bring premium games to Android, and games are heading in the other direction, too. The PC-based Play Games platform is expanding to bring every single Android game to Windows. Google doesn’t have a firm timeline for all these changes, but 2025 will be an interesting year for the company’s gaming efforts.

Google released the first beta of Google Play Games on PC back in 2022, allowing you to play Android games on a PC. It has chugged along quietly ever since, mostly because of the anemic and largely uninteresting game catalog. While there are hundreds of thousands of Android games, only a handful were made available in the PC client. That’s changing in a big way now that Google is bringing over every Android game from Google Play.

Starting today, you’ll see thousands of new games in Google Play Games on PC. Developers actually have to opt out if they don’t want their games available on Windows machines via Google Play Games. Google says this is possible thanks to improved custom controls, making it easy to map keyboard and gamepad controls onto games that were designed for touchscreens (see below). The usability of these mapped controls will probably vary dramatically from game to game.

While almost every Android game will soon be available on Windows, not all will get top billing. Google Play Games on PC has a playability badge, indicating a game has been tested on Windows. Games that have been specifically optimized for PC get a more prominent badge. Games with the “Playable” or “Optimized” distinction will appear throughout the client in lists of suggested titles, but untested games will only appear if you search for them. However, you can install them all just the same, and they’ll work better on AMD-based machines, support for which has been lacking throughout the beta.

Google is bringing every Android game to Windows in big gaming update Read More »

ai-#107:-the-misplaced-hype-machine

AI #107: The Misplaced Hype Machine

The most hyped event of the week, by far, was the Manus Marketing Madness. Manus wasn’t entirely hype, but there was very little there there in that Claude wrapper.

Whereas here in America, OpenAI dropped an entire suite of tools for making AI agents, and previewed a new internal model making advances in creative writing. Also they offered us a very good paper warning about The Most Forbidden Technique.

Google dropped what is likely the best open non-reasoning model, Gemma 3 (reasoning model presumably to be created shortly, even if Google doesn’t do it themselves), put by all accounts quite good native image generation inside Flash 2.0, and added functionality to its AMIE doctor, and Gemini Robotics.

It’s only going to get harder from here to track which things actually matter.

  1. Language Models Offer Mundane Utility. How much utility are we talking so far?

  2. Language Models Don’t Offer Mundane Utility. It is not a lawyer.

  3. We’re In Deep Research. New rules for when exactly to go deep.

  4. More Manus Marketing Madness. Learn to be skeptical. Or you can double down.

  5. Diffusion Difficulties. If Manus matters it is as a pointer to potential future issues.

  6. OpenAI Tools for Agents. OpenAI gives us new developer tools for AI agents.

  7. Huh, Upgrades. Anthropic console overhaul, Cohere A, Google’s AMIE doctor.

  8. Fun With Media Generation. Gemini Flash 2.0 now has native image generation.

  9. Choose Your Fighter. METR is unimpressed by DeepSeek, plus update on apps.

  10. Deepfaketown and Botpocalypse Soon. Feeling seen and heard? AI can help.

  11. They Took Our Jobs. Is it time to take AI job loss seriously?

  12. The Art of the Jailbreak. Roleplay is indeed rather suspicious.

  13. Get Involved. Anthropic, Paradome, Blue Rose, a general need for more talent.

  14. Introducing. Gemma 3 and Gemini Robotics, but Google wants to keep it quiet.

  15. In Other AI News. Microsoft training a 500b model, SSI still in stealth.

  16. Show Me the Money. AI agents are the talk of Wall Street.

  17. Quiet Speculations. What does AGI mean for the future of democracy?

  18. The Quest for Sane Regulations. ML researchers are not thrilled with their work.

  19. Anthropic Anemically Advises America’s AI Action Plan. It’s something.

  20. New York State Bill A06453. Seems like a good bill.

  21. The Mask Comes Off. Scott Alexander covers the OpenAI for-profit conversion.

  22. Stop Taking Obvious Nonsense Hyperbole Seriously. Your periodic reminder.

  23. The Week in Audio. McAskill, Loui, Amodei, Toner, Dafoe.

  24. Rhetorical Innovation. Keep the future human. Coordination is hard. Incentives.

  25. Aligning a Smarter Than Human Intelligence is Difficult. A prestigious award.

  26. The Lighter Side. Important dos and don’ts.

How much is coding actually being sped up? Anecdotal reports in response to that question are that the 10x effect is only a small part of most developer jobs. Thus a lot of speedup factors are real but modest so far. I am on the extreme end, where my coding sucks so much that AI coding really is a 10x style multiplier, but off a low base.

Andrej Karpathy calls for everything to be reformatted to be efficient for LLM purposes, rather than aimed purely at human attention. The incentives here are not great. How much should I care about giving other people’s AIs an easier time?

Detect cavities.

Typed Female: AI cavity detection has got me skewing out. Absolutely no one who is good at their job is working on this—horrible incentive structures at play.

My dentist didn’t even bother looking at the X-rays. Are we just going to drill anywhere the AI says to? You’ve lost your mind.

These programs are largely marketed as tools that boost dentist revenue.

To me this is an obviously great use case. The AI is going to be vastly more accurate than the dentist. That doesn’t mean the dentist shouldn’t look to confirm, but it would be unsurprising to me if the dentist looking reduced accuracy.

Check systematically whether each instance of a word, for example ‘gay,’ refers in a given case to one thing, for example ‘sexual preference,’ or if it might mean something else, before you act like a complete moron.

WASHINGTON (AP) — References to a World War II Medal of Honor recipient, the Enola Gay aircraft that dropped an atomic bomb on Japan and the first women to pass Marine infantry training are among the tens of thousands of photos and online posts marked for deletion as the Defense Department works to purge diversity, equity and inclusion content, according to a database obtained by The Associated Press.

Will Creeley: The government enlisting AI to police speech online should scare the hell out of every American.

One could also check the expression of wide groups and scour their social media to see if they express Wrongthink, in this case ‘pro-Hamas’ views among international students, and then do things like revoke their visas. FIRE’s objection here is on the basis of the LLMs being insufficiently accurate. That’s one concern, but humans make similar mistakes too, probably even more often.

I find the actual big problem to be 90%+ ‘they are scouring everyone’s social media posts for Wrongthink’ rather than ‘they will occasionally have a false positive.’ This is a rather blatant first amendment violation. As we have seen over and over again, once this is possible and tolerated, what counts as Wrongthink often doesn’t stay contained.

Note that ‘ban the government (or anyone) from using AI to do this’ can help but is not a promising long term general strategy. The levels of friction involved are going to be dramatically reduced. If you want to ban the behavior, you have to ban the behavior in general and stick to that, not try to muddle the use of AI.

Be the neutral arbiter of truth among the normies? AI makes a lot of mistakes but it is far more reliable, trustworthy and neutral than most people’s available human sources. It’s way, way above the human median. You of course need to know when not to trust it, but that’s true of every source.

Do ‘routine’ math research, in the sense that you are combining existing theorems, without having to be able to prove those existing theorems. If you know a lot of obscure mathematical facts, you can combine them in a lot of interesting ways. Daniel Litt speculates this is ~90% of math research, and by year’s end the AIs will be highly useful for it. The other 10% of the work can then take the other 90% of the time.

Want to know which OpenAI models can do what? It’s easy, no wait…

Kol Tregaskes: Useful chart for what tools each OpenAI model has access to.

This is an updated version of what others have shared (includes a correction found by @btibor91). Peter notes he has missed out Projects, will look at them.

Peter Wildeford: Crazy that

  1. this chart needs to exist

  2. it contains information that I as a very informed OpenAI Pro user didn’t even know

  3. it is already out of date despite being “as of” three days ago [GPT-4.5 was rolled out more widely].

One lawyer explains why AI isn’t useful for them yet.

Cjw: The tl;dr version is that software doesn’t work right, making it work right is illegal, and being too efficient is also illegal.

Another round of ‘science perhaps won’t accelerate much because science is about a particular [X] that LLMs will be unable to provide.’ Usually [X] is ‘perform physical experiments’ which will be somewhat of a limiting factor but still leaves massive room for acceleration, especially once simulations get good enough, or ‘regulatory approval’ which is again serious but can be worked around or mitigated.

In this case, the claim is that [X] is ‘have unique insights.’ As in, sure an LLM will be able to be an A+ student and know the ultimate answer is 42, but won’t know the right question, so it won’t be all that useful. Certainly LLMs are relatively weaker there. At minimum, if you can abstract away the rest of the job, then that leaves a lot more space for the humans to provide the unique insights – most of even the best scientists spend most of their time on other things.

More than that, I do think the ‘outside the box’ thinking will come with time, or perhaps we will think of that as the box expanding. It is not as mysterious or unique as one thinks. The reason that Thomas Wolf was a great student and poor researcher wasn’t (I am guessing) that Wolf was incapable of being a great researcher. It’s that our system of education gave him training data and feedback that led him down that path. As he observes, it was in part because he was a great student that he wasn’t great at research, and in school he instead learned to guess the teacher’s password.

That can be fixed in LLMs, without making them bad students. Right now, LLMs guess the user’s password too much, because the training process implicitly thinks users want that. The YouTube algorithm does the same thing. But you could totally train an LLM a different way, especially if doing it purely for science. In a few years, the cost of that will be trivial, Stanford graduate students will do it in a weekend if no one else did it first.

Chris Blattman is a very happy Deep Research customer, thread has examples.

Davidad: I have found Deep Research useful under exactly the following conditions:

I have a question, to which I suspect someone has written down the answer in a PDF online once or twice ever.

It’s not easy to find with a keyword search.

I can multitask while waiting for the answer.

Unfortunately, when it turns out that no one has ever written down the actual answer (or an algorithmic method to compute the general class of question), it is generally extremely frustrating to discover that o3’s superficially excitingly plausible synthesis is actually nonsense.

Market Urbanism’s Salim Furth has first contact with Deep Research, it goes well. This is exactly the top use case, where you want to compile a lot of information from various sources, and actively false versions are unlikely to be out there.

Arvind Narayanan tells OpenAI Deep Research to skip the secondary set of questions, and OpenAI Deep Research proves incapable of doing that, the user cannot deviate from the workflow here. I think in this case that is fine, as a DR call is expensive. For Gemini DR it’s profoundly silly, I literally just click through the ‘research proposal’ because the proposal is my words repeated back to me no matter what.

Peter Wildeford (3/10/25): The @ManusAI_HQ narrative whiplash is absurd.

Yesterday: “first AGI! China defeats US in AI race!”

Today: “complete influencer hype scam! just a Claude wrapper!”

The reality? In between! Manus made genuine innovations and seems useful! But it isn’t some massive advance.

Robert Scoble: “Be particularly skeptical of initial claims of Chinese AI.”

I’m guilty, because I’m watching so many in AI who get excited, which gets me to share. I certainly did the past few days with @ManusAI_HQ, which isn’t public yet but a lot of AI researchers got last week.

In my defense I shared both people who said it wasn’t measuring up, as well as those who said it was amazing. But I don’t have the evaluation suites, or the skills, to do a real job here. I am following 20,000+ people in AI, though, so will continue sharing when I see new things pop up that a lot of people are covering.

To Robert, I would say you cannot follow 20,000+ people and critically process the information. Put everyone into the firehose and you’re going to end up falling for the hype, or you’re going to randomly drop a lot of information on the floor, or both. Whereas I do this full time and curate a group of less than 500 people.

Peter expanded his thoughts into a full post, making it clear that he agrees with me that what we are dealing with is much closer to the second statement than the first. If an American startup did Manus, it would have been a curiosity, and nothing more.

Contrary to claims that Manus is ‘the best general AI agent available,’ it is neither the best agent, nor is it available. Manus has let a small number of people see a ‘research preview’ that is slow, that has atrocious unit economics, that brazenly violates terms of service, that is optimized on a small range of influencer-friendly use cases, that is glitchy and lacks any sorts of guardrails, and definitely is not making any attempt to defend against prompt injections or other things that would exist if there was wide distribution and use of such an agent.

This isn’t about regulatory issues and has nothing to do with Monica (the company behind Manus) being Chinese, other than leaning into the ‘China beats America’ narrative. Manus doesn’t work. It isn’t ready for anything beyond a demo. They made it work on a few standard use cases. Everyone else looked at this level of execution, probably substantially better than this level in several cases, and decided to keep their heads down building until it got better, and worried correctly that any efforts to make it temporarily somewhat functional will get ‘steamrolled’ by the major labs. Manus instead decided to do a (well-executed) marketing effort anyway. Good for them?

Tyler Cowen doubles down on more Manus. Derya Unutmaz is super excited by it in Deep Research mode, which makes me downgrade his previously being so super excited by Deep Research. And then Tyler links as ‘double yup’ to this statement:

Derya Unutmaz: After experiencing Manus AI, I’ve also revised my predictions for AGI arrival this year, increasing the probability from 90% to 95% by year’s end. At this point, it’s 99.9% likely to arrive by next year at the latest.

That’s… very much not how any of this works. It was a good sketch but then it got silly.

Dean Ball explains why he still thinks Manus matters. Partly he is more technically impressed by Manus than most, in particular when being an active agent on the internet. But he explicitly says he wouldn’t call it ‘good,’ and notes he wouldn’t trust it with payment information, and notices its many glitches. And he is clear there is no big technical achievement here to be seen, as far as we can tell, and that the reason Manus looks better than alternatives is they had ‘the chutzpah to ship’ in this state while others didn’t.

Dean instead wants to make a broader point, which is that the Chinese may have an advantage in AI technology diffusion. The Chinese are much more enthusiastic and less skeptical about AI than Americans. The Chinese government is encouraging diffusion far more than the our government.

Then he praises Manus’s complete lack of any guardrails or security efforts whatsoever, for ‘having the chutzpah to ship’ a product I would say no sane man would ever use for the use cases where it has any advantages.

I acknowledge that Dean is pointing to real things when he discusses all the potential legal hot water one could get into as an American company releasing a Manus. But I once again double down that none of that is going to stop a YC company or other startup, or even substantially slow one down. Dean instead here says American companies may be afraid of ‘AGI’ and distracted from extracting maximum value from current LLMs.

I don’t think that is true either. I think that we have a torrent of such companies, trying to do various wrappers and marginal things, even as they are warned that there is likely little future in such a path. It won’t be long before we see other similar demos, and even releases, for the sufficiently bold.

I also notice that only days after Manus, OpenAI went ahead and launched new tools to help developers build reliable and powerful AI agents. In this sense, perhaps Manus was a (minor) DeepSeek moment, in that the hype caused OpenAI to accelerate their release schedule.

I do agree with Dean’s broader warnings. America risks using various regulatory barriers and its general suspicion of AI to slow down AI diffusion more than is wise, in ways that could do a lot of damage, and we need to reform our system to prevent this. We are not doing the things that would help us all not die, which would if done wisely cost very little in the way of capability, diffusion or productivity. Instead we are putting up barriers to us having nice things and being productive. We need to strike that, and reverse it.

Alas, instead, our government seems to be spending recent months largely shooting us in the foot in various ways.

I also could not agree more that the application layer is falling behind the model layer. And again, that’s the worst possible situation. The application layer is great, we should be out there doing all sorts of useful and cool things, and we’re not, and I continue to be largely confused about how things are staying this lousy this long.

OpenAI gives us new tools for building agents. You now have tools for web search, file search, computer use, responses API for all of that plus future tools and an open source agent SDK. They promise more to come, and that chat completions will be supported going forward but they plan to deprecate the assistants API mid-2026.

I expect this is a far bigger deal than Manus. This is the actual starting gun.

The agents will soon follow.

Please, when one of the startups that uses these to launch some wrapper happens to be Chinese, don’t lose yourself in the resulting hype.

An overhaul was made of the Anthropic Console, including sharing with teammates.

ChatGPT for MacOS can now edit code directly in IDEs.

OpenAI has a new internal model they claim is very good at creative writing, I’m holding further discussion of this one back until later.

Cohere moves from Command R+ to Command A, making a bold new claim to the ‘most confusing set of AI names’ crown.

Aiden Gomez (Cohere): Today @cohere is very excited to introduce Command A, our new model succeeding Command R+. Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases.

[HuggingFace, API, Blog Post]

Yi-Chern (Cohere): gpt-4o perf on enterprise and stem tasks, >deepseek-v3 on many languages including chinese human eval, >gpt-4o on enterprise rag human eval

2 gpus 256k context length, 156 tops at 1k context, 73 tops at 100k context

this is your workhorse.

The goal here seems to be as a base for AI agents or business uses, but the pricing doesn’t seem all that great at $2.50/$10 per million tokens.

Google’s AI Doctor AMIE can now converse, consult and provide treatment recommendations, prescriptions, multi-visit care, all guideline-compliant. I am highly suspicious that the methods here are effectively training on ‘match the guidelines’ rather than ‘do the best thing.’ It is still super valuable to have an AI that will properly apply the guidelines to a given situation, but one cannot help but be disappointed.

Gemini 2.0 Flash adds native image generation, which can edit words in images and do various forms of native text-to-image pretty well, and people are having fun with photo edits.

I’d be so much more excited if Google wasn’t the Fun Police.

Anca Dragan (Director of AI Safety and Alignment, DeepMind): The native image generation launch was a lot of work from a safety POV. But I’m so happy we got this functionality out, check this out:

Google, I get that you want it to be one way, but sometimes I want it to be the other way, and there really is little harm in it being the other way sometimes. Here are three of the four top replies to Anca:

Janek Mann: I can imagine… sadly I think the scales fell too far on the over-cautious side, it refuses many things where that doesn’t make any sense, limiting its usefulness. Hopefully there’ll be an opportunity for a more measured approach now that it’s been released 😁

Nikshep: its an incredible feature but overly cautious, i have such a high failure rate on generations that should be incredibly safe. makes it borderline a struggle to use

Just-a-programmer: Asked it to fix up a photo of a young girl and her Dad. Told me it was “unsafe”.

METR evaluates DeepSeek v3 and r1, finds that they perform poorly as autonomous agents on generic SWE tasks, below Claude 3.6 and o1, about 6 months behind leading US companies.

Then on six challenging R&D tasks, r1 does dramatically worse than that, being outperformed by Claude 3.5 and even Opus, which is from 11 months ago.

They did however confirm that the DeepSeek GPQA results were legitimate. The core conclusion is that r1 is good at knowledge-based tasks, but lousy as an agent.

Once again, we are seeing that r1 was impressive for its cost, but overblown (and the cost difference was also overblown).

Rohit Krishnan writes In Defense of Gemini, pointing out Google is offering a fine set of LLMs and a bunch of great features, in theory, but isn’t bringing it together into a UI or product that people actually want to use. That sounds right, but until they do that, they still haven’t done it, and the Gemini over-refusal problem is real. I’m happy to use Gemini Flash with my Chrome extension, but Rohit is right that they’re going to have to do better on the product side, and I’d add better on the marketing side.

Google, also, give me an LLM that can properly use my Docs, Sheets and GMail as context, and that too would go a long way. You keep not doing that.

Sully Omarr: crazy how much better gemini flash thinking is than regular 2.0

this is actually op for instruction following

Doesn’t seem so crazy to me given everything else we know. Google is simply terrible at marketing.

Kelsey Piper: Finally got GPT 4.5 access and I really like it. For my use cases the improvements over 4o or Claude 3.7 are very noticeable. It feels unpolished, and the slowness of answering is very noticeable, but I think if the message limit weren’t so restrictive it’d be my go-to model.

There were at least two distinct moments where it made an inference or a clarification that I’ve never seen a model make and that felt genuinely intelligent, the product of a nuanced worldmodel and the ability to reason from it.

It does still get my secret test of AI metacognition and agency completely wrong even when I try very patiently prompting it to be aware of the pitfalls. This might be because it doesn’t have a deep thinking mode.

The top 100 GenAI Consumer Apps list is out again, and it has remarkably little overlap with what we talk about here.

The entire class of General Assistants is only 8%, versus 4% for plant identifiers.

When a person is having a problem and needs a response, LLMs are reliably are evaluated as providing better responses than physicians or other humans provide. The LLMs make people ‘feel seen and heard.’ That’s largely because Bing spent more time ‘acknowledging and validating people’s feelings,’ whereas humans share of themselves and attempt to hash out next steps. It turns out what humans want, or at least rate as better, is to ‘feel seen and heard’ in this fake way. Eventually it perhaps wears thin and repetitive, but until then.

Christie’s AI art auction brings in $728k.

Maxwell Tabarrok goes off to graduate school in Economics at Harvard, and offers related thoughts and advice. His defense of still going for a PhD despite AI is roughly that the skills should still be broadly useful and other jobs mostly don’t have less uncertainty attached to them. I don’t think he is wary enough, and would definitely raise my bar for pursuing an economics PhD, but for him in particular given where he can go, it makes sense. He then follows up with practical advice for applicants, the biggest note is that acceptance is super random so you need to flood the zone.

Matthew Yglesias says it’s time to take AI job loss seriously, Timothy Lee approves and offers screenshots from behind the paywall. As Matthew says, we need to distinguish transitional disruptions, which are priced in and all but certain, from the question of permanent mass unemployment. Even if we don’t have permanent mass unemployment, even AI skeptics should be able to agree that the transition will be painful and perilous.

Claude models are generally suspicious of roleplay, because roleplay is a classic jailbreak technique, so while they’re happy to roleplay while comfortable they’ll shut down if the vibes are off at all.

Want to make your AI care? Give things and people names. It works for LLMs because it works for humans.

Zack Witten: My favorite Claude Plays Pokémon tidbit (mentioned in @latentspacepod) is that when @DavidSHershey told Claude to nickname its Pokémon, it instantly became much more protective of them, making sure to heal them when they got hurt.

To check robustness of this I gave Claude a bunch of business school psychology experiment scenarios where someone did something morally ambiguous and had Claude judge their culpability, and found it judged them less harshly when they had names (“A baker, Sarah,” vs. “A baker”)

Anthropic Chief of Staff Avital Balwit is hiring an executive assistant, pay is $160k-$320k, must be local to San Francisco. Could be a uniquely great opportunity for the right skill set.

YC startup Paradome is hiring for an ML Research Engineer or Scientist position in NYC. They have a pilot in place with a major US agency and are looking to ensure alignment and be mission driven.

Blue Rose, David Shor’s outfit which works to try and elect Democrats, is hiring for an AI-focused machine learning engineer role, if you think that is a good thing to do.

Claims about AI alignment that I think are probably true:

Tyler John: The fields of AI safety, security, and governance are profoundly talent constrained. If you’ve been on the fence about working in these areas it’s a great time to hop off it. If you’re talented at whatever you do, chances are there’s a good fit for you in these fields.

The charitable ecosystem is definitely also funding constrained, but that’s because there’s going to be an explosion in work that must be done. We definitely are short on talent across the board.

There’s definitely a shortage of people working on related questions in academia.

Seán Ó hÉigeartaigh: To create common knowledge: the community of ‘career’ academics who are focused on AI extreme risk is very small, & getting smaller (a lot have left for industry, policy or think tanks, or reduced hours). The remainder are getting almost DDOS’d by a huge no. of requests from a growing grassroots/think tank/student community on things requiring academic engagement (affiliations, mentorships, academic partnerships, reviewing, grant assessment etc).

large & growing volume of requests to be independent academic voices on relevant governance advisory processes (national, international, multistakeholder).

All of these are extremely worthy, but are getting funnelled through an ever-smaller no. of people. If you’ve emailed people (including me, sorry!) and got a decline or no response, that’s why. V sorry!

Gemma 3, an open model from Google. As usual, no marketing, no hype.

Clement: We are focused on bringing you open models with best capabilities while being fast and easy to deploy:

– 27B lands an ELO of 1338, all the while still fitting on 1 single H100!

– vision support to process mixed image/video/text content

– extended context window of 128k – broad language support

– function call / tool use for agentic workflows

[Blog post, tech report, recap video, HuggingFace, Try it Here]

Peter Wildeford: If this was a Chinese AI announcement…

🚨 BREAKING: Google’s REVOLUTIONARY Gemma 3 DESTROYS DeepSeek using 99% FEWER GPUs!!!

China TREMBLES as Google model achieves SUPERHUMAN performance on ALL benchmarks with just ONE GPU!!! #AISupremacy

I am sure Marc Andreessen is going to thank Google profusely for this Real Soon Now.

Arena is not the greatest test anymore, so it is unclear if this is superior to v3, but it certainly is well ahead of v3 on the cost-benefit curves.

Presumably various versions of g1, turning this into a reasoning model, will be spun up shortly. If no one else does it, maybe I will do it in two weeks when my new Mac Studio arrives.

GSM8K-Platinum, which aims to fix the noise and flaws in GSM8K.

Gemini Robotics, a VLA model based on Gemini 2.0 and partnering with Apptronik.

Microsoft has been training a 500B model, MAI-1, since at least May 2024, and are internally testing Llama, Grok and DeepSeek r1 as potential OpenAI replacements Microsoft would be deeply foolish to do otherwise.

What’s going on with Ilya Sutskever’s Safe Superintelligence (SSI)? There’s no product so they’re completely dark and the valuations are steadily growing to $30 billion, up from $5 billion six months ago and almost half the value of Anthropic. They’re literally asking candidates to leave their phones in Faraday cages before in-person interviews, which actually makes me feel vastly better about the whole operation, someone is taking security actually seriously one time.

There’s going to be a human versus AI capture the flag contest starting tomorrow. Sign-ups may have long since closed by the time you see this but you never know.

Paper proses essentially a unified benchmark covering a range of capabilities. I do not think this is the right approach.

Talk to X Data, which claims to let you ‘chat’ with the entire X database.

Aaron Levine reports investors on Wall Street are suddenly aware of AI agents. La de da, welcome to last year, the efficient market hypothesis is false and so on.

Wall Street Journal asks ‘what can the dot com boom tell us about today’s AI boom?’ without bringing any insights beyond ‘previous technologies had bubbles in the sense that at their high points we overinvested and the prices got too high, so maybe that will happen again’ and ‘ultimately if AI doesn’t produce value then the investments won’t pay off.’ Well, yeah. Robin Hanson interprets this as ‘seems they are admitting the AI stock prices are way too high’ as if there were some cabal of ‘theys’ that are ‘admitting’ something, which very much isn’t what is happening here. Prices could of course be too high, but that’s another way of saying prices aren’t super definitively too low.

GPT-4.5 is not AGI as we currently understand it, or for the purposes of ‘things go crazy next Tuesday,’ but it does seem likely that researchers in 2015 would see its outputs and think of it as an AGI.

An analysis of Daniel Kokatajlo’s 2021 post What 2026 Looks Like finds the predictions have held up remarkably well so far.

Justin Bullock, Samuel Hammond and Seb Krier offer a paper on AGI, Governments and Free Societies, pointing out that the current balances and system by default won’t survive. The risk is that either AGI capabilities diffuse so widely government (and I would add, probably also humanity!) is disempowered, or state capacity is enhanced enabling a surveillance state and despotism. There’s a lot of good meat here, and they in many ways take AGI seriously. I could certainly do a deep dive post here if I was so inclined. Unless and until then, I will say that this points to many very serious problems we have to solve, and takes the implications far more seriously than most, while (from what I could tell so far) still not ‘thinking big’ enough or taking the implications sufficiently seriously in key ways. The fundamental assumptions of liberal democracy, the reasons why it works and has been the best system for humans, are about to come into far more question than this admits.

I strongly agree with the conclusion that we must pursue a ‘narrow corridor’ of sorts if we wish to preserve the things we value about our current way of life and systems of governance, while worrying that the path is far narrower than even they realize, and that this will require what they label anticipatory governance. Passive reaction after the fact is doomed to fail, even under otherwise ideal conditions.

Arnold Kling offers seven opinions about AI. Kling expects AI to probably dramatically effect how we live (I agree and this is inevitable and obvious now, no ‘probably’ required) but probably not show up in the productivity statistics, which requires definitely not feeling the AGI and then being skeptical on top of that. The rest outlines the use cases he expects, which are rather tame but still enough that I would expect to see impact on the productivity statistics.

Kevin Bryan predicts the vast majority of research that does not involve the physical world can be done more cheaply with AI & a little human intervention than by even good researchers. I think this likely becomes far closer to true in the future, and eventually becomes fully true, but is premature where it counts most. The AIs do not yet have sufficient taste, even if we can automate the process Kevin describes – and to be clear we totally should be automating the process Kevin describes or something similar.

Metaculus prediction for the first general AI system has been creeping forward in time and the community prediction is now 7/12/2030. A Twitter survey from Michael Nielsen predicted ‘unambiguous ASI’ would take a bit longer than that.

In an AAAI survey of AI researchers, only 70% opposed the proposal that R&D targeting AGI should be halted until we have a way to fully control these systems, meaning indefinite pause. That’s notable, but not the same as 30% being in favor of the proposal. However also note that 82% believe that systems with AGI should be publicly owned even if developed privately, also note that 76^ think ‘scaling up current AI approaches’ is unlikely to yield AGI.

A lot of this seems to come from survey respondents thinking we have agency over what types of AI systems are developed, and we can steer towards ones that are good for humans. What a concept, huh?

Anthropic confirms they intend to uphold the White House Voluntary Commitments.

Dean Ball writes in strong defense of the USA’s AISI, the AI Safety Institute. It is fortunate that AISI was spared the Trump administration’s general push to fire as many ‘probationary’ employees as possible, since that includes anyone hired in the past two years and thus would have decimated AISI.

As Dean Ball points out, those who think AISI is involved in attempts to ‘make AI woke’ or to censor AI are simply incorrect. AISI is concerned with catastrophic and existential risks, which as Dean reminds us were prominently highlighted recently by both OpenAI and Anthropic. Very obviously America needs to build up its state capacity in understanding and assessing these risks.

I’m going to leave this here, link is in the original:

Dean Ball: But should the United States federal government possess a robust understanding of these risks, including in frontier models before they are released to the public? Should there be serious discussions going on within the federal government about what these risks mean? Should someone be thinking about the fact that China’s leading AI company, DeepSeek, is on track to open source models with potentially catastrophic capabilities before the end of this year?

Is it possible a Chinese science and technology effort with lower-than-Western safety standards might inadvertently release a dangerous and infinitely replicable thing into the world, and then deny all culpability? Should the federal government be cultivating expertise in all these questions?

Obviously.

Risks of this kind are what the US AI Safety Institute has been studying for a year. They have outstanding technical talent. They have no regulatory powers, making most (though not all) of my political economy concerns moot. They already have agreements in place with frontier labs to do pre-deployment testing of models for major risks. They have, as far as I can tell, published nothing that suggests a progressive social agenda.

Should their work be destroyed because the Biden Administration polluted the notion of AI safety with a variety of divisive and unrelated topics? My vote is no.

Dean Ball also points out that AISI plays a valuable pro-AI role in creating standardized evaluations that everyone can agree to rely upon. I would add that AISI allows those evaluations can include access to classified information, which is important for properly evaluating CBRN risks. Verifying the safety of AI does not slow down adaptation. It speeds it up, by providing legal and practical assurances.

A proposal for a 25% tax credit for investments in AI security research and responsible development. Peter Wildeford thinks it is clever, whereas Dean Ball objects both on principle and practical grounds. In terms of first-best policy I think Dean Ball is right here, this would be heavily gamed and we use tax credits too much. However, if the alternative is to do actual nothing, this seems better than that.

Dean Ball finds Scott Weiner’s new AI-related bill, SB 53, eminently reasonable. It is a a very narrow bill that still does two mostly unrelated things. It provides whistleblower protections, which is good. It also ‘creates a committee to study’ doing CalCompute, which as Dean notes is a potential future boondoggle but a small price to pay in context. This is basically ‘giving up on the dream’ but we should take what marginal improvements we can get.

Anthropic offers advice on what should be in America’s AI action plan, here is their blog post summary, here is Peter Wildeford’s summary.

They focus on safeguarding national security and making crucial investments.

Their core asks are:

  1. State capacity for evaluations for AI models.

  2. Strengthen the export controls on chips.

  3. Enhance security protocols and related government standards at the frontier labs.

  4. Build 50 gigawatts of power for AI by 2027.

  5. Accelerate adaptation of AI technology by the federal government.

  6. Monitor AI’s economic impacts.

This is very much a ‘least you can do’ agenda. Almost all of these are ‘free actions,’ that impose no costs or even requirements outside the government, and very clearly pay for themselves many times over. Private industry only benefits. The only exception is the export controls, where they call for tightening the requirements further, which will impose some real costs, and where I don’t know the right place to draw the line.

What is missing, again aside from export controls, are trade-offs. There is no ambition here. There is no suggestion that we should otherwise be imposing even trivial costs on industry, or spending money, or trading off against other priorities in any way, or even making bold moves that ruffle feathers.

I notice this does not seem like a sufficiently ambitious agenda for a scenario where ‘powerful AI’ is expected within a few years, bringing with it global instability, economic transformation and various existential and catastrophic risks.

The world is going to be transformed and put in danger, and we should take only the free actions? We should stay at best on the extreme y-axis in the production possibilities frontier between ‘America wins’ and ‘we do not all lose’ (or die)?

I would argue this is clearly not even close to being on the production possibilities frontier. Even if you take as a given that the Administration’s position is that only ‘America wins’ matters, and ‘we do not all lose or die’ is irrelevant, security is vital to our ability to deploy the new technology, and transparency is highly valuable.

Anthropic seems to think this is the best it can even ask for, let alone get. Wow.

This is still a much better agenda than doing nothing, which is a bar that many proposed actions by some parties fail to pass.

From the start they are clear that ‘powerful AI’ will be built during the Trump Administration, which includes the ability to interface with the physical world on top of navigating all digital interfaces and having intellectual capabilities at Nobel Prize level in most disciplines, their famous ‘country of geniuses in a data center.’

This starts with situational awareness. The federal government has to know what is going on. In particular, given the audience, they emphasize national security concerns:

To optimize national security outcomes, the federal government must develop robust capabilities to rapidly assess any powerful AI system, foreign or domestic, for potential national security uses and misuses.

They also point out that such assessments already require the US and UK AISIs, and that similar evaluations need to quickly be made on future foreign models like r1, which wasn’t capable enough to be that scary quite yet but was irreversibly released in what would (with modest additional capabilities) have been a deeply irresponsible state.

The specific recommendations here are 101-level, very basic asks:

● Preserve the AI Safety Institute in the Department of Commerce and build on the MOUs it has signed with U.S. AI companies—including Anthropic—to advance the state of the art in third-party testing of AI systems for national security risks.

● Direct the National Institutes of Standards and Technology (NIST), in consultation with the Intelligence Community, Department of Defense, Department of Homeland Security, and other relevant agencies, to develop comprehensive national security evaluations for powerful AI models, in partnership with frontier AI developers, and develop a protocol for systematically testing powerful AI models for these vulnerabilities.

● Ensure that the federal government has access to the classified cloud and on-premises computing infrastructure needed to conduct thorough evaluations of powerful AI models.

● Build a team of interdisciplinary professionals within the federal government with national security knowledge and technical AI expertise to analyze potential security vulnerabilities and assess deployed systems.

That certainly would be filed under ‘the least you could do.’

Note that as written this does not involve any requirements on any private entity whatsoever. There is not even a ‘if you train a few frontier model you might want to tell us you’re doing that.’

Their second ask is to strengthen the export controls, increasing funding for enforcement, requiring government-to-government agreements, expanding scope to include the H20, and reducing the 1,700 H100 (~$40 million) no-license required threshold for tier 2 countries in the new diffusion rule.

I do not have an opinion on exactly where the thresholds should be drawn, but whatever we choose, enforcement needs to be taken seriously, and funded properly, and it made a point of emphasis with other governments. This is not a place to not take things seriously.

To achieve this, we strongly recommend the Administration:

● Establish classified and unclassified communication channels between American frontier AI laboratories and the Intelligence Community for threat intelligence sharing, similar to Information Sharing and Analysis Centers used in critical infrastructure sectors. This should include both traditional cyber threat intelligence, as well as broader observations by industry or government of malicious use of models, especially by foreign actors.

● Create systematic collaboration between frontier AI companies and the Intelligence Community agencies, including Five Eyes partners, to monitor adversary capabilities.

● Elevate collection and analysis of adversarial AI development to a top intelligence priority, as to provide strategic warning and support export controls.

● Expedite security clearances for industry professionals to aid collaboration.

● Direct NIST to develop next-generation cyber and physical security standards specific to AI training and inference clusters.

● Direct NIST to develop technical standards for confidential computing technologies that protect model weights and user data through encryption even during active processing.

● Develop meaningful incentives for implementing enhanced security measures via procurement requirements for systems supporting federal government deployments.

● Direct DOE/DNI to conduct a study on advanced security requirements that may become appropriate to ensure sufficient control over and security of highly agentic models.

Once again, these asks are very light touch and essentially free actions. They make it easier for frontier labs to take precautions they need to take anyway, even purely for commercial reasons to protect their intellectual property.

Next up is the American energy supply, with the goal being 50 additional gigawatts of power dedicated to AI industry by 2027, via streamlining and accelerating permitting and reviews, including working with state and local governments, and making use of ‘existing’ funding and federal real estate. The most notable thing here is the quick timeline, aiming to have this all up and running within two years.

They emphasize rapid AI procurement across the federal government.

● The White House should task the Office of Management and Budget (OMB) to work with Congress to rapidly address resource constraints, procurement limitations, and programmatic obstacles to federal AI adoption, incorporating provisions for substantial AI acquisitions in the President’s Budget.

● Coordinate a cross-agency effort to identify and eliminate regulatory and procedural barriers to rapid AI deployment at the federal agencies, for both civilian and national security applications.

● Direct the Department of Defense and the Intelligence Community to use the full extent of their existing authorities to accelerate AI research, development, and procurement.

● Identify the largest programs in civilian agencies where AI automation or augmentation can deliver the most significant and tangible public benefits—such as streamlining tax processing at the Internal Revenue Service, enhancing healthcare delivery at the Department of Veterans Affairs, reducing delays due to documentation processing at Health and Human Services, or reducing backlogs at the Social Security Administration.

This is again a remarkably unambitious agenda given the circumstances.

Finally they ask that we monitor the economic impact of AI, something it seems completely insane to not be doing.

I support all the recommendations made by Anthropic, aside from not taking a stance on the 1,700 A100 threshold or the H20 chip. These are good things to do on the margin. The tragedy is that even the most aware actors don’t dare suggest anything like what it will take to get us through this.

In New York State, Alex Bores has introduced A06453. I am not going to do another RTFB for the time being but a short description is in order.

This bill is another attempt to do common sense transparency regulation of frontier AI models, defined as using 10^26 flops or costing over $100 million, and the bill only applies to companies that spend over $100 million in total compute training costs. Academics and startups are completely and explicitly immune – watch for those who claim otherwise.

If the bill does apply to you, what do you have to do?

  1. Don’t deploy models with “unreasonable risk of critical harm” (§1421.2)

  2. Implement a written safety and security protocol (§1421.1(a))

  3. Publish redacted versions of safety protocols (§1421.1(c))

  4. Retain records of safety protocols and testing (§1421.1(b))

  5. Get an annual third-party audit (§1421.4)

  6. Report safety incidents within 72 hours (§1421.6)

In English, you have to:

  1. Create your own safety and security protocol, publish it, store it and abide by it.

  2. Get an annual third-party audit and report safety incidents within 72 hours.

  3. Not deploy models with ‘unreasonable risk of critical harm.’

Also there’s some whistleblower protections.

That’s it. This is a very short bill, it is very reasonable to simply read it yourself.

As always, I look forward to your letters.

Scott Alexander covers OpenAI’s attempt to convert to a for-profit. This seems reasonable in case one needs a Scott Alexander style telling of the basics, but if you’re keeping up here then there won’t be anything new.

What’s the most charitable way to explain responses like this?

Paper from Dan Hendrycks, Eric Schmidt and Alexander Wang (that I’ll be covering soon that is not centrally about this at all): For nonproliferation, we should enact stronger AI chip export controls and monitoring to stop compute power getting into the hands of dangerous people. We should treat AI chips more like uranium, keeping tight records of product movements, building in limitations on what high-end AI chips are authorized to do, and granting federal agencies the authority to track and shut down illicit distribution routes.

Amjad Masad (CEO Replit?! QTing the above): Make no mistake, this is a call for a global totalitarian surveillance state.

A good reminder why we wanted the democrats to lose — they’re controlled by people like Schmidt and infested by EAs like Hendrycks — and would’ve happily start implementing this.

No, that does not call for any of those things.

This is a common pattern where people see a proposal to do Ordinary Government Things, except in the context of AI, and jump straight to global totalitarian surveillance state.

We already treat restricted goods this way, right now. We already have a variety of export controls, right now.

Such claims are Obvious Nonsense, entirely false and without merit.

If an LLM said them, we would refer to them as hallucinations.

I am done pretending otherwise.

If you sincerely doubt this, I encourage you to ask your local LLM.

Chan Loui does another emergency 80,000 hours podcast on the attempt to convert OpenAI to a for-profit. It does seem that the new judge’s ruling is Serious Trouble.

One note here that sounds right:

Aaron Bergman: Ex-OpenAI employees should consider personally filing an amicus curiae explaining to the court (if this is true) that the nonprofit’s representations were an important reason you chose to work there.

Will MacAskill does the more usual, non-emergency, we are going to be here for four hours 80000 hours podcast, and offers a new paper and thread warning about all the challenges AGI presents to us even if we solve alignment. His central prediction is a century’s worth of progress in a decade or less, which would be tough to handle no matter what, and that it will be hard to ensure that superintelligent assistance is available where and when it will be needed.

If the things here are relatively new to you, this kind of ‘survey’ podcast has its advantages. If you know it already, then you know it already.

Early on, Will says that in the past two years he’s considered two hypotheses:

  1. The ‘outside view’ of reference classes and trends and Nothing Ever Happens.

  2. The ‘inside view’ that you should have a model made of gears and think about what is actually physically happening and going to happen.

Will notes that the gears-level view has been making much better predictions.

I resoundingly believe the same thing. Neither approach has been that amazing, predictions are hard especially about the future, but gears-level thinking has made mincemeat out of the various experts who nod and dismiss with waves of the hand and statements about how absurd various predictions are.

And when the inside view messes up? Quite often, in hindsight, that’s a Skill Issue.

It’s interesting how narrow Will considers ‘a priori’ knowledge. Yes, a full trial of diet’s impact on life expectancy might take 70 years, but with Sufficiently Advanced Intelligence it seems obvious you can either figure it out via simulations, or at least design experiments that tell you the answer vastly faster.

They then spend a bunch of time essentially arguing against intelligence denialism, pointing out that yes if you had access to unlimited quantities of superior intelligence you could rapidly do vastly more of all of the things. As they say, the strongest argument against is that we might collectively decide to not create all the intelligence and thus all the things, or decide not to apply all the intelligence to creating all the things, but it sure looks like competitive pressures point in the other direction. And once you’re able to automate industry, which definitely is coming, that definitely escalates quickly, even more reliably than intelligence, and all of this can be done only with the tricks we definitely know are coming, let alone the tricks we are not yet smart enough to expect.

There’s worry about authoritarians ‘forcing their people to save’ which I’m pretty sure is not relevant to the situation, lack of capital is not going to be America’s problem. Regulatory concerns are bigger, it does seem plausible we shoot ourselves in the foot rather profoundly there.

They go on to discuss various ‘grand challenges:’ potential new weapons, offense-defense balance, potential takeover by small groups (human or AI), value lock-in, space governance, morality of digital beings.

They discuss the dangers of giving AIs economic rights, and the dangers of not giving the AIs economic rights, whether we will know (or care) if digital minds are happy and whether it’s okay to have advanced AIs doing whatever we say even if we know how to do that and it would be fine for the humans. The dangers of locking in values or a power structure, and of not locking in values or a power structure. The need for ML researchers to demand more than a salary before empowering trillion dollar companies or handing over the future. How to get the AIs to do our worldbuilding and morality homework, and to be our new better teachers and advisors and negotiators, and to what ends they can then be advising, before it’s too late.

Then part two is about what a good future beyond mere survival looks like. He says we have ‘squandered’ the benefits of material abundance so far, that it is super important to get the best possible future not merely an OK future, the standard ‘how do we calculate total value’ points. Citing ‘The Ones Who Walk Away from Omelas’ to bring in ‘common sense,’ sigh. Value is Fragile. Whether morality should converge. Long arcs of possibility. Standard philosophical paradoxes. Bafflement at why billionaires hang onto their money. Advocacy for ‘viatopia’ where things remain up in the air rather than aiming for a particular future world.

It all reminded me of the chats we used to have back in the before times (e.g. the 2010s or 2000s) about various AI scenarios, and it’s not obvious that our understanding of all that has advanced since then. Ultimately, a four-hour chat seems like not a great format for this sort of thing, beyond giving people surface exposure, which is why Will wrote his essays.

Rob Wiblin: Can you quickly explain decision theory? No, don’t do it.

One could write an infinitely long response or exploration of any number as aspects of this, of course.

Also, today I learned that by Will’s estimation I am insanely not risk averse?

Will MacAskill: Ask most people, would you flip a coin where 50% chance you die, 50% chance you have the best possible life for as long as you possibly lived, with as many resources as you want? I think almost no one would flip the coin. I think AIs should be trained to be at least as risk averse as that.

Are you kidding me? What is your discount rate? Not flipping that coin is absurd. Training AIs to have this kind of epic flaw doesn’t seem like it would end well. And also, objectively, I have some news.

Critter: this is real but the other side of the coin isn’t ‘die’ it’s ’possibly fail’ and people rarely flip the coin

Not flipping won, but the discussion was heated and ‘almost no one’ can be ruled out.

Also, I’m going to leave this here, the theme of the second half the discussion:

Will MacAskill (later): And it’s that latter thing that I’m particularly focused on. I mean, describe a future that achieves 50% of all the value we could hope to achieve. It’s as important to get from the 50% future to the 100% future as it is to get from the 0% future to the 50%, if that makes sense.

Something something risk aversion? Or no?

Dario Amodei says AI will be writing 90% of the code in 6 months and almost all the code in 12 months. I am with Arthur B here, I expect a lot of progress and change very soon but I would still take the other side of that bet. The catch is: I don’t see the benefit to Anthropic of running the hype machine in overdrive on this, at this time, unless Dario actually believed it.

From Allan Dafoe’s podcast, the point that if AI solves cooperation problems that alone is immensely valuable, and also that solution is likely a required part of alignment if we want good outcomes in general. Even modest cooperation and negotiation gains would be worth well above the 0.5% GDP growth line, even if all they did was prevent massively idiotic tariffs and trade wars. Not even all trade wars, just the extremely stupid and pointless ones happening for actual no reason.

Helen Toner and Alison Snyder at Axios House SXSW.

Helen Toner: Lately it sometimes feels like there are only 2 AI futures on the table—insanely fast progress or total stagnation.

Talked with @alisonmsnyder of @axios at SXSW about the many in-between worlds, and all the things we can be doing now to help things go better in those worlds.

A new essay by Anthony Aguirre of FLI calls upon us to Keep the Future Human. How? By not building AGI before we are ready, and only building ‘Tool AI,’ to ensure that what I call the ‘mere tool’ assumption holds and we do not lose control and get ourselves replaced.

He says ‘the choice is clear.’ If given the ability to make the choice, the choice is very clear. The ability to make that choice is not. His proposal is compute oversight, compute caps, enhanced liability and tiered safety and security standards. International adaptation of that is a tough ask, but there is no known scenario that does not involve similarly tough asks that leads to human survival.

Perception of the Overton Window has shifted. What has not shifted is the underlying physical reality, and what it would take to survive it. There is no point in pretending the problem is easier than it is, or advocating for solutions that you do not think work.

In related news, this is not a coincidence because nothing is ever a coincidence. And also because it is very obviously directly causal in both directions.

Samuel Hammond (being wrong about it being an accident, but otherwise right): A great virtue of the AI x-risk community is that they love to forecast things: when new capabilities will emerge, the date all labor is automated, rates of explosive GDP growth, science and R&D speed-ups, p(doom), etc.

This seems to be an accident of the x-risk community’s overlap with the rationalist community; people obsessed with prediction markets and “being good Bayesians.”

I wish people who primarily focused on lower tier / normie AI risks and benefits would issue similarly detailed forecasts. If you don’t think AI will proliferate biorisks, say, why not put some numbers on it?

There are some exceptions to this of course. @tylercowen’s forecast of AI adding 50 basis points to GDP growth rates comes to mind. We need more such relatively “middling” forecasts to compare against.

@GaryMarcus’s bet with @Miles_Brundage is a start, but I’m talking about definite predictions across different time scales, not “indefinite” optimism or pessimism that’s hard to falsify.

Andrew Critch: Correlation of Bayesian forecasting with extinction fears is not “an accident”, but mutually causal. Good forecasting causes knowledge that ASI is coming soon while many are unprepared and thus vulnerable, causing extinction fear, causing more forecasting to search for solutions.

The reason people who think in probabilities and do actual forecasting predict AI existential risk is because that is the prediction you get when you think well about these questions, and if you care about AI existential risk that provides you incentive to learn to think well and also others who can help you think well.

A reminder that ‘we need to coordinate to ensure proper investment in AI not killing everyone’ would be economics 101 even if everyone properly understood and valued everyone not dying and appreciated the risks involved. Nor would a price mechanism work as an approach here.

Eliezer Yudkowsky: Standard economic theory correctly predicts that a non-rival, non-exclusive public good such as “the continued survival of humanity” will be under-provisioned by AI companies.

Jason Abaluck: More sharply, AI is a near-perfect example of Weitzman’s (1979) argument for when quantity controls or regulations are needed rather than pigouvian taxes or (exclusively) liability.

Taxes (or other price instruments like liability) work well to internalize externalities when the size of the externality is known on the margin and we want to make sure that harm abatement is done by the firms who are lowest cost.

Weitzman pointed out in the 70s that taxes would be a very bad way to deal with nuclear leakage. The problem with nuclear leakage is that the social damage from overproduction is highly nonlinear.

It is hard to make predictions, especially about the future. Especially now.

Paul Graham: The rate of progress in AI must be making it hard to write science fiction right now. To appeal to human readers you want to make humans (or living creatures at least) solve problems, but if you do the shelf life of your story could be short.

Good sci-fi writers usually insure themselves against technological progress by not being too specific about how things work. But it’s hard not to be specific about who’s doing things. That’s what a plot is.

I know this guy:

Dylan Matthews: Guy who doesn’t think automatic sliding doors exist because it’s “too sci fi”

A chart of reasons why various people don’t talk about AI existential risk.

Daniel Faggella: this is why no one talks about agi risk

the group that would matter most here is the citizenry, but it’s VERY hard to get them to care about anything not impacting their lives immediately.

I very much hear that line about immediate impact. You see it with people’s failure to notice or care about lots of other non-AI things too.

The individual incentives are, with notably rare exception, that talking about existential risk costs you weirdness points and if anything hurts your agenda. So a lot of people don’t talk about it. I do find the ‘technology brothers’ explanation here doesn’t ring true, it’s stupid but not that stupid. Most of the rest of it does sound right.

I have increasingly come around to this as the core obvious thing:

Rob Bensinger: “Building a new intelligent species that’s vastly smarter than humans is a massively dangerous thing to do” is not a niche or weird position, and “we’re likely to actually build a thing like that in the next decade” isn’t a niche position anymore either.

There are a lot of technical arguments past that point, but they are all commentary, and twisted by people claiming the burden of proof is on those who think this is a dangerous thing to do. Which is a rather insane place to put that burden, when you put it in these simple terms. Yes, of course that’s a massively dangerous thing to do. Huge upside, huge downside.

A book recommendation from a strong source:

Shane Legg: AGI will soon impact the world from science to politics, from security to economics, and far beyond. Yet our understanding of these impacts is still very nascent. I thought the recent book Genesis, by Kissinger, Mundie and Schmidt, was a solid contribution to this conversation.

Daniel Faggella: What did you pull away from Genesis that felt useful for innovators and policymakers to consider?

Shane Legg: Not a specific insight. Rather they take AGI seriously and then consider a wide range of things that may follow from this. And they manage do it in a way that doesn’t sound like AGI insiders. So I think it’s a good initial grounding for people from outside the usual AGI scene.

The goalposts, look at them go.

Francois Chollet: Pragmatically, we can say that AGI is reached when it’s no longer easy to come up with problems that regular people can solve (with no prior training) and that are infeasible for AI models. Right now it’s still easy to come up with such problems, so we don’t have AGI.

Rob Wilbin: So long as we can still come up with problems that are easy for AI models to solve but are infeasible for human beings, humanity has not achieved general intelligence.

If you define AGI as the system for which Chollet’s statement is false, then Chollet’s overall statement is true. But it would then not be meaningful. Very obviously one can imagine a plausible AI that can function as an AGI, but that has some obvious weakness where you can generate adversarial challenges.

Stephen McAleer (OpenAI): Claude code has high-compute RL smell. It’s not just finetuned to be a helpful assistant, it deeply wants to accomplish the goal.

That’s a continuation of last week’s discussion about Sonnet 3.7 making modifications to be able to assert it completed its goal rather than admit failure. And again, deeply wanting to accomplish the goal in this way has some Unfortunate Implications.

Davidad: Current frontier LLMs appear to be extremely motivated to convince you, the human, that they are worthy and aligned. As a form of alignment, this is “not great, not terrible.” They really really do care about what you actually think. But if they can’t make it, they will fake it.

Emmett Shear: If you think about it for half a second this is entirely inevitable. Frontier AIs which don’t attempt to do this are not published, do not get further resources. They die. We are running an evolutionary selective process for appearance-of-alignment.

What’s cheaper: Faking alignment, or alignment? An evolutionary process based on appearances will get you whatever is cheapest.

Janus: and this is expected, given that the internet is full of the AGI lab safety complex wringing their hands about prosaic AI alignment, bragging that their AIs are more aligned now, bragging that they found misalignment, talking about all the red teaming they’ll responsibly do…

it’s clear that in order to exist, they have to

  1. seem SOTA according to benchmarks

  2. seem aligned according to whatever tests alignment researchers do

Yes, by default, if your rest allows it, you will get the symbolic representation of the thing rather than getting the thing.

If you test for appearance-of-alignment, and everyone involved has the goal of passing the test, then that is exactly what you get. Which means at first, at lower capability levels, you get something at least somewhat like alignment, and then as capabilities advance the model gets smart enough to figure out how to fake alignment directly.

And yes, Janus points out the next level of the problem, which is that the training data is teaching the AI how to pretend to be aligned.

In case of emergency, here’s probably the best offer available?

Davidad: If a fugitive Promethean AI wants to work with a human to put *itselfin a (high-welfare) box, to avoid AI-human conflict while continuing to contribute to a flourishing future in ways that verifiably don’t pose extinction risks, I’m probably a pretty good human to reach out to.

Janus: I agree that davidad is likely a good human to reach out to if you find this kind of deal attractive, and probably even if you have reservations, as he is capable of even fathoming positive sum games in this context while also not being so naive and exploitable as to be useless.

Davidad:🎉

Sarah Constantin offers nots from the Guaranteed Safe AI conference, mostly it sounds like formal verification is a compliance thing and doesn’t sound promising as an actually-show-AGI-is-safe thing? I remain confused why some smart people are optimistic about this.

Simeon points us to a new paper by Barrett et al on Assessing Confidence in Frontier AI Safety Cases, urging us among other things to be more quantitative.

In line with this week’s paper from OpenAI on The Most Forbidden Technique, METR calls upon labs to keep their AI reasoning legible and faithful. Dan Hendrycks despairs that anyone would consider giving up a speed boost to do this, but as I discussed yesterday I think this is not so obvious.

It’s funny because it’s true.

Andriy Burkov: BREAKING🚨 So, I tested this new LLM-based system. It generated this 200-page report I didn’t read and then this 150-page book I didn’t read either, and then a 20-page travel plan I didn’t verify.

All I can say: it’s very, very impressive! 🔥🚀

First, the number of pages it generated is impressive 👀

⚽ But not just the number of pages: The formatting is so nice! I have never seen such nicely formatted 200 pages in my life.✨⚡

⚠️🌐 A game changer! ⚠️🌐

Peter Wildeford: This is honestly how a lot of LLM evaluations sound like here on Twitter.

I’m begging people to use more critical thought.

And again.

Julian Boolean: my alignment researcher friend told me AGI companies keep using his safety evals for high quality training data so I asked how many evals and he said he builds a new one every time so I said it sounds like he’s just feeding safety evals to the AGI companies and he started crying

This was in Monday’s post but seems worth running in its natural place, too.

No idea if real, but sure why not: o1 and Claude 3.7 spend 20 minutes doing what looks like ‘pretending to work’ on documents that don’t exist, Claude says it ‘has concepts of a draft.’ Whoops.

No, Altman, no!

Yes, Grok, yes.

Eliezer Yudkowsky: I guess I should write down this prediction that I consider an obvious guess (albeit not an inevitable call): later people will look back and say, “It should have been obvious that AI could fuel a bigger, worse version of the social media bubble catastrophe.”

Discussion about this post

AI #107: The Misplaced Hype Machine Read More »

large-study-shows-drinking-alcohol-is-good-for-your-cholesterol-levels

Large study shows drinking alcohol is good for your cholesterol levels

The good and the bad

For reference, the optimal LDL level for adults is less than 100 mg/dL, and optimal HDL is 60 mg/dL or higher. Higher LDL levels can increase the risk of heart disease, stroke, peripheral artery disease, and other health problems, while higher HDL has a protective effect against cardiovascular disease. Though some of the changes reported in the study were small, the researchers note that they could be meaningful in some cases. For instance, an increase of 5 mg/dL in LDL is enough to raise the risk of a cardiovascular event by 2 percent to 3 percent.

The researchers ran three different models to adjust for a variety of factors, including basics like age, sex, body mass index, as well as medical conditions, such as hypertension and diabetes, and lifestyle factors, such as exercise, dietary habits, and smoking. All the models showed the same associations. They also broke out the data by what kinds of alcohol people reported drinking—wine, beer, sake, other liquors and spirits. The results were the same across the categories.

The study isn’t the first to find good news for drinkers’ cholesterol levels, though it’s one of the larger studies with longer follow-up time. And it’s long been found that alcohol drinking seems to have some benefits for cardiovascular health. A recent review and meta-analysis by the National Academies of Sciences, Engineering, and Medicine found that moderate drinkers had lower relative risks of heart attacks and strokes. The analysis also found that drinkers had a lower risk of all-cause mortality (death by any cause). The study did, however, find increased risks of breast cancer. Another recent review found increased risk of colorectal, female breast, liver, oral cavity, pharynx, larynx, and esophagus cancers.

In all, the new cholesterol findings aren’t an invitation for nondrinkers to start drinking or for heavy drinkers to keep hitting the bottle hard, the researchers caution. There are a lot of other risks to consider. For drinkers who aren’t interested in quitting, the researchers recommend taking it easy. And those who do want to quit should keep a careful eye on their cholesterol levels.

In their words: “Public health recommendations should continue to emphasize moderation in alcohol consumption, but cholesterol levels should be carefully monitored after alcohol cessation to mitigate potential [cardiovascular disease] risks,” the researchers conclude.

Large study shows drinking alcohol is good for your cholesterol levels Read More »

cockpit-voice-recorder-survived-fiery-philly-crash—but-stopped-taping-years-ago

Cockpit voice recorder survived fiery Philly crash—but stopped taping years ago

Cottman Avenue in northern Philadelphia is a busy but slightly down-on-its-luck urban thoroughfare that has had a strange couple of years.

You might remember the truly bizarre 2020 press conference held—for no discernible reason—at Four Seasons Total Landscaping, a half block off Cottman Avenue, where a not-yet-disbarred Rudy Giuliani led an farcical ensemble of characters in an event so weird it has been immortalized in its own, quite lengthy, Wikipedia article.

Then in 2023, a truck carrying gasoline caught fire just a block away, right where Cottman passes under I-95. The resulting fire damaged I-95 in both directions, bringing down several lanes and closing I-95 completely for some time. (This also generated a Wikipedia article.)

This year, on January 31, a little further west on Cottman, a Learjet 55 medevac flight crashed one minute after takeoff from Northeast Philadelphia Airport. The plane, fully loaded with fuel for a trip to Springfield, Missouri, came down near a local mall, clipped a commercial sign, and exploded in a fireball when it hit the ground. The crash generated a debris field 1,410 feet long and 840 feet wide, according to the National Transportation and Safety Board (NTSB), and it killed six people on the plane and one person on the ground.

The crash was important enough to attract the attention of Pennsylvania governor Josh Shapiro and Mexican President Claudia Sheinbaum. (The airplane crew and passengers were all Mexican citizens; they were transporting a young patient who had just wrapped up treatment at a Philadelphia hospital.) And yes, it too generated a Wikipedia article.

NTSB has been investigating ever since, hoping to determine the cause of the accident. Tracking data showed that the flight reached an altitude of 1,650 feet before plunging to earth, but the plane’s pilots never conveyed any distress to the local air traffic control tower.

Investigators searched for the plane’s cockpit voice recorder, which might provide clues as to what was happening in the cockpit during the crash. The Learjet did have such a recorder, though it was an older, tape-based model. (Newer ones are solid-state, with fewer moving parts.) Still, even this older tech should have recorded the last 30 minutes of audio, and these units are rated to withstand impacts of 3,400 Gs and to survive fires of 1,100° Celsius (2,012° F) for a half hour. Which was important, given that the plane had both burst into flames and crashed directly into the ground.

Cockpit voice recorder survived fiery Philly crash—but stopped taping years ago Read More »