Author name: Mike M.

no-judge-with-tesla-stock-should-handle-elon-musk-cases,-watchdog-argues

No judge with Tesla stock should handle Elon Musk cases, watchdog argues

No judge with Tesla stock should handle Elon Musk cases, watchdog argues

Elon Musk’s fight against Media Matters for America (MMFA)—a watchdog organization that he largely blames for an ad boycott that tanked Twitter/X’s revenue—has raised an interesting question about whether any judge owning Tesla stock might reasonably be considered biased when weighing any lawsuit centered on the tech billionaire.

In a court filing Monday, MMFA lawyers argued that “undisputed facts—including statements from Musk and Tesla—lay bare the interest Tesla shareholders have in this case.” According to the watchdog, any outcome in the litigation will likely impact Tesla’s finances, and that’s a problem because there’s a possibility that the judge in the case, Reed O’Connor, owns Tesla stock.

“X cannot dispute the public association between Musk—his persona, business practices, and public remarks—and the Tesla brand,” MMFA argued. “That association would lead a reasonable observer to ‘harbor doubts’ about whether a judge with a financial interest in Musk could impartially adjudicate this case.”

It’s still unclear if Judge O’Connor actually owns Tesla stock. But after MMFA’s legal team uncovered disclosures showing that he did as of last year, they argued that fact can only be clarified if the court views Tesla as a party with a “financial interest in the outcome of the case” under Texas law—“no matter how small.”

To make those facts clear, MMFA is now arguing that X must be ordered to add Tesla as an interested person in the litigation, which a source familiar with the matter told Ars, would most likely lead to a recusal if O’Connor indeed still owned Tesla stock.

“At most, requiring X to disclose Tesla would suggest that judges owning stock in Tesla—the only publicly traded Musk entity—should recuse from future cases in which Musk himself is demonstrably central to the dispute,” MMFA argued.

Ars could not immediately reach X Corp’s lawyer for comment.

However, in X’s court filing opposing the motion to add Tesla as an interested person, X insisted that “Tesla is not a party to this case and has no interest in the subject matter of the litigation, as the business relationships at issue concern only X Corp.’s contracts with X’s advertisers.”

Calling MMFA’s motion “meritless,” X accused MMFA of strategizing to get Judge O’Connor disqualified in order to go “forum shopping” after MMFA received “adverse rulings” on motions to stay discovery and dismiss the case.

As to the question of whether any judge owning Tesla stock might be considered impartial in weighing Musk-centric cases, X argued that Judge O’Connor was just as duty-bound to reject an improper motion for recusal, should MMFA go that route, as he was to accept a proper motion.

“Courts are ‘reluctant to fashion a rule requiring judges to recuse themselves from all cases that might remotely affect nonparty companies in which they own stock,'” X argued.

Recently, judges have recused themselves from cases involving Musk without explaining why. In November, a prior judge in the very same Media Matters’ suit mysteriously recused himself, with The Hill reporting that it was likely that the judge’s “impartiality might reasonably be questioned” for reasons like a financial interest or personal bias. Then in June, another judge ruled he was disqualified to rule on a severance lawsuit raised by former Twitter executives without giving “a specific reason,” Bloomberg Law reported.

Should another recusal come in the MMFA lawsuit, it would be a rare example of a judge clearly disclosing a financial interest in a Musk case.

“The straightforward question is whether Musk’s statements and behavior relevant to this case affect Tesla’s stock price, not whether they are the only factor that affects it,” MMFA argued. ” At the very least, there is a serious question about whether Musk’s highly unusual management practices mean Tesla must be disclosed as an interested party.”

Parties expect a ruling on MMFA’s motion in the coming weeks.

No judge with Tesla stock should handle Elon Musk cases, watchdog argues Read More »

google’s-play-store-wants-to-pivot-from-grab-and-go-to-an-active-destination

Google’s Play Store wants to pivot from grab-and-go to an active destination

It’s still a store, just with a different product —

If multi-app shopping doesn’t keep you there, maybe free Pixel gear will.

Enlarge / I like the idea of clicking “Realistic,” “MMORPG,” and “Word” boxes, just to see what comes back.

Google

Google Play is a lot of things—perhaps too many things for those who just want to install some apps. If that’s how you feel, you might find “Google Play’s next chapter” a bit bewildering, as Google hopes to make it “more than a store.” Or you might start thinking about how to turn Play Points into a future Pixel phone.

Google Play’s “new way to Play.”

In a blog post about “How we’re evolving Google Play,” VP and General Manager of Google Play Sam Bright outlines the big changes to Google Play:

  • AI-generated app reviews and summaries, along with app comparisons
  • “Curated spaces” for interests, showing content from apps related to one thing (like cricket, and Japanese comics)
  • Game recommendations based on genres and features you select.
  • Google Play Games on PC can pick up where you left off in games played on mobile and can soon play multiple titles at the same time on desktop.
  • Play Points enthusiasts who are in the Diamond, Platinum, or Gold levels can win Pixel devices, Razer gaming products, and other gear, along with other game and access perks.

Those are the upgrades to existing Play features. The big new thing is Collections, which, like the “curated spaces,” takes content from apps you already have installed and organizes them around broad categories. I spotted “Watch,” “Listen,” “Read,” “Games,” “Social,” “Shop,” and “Food” in Google’s animated example. You can toggle individual apps feeding into the Collections in the settings.

It’s hard not to look at Google Play’s new focus on having users actively express their interests in certain topics and do their shopping inside a fully Google-ized space, against the timing of yesterday’s announcement regarding third-party cookies. Maybe that connection isn’t apparent right off, but bear with me.

The Play Store is still contractually installed on the vast majority of Android devices, but competition and changes could be coming following Google’s loss to Epic in an antitrust trial and proposed remedies Google deeply dislikes. Meanwhile, the Play Store and Google’s alleged non-compliance with new regulations, like allowing developers to notify customers about payment options outside the store, are under investigation.

If the tide turns against tracking users across apps, websites, and stores, and if the Play Store becomes non-required for browsing and purchasing apps, it’s in Google’s interests to get people actively committing to things they want to see more about on their phone screens. It’s a version of what Chrome is doing with its Privacy Sandbox and its “Topics” that it can flag for advertisers. Google’s video for the new Play experience suggests “turning a sea of apps into a world of discovery.” The prompt “What are you interested in?” works for the parties on both ends of Google’s Play space.

Google’s Play Store wants to pivot from grab-and-go to an active destination Read More »

alexa-had-“no-profit-timeline,”-cost-amazon-$25-billion-in-4-years

Alexa had “no profit timeline,” cost Amazon $25 billion in 4 years

In this photo illustration, Echo Dot smart speaker with working Alexa with blue light ring seen displayed.

The Amazon business unit that focuses on Alexa-powered gadgets lost $25 billion between 2017 and 2021, The Wall Street Journal (WSJ) reported this week.

Amazon claims it has sold more than 500,000 Alexa devices, which included Echo speakers, Kindle readers, Fire TV sets and streaming devices, and Blink and Ring smart home security cameras. But since debuting, Alexa, like other voice assistants, has struggled to make money. In late 2022, Business Insider reported that Alexa was set to lose $10 billion that year.

WSJ said it got the $25 billion figure from “internal documents” and that it wasn’t able to determine the Devices business’s losses before or after the shared time period.

“No profit timeline”

WSJ’s report claims to offer insight into how Devices was able to bleed so much money for so long.

For one, it seems like the business unit was allowed some wiggle room in terms of financial success in the interest of innovation and the potential for long-term gains. Someone the WSJ described as being “a former longtime Devices executive” said that when Alexa first started, Amazon’s gadgets team “didn’t have a profit timeline” when launching products.

Amazon is known to have sold Echo speakers for cheap or at a loss in the hopes of making money off Alexa later. In 2019, then-Amazon Devices SVP Dave Limp, who exited the company last year, told WSJ: “We don’t have to make money when we sell you the device.” WSJ noted that this strategy has applied to other unspecified Amazon devices, too.

People tend to use Alexa for free services, though, like checking the weather or the time, not making big purchases.

“We worried we’ve hired 10,000 people and we’ve built a smart timer,” an anonymous person that WSJ said is a “former senior employee” said.

An Amazon spokesperson told WSJ that more than half of people with an Echo have shopped with it but wouldn’t provide more specifics. Per “former employees on the Alexa shopping team” that WSJ spoke with, however, the amount of shopping revenue tied to Alexa is insignificant.

In an emailed statement, an Amazon spokesperson told Ars Technica, in part:

Within Devices & Services, we’re focused on the value we create when customers use our services, not just when they buy our devices. Our Devices & Services organization has established numerous profitable businesses for Amazon and is well-positioned to continue doing so going forward.

Further hindering Alexa’s revenue are challenges in selling security and other services and the limitation of ad sales because they annoy Alexa users, WSJ reported.

Massive losses also didn’t seem to slow down product development. WSJ claimed the Devices business lost over $5 billion in 2018 yet still spent money developing the Astro consumer robot. That robot has yet to see general availability, while a business version is getting bricked just 10 months after release. Amazon Halo health trackers, which have also been bricked, and Luna game-streaming devices were also developed in 2019, when the hardware unit lost over $6 billion, per WSJ.

Amazon has laid off at least 19,000 workers since 2022, with the Devices division reportedly hit especially hard.

Alexa had “no profit timeline,” cost Amazon $25 billion in 4 years Read More »

the-first-gpt-4-class-ai-model-anyone-can-download-has-arrived:-llama-405b

The first GPT-4-class AI model anyone can download has arrived: Llama 405B

A new llama emerges —

“Open source AI is the path forward,” says Mark Zuckerberg, misusing the term.

A red llama in a blue desert illustration based on a photo.

In the AI world, there’s a buzz in the air about a new AI language model released Tuesday by Meta: Llama 3.1 405B. The reason? It’s potentially the first time anyone can download a GPT-4-class large language model (LLM) for free and run it on their own hardware. You’ll still need some beefy hardware: Meta says it can run on a “single server node,” which isn’t desktop PC-grade equipment. But it’s a provocative shot across the bow of “closed” AI model vendors such as OpenAI and Anthropic.

“Llama 3.1 405B is the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation,” says Meta. Company CEO Mark Zuckerberg calls 405B “the first frontier-level open source AI model.”

In the AI industry, “frontier model” is a term for an AI system designed to push the boundaries of current capabilities. In this case, Meta is positioning 405B among the likes of the industry’s top AI models, such as OpenAI’s GPT-4o, Claude’s 3.5 Sonnet, and Google Gemini 1.5 Pro.

A chart published by Meta suggests that 405B gets very close to matching the performance of GPT-4 Turbo, GPT-4o, and Claude 3.5 Sonnet in benchmarks like MMLU (undergraduate level knowledge), GSM8K (grade school math), and HumanEval (coding).

But as we’ve noted many times since March, these benchmarks aren’t necessarily scientifically sound or translate to the subjective experience of interacting with AI language models. In fact, this traditional slate of AI benchmarks is so generally useless to laypeople that even Meta’s PR department now just posts a few images of charts and doesn’t even try to explain them in any detail.

A Meta-provided chart that shows Llama 3.1 405B benchmark results versus other major AI models.

Enlarge / A Meta-provided chart that shows Llama 3.1 405B benchmark results versus other major AI models.

We’ve instead found that measuring the subjective experience of using a conversational AI model (through what might be called “vibemarking”) on A/B leaderboards like Chatbot Arena is a better way to judge new LLMs. In the absence of Chatbot Arena data, Meta has provided the results of its own human evaluations of 405B’s outputs that seem to show Meta’s new model holding its own against GPT-4 Turbo and Claude 3.5 Sonnet.

A Meta-provided chart that shows how humans rated Llama 3.1 405B's outputs compared to GPT-4 Turbo, GPT-4o, and Claude 3.5 Sonnet in its own studies.

Enlarge / A Meta-provided chart that shows how humans rated Llama 3.1 405B’s outputs compared to GPT-4 Turbo, GPT-4o, and Claude 3.5 Sonnet in its own studies.

Whatever the benchmarks, early word on the street (after the model leaked on 4chan yesterday) seems to match the claim that 405B is roughly equivalent to GPT-4. It took a lot of expensive computer training time to get there—and money, of which the social media giant has plenty to burn. Meta trained the 405B model on over 15 trillion tokens of training data scraped from the web (then parsed, filtered, and annotated by Llama 2), using more than 16,000 H100 GPUs.

So what’s with the 405B name? In this case, “405B” means 405 billion parameters, and parameters are numerical values that store trained information in a neural network. More parameters translate to a larger neural network powering the AI model, which generally (but not always) means more capability, such as better ability to make contextual connections between concepts. But larger-parameter models have a tradeoff in needing more computing power (AKA “compute”) to run.

We’ve been expecting the release of a 400 billion-plus parameter model of the Llama 3 family since Meta gave word that it was training one in April, and today’s announcement isn’t just about the biggest member of the Llama 3 family: There’s an entirely new iteration of improved Llama models with the designation “Llama 3.1.” That includes upgraded versions of its smaller 8B and 70B models, which now feature multilingual support and an extended context length of 128,000 tokens (the “context length” is roughly the working memory capacity of the model, and “tokens” are chunks of data used by LLMs to process information).

Meta says that 405B is useful for long-form text summarization, multilingual conversational agents, and coding assistants and for creating synthetic data used to train future AI language models. Notably, that last use-case—allowing developers to use outputs from Llama models to improve other AI models—is now officially supported by Meta’s Llama 3.1 license for the first time.

Abusing the term “open source”

Llama 3.1 405B is an open-weights model, which means anyone can download the trained neural network files and run them or fine-tune them. That directly challenges a business model where companies like OpenAI keep the weights to themselves and instead monetize the model through subscription wrappers like ChatGPT or charge for access by the token through an API.

Fighting the “closed” AI model is a big deal to Mark Zuckerberg, who simultaneously released a 2,300-word manifesto today on why the company believes in open releases of AI models, titled, “Open Source AI Is the Path Forward.” More on the terminology in a minute. But briefly, he writes about the need for customizable AI models that offer user control and encourage better data security, higher cost-efficiency, and better future-proofing, as opposed to vendor-locked solutions.

All that sounds reasonable, but undermining your competitors using a model subsidized by a social media war chest is also an efficient way to play spoiler in a market where you might not always win with the most cutting-edge tech. That benefits Meta, Zuckerberg says, because he doesn’t want to get locked into a system where companies like his have to pay a toll to access AI capabilities, drawing comparisons to “taxes” Apple levies on developers through its App Store.

A screenshot of Mark Zuckerberg's essay,

Enlarge / A screenshot of Mark Zuckerberg’s essay, “Open Source AI Is the Path Forward,” published on July 23, 2024.

So, about that “open source” term. As we first wrote in an update to our Llama 2 launch article a year ago, “open source” has a very particular meaning that has traditionally been defined by the Open Source Initiative. The AI industry has not yet settled on terminology for AI model releases that ship either code or weights with restrictions (such as Llama 3.1) or that ship without providing training data. We’ve been calling these releases “open weights” instead.

Unfortunately for terminology sticklers, Zuckerberg has now baked the erroneous “open source” label into the title of his potentially historic aforementioned essay on open AI releases, so fighting for the correct term in AI may be a losing battle. Still, his usage annoys people like independent AI researcher Simon Willison, who likes Zuckerberg’s essay otherwise.

“I see Zuck’s prominent misuse of ‘open source’ as a small-scale act of cultural vandalism,” Willison told Ars Technica. “Open source should have an agreed meaning. Abusing the term weakens that meaning which makes the term less generally useful, because if someone says ‘it’s open source,’ that no longer tells me anything useful. I have to then dig in and figure out what they’re actually talking about.”

The Llama 3.1 models are available for download through Meta’s own website and on Hugging Face. They both require providing contact information and agreeing to a license and an acceptable use policy, which means that Meta can technically legally pull the rug out from under your use of Llama 3.1 or its outputs at any time.

The first GPT-4-class AI model anyone can download has arrived: Llama 405B Read More »

at&t-failed-to-test-disastrous-update-that-kicked-all-devices-off-network

AT&T failed to test disastrous update that kicked all devices off network

A large AT&T logo seen on the outside of its corporate offices.

A government investigation has revealed more detail on the impact and causes of a recent AT&T outage that happened immediately after a botched network update. The nationwide outage on February 22, 2024, blocked over 92 million phone calls, including over 25,000 attempts to reach 911.

As described in more detail later in this article, the FCC criticized AT&T for not following best practices, which dictate “that network changes must be thoroughly tested, reviewed, and approved” before implementation. It took over 12 hours for AT&T to fully restore service.

“All voice and 5G data services for AT&T wireless customers were unavailable, affecting more than 125 million devices, blocking more than 92 million voice calls, and preventing more than 25,000 calls to 911 call centers,” the Federal Communications Commission said yesterday. The outage affected all 50 states as well as Washington, DC, Puerto Rico, and the US Virgin Islands.

The outage also cut off service to public safety users on the First Responder Network Authority (FirstNet), the FCC report said. “Voice and 5G data services were also unavailable to users from mobile virtual network operators (MVNOs) and other wireless customers who were roaming on AT&T Mobility’s network,” the FCC said.

An incorrect process

AT&T previously acknowledged that the mobile outage was caused by a botched update related to a network expansion. The “outage was caused by the application and execution of an incorrect process used as we were expanding our network, not a cyber attack,” AT&T said.

The FCC report said the nationwide outage began three minutes after “AT&T Mobility implemented a network change with an equipment configuration error.” This configuration error caused the AT&T network “to enter ‘protect mode’ to prevent impact to other services, disconnecting all devices from the network, and prompting a loss of voice and 5G data service for all wireless users.”

While the network change was rolled back within two hours, full service restoration “took at least 12 hours because AT&T Mobility’s device registration systems were overwhelmed with the high volume of requests for re-registration onto the network,” the FCC found.

Outage reveals deeper problems at AT&T

Although a configuration error was the immediate cause of the outage, the FCC investigation revealed various problems in AT&T’s processes that increased the likelihood of an outage and made recovery more difficult than it should have been. The FCC Public Safety and Homeland Security Bureau analyzed network outage reports and written responses submitted by AT&T and interviewed AT&T employees. The bureau’s report said:

The Bureau finds that the extensive scope and duration of this outage was the result of several factors, all attributable to AT&T Mobility, including a configuration error, a lack of adherence to AT&T Mobility’s internal procedures, a lack of peer review, a failure to adequately test after installation, inadequate laboratory testing, insufficient safeguards and controls to ensure approval of changes affecting the core network, a lack of controls to mitigate the effects of the outage once it began, and a variety of system issues that prolonged the outage once the configuration error had been remedied.

At 2: 42 am CST on February 22, an AT&T “employee placed a new network element into its production network during a routine night maintenance window in order to expand network functionality and capacity,” the FCC said. The configuration “did not conform to AT&T’s established network element design and installment procedures, which require peer review.”

An adequate peer review should have prevented the network change from being approved and from being loaded onto the network, but this peer review did not take place, the FCC said. The configuration error was made by one employee, and the misconfigured network element was loaded onto the network by a second employee.

“The fact that the network change was loaded onto the AT&T Mobility network indicates that AT&T Mobility had insufficient oversight and controls in place to ensure that approval had occurred prior to loading,” the FCC said.

AT&T faces possible punishment

AT&T issued a statement saying it has “implemented changes to prevent what happened in February from occurring again. We fell short of the standards that we hold ourselves to, and we regret that we failed to meet the expectations of our customers and the public safety community.”

AT&T could eventually face some kind of punishment. The Public Safety and Homeland Security Bureau referred the matter to the FCC Enforcement Bureau for potential violations of FCC rules.

Verizon Wireless last month agreed to pay a $1,050,000 fine and implement a compliance plan because of a December 2022 outage in six states that lasted one hour and 44 minutes. The Verizon outage was similarly caused by a botched update, and the FCC investigation revealed systemic problems that made the company prone to such outages.

AT&T failed to test disastrous update that kicked all devices off network Read More »

the-cruise-origin-driverless-pod-is-dead,-gm-tells-investors

The Cruise Origin driverless pod is dead, GM tells investors

nobody take the wheel —

The driverless Origin is dead; instead, Cruise will use next-generation Bolt EVs.

a rendering of a Cruise Origin picking up passengers in the Castro district in San Francisco

Enlarge / As Cruise ramps up its robotaxi service, it won’t be in these cool-looking driverless pods.

Cruise

The Cruise Origin was definitely the least conventional of all the myriad vehicles that General Motors planned to build using its new Ultium battery platform. For starters, it wasn’t a pickup truck or SUV, unlike all the Ultium-based electric vehicles that have gone into production thus far. Instead, the Origin—meant for Cruise, GM’s robotaxi startup—was a true driverless pod design, a box on wheels with the front and rear seats facing each other and no steering wheel at all. But now the Origin is dead, GM said in a letter to investors today.

We saw the Origin in person in January 2020 at a flashy reveal event that was light on the details. At the time, Cruise was targeting early 2022 to begin deploying Origins, a timeline that accounted for neither pandemic nor the difficulty in actually developing autonomous vehicles.

By early 2022, Cruise was ready to petition the National Highway Traffic Safety Administration, asking permission to begin using Origins on the road. But 2023 was a bad year for the autonomous vehicle company, which had its operations in California suspended after a Cruise robotaxi ran over and then dragged a pedestrian in San Francisco.

The challenge of convincing NHTSA that such a radically different design should be given the OK proved too much for GM to bear, it told investors.

Instead of using Origins, Cruise will turn its attention to the next-generation Chevrolet Bolt, which will cost less per unit than the Origin, helpfully. The next-gen Bolt is a revamp of Chevy’s popular compact EV that will move over to the cheaper Ultium battery platform. The Bolt was GM’s bestselling EV but went out of production last year at the Orion Assembly plant in Michigan, which the automaker wanted to repurpose so it could build electric pickup trucks.

Those electric pickups are now on hold, postponed until mid-2026 GM says. Like Ford, it appears that GM miscalculated the appeal of expensive electric trucks, and as a result the company will not meet its originally stated ambition of building a million EVs in 2025.

The Cruise Origin driverless pod is dead, GM tells investors Read More »

waymo-is-suing-people-who-allegedly-smashed-and-slashed-its-robotaxis

Waymo is suing people who allegedly smashed and slashed its robotaxis

Waymo car is vandalized in San Francisco

The people of San Francisco haven’t always been kind to Waymo’s growing fleet of driverless taxis. The autonomous vehicles, which provide tens of thousands of rides each week, have been torched, stomped on, and verbally berated in recent months. Now Waymo is striking back—in the courts.

This month, the Silicon Valley company filed a pair of lawsuits, neither of which have been previously reported, that demand hundreds of thousands of dollars in damages from two alleged vandals. Waymo attorneys said in court papers that the alleged vandalism, which ruined dozens of tires and a tail end, are a significant threat to the company’s reputation. Riding in a vehicle in which the steering wheel swivels on its own can be scary enough. Having to worry about attackers allegedly targeting the rides could undermine Waymo’s ride-hailing business before it even gets past its earliest stage.

Waymo, which falls under the umbrella of Google parent Alphabet, operates a ride-hailing service in San Francisco, Phoenix, and Los Angeles that is comparable to Uber and Lyft except with sensors and software controlling the driving. While its cars haven’t contributed to any known deadly crashes, US regulators continue to probe their sometimes erratic driving. Waymo spokesperson Sandy Karp says the company always prioritizes safety and that the lawsuits reflect that strategy. She declined further comment for this story.

In a filing last week in the California Superior Court of San Francisco County, Waymo sued a Tesla Model 3 driver whom it alleges intentionally rear-ended one of its autonomous Jaguar crossovers. According to the suit, the driver, Konstantine Nikka-Sher Piterman, claimed in a post on X that “Waymo just rekt me” before going on to ask Tesla CEO Elon Musk for a job. The other lawsuit from this month, filed in the same court, targets Ronaile Burton, who allegedly slashed the tires of at least 19 Waymo vehicles. San Francisco prosecutors have filed criminal charges against her to which she has pleaded not guilty. A hearing is scheduled for Tuesday.

Burton’s public defender, Adam Birka-White, says in a statement that Burton “is someone in need of help and not jail” and that prosecutors continue “to prioritize punishing poor people at the behest of corporations, in this case involving a tech company that is under federal investigation for creating dangerous conditions on our streets.”

An attorney for Burton in the civil case hasn’t been named in court records, and Burton is currently in jail and couldn’t be reached for comment. Piterman didn’t respond to a voicemail, a LinkedIn message, and emails seeking comment. He hasn’t responded in court to the accusations.

Based on available records from courts in San Francisco and Phoenix, it appears that Waymo hasn’t previously filed similar lawsuits.

In the Tesla case, Piterman “unlawfully, maliciously, and intentionally” sped his car past a stop sign and into a Waymo car in San Francisco on March 19, according to the company’s suit. When the Waymo tried to pull over, Piterman allegedly drove the Tesla into the Waymo car again. He then allegedly entered the Waymo and later threatened a Waymo representative who responded to the scene in person. San Francisco police cited Piterman, according to the lawsuit. The police didn’t respond to WIRED’s request for comment.

Waymo is suing people who allegedly smashed and slashed its robotaxis Read More »

monthly-roundup-#20:-july-2024

Monthly Roundup #20: July 2024

It is monthly roundup time.

I invite readers who want to hang out and get lunch in NYC later this week to come on Thursday at Bhatti Indian Grill (27th and Lexington) at noon.

I plan to cover the UBI study in its own post soon.

I cover Nate Silver’s evisceration of the 538 presidential election model, because we cover probabilistic modeling and prediction markets here, but excluding any AI discussions I will continue to do my best to stay out of the actual politics.

Jeff Bezos’ rocket company Blue Origin files comment suggesting SpaceX Starship launches be capped due to ‘impact on local environment.’ This is a rather shameful thing for them to be doing, and not for the first time.

Alexey Guzey reverses course, realizes at 26 that he was a naive idiot at 20 and finds everything he wrote cringe and everything he did incompetent and Obama was too young. Except, no? None of that? Young Alexey did indeed, as he notes, successfully fund a bunch of science and inspire good thoughts and he stands by most of his work. Alas, now he is insufficiently confident to keep doing it and is in his words ‘terrified of old people.’ I think Alexey’s success came exactly because he saw people acting stupid and crazy and systems not working and did not then think ‘oh these old people must have their reasons,’ he instead said that’s stupid and crazy. Or he didn’t even notice that things were so stupid and crazy and tried to just… do stuff.

When I look back on the things I did when I was young and foolish and did not know any better, yeah, some huge mistakes, but also tons that would never have worked if I had known better.

Also, frankly, Alexey is failing to understand (as he is still only 26) how much cognitive and physical decline hits you, and how early. Your experience and wisdom and increased efficiency is fighting your decreasing clock speed and endurance and physical strength and an increasing set of problems. I could not, back then, have done what I am doing now. But I also could not, now, do what I did then, even if I lacked my current responsibilities. For example, by the end of the first day of a Magic tournament I am now completely wiped.

Google short urls are going to stop working. Patrick McKenzie suggests prediction markets on whether various Google services will survive. I’d do it if I was less lazy.

This is moot in some ways now that Biden has dropped out, but being wrong on the internet is always relevant when it impacts our epistemics and future models.

Nate Silver, who now writes Silver Bulletin and runs what used to be the old actually good 538 model, eviscerates the new 538 election model. The ‘new 538’ model had Biden projected to do better in Wisconsin and Ohio than either the fundamentals or his polls, which makes zero sense. It places very little weight on polls, which makes no sense. It has moved towards Biden recently, which makes even less sense. Texas is their third most likely tipping point state, it happens 9.8% of the time, wait what?

At best, Kelsey Piper’s description here is accurate.

Kelsey Piper: Nate Silver is slightly too polite to say it but my takeaway from his thoughtful post is that the 538 model is not usefully distinguishable from a rock with “incumbents win reelection more often than not” painted on it.

Gil: worse, I think Elliott’s modelling approach is probably something like max_(dem_chance) [incumbency advantage, polls, various other approaches].

Elliott’s model in 2020 was more bullish on Biden’s chances than Nate and in that case Trump was the incumbent and down in the polls.

Nate Silver (on Twitter): Sure, the Titanic might seem like it’s capsizing, but what you don’t understand is that the White Star Line has an extremely good track record according to our fundamentals model.

At worst, the model is bugged or incoherent, or a finger is on the scale. And given the debate over Biden stepping aside, this could have altered the outcome of the election. It still might have, if it delayed Biden’s resignation, although once you get anywhere near this far ‘the Sunday after the RNC’ is actually kind of genius timing.

I have done a lot of modeling in my day. What Nate is doing here is what my culture used to refer to as ‘calling bullshit.’ I would work on a model and put together a spreadsheet. I’d hand it off to my partner, who would enter various numbers into the input boxes, and look at the outputs. Then we’d get on the phone and he’d call bullshit: He’d point out a comparison or output somewhere that did not make sense, that could not be right. Usually he’d be right, and we’d iterate until he could not do that anymore. Then we might, mind you I said might, have a good model.

Another thing you could have done was to look at the market, or now the market history, since ‘things may have changed by the time you read this’ indeed.

Thus, no, I do not need to read through complex Bayesian explanations on various modeling assumptions to know that the 538 forecast here is bonkers. If it produces bonkers outputs, then it bonkers. If the topline number seemed bonkers, but all the internals made sense and the movements over time made sense and one could be walked through how that produces the final answer, that would be one thing.

But no, these outputs are simply flat out bonkers. The model does not much care about the things that matter most, it does not respond reasonably, it has outputs in places that were so pro-Biden as to look like bugs. Ignore such Obvious Nonsense.

It is also important because when they change Biden, to Harris or otherwise, there is a good chance they will still make similar mistakes.

As noted above, I will continue to cover modeling and prediction markets, and tracking how the candidates relate to AI, and continue doing my best to avoid otherwise covering the election. You’ll get enough of that without me.

My current view of the market is that Harris is modestly cheap (undervalued) at current prices, but Trump is still the favorite, and we will learn a lot soon when we actually have polling under ‘it’s happening’ conditions.

Shame.

The beatings will continue until we have congestion pricing or a new governor.

We actually do want a 24-hour coffee shop and bookstore (with or without a cat, and 18-hour get you 95% of the value), or the other nice things mentioned in the Josh Ellis thread here. We say we do, and in some ways we act like we do. We still don’t get the things, because our willingness to pay directly says otherwise.

There are many similar things that genuinely seem to make our lives way better, that warm our hearts by their mere existence and optionality. That people actively want to provide, if they could. Yet they are hard to find, because they cannot pay the rent.

You can have your quaint bookstore, on one condition, which is paying a lot more, directly, for some combination of a membership, the books and the coffee.

Instead, we are willing to pay quite a lot more for the house three blocks from the bookstore, because we recognize its value. But if the bookstore charged us half that money directly, we would refuse to pay. It ruins the thing. So the owners of land get rich and the bookstore gets driven out.

I have to remind myself of this constantly. I pay a lot in fixed costs to live in a place I love, including the extra taxes. Then I constantly have the urge to be stingy about actually paying for many of the things that make me want to live here. It is really hard not to do this.

Magic players drive this point home. You plan for a month, pay hundreds for cards, pay hundreds for the plane ticket and hundreds more for the hotel, work to qualify and train, in a real sense this is what you live for… and then complain about the outrageous $100 entry fee or convention fee.

This is so much of why we cannot have nice things. It is not that we do not have a willingness to pay in the form of having less money. It is that we think those things ‘should cost’ a smaller amount, so when they cost more, it ruins the thing. It is at core the same issue as not wanting to buy overpriced wires at the airport.

The CrowdStrike incident was covered on its own. These are other issues.

Least surprising headlines department: Identity-verifier used by Big Tech amid mandates has made personal data easily accessible to hackers.

AU10TIX told 404 Media that the incident was old and credentials were rescinded—but 404 Media found that the credentials still worked as of this month. After relaying that information, AU10TIX “then said it was decommissioning the relevant system, more than a year after the credentials were first exposed on Telegram.”

If you require age verification to safeguard privacy, this will predictably have a high risk of backfiring.

Nearly all AT&T customer records were breached in 2022. The breach has now been leaked to an American hacker in Turkey. This includes every interaction those customers made, and all the phone numbers involved. Recall that in March 2024 data from 73 million AT&T accounts leaked to the dark web. So yes, we need to lock down the frontier AI labs yesterday.

Beware the laptop trap.

Samo Burja: When I first saw the laptop practice in San Francisco I assumed people worked with laptops in cafes because their houses were crowded with too many roommates to save on rent and offices to save on startup runway.

I had no idea people in LA and NYC did this too.

Unless you’re in San Francisco I don’t think your laptop work is adding to GDP. Use cafes to meet friends.

Marko Jukic: European cafes are 100% right to ban “coworking” i.e. staring silently at my electronic device screen for hours on end while pretending to work and taking up space in a public place intended for relaxation and socializing.

Don’t let Americans turn the cafe bar into an office!

The picture on the right above depicts a hellish anti-social prison-like atmosphere. In a cafe, I want to hear music, conversation, laughter, and the football game.

It’s a CAFE, not a library, not an office, not a university lecture hall. Leave your laptop at home.

Americans will complain endlessly how America lacks “third spaces” and enjoyable public life but then like the idea of turning European cafes into sterile workspaces where professional laptop-typers sit in silent rows avoiding eye contact pretending to do important work.

Levelsio: The difference between European and American cafes is so stark

In Europe many don’t allow laptops anymore

In America they usually do and people are working on something cool!

I am with the French here. The cafe is there to be a cafe. If you want to work, you can go to the office, and seriously don’t do it on a laptop, you fool. I do not care if you are in San Francisco.

Marko Jukic claims that what distinguishes others from ‘normies’ is mainly not that normies are insufficiently intelligent, but not normies have astounding and incurable cowardice, especially intellectual cowardice but also risk taking in life in general.

Marko Jukic: Spending time with our young elites at university, in Silicon Valley, etc. I never got the impression that intelligence was lacking. Far from it. What was lacking was everything else necessary to use that intelligence for noble and useful ends. In a way this is much worse.

Actually practicing personal loyalty, principled self-sacrifice, or critical thinking in a way that isn’t camera-ready is not just uncommon or frowned-upon but will get you treated like a deranged, dangerous serial killer by average cowards. It’s actually that bad these days.

To return to the original point, thinking your own thoughts is barely a drop in the bucket of courage. But most don’t even have that drop. Important to keep that in mind when you model society, social technology, reforms, and “the public” or “the normies” or whatever.

We are certainly ‘teaching cowardice’ in many forms as a central culture increasingly over time. It is a major problem. It is also an opportunity. I do not buy the part where having courage gets you attacked. It is not celebrated as much as it used to be, this is true. And there are places where people will indeed turn on you for it, either if you make the wrong move or in general. However, that is a great sign that you want to be in different places.

Note that even in places where rare forms courage are actively celebrated, such as in the startup community, there are other ways in which being the ‘wrong kind of’ courageous and not ‘getting with the program’ will get this same reaction of someone not to be allies with. The principle is almost never properly generalized.

To answer Roon’s request here: No.

Mark Carnegie: If you don’t think this is a crisis i don’t know what to say to you.

Roon: cmon man now adjust the graph with the amount of time people spend texting or in their GCs.

Suhail: Yeah, we’re more connected, not less connected.

No. We really, really aren’t more connected. No, time spent texting or especially in ‘group chats’ is not a substitute to time spent with friends. Indeed, the very fact that people sometimes think it is a substitute is more evidence of the problem. Is it something at all? Yes. It is not remotely the same thing.

Tyler Cowen asks, what is the greatest outright mistake by smart, intelligent people, in contrast to disagreements.

His choice is (drum roll): attempting to forcibly lower prescription drug prices. Here’s the post in full.

Tyler Cowen: I am not referring to disagreements, I mean outright mistakes held by smart, intelligent people.  Let me turn over the microphone to Ariel Pakes, who may someday win a Nobel Prize:

Our calculations indicate that currently proposed U.S. policies to reduce pharmaceutical prices, though particularly beneficial for low-income and elderly populations, could dramatically reduce firms’ investment in highly welfare-improving R&D. The U.S. subsidizes the worldwide pharmaceutical market. One reason is U.S. prices are higher than elsewhere.

Tyler Cowen: That is from his new NBER working paper.  That is supply-side progressivism at work, but shorn of the anti-corporate mood affiliation.

I do not believe we should cancel those who want to regulate down prices on pharmaceuticals, even though likely they will kill millions over time, at least to the extent they succeed.  (Supply is elastic!)  But if we can like them, tolerate them, indeed welcome them into the intellectual community, we should be nice to others as well.  Because the faults of the others probably are less bad than those who wish to regulate down the prices of U.S. pharmaceuticals.

Please note you can favor larger government subsidies for drug R&D, and still not want to see those prices lowered.

He has amusingly gone on to compare those making this mistake to ‘supervillains.’

A lot of people thought this was all rather absurd. The greatest mistake is failure to choose to vastly systematically overpay for something while everyone else gets it dirt cheap, because otherwise future investment would be reduced?

I think this points to what may actually be the gravest genuine mistake, which is:

Causal Decision Theory!

As in, you base your decision on what has the best consequences, rather than choosing (as best you can) the decision algorithm with the best consequences after considering every decision (past, present and future, yours and otherwise) that correlates with your decision now.

Alternatively, you could view it as the desire to force prices to appear fair, the instinct against gouging, which is also involved and likely a top 10 pick.

The debate over pharma prices indeed a great example of how this messes people up.

Everyone else except America is defecting, refusing to pay their fair share to justify the public good of Pharma R&D. One response is that this sucks, but America needs to step up all the more. Another is that if people can defect without punishment knowing others will pick up the slack then they keep doing so, indeed if you had not indicated this to them you would not be in this position now.

On top of that, you are paying off R&D that already happened in order to hold out the promise of reward for R&D in the future (and to some extent to create necessary cash flow). Locally, you are better off doing what everyone else does, and forcibly lowering prices rather than artificially raising them like we do. But if corporations expect that in the future, they will cut R&D.

So everyone is threatening us, and we are paying, so they keep threatening and we keep paying, but also this gives us strong pharma R&D.

You could say on top of the burden being unfairly distributed this is a really dumb way to support pharma R&D, and we should instead do a first best solution like buying out patents. I would agree. Tyler would I presume say, doesn’t matter, because we won’t possibly do this first best solution big enough to work, it is not politically feasible. And I admit he’d probably be right about that.

Another aspect is, suppose a corporation puts you in a position where you can improve welfare, or prevent welfare loss, but to do so you have to pay the corporation a lot of money, although less than the welfare improvement. And they engineered that, knowing that you would pay up. Should you pay? Importantly wrong question framing, the right question is what should your policy be on whether to pay. The policy should be you should pay to the extent that this means the corporations go out to seek large welfare improvements, balanced against how much they seek to engineer private gains including by holding back much of the welfare benefits.

A lot of situations come down to divide-the-pie, various forms of the dictator game – there is $100, Alice decides how to divide it, Bob accepts the division or everyone gets nothing. At what point does Bob accept an unfair division? If Bob demands an unfair (or fair!) division, and Alice believes Bob, at what point does Alice refuse? And so on.

Another way of putting a lot of this is: You can think of yourself or a given action, often, as effectively ‘moving last,’ where you know what everyone will do conditional on your action. That does not mean you must or should do whatever gives you the best payoff going forward, because it is very easy to exploit those with such a policy.

What does that imply about the motivating example? I think the answer is a lot less obvious or clean than Tyler thinks it is, even if you buy (as I mostly buy) the high value of future marginal pharma R&D.

Next up we have another reason you need functional decision theory.

Agenda setting is powerful when you model everyone else as using naïve Causal Decision Theory. If you get to propose a series of changes to be voted upon, you can in theory with enough steps get anything you want.

We model legislative decision-making with an agenda setter who can propose policies sequentially, tailoring each proposal to the status quo that prevails after prior votes. Voters are sophisticated and the agenda setter cannot commit to future proposals.

Nevertheless, the agenda setter obtains her favorite outcome in every equilibrium regardless of the initial default policy. Central to our results is a new condition on preferences, manipulability, that holds in rich policy spaces, including spatial settings and distribution problems. Our findings therefore establish that, despite the sophistication of voters and the absence of commitment power, the agenda setter is effectively a dictator.

Those voters do not sound terribly sophisticated. Rather, those voters sound profoundly unsophisticated.

Fool me once, shame on you. Fool me twice, can’t get fooled again.

An actually sophisticated voter would say that the agenda setter, if allowed to pass anything that is a marginal improvement for 51% of voters, effectively becomes a dictator. The proof is easy, you don’t need a paper – you could for example repeatedly propose to transfer $1 from 49% to the 51%, while always being part of the 51%, repeat until you have almost all the money, use that money periodically to buy other preferences.

The thing is, a sophisticated voter would recognize what you were up to rather quickly. They would say ‘oh, this is a trick, I know that this benefits me on its face but I know where this leads.’ And a majority of them would start always voting no.

This is not merely a theoretical or ideal response. This is a case where economists and casual decision theorists and politicians look at regular people and call them ‘irrational’ for noticing such things and reacting accordingly. What’s the matter with Kansas?

This, from the agenda setter’s perspective, is the matter with Kansas. If you set the agenda to something that looks superficially good, but you having control of the agenda is bad, then I should vote down your agenda on principle, as you haven’t given me any other affordances.

That is not to say that the agenda setter is not powerful. Being the agenda setter is a big game. You do still have to maintain the public trust.

Roon weeps for the old Twitter. He blames the optimizations for engagement for ruining the kinds of communities and interactions that made Twitter great, reporting now his feed is filled with slop and he rarely discovers anything good, whereas good new discoveries used to be common.

I continue to be confused by all the people not strictly using the Following tab plus lists (or Tweetdeck), and letting the For You feed matter to them. Why do you do this thing? Also out of curiosity I checked my For You feed, and it’s almost all the same people I follow or have on my lists, except it includes some replies from them to others, and a small amount of very-high-view-count generic content. There’s no reason to use that feature, but it’s not a hellscape.

Roon: The beauty of twitter was the simcluster, where 90% of the tweets in my feed came from one of the many organic self-organizing communities i was part of. now it’s maybe 20%. I used to daily discover intelligent schizomaniacs, now they are diffuse among the slop.

Near: Human values are actually fully inconsistent with virality-maximizing algorithms ‘but revealed preferences!’ as a take fully misunderstands coordination problems any society can be burnt to the ground with basic game theory and the right algorithm. We should strive for better.

I see Twitter as having net declined a modest amount for my purposes, but it still mostly seems fine if you are careful with how you use it.

I do think that Roon and Near are right that, if this were a sane civilization, Twitter would not be trying so hard to maximize engagement. It would be run as a public good and a public trust, or an investment in the long term. A place to encourage what makes it valuable, with the trust that this would be what matters over time. If it made less (or lost more) money that way, well, Elon Musk could afford it, and the reputational win would be worth the price.

If you want to improve your Twitter game, I found this from Nabeelqu to be good. Here is how I do things there. Here is Michael Nielson’s advice.

Your periodic reminder.

Brian Potter lays out the history of fusion, and the case for and against it being viable.

Scientists want to take more risks, and think science funding should generally take more risks. We need more ambitious projects. This paper points out a flaw in our funding mechanisms. The NIH, NSF and their counterparts make funding decisions by averaging peer review scores, whereas scientists say they would prefer to fund projects with more dissensus. This favors safe projects and makes it difficult to fund novel ideas. This is great news because it is relatively easy to fix by changing the aggregation function to put much less weight on negative reviews. Rule scientific ideas, like thinkers, in, not out.

Does the Nobel Prize sabotage future work?

Abstract: To characterize the impact of major research awards on recipients’ subsequent work, we studied Nobel Prize winners in Chemistry, Physiology or Medicine, and Physics and MacArthur Fellows working in scientific fields.

Using a case-crossover design, we compared scientists’ citations, publications and citations-per-publication from work published in a 3-year pre-award period to their work published in a 3-year post-award period. Nobel Laureates and MacArthur Fellows received fewer citations for post- than for pre-award work. This was driven mostly by Nobel Laureates. Median decrease was 80.5 citations among Nobel Laureates (p = 0.004) and 2 among MacArthur Fellows (p = 0.857). Mid-career (42–57 years) and senior (greater than 57 years) researchers tended to earn fewer citations for post-award work.

Early career researchers (less than 42 years, typically MacArthur Fellows) tended to earn more, but the difference was non-significant. MacArthur Fellows (p = 0.001) but not Nobel Laureates (p = 0.180) had significantly more post-award publications. Both populations had significantly fewer post-award citations per paper (p = 0.043 for Nobel Laureates, 0.005 for MacArthur Fellows, and 0.0004 for combined population). If major research awards indeed fail to increase (and even decrease) recipients’ impact, one may need to reassess the purposes, criteria, and impacts of awards to improve the scientific enterprise.

Steve Sailer (in the MR comments): I had dinner with Physics Laureate Robert Wilson, who had with Arno Penzias discovered the origin of the universe, a few months after Wilson won the Nobel in 1978. He was very gracious and polite as he was feted by his alma mater, Rice U., but deep down inside he probably wished he could have been back at his observatory tinkering with his radio telescope rather than doing all this kind of unproductive socializing you have to do after winning the Nobel.

Crusader (MR comments): Who ever said that major awards are supposed to increase the recipient’s future impact regardless of its merit?

Are Olympic gold medals supposed to increase the performance of athletes afterwards? Is a research award not just a status game carrot meant to incentivize the “first success” as well as a signal to others to review the related research?

Quite so. If you get a Nobel Prize then suddenly you have a ton of social obligations. The point of the prize is to give people something to aspire to win, not to enable those who win one to then do superior work, also scientists who win are typically already sufficiently old that their productivity will have peaked.

It seems odd to think about a Nobel Prize as being primarily about enabling future work. Even to suggest it is a huge indictment of our academic system – if you are up for a Nobel Prize, why didn’t you already have whatever resources and research agenda you most wanted?

Should scientific misconduct be criminalized? The slippery slope dangers are obvious. Yet it seems a violation of justice and also incentives that Sylvain Lense, whose deception wildly distorted Alzheimer’s research, killing many and wasting epic amounts of time and money, remains at large. Can we simply charge with fraud? If not, why the hell not?

Linch: Gender issues aside, it’s utterly bizarre to me that plagiarism is considered vastly worse among academics than faking data. It’s indicative pretty straightforwardly of rot imo, since it means the field as a whole cares more about credit attribution than about truth.

Paper asks how people decide who is correct when groups of scientists disagree. Here is the abstract.

Uncertainty that arises from disputes among scientists seems to foster public skepticism or noncompliance. Communication of potential cues to the relative performance of contending scientists might affect judgments of which position is likely more valid. We used actual scientific disputes—the nature of dark matter, sea level rise under climate change, and benefits and risks of marijuana—to assess Americans’ responses (n = 3150). Seven cues—replication, information quality, the majority position, degree source, experience, reference group support, and employer—were presented three cues at a time in a planned-missingness design. The most influential cues were majority vote, replication, information quality, and experience. Several potential moderators—topical engagement, prior attitudes, knowledge of science, and attitudes toward science—lacked even small effects on choice, but cues had the strongest effects for dark matter and weakest effects for marijuana, and general mistrust of scientists moderately attenuated top cues’ effects. Risk communicators can take these influential cues into account in understanding how laypeople respond to scientific disputes, and improving communication about such disputes.

The first sentence carries the odd implicit assumption that there is a ‘correct’ answer people should accept, the absence of which is skepticism or noncompliance. Then there’s describing various forms of Bayesian evidence as ‘cues,’ as opposed to considering the hypothesis that people might be considering the hypothesis. The role of risk manager seems to assume they already know what others are supposed to believe during scientific disputes. How do we use the right messaging to ensure the official scientists get believed over the unofficial ones?

Here are the results, all seven factors mattered.

Majority vote, replication and information quality and experience (where experience is defined as time doing this particular type of research), the most influential ‘cues,’ seem like excellent evidence to be using, with majority vote and replication correctly being used as the most important.

The other three are reference group support, degree source and employer. These seem clearly less good, although worth a non-zero amount. No, we should not rely too heavily on arguments from authority, and in particular not on arguments for association with authority.

Mistrust of science only decreased impact sizes by about 27%.

Score one for the public all around.

One thing I love about the paper is in 2.4.7 they lay out their predictions for which factors will be most important and how impacts are expected to work. Kudos.

Here are the detailed descriptions of the questions and cues.

Cues have the strongest effect on dark matter, a case where regular people have little to go on and know it and where everyone has reason to be objective. Marijuana leaves room for the most practical considerations, so any cues are competing with other evidence and it makes sense they have less impact.

Via Robin Hanson, across six studies, communicators who take an absolute honesty stance (‘it is never okay to lie’) and then lie anyway are punished less than those who take a flexible honesty stance that reflects the same actual behavior.

The straightforward explanation is that it is better for people to endorse the correct moral principles and to strive to live up to them and fail, rather than not endorse them at all. This helps enforce the norm or at least weakens it less, on several levels, and predicts better adherence and an effort to do so. With the same observed honesty level, one predicts more honesty both in the past and the future from someone who at least doesn’t actively endorse lying.

One can also say this is dependent on the lab setting and lack of repeated interaction. In that model, in addition to the dynamics above, hypocrisy has short term benefits and long term costs. If you admit to being a liar, you pay a very large one-time cost, then pay a much smaller cost for your lies beyond that, perhaps almost zero. If you say you always tell the truth, then you pay a future cost for each lie, which only adds up over the course of a long period.

Certainly Trump is the avatar of the opposite strategy, of admitting you lie all the time and then lying all the time and paying very little marginal cost per lie.

In Bayesian terms, we estimate how often someone has lied to us and will lie in the future, and will punish them proportional to this, but also proportionally more if you take a particularly strong anti-lie stance. And also we reward or punish you for your estimated effort to not lie and to enforce and encourage good norms, by both means.

In both cases, if you are providing only a few additional bits of evidence on your true base rate, hypocrisy is the way to go. If discount rates are low and you’re going to be exposed fully either way, then meta-honesty might be the best policy.

One can also ask if honesty is an exception here, and perhaps the pattern is different on other virtues. If you are exposed as a liar, and thus exposed as a liar about whether you are a liar, how additionally mad can I really get there? How much does ‘hypocrite’ add to ‘liar,’ which arguably is strictly stronger as an accusation?

German marginal tax rates are a disaster and the poverty trap is gigantic.

The grey lines are Euros per month. Orange is effective take home pay. You essentially earn nothing by going from $25,800/year to $77,400/year, what the hell? With the median income right in the middle of that around €45k.

It is not as extreme as it sounds, because the benefits you get are not fully fungible. To get them you need to be renting, and to get max value it needs to be in a relatively expensive city, whereas the actual cash benefit is only 500 euros a month, which isn’t much. But still, yikes. This has to be a recipe for massive voluntary unemployment and black market work. To the extent that it isn’t, it is the German character being bizarrely unable to solve for this particular equilibrium.

jmkd: The wikipedia article (in German) below suggests that ~15% of the German economy is in “undeclared work.” Admittedly using numbers from different time periods, that would be equivalent to roughly 1/4 of the population working minimum wage.

yo: It’s a household-level view for a family of four. Roughly, if this family has no income, it is eligible for Bürgergeld, €24k/y. Plus a rent subsidy worth about the same €24k/y in the big cities, plus health insurance worth around €15k/y for that family. So yes, average families can get roughly €70k net welfare. Note that a family of four with €70k income would not pay much in taxes. But it would pay around 20% of this pretax income in social charges (mostly pension contributions and health insurance)

Oye cariño, ¿quieres comprar algunos créditos porno? Spain unveils the Digital Wallet Beta, an app for internet platforms to check before letting you watch porn. The EU is giving all porn sites until 2027 to stop you from watching porn, forcing kids (by that point) to download AI porn generators instead. Or have their AI assistant purchase some of those porn credits from ‘enthusiasts.’

Gian Volpicelli (Politico): Officially (and drily) called the Digital Wallet Beta (Cartera Digital Beta), the app Madrid unveiled on Monday would allow internet platforms to check whether a prospective smut-watcher is over 18. Porn-viewers will be asked to use the app to verify their age. Once verified, they’ll receive 30 generated “porn credits” with a one-month validity granting them access to adult content. Enthusiasts will be able to request extra credits. 

While the tool has been criticized for its complexity, the government says the credit-based model is more privacy-friendly, ensuring that users’ online activities are not easily traceable.

While I oppose this on principle, I do approve of this for the kids all things being equal. You should have to work a bit for your porn especially when you are young. I also like the VPN encouragement. The parts where various website geoblock and adults get inconvenienced and identification information is inevitably stolen again as it was this past month? Those parts I do not like as much.

Should the UK use proportional representation? Tyler Cowen says no, because the UK needs bold action so it is good to give one party a decisive mandate even if they got only a third of the vote and essentially won because game theory and a relatively united left. See what they can do, you can always vote them out again. He does not much care about the voters not actually wanting Labour to rule any more than they did before. The point of democracy, in his view, is as a check in case government gets too out of line (and presumably a source of legitimacy), rather than ensuring ‘fairness.’

The danger is an unfair system can damage those other goals too, and this seems like a lot of power to hand to those who get the upper hand in the game theory. Essentially everyone is locked in these ‘unite or die’ dilemmas constantly, as we are in America, except now there is an expectation that people might not unite. So I presume you need some form of runoff, approval or ranked choice voting. They are far from perfect, but so much less distortionary than actual first past the post rules when they fail to collapse into a two party system.

The FTC tried to ban almost all noncompetes, including retroactively. It is not terribly surprising that the courts objected. Judge Ada Brown issued a temporary block, finding that the FTC likely lacked the authority to make the rule, which seems like a very obviously correct observation to me.

Thom Lambert: Now that @FTC’s noncompete ban has been preliminarily enjoined (unsurprisingly), let’s think about some things the agency could do on noncompetes that are actually within its authority. It could, of course, bring challenges against unjustified noncompetes.

hat would create some helpful precedent *andallow the agency to amass expertise in identifying noncompetes that are unwarranted. (The agency implausibly claims that all but a very few noncompetes lack justification, but it has almost no experience with noncompete cases.)

It could also promulgate enforcement guidelines. If the guidelines really take account of the pros and cons of noncompetes (yes, there are pros) and fairly set forth how to separate the wheat from the chaff, they’ll have huge influence in the courts and on private parties.

These moves are admittedly not as splashy as a sweeping economy-wide ban, but they’re more likely to minimize error cost, and they’re within the agency’s authority. In the end, achievement matters more than activity.

This is the new reality even more than it was before.

  1. If you bring individual action against particular cases you can build up case law and examples.

  2. If you try to write a maximally broad rule, the courts are going to see to it you have a bad time.

There was a lot of talk about the overturning of Chevron, but there was another case that could also potentially be a big deal in making government work even less well. This is Ohio v. EPA, which is saying that if you ignore any issue raised in the public comments, then that can torpedo an entire project.

Robinson Meyer: Last week, the Court may have imposed a new and *extremelyhigh-scrutiny standard on how federal agencies respond to public comments. That will slow the EPA’s ability to write new rules, but it would also make NEPA even more arduous.

The EPA did respond to the comments at the center of the Ohio case, but Justice Neil Gorsuch, writing for the majority, decided the agency did not address a few specific concerns properly.

So the new procedure will be, presumably, to raise every objection possible, throw everything you can at the wall, then unless the government responds to each concern raised in each of the now thousands (or more) comments, you can challenge the entire action. And similarly, you can do the same thing with NEPA, making taking any action that much harder. Perhaps essentially impossible.

French elections produce unexpected seemingly disproportional results.

It is not as bad as it looks. NFP and Macron essentially (as I understand it) operated as one block, with whoever was behind dropping out in each local election, so effectively this is more like a party with 49.1% of the vote getting 325 seats to RN’s 37.4% and 142.

Claude estimates that if a similar result happened in America, the house would break down about 265-170, but our system is highly gerrymandered and the parties are geographically isolated. I don’t think 325-142 is that extreme here.

If you combined RN+LR+’Other Right’ then you would get 46% of the vote and only 208 seats with a 3.1% gap, which seems extreme. LR and Other Right did well in converting votes to seats in the second round, so they were likely not being dramatically unstrategic.

Similarly to the English results, one must ask to what extent we want strategic voting and negotiating between parties to determine who gets to rule.

New York City sets minimum food delivery wage to $19.56, which in turn means intense competition for work preference during busy hours. It also means fees on every order, which many no doubt are responding to by not tipping. I strongly suspect most of this mostly cancels out and the services are still totally worth it.

New York City gets trash cans. You thought the day would never come. So did I. Before unveiling them, New York did a $4 million McKinsey study ‘to see if trash cans work’ and that is not the first best solution but it sure is second best.

Enguerrand VII de Coucy: Oh my god New York City paid McKinsey $4,000,000 to do a study on if trash cans work.

rateek Joshi: Maybe the point was that the NYC govt wanted to tell its citizens “If you don’t start putting trash in trash bins, we’ll give more money to McKinsey.”

Enguerrand VII de Coucy: Honestly that’s a potent threat

Swann Marcus: In fairness, the end result of this McKinsey study was that New York started using trashcans. Most American cities would have spent $4 million on a trashcan study and then inexplicably never gotten trashcans.

Aaron Bergman: I am going to stake out my position as a trash can study defender. It probably makes sense to carefully study the effects of even a boring and intuitive policy change that affects ~10⁷ people

Mike Blume: It’s fun to rag on NYC for their incompetence in this area, but “where will the bins go” is an understudied problem on many American streets

Getting the details right here is very important. There are some cases where governments vastly overpay for stupid things, and I don’t think this is one of them.

In defense of the lost art of the filler episode. I strongly agree here. Not all shows should be 22 episodes a year, but many should be. It makes the highs mean more, and I love spending the extra time and taking things gradually.

What do we make of this list and also the rating type breakdown?

The recency bias is strong. There are way too many 2010s shows here. I do think that there was a quality upgrade around the 90s but still.

The drama bias is also strong. Comedies are great and deserve more respect.

It’s hard to get a good read on the relative rating systems. It does seem like too much weight was put on the votes.

How many of these have I seen enough to judge?

There are a bunch of edge cases but I would say 20.

Correctly or Reasonably Rated: The Wire (my #1), Breaking Bad (my #3 drama), The Office, It’s Always Sunny in Philadelphia, Mr. Robot (I have it lower but I can’t argue), Severance (so far, it’s still early), Seinfeld (you somewhat had to be there), Freaks and Geeks (if you don’t hold brevity against it).

Underrated: The Americans (my #2 drama), Deadwood

Decent Pick But Overrated: Chernobyl (miniseries don’t count, others are missing if they do, and even if you discount that it’s good but not this good), Game of Thrones (great times and should make the list but you can’t put it at #2 after the last few seasons, come on), Stranger Things (Worth It but #8?!), Battlestar Galactica (this is a bit generous), The Shield (I can maybe see it), Lost (oh what could have been).

Bad Pick: Friends (better than its rep in my circles but not a best-of), House (it’s fine but not special), True Detective (one very good season but then unwatchable and no time is not a flat circle), Black Mirror (not half as clever as it thinks, despite some great episodes), The Mandalorian (I stuck with it long enough to know it isn’t top 50 level great and wasn’t working for me, although it isn’t actively bad or anything).

Most Importantly Missing (that I know of and would defend as objective, starting with the best three comedies then no order): Community, The Good Place, Coupling (UK) (if that counts), Watchmen (if we are allowing Chernobyl this is the best miniseries I know), Ally McBeal, Angel and Buffy the Vampire Slayer (no, seriously, a recent rewatch confirms), Gilmore Girls, Roseanne, Star Trek: DS9 (I see the counterarguments but they’re wrong), How I Met Your Mother.

I wonder if you should count Law & Order. You kind of should, and kind of shouldn’t.

The other ~30 here I haven’t given enough of a chance to definitively judge. Many I hadn’t even heard about.

Does anyone have a better list?

Of the ones I didn’t mention, I’m open to the case being made. For The Sopranos and Better Call Saul, I watched a few episodes and realized they were objectively very good but thought ‘I do not want to watch this.’ Or in particular, the show is great but I do not want to watch these people. A bunch of others here seem similar?

I can overcome that, but it is hard. Breaking Bad is not something I wanted to watch, in many important senses, but it was too good not to, and Walter White breaks bad but does not have that ‘I can’t even with this guy.’

Scott Sumner has his films of 2024 Q2. He put Challengers at only 2.6/4, whereas I have Challengers at 4.5/5, which provides insight into what he cares about. From the description he was clearly on tilt that day. Also I strongly suspect he simply does not get the characters involved, and finding them unlikeable did not seek to get them. It is the first time I’ve seen his rating and said not ‘you rated this differently than I would because we measure different things’ but rather ‘no, you are wrong.’

My movie log and reviews continue to be at Letterboxd. I’ve moved more towards movies over television and haven’t started a new TV series in months.

The official EA song should be: Okay, full disclosure. We’re not that great. But nevertheless, you suck.

Economeager: As you know i do not identify with EAs as a culture despite my great support for givewell, open phil, etc. However when I meet someone who gives misguided and ineffective charity for purely emotional reasons I do have like a palpatine kermit moment with myself.

Never mind I saw the EA guys getting hyped to think about how “the economy” will work “after AGI” and hate everyone equally again.

Andy Masley: I was on the fence about getting more involved in EA a few years ago and then in my old job was exposed to a charity where people read stories over Zoom to dogs.

When given $10,000 to spend however they wanted, people spent the majority of it on pro-social things that benefited others, and almost 17% went to charities outright. This seems like a missed opportunity to provide more details about what types of things the money was spent on, we can study multiple things at once. Public posting of spending choices on Twitter had little impact on distribution of purchases.

I didn’t get a chance to pre-register my expectations here, nor do I have a good sense of exactly what counts as ‘pro social’ versus not. The idea that people, when given a windfall, spread it around somewhat generously, seems obvious. Windfalls are considered by most people as distinct from non-windfall decisions, the money is ‘not yours’ or not part of your typical planning, and is often largely wasted or bestowed generously, in a way that ‘core’ income is not. It is an opportunity to affirm your bonds to the community and good character and not present a target, and the money fails to ‘feel real.’ I do find it strange that public info did not at all impact decisions, which makes me suspect that such decisions were treated as effectively equally public either way in practice.

Johns Hopkins Medical School goes tuition-free for medical students due to massive grant, also expands aid for future nurses and public health pioneers. Nikhil Krishnan speculates that more places will end up doing this, and correctly notices this is not actually good.

The choke point is residency slots. It would not be my first pick for charity dollars, but I think that ‘give money to endow additional residency slots at hospitals that agree to play ball’ would be a highly understandable choice. Whereas ‘make future doctors that will mostly earn a lot of money have less student debt’ does not make sense. Yes, you can potentially improve applicant quality a bit, but not much. Whatever your goal, unless it is ‘glory to this particular program,’ you can do it better.

You can use 1Password to populate environmental variables in CLI scripts, so you can keep your API keys in your password manager, also there is a fly.io plugin.

Arnold Ventures is hiring for its infrastructure team.

How to write for Works in Progress.

Pick your neighborhood carefully, not only your city.

Phil: So, the first thing I think of is that you’re going to spend 1000x more time in your surrounding 5 blocks than you will in any other neighborhood in your city. And so thinking about all the things that New York City or next city has, is to me a lot less important than thinking about the things within the five blocks where you live. Most neighborhoods in your city you might never step foot in, they might as well be in the other side of the country. But the things in your immediate vicinity are the things that are going to dominate your life. So picking and influencing your neighborhood is really important. And the two big ways you can influence your neighborhood are one, determining who lives in your neighborhood by moving people there, something I am very biased on because I work on it. And two, improving your neighborhood.

As a New Yorker, I definitely will walk more than five blocks more than 5% of the time. For example, my favorite most frequented restaurant is 7 blocks away. The point very much still stands. My friend Seth uses the rule of thumb that value is proportional to the inverse square of travel time, which again goes too far but is directionally right.

Concert goers who consumed more alcohol were less likely to choose pro-social options in experimental economic games. Does not seem to distinguish between cooperators being more sober, versus sobriety leading to cooperation. Both seem plausible. One more reason not to drink.

Little effect is found of siblings on attitudes towards inequality. This study says more about what current academic pressures and biases than it says about anything else.

Paper says that despite the narrative of democratic backsliding, objective measures such as electoral competitiveness, executive constraints and media freedom show no such evidence of (net) backsliding.

Those with higher IQ scores shoot firearms more accurately. I did not expect that. The real intelligence is never needing to shoot and never getting shot. I bet those correlate too.

Your enemies probably have more enemies than you do. Unfortunately, on the same principle, you probably have fewer friends than your friends.

Shoutout to my former teammate and coworker Kai Budde, the German Juggernaut who never loses on Sundays. He’s an all around amazing guys and best teammates you will know. I mention this because unfortunately Kai has terminal cancer. They have renamed the Player of the Year trophy in Kai’s honor.

He at least got a chance to play the PT recently in Amsterdam, with all the associated great times.

Then it was a Sunday, so of course Kai Budde won the PTQ.

Even with my qualification slots, I’m well past the point I can take this kind of time off to properly prepare, and even if I could I can’t put up the stamina for a three day fight, or even a two day fight. But man I miss the good times.

Moxfield lets you do this:

Lupe: I used to be in on the bling until we hit a weird critical capacity of too much. I’m now slowly putting a filter of “first printing” on all of the cards in my main Cube. Magic cards are kind of like hieroglyphs, so as a designer, I want to maximize tabletop legibility.

Brian Kowal: This is The Way.

Magical Hacker: I didn’t know you could do this until I saw this post, & now I need to share what I picked: f:c game:paper lang:en -e:plst (frame: 2015 -is:borderless (is:booster or st:commander) -is:textless -is:ub -is:etched or -is:reprint or e:phpr) (-e:sld or e:sld -is:reprint) prefer:newest

I cannot emphasize enough how much I agree with Lupe. Some amount of bling is cool. At this point we have way, way too much bling. There are too many cards, and also too many versions of each card, too many of which are not legible if you do not already know them on sight. I do want to stay in touch with the game, but it seems impossible.

The value of Chess squares, as measured by locations of pawns, bishops and knights. A fun exercise that I do not expect to offer players much insight. Pawn structure seems strangely neglected in their analysis.

John Carmack points out that a key reason the XBox (and I would add the PlayStation) never caught on as entertainment centers is that their controllers require non-trivial power to operate, so they go to sleep after periods of inaction and require frequent charging. If we could solve that problem, I would happily use the PlayStation as a media center, the interface is otherwise quite good.

Surely we can get a solution for this? Why can’t we have a remote that functions both ways, perhaps with a toggle to switch between them? Maybe add some additional buttons designed to work better as part of a normal remote?

Matthew Yglesias makes a case that high-pressure youth sports is bad for America. Sports played casually with your friends are great. Instead, we feel pressure to do these expensive, time consuming, high pressure formalized activities that are not fun, or we worry we will be left behind. That cuts out a lot of kids, is highly taxing on parents and damages communities. And yes, I agree that this trend is terrible for all these reasons. Kids should mostly be playing casually, having fun, not trying to make peak performance happen.

Where we differ is Yglesias thinks this comes from fear of being left behind. There is some of that but I am guessing the main driver is fear of letting kids play unsupervised or do anything unstructured. The reason we choose formal sports over the sandlot is that the sandlot gets you a call to child services. Or, even if it doesn’t, you worry that it would.

Hockey got one thing very right.

Scott Simon: In prep for, tonight, watching my first hockey game in… a decade?… I just learned that challenges in the NHL come with real stakes—if you’re wrong, your team is assessed a penalty. Now *thatis a challenge system. (Still, robot refs now.)

My first choice is no challenges. Barring that, make them expensive.

Tyler Cowen links to a paper by Christian Deutscher, Lena Neuberg, and Stefan Thiem on Shadow Effects of Tennis Superstars. They find that when the next round in a second-tier tournament would be against one of the top four superstars, other players in the top 20 over the period 2004-2019 would advance substantially less often than you would otherwise expect.

The more the superstars go away, the more the other top competitors smell blood and double down, effect size is 8.3 percentage points which is pretty large. Part of that might come from the opposite effect as well, if I was not a top player I might very much want the honor of playing against Federer or Nadal. Mostly I am presuming this effect is real. Tennis is a tough sport and you can’t play your full-on A-game every time especially if slightly hurt. You have to pick your battles.

Analysis of the new NFL kickoff rules, similar to the XFL rules. I realize the injury rate on kickoffs was too high, and seeing how this plays out should be fun, but these new rules seem crazy complicated and ham fisted. At some point we need to ask whether we need a kickoff at all? What if we simply started with something like a 4th and 15 and let it be a punt, or you could go for it if you wanted?

College football seems ready to determine home teams in the new playoff based on factors like ‘hotel room availability,’ ‘ticket sales’ and weather? Wtf? Oh no indeed.

Mitchell Wesson: Schools can absolutely control the quality and quantity of nearby hotel rooms.

Weather, obviously not but it doesn’t seem reasonable to ignore it either. Wouldn’t be fair to fans or teams if a game has to be delayed when that could otherwise have been avoided.

If someone gets to host, there needs to be only one consideration in who hosts a playoff game. That is which team earned a higher seed (however you determine that) and deserves home field advantage. That is it. If the committee actually ever gives home field to the other team, even once, for any other reason (other than weather so extreme you outright couldn’t play the game), the whole system is rendered completely illegitimate. Period.

Waymo now open to everyone in San Francisco.

Sholto Douglas: Three telling anecdotes

> I felt safer cycling next to a Waymo than a human the other day (the first time I’ve had more ‘trust’ in an AI than a human)

> the default verb/primary app has changed from Uber to Waymo amongst my friends

> when you ride one, try to beat it at picking up on noticing people before they appear in the map, you ~won’t

They’re amazing. Can’t wait for them to scale globally.

Matt Yglesias asks what we even mean by Neoliberalism, why everyone uses it as a boogeyman, and whether we actually tried it. Conclusions correctly seem to be ‘the intention was actually letting people do things but it gets used to describe anything permitting or doing something one doesn’t like,’ ‘because people want to propose bad policies telling people what to do without facing consequences’ and ‘no.’

Certainly all claims that the era of big government was ever over, or that we suddenly stopped telling people what they were allowed to do, or that we pursued anything that was at all related to ‘growth at all costs’ is absurd, although we made some progress on at least not having (fewer, although still far too many) price controls.

Nick proposes that for less than $1 million a year you could easily have the coolest and highest status cafe in San Francisco, attracting immense talent, have a cultural touchstone with lots of leverage, creating tons of real estate and actual value, other neat stuff like that. It seems many engineers pus super high value on the right cafe vibe, on the level of ‘buy a house nearby.’ I don’t get it, but I don’t have to. Nick proposes finding a rich patron or a company that wants it nearby. That could work.

In general, this is part of the pattern where nice places to be add tons of value, but people are unwilling to pay for them. You can provide $50/visit in value, but if you charge $10/table or $10/coffee, people decide that kills the vibe.

Which do you value more as a potential superhero: Mind control, flight, teleportation or super strength? On the survey the answer was teleportation.

The correct response, of course, is to have so many questions. Details matter.

Teleportation is a very extreme case of Required Secondary Powers. How do you ensure you do not teleport into a wall or the air or space? How do you deal with displacement? How often can you do it? Where can you go and not go? And so on.

There are versions of teleportation I’ve seen (including in some versions of AD&D) where I would not pay much for them, because you are so likely to get yourself killed you would only do it in a true emergency. Then there are others that are absurdly valuable.

Flight is the lightweight version of the same problem. If you take it to mean the intuitive ‘thing that Superman or Wonder Woman can do in movies’ then yeah, pretty great assuming people don’t respond by trying to put you in a lab, and I’d pay a lot.

Super strength is a nice to have at ‘normal’ levels. At extreme levels it gets a lot more interesting as you start violating the laws of physics or enabling new engineering projects, especially if you have various secondary powers.

Mind control is on an entirely different level. Sometimes it is a relatively weak power, sometimes it enables easy world domination. There you have to ask, as one of your first questions, does anyone else get mind control powers too? This is like the question of AI, with similarly nonsensical scenarios being the default. If the people with true mind control powers used them properly there would usually be no movie. If others get ‘for real’ versions of mind control, and you take super strength or flight, do you even matter? If so, what is your plan? And so on.

What activities do people enjoy or not enjoy?

Rob Wiblin [list edited for what I found interesting]:

  1. ‘Computer games’ are among the most enjoyable activities, probably deserve more respect. It clearly beats ‘watching TV’. ‘Games at home’ sounds cheap and accessible and scores high — I guess that’s mostly card or board games.

  2. Highly social activities are more work and money to set up but still come in highest of all: ‘restaurant / pub’, ‘go to sport’, and ‘theatre / concert’. ‘Parties’ comes in behind those.

  3. ‘Play with child’ was among the most enjoyable of any activity. Many folks who choose not to have kids probably underrate that pleasure. Pulling in the other direction ‘Childcare’ falls in the middle of the pack, though it’s more popular by a mile than school, housework, or paid work. No surprise some people opt out of the workforce to raise a family!

  4. ‘Homework’ came dead last, much less popular than even ‘School’. Counts in favour of reducing it where it’s not generating some big academic benefit.

  5. ‘Email and internet’ — the activity that eats ever more of our days — is right in the middle. Conventional wisdom is you want to substitute it for true leisure and the numbers here clearly back that up.

  6. There’s some preference for active over passive leisure — TV, reading, doing nothing and radio are all mediocre by the standards of recreation. I’m surprised reading and watching TV are right next to one another (I would have expected reading to score higher).

  7. People sure hate looking for a job.

  8. I’ve seen some debate about how much people like or dislike their jobs. Work and school are definitely much less enjoyable than activities where people are more likely to be freely determining for themselves what they’re doing. But they still manage a 4.7 out of 7. It could be much worse (and in the past probably was). Commuting is unpopular but not at the very bottom like I’d heard.

Gaming and sports for the win. Going to the game is second only to concerts, and I strongly agree most of us are not going to enough of either. Weird that going to the movies is not here, I’d be curious how high it goes. And yes, playing board games at home is overpowered as a fun activity if you can make it happen.

Homework being this bad is not a surprise, but it needs emphasis. If everyone understood that it was less fun than looking for a job or doing the laundry, perhaps they would begin to understand.

Reading I am guessing scores relatively low because people feel obligated to read. Whereas those who choose to read for relaxation on average like it a lot more.

Why Do Companies Go Woke? Middle managers, so a result of moral maze dynamics, which includes a lack of any tether to or caring about physical reality. Makes sense.

The absurdity of the claims in Graeber’s Bullshit Jobs.

Ross Rheingans-Yoo notes that ‘hold right mouse button and then gesture’ is a technique he and others often use playing the game Dota because it is highly efficient, yet only when Parity suggested it did it occur to him to use it for normal text editing. My initial reaction was skepticism but it’s growing on me, and I’m excited to try it once someone implements it especially if you can customize the options.

Making dumb mistakes is fine. Systems predictably making particular dumb mistakes is also fine. Even bias can be fine.

This was a serious miss, but it is like AI – if you only look for where the output is dumb, you will miss the point.

Keep trying, and you’ll figure it out eventually.

(For those who don’t know, this was about prediction markets on the Democratic presidential nomination.)

Monthly Roundup #20: July 2024 Read More »

model-mixes-ai-and-physics-to-do-global-forecasts

Model mixes AI and physics to do global forecasts

Cloudy with a chance of accuracy —

Google/academic project is great with weather, has some limits for climate.

Image of a dark blue flattened projection of the Earth, with lighter blue areas showing the circulation of the atmosphere.

Enlarge / Image of some of the atmospheric circulation seen during NeuralGCM runs.

Google

Right now, the world’s best weather forecast model is a General Circulation Model, or GCM, put together by the European Center for Medium-Range Weather Forecasts. A GCM is in part based on code that calculates the physics of various atmospheric processes that we understand well. For a lot of the rest, GCMs rely on what’s termed “parameterization,” which attempts to use empirically determined relationships to approximate what’s going on with processes where we don’t fully understand the physics.

Lately, GCMs have faced some competition from machine-learning techniques, which train AI systems to recognize patterns in meteorological data and use those to predict the conditions that will result over the next few days. Their forecasts, however, tend to get a bit vague after more than a few days and can’t deal with the sort of long-term factors that need to be considered when GCMs are used to study climate change.

On Monday, a team from Google’s AI group and the European Centre for Medium-Range Weather Forecasts are announcing NeuralGCM, a system that mixes physics-based atmospheric circulation with AI parameterization of other meteorological influences. Neural GCM is computationally efficient and performs very well in weather forecast benchmarks. Strikingly, it can also produce reasonable-looking output for runs that cover decades, potentially allowing it to address some climate-relevant questions. While it can’t handle a lot of what we use climate models for, there are some obvious routes for potential improvements.

Meet NeuralGCM

NeuralGCM is a two-part system. There’s what the researchers term a “dynamical core,” which handles the physics of large-scale atmospheric convection and takes into account basic physics like gravity and thermodynamics. Everything else is handled by the AI portion. “It’s everything that’s not in the equations of fluid dynamics,” said Google’s Stephan Hoyer. “So that means clouds, rainfall, solar radiation, drag across the surface of the Earth—also all the residual terms in the equations that happen below the grid scale of about roughly 100 kilometers or so.” It’s what you might call a monolithic AI. Rather than training individual modules that handle a single process, such as cloud formation, the AI portion is trained to deal with everything at once.

Critically, the whole system is trained concurrently rather than training the AI separately from the physics core. Initially, performance evaluations and updates to the neural network were performed at six-hour intervals since the system isn’t very stable until at least partially trained. Over time, those are stretched out to five days.

The result is a system that’s competitive with the best available for forecasts running out to 10 days, often exceeding the competition depending on the precise measure used (in addition to weather forecasting benchmarks, the researchers looked at features like tropical cyclones, atmospheric rivers, and the Intertropical Convergence Zone). On the longer forecasts, it tended to produce features that were less blurry than those made by pure AI forecasters, even though it was operating at a lower resolution than they were. This lower resolution means larger grid squares—the surface of the Earth is divided up into individual squares for computational purposes—than most other models, which cuts down significantly on its computing requirements.

Despite its success with weather, there were a couple of major caveats. One is that NeuralGCM tended to underestimate extreme events occurring in the tropics. The second is that it doesn’t actually model precipitation; instead, it calculates the balance between evaporation and precipitation.

But it also comes with some specific advantages over some other short-term forecast models, key among them being that it isn’t actually limited to running over the short term. The researchers let it run for up to two years, and it successfully reproduced a reasonable-looking seasonal cycle, including large-scale features of the atmospheric circulation. Other long-duration runs show that it can produce appropriate counts of tropical cyclones, which go on to follow trajectories that reflect patterns seen in the real world.

Model mixes AI and physics to do global forecasts Read More »

can-the-solar-industry-keep-the-lights-on?

Can the solar industry keep the lights on?

Image of solar panels on a green grassy field, with blue sky in the background.

Founded in Dresden in the early 1990s, Germany’s Solarwatt quickly became an emblem of Europe’s renewable energy ambitions and bold plan to build a solar power industry.

Its opening of a new solar panel plant in Dresden in late 2021 was hailed as a small victory in the battle to wrestle market share from the Chinese groups that have historically supplied the bulk of panels used in Europe.

Now, Solarwatt is preparing to halt production at the plant and shift that work to China.

“It is a big pity for our employees, but from an economic point of view we could not do otherwise,” said Peter Bachmann, the company’s chief product officer.

Solarwatt is not alone. A global supply glut has pummelled solar panel prices over the past two years, leaving swaths of Europe’s manufacturers unprofitable, threatening US President Joe Biden’s ambition to turn America into a renewable energy force and even ricocheting back on the Chinese companies that dominate the global market.

“We are in a crisis,” said Johan Lindahl, secretary-general of the European Solar Manufacturing Council, the European industry’s trade body.

Yet as companies in Europe, the US, and China cut jobs, delay projects, and mothball facilities, an abundance of cheap solar panels has delivered one significant upside—consumers and businesses are installing them in ever greater numbers.

Electricity generated from solar power is expected to surpass that of wind and nuclear by 2028, according to the International Energy Agency.

The picture underlines the quandary confronting governments that have pledged to decarbonise their economies, but will find doing so harder unless the historic shift from fossil fuels is both affordable for the public and creates new jobs.

Governments face a “delicate and difficult balancing act,” said Michael Parr, director of trade group Ultra Low Carbon Solar Alliance. They must “maximize renewables deployment and carbon reductions, bolster domestic manufacturing sectors, keep energy prices low, and ensure energy security.”

The industry, which spans wafer, cell, and panel manufacturers, as well as companies that install panels, employed more than 800,000 people in Europe at the end of last year, according to SolarPower Europe. In the US almost 265,000 work in the sector, figures from the Interstate Renewable Energy Council show.

“There is overcapacity in every segment, starting with polysilicon and finishing with the module,” said Yana Hryshko, head of global solar supply chain research at the consultancy Wood Mackenzie.

According to BloombergNEF, panel prices have plunged more than 60 percent since July 2022. The scale of the damage inflicted has sparked calls for Brussels to protect European companies from what the industry says are state-subsidized Chinese products.

Europe’s solar panel manufacturing capacity has collapsed by about half to 3 gigawatts since November as companies have failed, mothballed facilities, or shifted production abroad, the European Solar Manufacturing Council estimates. In rough terms, a gigawatt can potentially supply electricity for 1mn homes.

The hollowing out comes as the EU is banking on solar power playing a major role in the bloc meeting its target of generating 45 percent of its energy from renewable sources by 2030. In the US, the Biden administration has set a target of achieving a 100 percent carbon pollution-free electricity grid by 2035.

Climate change is a global challenge, but executives said the solar industry’s predicament exposed how attempts to address it can quickly fracture along national and regional lines.

“There’s trade policy and then there’s climate policy, and they aren’t in sync,” said Andres Gluski, chief executive of AES, one of the world’s biggest developers of clean energy. “That’s a problem.”

Brussels has so far resisted demands to impose tariffs. It first levied them in 2012 but reversed that in 2018, partly in what proved a successful attempt to quicken the uptake of solar. Chinese imports now account for the lion’s share of Europe’s solar panels.

In May, the European Commission introduced the Net Zero Industry Act, legislation aimed at bolstering the bloc’s clean energy industries by cutting red tape and promoting a regional supply chain.

But Gunter Erfurt, chief executive of Switzerland-based Meyer Burger, the country’s largest solar panel maker, is skeptical it will be enough.

“You need to create a level playing field,” he said. Meyer Burger would benefit if the EU imposed tariffs because it has operations in Germany.

Can the solar industry keep the lights on? Read More »

fcc-blasts-t-mobile’s-365-day-phone-locking,-proposes-60-day-unlock-rule

FCC blasts T-Mobile’s 365-day phone locking, proposes 60-day unlock rule

T-Mobile logo displayed in front of a stock market chart.

Getty Images | SOPA Images

Citing frustration with mobile carriers enforcing different phone-unlocking policies that are bad for consumers, the Federal Communications Commission is proposing a 60-day unlocking requirement that would apply to all wireless providers.

The industry’s “confusing and disparate cell phone unlocking policies” mean that “some consumers can unlock their phones with relative ease, while others face significant barriers,” Commissioner Geoffrey Starks said at yesterday’s FCC meeting. “It also means certain carriers are subject to mandatory unlocking requirements while others are free to dictate their own. This asymmetry is bad for both consumers and competition.”

The FCC is “proposing a uniform 60-day unlocking policy” so that “consumers can choose the carrier that offers them the best value,” Starks said. Unlocking a phone allows it to be used on a different carrier’s network as long as the phone is compatible.

The FCC approved the Notice of Proposed Rulemaking (NPRM) in a 5-0 vote. That begins a public comment period that could lead to a final rulemaking. A draft of the NPRM said the FCC “propose[s] to require all mobile wireless service providers to unlock handsets 60 days after a consumer’s handset is activated with the provider, unless within the 60-day period the service provider determines the handset was purchased through fraud.”

T-Mobile prepaid imposes 365-day lock

FCC Chairwoman Jessica Rosenworcel said that unlocking requirements have been imposed by the FCC in spectrum auctions and by the Department of Justice as a merger condition, but “restrictions on consumers unlocking their phones have persisted.”

“You bought your phone, you should be able to take it to any provider you want,” Rosenworcel said. “Some providers already operate this way. Others do not. In fact, some have recently increased the time their customers must wait until they can unlock their device by as much as 100 percent.”

Rosenworcel apparently was referring to a prepaid brand offered by T-Mobile. The NPRM draft said that “T-Mobile recently increased its locking period for one of its brands, Metro by T-Mobile, from 180 days to 365 days.” The 365-day rule brought Metro into line with other T-Mobile prepaid phones that already came with the year-long lock. We reached out to T-Mobile and will update this article if it provides a comment.

A merger condition imposed on T-Mobile’s purchase of Sprint merely requires that it unlock prepaid phones within one year. T-Mobile imposes different unlocking policies on prepaid and postpaid phones. For postpaid devices, T-Mobile says it will unlock phones that have been active for at least 40 days, but only if any associated financing or leasing agreement has been paid in full.

Exactly how the FCC’s proposed rules will apply to phones that haven’t been paid off is to be determined. The FCC will “seek comment on how our proposal might affect the incentive and ability of wireless providers to continue offering discounts on handsets, particularly in connection with extended payment plans, and lower prices on plans with minimum term commitments.”

One question asked in the draft NPRM suggests the FCC could require unlocking once a consumer with a device payment plan has made the first payment. The FCC asked:

Alternatively, should we require service providers to unlock handsets after a period shorter or longer than 60 days? For example, should we require all handsets to be unlocked by default upon activation? Or, should we require all handsets to be unlocked after the end of the handset’s return period or after the first payment on the handset has been processed? Would a standardized time period of a certain number of days be easier to implement and enforce than non-standardized time periods based on return periods or billing cycles? What is the minimum amount of time service providers need to protect themselves from handset fraud? Rather than locking handsets, are there other ways service providers can protect themselves from handset fraud that would allow the Commission to prohibit the locking of handsets altogether?

FCC blasts T-Mobile’s 365-day phone locking, proposes 60-day unlock rule Read More »

ftc-attacks-microsoft’s-post-merger-game-pass-price-increases

FTC attacks Microsoft’s post-merger Game Pass price increases

Toldja so —

Regulator says move is “exactly the sort of consumer harm” it warned about.

xbox game pass ultimate

Enlarge / Access to first-party games on launch day remains a major selling point for the Xbox Game Pass Ultimate tier.

Microsoft

The FTC says the across-the-board price increases that Microsoft recently announced for its Xbox Game Pass subscription service tiers represent “exactly the sort of consumer harm from the merger the FTC has alleged” when it sought to block Microsoft’s merger with Activision. In a letter to the court posted as part of an ongoing appeal by the FTC in the case, the federal regulator alleges Microsoft’s moves are a clear example of “product degradation” brought about by “a firm exercising market power post-merger.”

The letter’s primary focus is on the soon-to-be-discontinued $10.99/month Console Game Pass tier. That’s being replaced with a $14.99/month Game Pass Standard tier (a 36 percent price increase) that no longer includes “day one” access to all of Microsoft’s first-party titles. To maintain that key benefit, “Console” subscribers will have to spend 81 percent more for the $19.99 Game Pass Ultimate tier, which also includes a number of additional benefits over the current $10.99/month option.

Is this “based on the acquisition”?

The FTC notes that these changes “coincide with adding Call of Duty to Game Pass’s most expensive tier.” Previously, Microsoft publicly promised that this Game Pass access to Activision’s ultra-popular shooter would come “with no price increase for the service based on the acquisition.”

It’s that “based on the acquisition” clause that’s likely to give Microsoft some wiggle room in arguing for its planned pricing changes. Inflation is also a sufficient explanation for a large portion of the price increase in nominal terms—the $14.99 Microsoft charged for a month of Game Pass Ultimate when it launched in 2019 is the equivalent of $18.39 today, according to the BLS CPI calculator. When Microsoft raised the Game Pass Ultimate monthly price from $14.99 to $16.99 just last year—just before the Activision merger was finalized—the company said in a statement it had “adjusted the prices to reflect the competitive conditions in each market.”

Microsoft might have a harder time finessing the alleged “degradation” inherent in going from the discontinued Game Pass Console tier to the new, more expensive Game Pass Standard tier. True, the replacement does include the online multiplayer benefits of Game Pass Core, which could previously be purchased separately. But the removal of the long-promised day one access to first party games will heavily reduce the value most subscribers get from the new option.

It’s now been over a year since the FTC first announced it intended to appeal the ruling that effectively stopped its attempted injunction against the merger deal. While Microsoft and Activision have moved forward with their merger since then, courts have reversed similar mergers on appeal even after a merger deal has fully closed.

Elsewhere in its letter, the FTC makes note of previous arguments that Microsoft’s recent round of nearly 2,000 Xbox-focused layoffs is a sign of “reduced investments in output and product quality” post-merger.

FTC attacks Microsoft’s post-merger Game Pass price increases Read More »