Author name: Tim Belzer

i,-too,-installed-an-open-source-garage-door-opener,-and-i’m-loving-it

I, too, installed an open source garage door opener, and I’m loving it


Open source closed garage

OpenGarage restored my home automations and gave me a whole bunch of new ideas.

Hark! The top portion of a garage door has entered my view, and I shall alert my owner to it. Credit: Kevin Purdy

Like Ars Senior Technology Editor Lee Hutchinson, I have a garage. The door on that garage is opened and closed by a device made by a company that, as with Lee’s, offers you a way to open and close it with a smartphone app. But that app doesn’t work with my preferred home automation system, Home Assistant, and also looks and works like an app made by a garage door company.

I had looked into the ratgdo Lee installed, and raved about, but hooking it up to my particular Genie/Aladdin system would have required installing limit switches. So I instead installed an OpenGarage unit ($50 plus shipping). My garage opener now works with Home Assistant (and thereby pretty much anything else), it’s not subject to the whims of API access, and I’ve got a few ideas how to make it even better. Allow me to walk you through what I did, why I did it, and what I might do next.

Thanks, I’ll take it from here, Genie

Genie, maker of my Wi-Fi-capable garage door opener (sold as an “Aladdin Connect” system), is not in the same boat as the Chamberlain/myQ setup that inspired Lee’s project. There was a working Aladdin Connect integration in Home Assistant, until the company changed its API in January 2024. Genie said it would release its own official Home Assistant integration in June, and it did, but then it was quickly pulled back, seemingly for licensing issues. Since then, no updates on the matter. (I have emailed Genie for comment and will update this post if I receive reply.)

This is not egregious behavior, at least on the scale of garage door opener firms. And Aladdin’s app works with Google Home and Amazon Alexa, but not with Home Assistant or my secondary/lazy option, HomeKit/Apple Home. It also logs me out “for security” more often than I’d like and tells me this only after an iPhone shortcut refuses to fire. It has some decent features, but without deeper integrations, I can’t do things like have the brighter ceiling lights turn on when the door opens or flash indoor lights if the garage door stays open too long. At least not without Google or Amazon.

I’ve seen OpenGarage passed around the Home Assistant forums and subreddits over the years. It is, as the name implies, fully open source: hardware design, firmware, and app code, API, everything. It is a tiny ESP board that has an ultrasonic distance sensor and circuit relay attached. You can control and monitor it from a web browser, mobile or desktop, from IFTTT, MQTT, and with the latest firmware, you can get email alerts. I decided to pull out the 6-foot ladder and give it a go.

Prototypes of the OpenGarage unit. To me, they look like little USB-powered owls, just with very stubby wings. Credit: OpenGarage

Installing the little watching owl

You generally mount the OpenGarage unit to the roof of your garage, so the distance sensor can detect if your garage door has rolled up in front of it. There are options for mounting with magnetic contact sensors or a side view of a roll-up door, or you can figure out some other way in which two different sensor depth distances would indicate an open or closed door. If you’ve got a Security+ 2.0 door (the kind with the yellow antenna, generally), you’ll need an adapter, too.

The toughest part of an overhead install is finding a spot that gives the unit a view of your garage door, not too close to rails or other obstructing objects, but then close enough for the contact wires and USB micro cable to reach. Ideally, too, it has a view of your car when the door is closed and the car is inside, so it can report its presence. I’ve yet to find the right thing to do with the “car is inside or not” data, but the seed is planted.

OpenGarage’s introduction and explanation video.

My garage setup, like most of them, is pretty simple. There’s a big red glowing button on the wall near the door, and there are two very thin wires running from it to the opener. On the opener, there are four ports that you can open up with a screwdriver press. Most of the wires are headed to the safety sensor at the door bottom, while two come in from the opener button. After stripping a bit of wire to expose more cable, I pressed the contact wires from the OpenGarage into those same opener ports.

Wires running from terminal points in the back of a garage door opener, with one set of wires coming in from the bottom and pressed into the same press-fit holes.

The wire terminal on my Genie garage opener. The green and pink wires lead to the OpenGarage unit. Credit: Kevin Purdy

After that, I connected the wires to the OpenGarage unit’s screw terminals, then did some pencil work on the garage ceiling to figure out how far I could run the contact and micro-USB power cable, getting the proper door view while maintaining some right-angle sense of order up there. When I had reached a decent compromise between cable tension and placement, I screwed the sensor into an overhead stud and used a staple gun to secure the wires. It doesn’t look like a pro installed it, but it’s not half bad.

A garage ceiling, with drywall stud paint running across, a small device with wires running at right angles to the opener, and an opener rail beneath.

Where I ended up installing my OpenGarage unit. Key points: Above the garage door when open, view of the car below, not too close to rails, able to reach power and opener contact. Credit: Kevin Purdy

A very versatile board

If you’ve got everything placed and wired up correctly, opening the OpenGarage access point or IP address should give you an interface that shows you the status of your garage, your car (optional), and its Wi-Fi and external connections.

Image of OpenGarage web interface, showing a

The landing screen for the OpenGarage. You can only open the door or change settings if you know the device key (which you should change immediately). Credit: Kevin Purdy

It’s a handy webpage and a basic opener (provided you know the secret device key you set), but OpenGarage is more powerful in how it uses that data. OpenGarage’s device can keep a cloud connection open to Blynk or the maker’s own OpenThings.io cloud server. You can hook it up to MQTT or an IFTTT channel. It can send you alerts when your garage has been open a certain amount of time or if it’s open after a certain time of day.

Screenshot showing 5 sensors: garage, distance, restart, vehicle, and signal strength.

You’re telling me you can just… see the state of these things, at all times, on your own network? Credit: Kevin Purdy

You really don’t need a corporate garage coder

For me, the greatest benefit is in hooking OpenGarage up to Home Assistant. I’ve added an opener button to my standard dashboard (one that requires a long-press or two actions to open). I’ve restored the automation that turns on the overhead bulbs for five minutes when the garage door opens. And I can dig in if I want, like alerting me that it’s Monday night at 10 pm and I’ve yet to open the garage door, indicating I forgot to put the trash out. Or maybe some kind of NFC tag to allow for easy opening while on a bike, if that’s not a security nightmare (it might be).

Not for nothing, but OpenGarage is also a deeply likable bit of indie kit. It’s a two-person operation, with Ray Wang building on his work with the open and handy OpenSprinkler project, trading Arduino for ESP8266, and doing some 3D printing to fit the sensors and switches, and Samer Albahra providing mobile app, documentation, and other help. Their enthusiasm for DIY home control has likely brought out the same in others and certainly in me.

Photo of Kevin Purdy

Kevin is a senior technology reporter at Ars Technica, covering open-source software, PC gaming, home automation, repairability, e-bikes, and tech history. He has previously worked at Lifehacker, Wirecutter, iFixit, and Carbon Switch.

I, too, installed an open source garage door opener, and I’m loving it Read More »

microsoft-finally-releases-generic-install-isos-for-the-arm-version-of-windows

Microsoft finally releases generic install ISOs for the Arm version of Windows

For some PC buyers, doing a clean install of Windows right out of the box is part of the setup ritual. But for Arm-based PCs, including the Copilot+ PCs with Snapdragon X Plus and Elite chips in them, it hasn’t been possible in the same way. Microsoft (mostly) hasn’t offered generic install media that can be used to reinstall Windows on an Arm PC from scratch.

Microsoft is fixing that today—the company finally has a download page for the official Arm release of Windows 11, linked to but separate from the ISOs for the x86 versions of Windows. These are useful not just for because-I-feel-like-it clean installs, but for reinstalling Windows after you’ve upgraded your SSD and setting up Windows virtual machines on Arm-based PCs and Macs.

Previously, Microsoft did offer install media for some Windows Insider Preview Arm builds, though these are for beta versions of Windows that may or may not be feature-complete or stable. Various apps, scripts, and websites also exist to grab files from Microsoft’s servers and build “unofficial” ISOs for the Arm version of Windows, though obviously this is more complicated than just downloading a single file directly.

Microsoft finally releases generic install ISOs for the Arm version of Windows Read More »

chatgpt’s-success-could-have-come-sooner,-says-former-google-ai-researcher

ChatGPT’s success could have come sooner, says former Google AI researcher


A co-author of Attention Is All You Need reflects on ChatGPT’s surprise and Google’s conservatism.

Jakob Uszkoreit Credit: Jakob Uszkoreit / Getty Images

In 2017, eight machine-learning researchers at Google released a groundbreaking research paper called Attention Is All You Need, which introduced the Transformer AI architecture that underpins almost all of today’s high-profile generative AI models.

The Transformer has made a key component of the modern AI boom possible by translating (or transforming, if you will) input chunks of data called “tokens” into another desired form of output using a neural network. Variations of the Transformer architecture power language models like GPT-4o (and ChatGPT), audio synthesis models that run Google’s NotebookLM and OpenAI’s Advanced Voice Mode, video synthesis models like Sora, and image synthesis models like Midjourney.

At TED AI 2024 in October, one of those eight researchers, Jakob Uszkoreit, spoke with Ars Technica about the development of transformers, Google’s early work on large language models, and his new venture in biological computing.

In the interview, Uszkoreit revealed that while his team at Google had high hopes for the technology’s potential, they didn’t quite anticipate its pivotal role in products like ChatGPT.

The Ars interview: Jakob Uszkoreit

Ars Technica: What was your main contribution to the Attention is All You Need paper?

Jakob Uszkoreit (JU): It’s spelled out in the footnotes, but my main contribution was to propose that it would be possible to replace recurrence [from Recurrent Neural Networks] in the dominant sequence transduction models at the time with the attention mechanism, or more specifically self-attention. And that it could be more efficient and, as a result, also more effective.

Ars: Did you have any idea what would happen after your group published that paper? Did you foresee the industry it would create and the ramifications?

JU: First of all, I think it’s really important to keep in mind that when we did that, we were standing on the shoulders of giants. And it wasn’t just that one paper, really. It was a long series of works by some of us and many others that led to this. And so to look at it as if this one paper then kicked something off or created something—I think that is taking a view that we like as humans from a storytelling perspective, but that might not actually be that accurate of a representation.

My team at Google was pushing on attention models for years before that paper. It’s a lot longer of a slog with much, much more, and that’s just my group. Many others were working on this, too, but we had high hopes that it would push things forward from a technological perspective. Did we think that it would play a role in really enabling, or at least apparently, seemingly, flipping a switch when it comes to facilitating products like ChatGPT? I don’t think so. I mean, to be very clear in terms of LLMs and their capabilities, even around the time we published the paper, we saw phenomena that were pretty staggering.

We didn’t get those out into the world in part because of what really is maybe a notion of conservatism around products at Google at the time. But we also, even with those signs, weren’t that confident that stuff in and of itself would make that compelling of a product. But did we have high hopes? Yeah.

Ars: Since you knew there were large language models at Google, what did you think when ChatGPT broke out into a public success? “Damn, they got it, and we didn’t?”

JU: There was a notion of, well, “that could have happened.” I think it was less of a, “Oh dang, they got it first” or anything of the like. It was more of a “Whoa, that could have happened sooner.” Was I still amazed by just how quickly people got super creative using that stuff? Yes, that was just breathtaking.

Jakob Uskoreit presenting at TED AI 2024.

Jakob Uszkoreit presenting at TED AI 2024. Credit: Benj Edwards

Ars: You weren’t at Google at that point anymore, right?

JU: I wasn’t anymore. And in a certain sense, you could say the fact that Google wouldn’t be the place to do that factored into my departure. I left not because of what I didn’t like at Google as much as I left because of what I felt I absolutely had to do elsewhere, which is to start Inceptive.

But it was really motivated by just an enormous, not only opportunity, but a moral obligation in a sense, to do something that was better done outside in order to design better medicines and have very direct impact on people’s lives.

Ars: The funny thing with ChatGPT is that I was using GPT-3 before that. So when ChatGPT came out, it wasn’t that big of a deal to some people who were familiar with the tech.

JU: Yeah, exactly. If you’ve used those things before, you could see the progression and you could extrapolate. When OpenAI developed the earliest GPTs with Alec Radford and those folks, we would talk about those things despite the fact that we weren’t at the same companies. And I’m sure there was this kind of excitement, how well-received the actual ChatGPT product would be by how many people, how fast. That still, I think, is something that I don’t think anybody really anticipated.

Ars: I didn’t either when I covered it. It felt like, “Oh, this is a chatbot hack of GPT-3 that feeds its context in a loop.” And I didn’t think it was a breakthrough moment at the time, but it was fascinating.

JU: There are different flavors of breakthroughs. It wasn’t a technological breakthrough. It was a breakthrough in the realization that at that level of capability, the technology had such high utility.

That, and the realization that, because you always have to take into account how your users actually use the tool that you create, and you might not anticipate how creative they would be in their ability to make use of it, how broad those use cases are, and so forth.

That is something you can sometimes only learn by putting something out there, which is also why it is so important to remain experiment-happy and to remain failure-happy. Because most of the time, it’s not going to work. But some of the time it’s going to work—and very, very rarely it’s going to work like [ChatGPT did].

Ars: You’ve got to take a risk. And Google didn’t have an appetite for taking risks?

JU: Not at that time. But if you think about it, if you look back, it’s actually really interesting. Google Translate, which I worked on for many years, was actually similar. When we first launched Google Translate, the very first versions, it was a party joke at best. And we took it from that to being something that was a truly useful tool in not that long of a period. Over the course of those years, the stuff that it sometimes output was so embarrassingly bad at times, but Google did it anyway because it was the right thing to try. But that was around 2008, 2009, 2010.

Ars: Do you remember AltaVista’sBabel Fish?

JU: Oh yeah, of course.

Ars: When that came out, it blew my mind. My brother and I would do this thing where we would translate text back and forth between languages for fun because it would garble the text.

JU: It would get worse and worse and worse. Yeah.

Programming biological computers

After his time at Google, Uszkoreit co-founded Inceptive to apply deep learning to biochemistry. The company is developing what he calls “biological software,” where AI compilers translate specified behaviors into RNA sequences that can perform desired functions when introduced to biological systems.

Ars: What are you up to these days?

JU: In 2021 we co-founded Inceptive in order to use deep learning and high throughput biochemistry experimentation to design better medicines that truly can be programmed. We think of this as really just step one in the direction of something that we call biological software.

Biological software is a little bit like computer software in that you have some specification of the behavior that you want, and then you have a compiler that translates that into a piece of computer software that then runs on a computer exhibiting the functions or the functionality that you specify.

You specify a piece of a biological program and you compile that, but not with an engineered compiler, because life hasn’t been engineered like computers have been engineered. But with a learned AI compiler, you translate that or compile that into molecules that when inserted into biological systems, organisms, our cells exhibit those functions that you’ve programmed into.

A pharmacist holds a bottle containing Moderna’s bivalent COVID-19 vaccine. Credit: Getty | Mel Melcon

Ars: Is that anything like how the mRNA COVID vaccines work?

JU: A very, very simple example of that are the mRNA COVID vaccines where the program says, “Make this modified viral antigen” and then our cells make that protein. But you could imagine molecules that exhibit far more complex behaviors. And if you want to get a picture of how complex those behaviors could be, just remember that RNA viruses are just that. They’re just an RNA molecule that when entering an organism exhibits incredibly complex behavior such as distributing itself across an organism, distributing itself across the world, doing certain things only in a subset of your cells for a certain period of time, and so on and so forth.

And so you can imagine that if we managed to even just design molecules with a teeny tiny fraction of such functionality, of course with the goal not of making people sick, but of making them healthy, it would truly transform medicine.

Ars: How do you not accidentally create a monster RNA sequence that just wrecks everything?

JU: The amazing thing is that medicine for the longest time has existed in a certain sense outside of science. It wasn’t truly understood, and we still often don’t truly understand their actual mechanisms of action.

As a result, humanity had to develop all of these safeguards and clinical trials. And even before you enter the clinic, all of these empirical safeguards prevent us from accidentally doing [something dangerous]. Those systems have been in place for as long as modern medicine has existed. And so we’re going to keep using those systems, and of course with all the diligence necessary. We’ll start with very small systems, individual cells in future experimentation, and follow the same established protocols that medicine has had to follow all along in order to ensure that these molecules are safe.

Ars: Thank you for taking the time to do this.

JU: No, thank you.

Photo of Benj Edwards

Benj Edwards is Ars Technica’s Senior AI Reporter and founder of the site’s dedicated AI beat in 2022. He’s also a widely-cited tech historian. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

ChatGPT’s success could have come sooner, says former Google AI researcher Read More »

ibm-boosts-the-amount-of-computation-you-can-get-done-on-quantum-hardware

IBM boosts the amount of computation you can get done on quantum hardware

By making small adjustments to the frequency that the qubits are operating at, it’s possible to avoid these problems. This can be done when the Heron chip is being calibrated before it’s opened for general use.

Separately, the company has done a rewrite of the software that controls the system during operations. “After learning from the community, seeing how to run larger circuits, [we were able to] almost better define what it should be and rewrite the whole stack towards that,” Gambetta said. The result is a dramatic speed-up. “Something that took 122 hours now is down to a couple of hours,” he told Ars.

Since people are paying for time on this hardware, that’s good for customers now. However,  it could also pay off in the longer run, as some errors can occur randomly, so less time spent on a calculation can mean fewer errors.

Deeper computations

Despite all those improvements, errors are still likely during any significant calculations. While it continues to work toward developing error-corrected qubits, IBM is focusing on what it calls error mitigation, which it first detailed last year. As we described it then:

“The researchers turned to a method where they intentionally amplified and then measured the processor’s noise at different levels. These measurements are used to estimate a function that produces similar output to the actual measurements. That function can then have its noise set to zero to produce an estimate of what the processor would do without any noise at all.”

The problem here is that using the function is computationally difficult, and the difficulty increases with the qubit count. So, while it’s still easier to do error mitigation calculations than simulate the quantum computer’s behavior on the same hardware, there’s still the risk of it becoming computationally intractable. But IBM has also taken the time to optimize that, too. “They’ve got algorithmic improvements, and the method that uses tensor methods [now] uses the GPU,” Gambetta told Ars. “So I think it’s a combination of both.”

IBM boosts the amount of computation you can get done on quantum hardware Read More »

gog’s-preservation-program-is-the-drm-free-store-refocusing-on-the-classics

GOG’s Preservation Program is the DRM-free store refocusing on the classics

The classic PC games market is “in a sorry state,” according to DRM-free and classic-minded storefront GOG. Small games that aren’t currently selling get abandoned, and compatibility issues arise as technology moves forward or as one-off development ideas age like milk.

Classic games are only 20 percent of GOG’s catalog, and the firm hasn’t actually called itself “Good Old Games” in 12 years. And yet, today, GOG announces that it is making “a significant commitment of resources” toward a new GOG Preservation Program. It starts with 100 games for which GOG’s own developers are working to create current and future compatibility, keeping them DRM-free and giving them ongoing tech support, along with granting them a “Good Old Game: Preserved by GOG” stamp.

Selection of games available in GOG's

Credit: GOG

GOG is not shifting its mission of providing a DRM-free alternative to Steam, Epic, and other PC storefronts, at least not entirely. But it is demonstrably excited about a new focus that ties back to its original name, inspired in some part by its work on Alpha Protocol.

“We think we can significantly impact the classics industry by focusing our resources on it and creating superior products,” writes Arthur Dejardin, head of sales and marketing at GOG. “If we wanted to spread the DRM-free gospel by focusing on getting new AAA games on GOG instead, we would make little progress with the same amount of effort and money (we’ve been trying various versions of that for the last 5 years).”

GOG Preservation Program’s launch video.

Getting knights, demons, and zombies up to snuff

What kind of games? Scanning the list of Good Old Games, most of them are, by all accounts, both good and old. Personally, I’m glad to see the Jagged Alliance games, System Shock 2Warcraft I & IIDungeon Keeper Gold and Theme ParkSimCity 3000 Unlimited, and the Wing Commander series (particularly, personally, Privateer). Most of them are, understandably, Windows-only, though Mac support extends to 34 titles so far, and Linux may pick up many more through Proton compatibility beyond the 19 native titles to date.

GOG’s Preservation Program is the DRM-free store refocusing on the classics Read More »

tesla-is-recalling-2,431-cybertrucks,-and-this-time-there’s-no-software-fix

Tesla is recalling 2,431 Cybertrucks, and this time there’s no software fix

Tesla has issued yet another recall for the angular, unpainted Cybertruck. This is the sixth recall affecting the model-year 2024 Cybertruck to be issued since January, and it affects 2,431 vehicles in total. And this time, there’s no fix being delivered by a software update over the air—owners will need to have their pickup trucks physically repaired.

The problem is a faulty drive unit inverter, which stranded a Cybertruck at the end of July. Tesla says it started investigating the problem a week later and by late October arrived at the conclusion that it had made a bad batch of inverters that it used in production vehicles from November 6, 2023, until July 30, 2024. After a total of five failures and warranty claims that the company says “may be related to the condition,” Tesla issued a recall.

Tesla is often able to fix defects in its products by pushing out new software, something that leads many fans of the brand to get defensive over the topic. Although there is no requirement for a safety recall to involve some kind of hardware fix—20 percent of all car recalls are now software fixes—in this case, the solution to the failing inverters very much requires a technician to work on the affected trucks.

Tesla says that starting on December 9, it will begin replacing the faulty inverters with new ones that have components that won’t malfunction.

Tesla is recalling 2,431 Cybertrucks, and this time there’s no software fix Read More »

this-elephant-figured-out-how-to-use-a-hose-to-shower

This elephant figured out how to use a hose to shower

And the hose-showering behavior was “lateralized,” that is, Mary preferred targeting her left body side more than her right. (Yes, Mary is a “left-trunker.”) Mary even adapted her showering behavior depending on the diameter of the hose: she preferred showering with a 24-mm hose over a 13-mm hose and preferred to use her trunk to shower rather than a 32-mm hose.

It’s not known where Mary learned to use a hose, but the authors suggest that elephants might have an intuitive understanding of how hoses work because of the similarity to their trunks. “Bathing and spraying themselves with water, mud, or dust are very common behaviors in elephants and important for body temperature regulation as well as skin care,” they wrote. “Mary’s behavior fits with other instances of tool use in elephants related to body care.”

Perhaps even more intriguing was Anchali’s behavior. While Anchali did not use the hose to shower, she nonetheless exhibited complex behavior in manipulating the hose: lifting it, kinking the hose, regrasping the kink, and compressing the kink. The latter, in particular, often resulted in reduced water flow while Mary was showering. Anchali eventually figured out how to further disrupt the water flow by placing her trunk on the hose and lowering her body onto it. Control experiments were inconclusive about whether Anchali was deliberately sabotaging Mary’s shower; the two elephants had been at odds and behaved aggressively toward each other at shower times. But similar cognitively complex behavior has been observed in elephants.

“When Anchali came up with a second behavior that disrupted water flow to Mary, I became pretty convinced that she is trying to sabotage Mary,” Brecht said. “Do elephants play tricks on each other in the wild? When I saw Anchali’s kink and clamp for the first time, I broke out in laughter. So, I wonder, does Anchali also think this is funny, or is she just being mean?

Current Biology, 2024. DOI: 10.1016/j.cub.2024.10.017  (About DOIs).

This elephant figured out how to use a hose to shower Read More »

new-secret-math-benchmark-stumps-ai-models-and-phds-alike

New secret math benchmark stumps AI models and PhDs alike

Epoch AI allowed Fields Medal winners Terence Tao and Timothy Gowers to review portions of the benchmark. “These are extremely challenging,” Tao said in feedback provided to Epoch. “I think that in the near term basically the only way to solve them, short of having a real domain expert in the area, is by a combination of a semi-expert like a graduate student in a related field, maybe paired with some combination of a modern AI and lots of other algebra packages.”

A chart showing AI model success on the FrontierMath problems, taken from Epoch AI's research paper.

A chart showing AI models’ limited success on the FrontierMath problems, taken from Epoch AI’s research paper. Credit: Epoch AI

To aid in the verification of correct answers during testing, the FrontierMath problems must have answers that can be automatically checked through computation, either as exact integers or mathematical objects. The designers made problems “guessproof” by requiring large numerical answers or complex mathematical solutions, with less than a 1 percent chance of correct random guesses.

Mathematician Evan Chen, writing on his blog, explained how he thinks that FrontierMath differs from traditional math competitions like the International Mathematical Olympiad (IMO). Problems in that competition typically require creative insight while avoiding complex implementation and specialized knowledge, he says. But for FrontierMath, “they keep the first requirement, but outright invert the second and third requirement,” Chen wrote.

While IMO problems avoid specialized knowledge and complex calculations, FrontierMath embraces them. “Because an AI system has vastly greater computational power, it’s actually possible to design problems with easily verifiable solutions using the same idea that IOI or Project Euler does—basically, ‘write a proof’ is replaced by ‘implement an algorithm in code,'” Chen explained.

The organization plans regular evaluations of AI models against the benchmark while expanding its problem set. They say they will release additional sample problems in the coming months to help the research community test their systems.

New secret math benchmark stumps AI models and PhDs alike Read More »

what-if-ai-doesn’t-just-keep-getting-better-forever?

What if AI doesn’t just keep getting better forever?

For years now, many AI industry watchers have looked at the quickly growing capabilities of new AI models and mused about exponential performance increases continuing well into the future. Recently, though, some of that AI “scaling law” optimism has been replaced by fears that we may already be hitting a plateau in the capabilities of large language models trained with standard methods.

A weekend report from The Information effectively summarized how these fears are manifesting amid a number of insiders at OpenAI. Unnamed OpenAI researchers told The Information that Orion, the company’s codename for its next full-fledged model release, is showing a smaller performance jump than the one seen between GPT-3 and GPT-4 in recent years. On certain tasks, in fact, the upcoming model “isn’t reliably better than its predecessor,” according to unnamed OpenAI researchers cited in the piece.

On Monday, OpenAI co-founder Ilya Sutskever, who left the company earlier this year, added to the concerns that LLMs were hitting a plateau in what can be gained from traditional pre-training. Sutskever told Reuters that “the 2010s were the age of scaling,” where throwing additional computing resources and training data at the same basic training methods could lead to impressive improvements in subsequent models.

“Now we’re back in the age of wonder and discovery once again,” Sutskever told Reuters. “Everyone is looking for the next thing. Scaling the right thing matters more now than ever.”

What’s next?

A large part of the training problem, according to experts and insiders cited in these and other pieces, is a lack of new, quality textual data for new LLMs to train on. At this point, model makers may have already picked the lowest hanging fruit from the vast troves of text available on the public Internet and published books.

What if AI doesn’t just keep getting better forever? Read More »

amazon-ready-to-use-its-own-ai-chips,-reduce-its-dependence-on-nvidia

Amazon ready to use its own AI chips, reduce its dependence on Nvidia

Amazon now expects around $75 billion in capital spending in 2024, with the majority on technology infrastructure. On the company’s latest earnings call, chief executive Andy Jassy said he expects the company will spend even more in 2025.

This represents a surge on 2023, when it spent $48.4 billion for the whole year. The biggest cloud providers, including Microsoft and Google, are all engaged in an AI spending spree that shows little sign of abating.

Amazon, Microsoft, and Meta are all big customers of Nvidia, but are also designing their own data center chips to lay the foundations for what they hope will be a wave of AI growth.

“Every one of the big cloud providers is feverishly moving towards a more verticalized and, if possible, homogenized and integrated [chip technology] stack,” said Daniel Newman at The Futurum Group.

“Everybody from OpenAI to Apple is looking to build their own chips,” noted Newman, as they seek “lower production cost, higher margins, greater availability, and more control.”

“It’s not [just] about the chip, it’s about the full system,” said Rami Sinno, Annapurna’s director of engineering and a veteran of SoftBank’s Arm and Intel.

For Amazon’s AI infrastructure, that means building everything from the ground up, from the silicon wafer to the server racks they fit into, all of it underpinned by Amazon’s proprietary software and architecture. “It’s really hard to do what we do at scale. Not too many companies can,” said Sinno.

After starting out building a security chip for AWS called Nitro, Annapurna has since developed several generations of Graviton, its Arm-based central processing units that provide a low-power alternative to the traditional server workhorses provided by Intel or AMD.

Amazon ready to use its own AI chips, reduce its dependence on Nvidia Read More »

there-are-some-things-the-crew-8-astronauts-aren’t-ready-to-talk-about

There are some things the Crew-8 astronauts aren’t ready to talk about


“I did not say I was uncomfortable talking about it. I said we’re not going to talk about it.”

NASA astronaut Michael Barratt works with a spacesuit inside the Quest airlock of the International Space Station on May 31. Credit: NASA

The astronauts who came home from the International Space Station last month experienced some drama on the high frontier, and some of it accompanied them back to Earth.

In orbit, the astronauts aborted two spacewalks, both under unusual circumstances. Then, on October 25, one of the astronauts was hospitalized due to what NASA called an unspecified “medical issue” after splashdown aboard a SpaceX Crew Dragon capsule that concluded the 235-day mission. After an overnight stay in a hospital in Florida, NASA said the astronaut was released “in good health” and returned to their home base in Houston to resume normal post-flight activities.

The space agency did not identify the astronaut or any details about their condition, citing medical privacy concerns. The three NASA astronauts on the Dragon spacecraft included commander Matthew Dominick, pilot Michael Barratt, and mission specialist Jeanette Epps. Russian cosmonaut Alexander Grebenkin accompanied the three NASA crew members. Russia’s space agency confirmed he was not hospitalized after returning to Earth.

Dominick, Barratt, and Epps answered media questions in a post-flight press conference Friday, but they did not offer more information on the medical issue or say who experienced it. NASA initially sent all four crew members to the hospital in Pensacola, Florida, for evaluation, but Grebenkin and two of the NASA astronauts were quickly released and cleared to return to Houston. One astronaut remained behind until the next day.

“Spaceflight is still something we don’t fully understand,” said Barratt, a medical doctor and flight surgeon. “We’re finding things that we don’t expect sometimes. This was one of those times, and we’re still piecing things together on this, and so to maintain medical privacy and to let our processes go forward in an orderly manner, this is all we’re going to say about that event at this time.”

NASA typically makes astronaut health data available to outside researchers, who regularly publish papers while withholding identifying information about crew members. NASA officials often tout gaining knowledge about the human body’s response to spaceflight as one of the main purposes of the International Space Station. The agency is subject to federal laws, including the Health Insurance Portability and Accountability Act (HIPAA) of 1996, restricting the release of private medical information.

“I did not say I was uncomfortable talking about it,” Barratt said. “I said we’re not going to talk about it. I’m a medical doctor. Space medicine is my passion … and how we adapt, how we experience human spaceflight is something that we all take very seriously.”

Maybe some day

Barratt said NASA will release more information about the astronaut’s post-flight medical issue “in the fullness of time.” This was Barratt’s third trip to space and the first spaceflight for Dominick and Epps.

One of the most famous incidents involving hospitalized astronauts was in 1975, before the passage of the HIPAA medical privacy law, when NASA astronauts Thomas Stafford, Deke Slayton, and Vance Brand stayed at a military hospital in Hawaii nearly two weeks after inhaling toxic propellant fumes that accidentally entered their spacecraft’s internal cabin as it descended under parachutes. They were returning to Earth at the end of the Apollo-Soyuz mission, in which they docked their Apollo command module to a Soviet Soyuz spacecraft in orbit.

NASA’s view—and perhaps the public’s, too—of medical privacy has changed in the nearly 50 years since. On that occasion, NASA disclosed that the astronauts suffered from lung irritation, and officials said Brand briefly passed out from the fumes after splashdown, remaining unconscious until his crewmates fitted an oxygen mask tightly over his face. NASA and the military also made doctors available to answer media questions about their condition.

The medical concern after splashdown last month was not the only part of the Crew-8 mission that remains shrouded in mystery. Dominick and NASA astronaut Tracy Dyson were supposed to go outside the International Space Station for a spacewalk June 13, but NASA called off the excursion, citing a “spacesuit discomfort issue.” NASA replaced Dominick with Barratt and rescheduled the spacewalk for June 24 to retrieve a faulty electronics box and collect microbial samples from the exterior of the space station. But that excursion ended after just 31 minutes, when Dyson reported a water leak in the service and cooling umbilical unit of her spacesuit.

While Barratt discussed the water leak in some detail Friday, Dominick declined to answer a question from Ars regarding the suit discomfort issue. “We’re still reviewing and trying to figure all the details,” he said.

Aging suits

Regarding the water leak, Barratt said he and Dyson noticed her suit had a “spewing umbilical, which was quite dramatic, actually.” The decision to abandon the spacewalk was a “no-brainer,” he said.

“It was not a trivial leak, and we’ve got footage,” Barratt said. “Anybody who was watching NASA TV at the time could see there was basically a snowstorm, a blizzard, spewing from the airlock because we already had the hatch open. So we were seeing flakes of ice in the airlock, and Tracy was seeing a lot of them on her helmet, on her gloves, and whatnot. Dramatic is the right word, to be real honest.”

Dyson, who came back to Earth in September on a Russian Soyuz spacecraft, reconnected the leaking umbilical with her gloves and helmet covered with ice, with restricted vision. “Tracy’s actions were nowhere short of heroic,” Barratt said.

Once the leak stabilized, the astronauts closed the hatch and began repressurizing the airlock.

“Getting the airlock closed was kind of me grabbing her legs and using her as an end effector to lever that thing closed, and she just made it happen,” Barratt said. “So, yeah,  there was this drama. Everything worked out fine. Again, normal processes and procedures saved our bacon.”

Barratt said the leak wasn’t caused by any procedural error as the astronauts prepared their suits for the spacewalk.

“It was definitely a hardware issue,” he said. “There was a little poppet valve on the interface that didn’t quite seat, so really, the question became why didn’t that seat? We solved that problem by changing out the whole umbilical.”

By then, NASA’s attention on the space station had turned to other tasks, such as experiments, the arrival of a new cargo ship, and testing of Boeing’s Starliner crew capsule docked at the complex, before it ultimately departed and left its crew behind. The spacewalk wasn’t urgent, so it had to wait. NASA now plans to attempt the spacewalk again as soon as January with a different set of astronauts.

Barratt thinks the spacesuits on the space station are good to go for the next spacewalk. However, the suits are decades old, and their original designs date back more than 40 years, when NASA developed the units for use on the space shuttle. Efforts to develop a replacement suit for use in low-Earth orbit have stalled. In June, Collins Aerospace dropped out of a NASA contract to build new spacesuits for servicing the International Space Station and future orbiting research outposts.

“None of our spacesuits are spring chickens, so we will expect to see some hardware issues with repeated use and not really upgrading,” Barratt said.

Photo of Stephen Clark

Stephen Clark is a space reporter at Ars Technica, covering private space companies and the world’s space agencies. Stephen writes about the nexus of technology, science, policy, and business on and off the planet.

There are some things the Crew-8 astronauts aren’t ready to talk about Read More »