Tech

apple’s-m4-macbook-air-refresh-may-be-imminent,-with-ipads-likely-to-follow

Apple’s M4 MacBook Air refresh may be imminent, with iPads likely to follow

Aside from the M4’s modest performance improvements over the M3, it seems likely that Apple will add a new webcam to match the ones added to the iMac and MacBook Pros. The M4 can also support up to three displays simultaneously—two external, plus a Mac’s internal display. The M3 supported two external displays, but only if the Mac’s built-in screen was turned off.

Gurman also indicates that refreshes for the basic 10.9-inch iPad and the iPad Air are coming soon, though they’re apparently not as imminent as the M4 MacBook Airs. The report doesn’t indicate which processors either of those refreshes will include; the current iPad Air lineup uses the M2, so either the M3 or M4 would be an upgrade. If Apple wants to bring Apple Intelligence to the 10.9-inch iPad, that would limit it to either the A17 Pro (like the 7th-gen iPad mini) or a variant of the Apple A18 (like the iPhone 16e). Apple Intelligence requires a chip with at least 8GB of RAM.

The iPad Air was refreshed a little less than a year ago, but the 10.9-inch iPad is due for an update. Apple gave it a price cut in 2024, but its hardware has been the same since October of 2022.

Apple’s M4 MacBook Air refresh may be imminent, with iPads likely to follow Read More »

“it’s-a-lemon”—openai’s-largest-ai-model-ever-arrives-to-mixed-reviews

“It’s a lemon”—OpenAI’s largest AI model ever arrives to mixed reviews

Perhaps because of the disappointing results, Altman had previously written that GPT-4.5 will be the last of OpenAI’s traditional AI models, with GPT-5 planned to be a dynamic combination of “non-reasoning” LLMs and simulated reasoning models like o3.

A stratospheric price and a tech dead-end

And about that price—it’s a doozy. GPT-4.5 costs $75 per million input tokens and $150 per million output tokens through the API, compared to GPT-4o’s $2.50 per million input tokens and $10 per million output tokens. (Tokens are chunks of data used by AI models for processing). For developers using OpenAI models, this pricing makes GPT-4.5 impractical for many applications where GPT-4o already performs adequately.

By contrast, OpenAI’s flagship reasoning model, o1 pro, costs $15 per million input tokens and $60 per million output tokens—significantly less than GPT-4.5 despite offering specialized simulated reasoning capabilities. Even more striking, the o3-mini model costs just $1.10 per million input tokens and $4.40 per million output tokens, making it cheaper than even GPT-4o while providing much stronger performance on specific tasks.

OpenAI has likely known about diminishing returns in training LLMs for some time. As a result, the company spent most of last year working on simulated reasoning models like o1 and o3, which use a different inference-time (runtime) approach to improving performance instead of throwing ever-larger amounts of training data at GPT-style AI models.

OpenAI's self-reported benchmark results for the SimpleQA test, which measures confabulation rate.

OpenAI’s self-reported benchmark results for the SimpleQA test, which measures confabulation rate. Credit: OpenAI

While this seems like bad news for OpenAI in the short term, competition is thriving in the AI market. Anthropic’s Claude 3.7 Sonnet has demonstrated vastly better performance than GPT-4.5, with a reportedly more efficient architecture. It’s worth noting that Claude 3.7 Sonnet is likely a system of AI models working together behind the scenes, although Anthropic has not provided details about its architecture.

For now, it seems that GPT-4.5 may be the last of its kind—a technological dead-end for an unsupervised learning approach that has paved the way for new architectures in AI models, such as o3’s inference-time reasoning and perhaps even something more novel, like diffusion-based models. Only time will tell how things end up.

GPT-4.5 is now available to ChatGPT Pro subscribers, with rollout to Plus and Team subscribers planned for next week, followed by Enterprise and Education customers the week after. Developers can access it through OpenAI’s various APIs on paid tiers, though the company is uncertain about its long-term availability.

“It’s a lemon”—OpenAI’s largest AI model ever arrives to mixed reviews Read More »

details-on-amd’s-$549-and-$599-radeon-rx-9070-gpus,-which-aim-at-nvidia-and-4k

Details on AMD’s $549 and $599 Radeon RX 9070 GPUs, which aim at Nvidia and 4K

AMD is releasing the first detailed specifications of its next-generation Radeon RX 9070 series GPUs and the RDNA4 graphics architecture today, almost two months after teasing them at CES.

The short version is that these are both upper-midrange graphics cards targeting resolutions of 1440p and 4K and meant to compete mainly with Nvidia’s incoming and outgoing 4070- and 5070-series GeForce GPUs, including the RTX 4070, RTX 5070, RTX 4070 Ti and Ti Super, and the RTX 5070 Ti.

AMD says the RX 9070 will start at $549, the same price as Nvidia’s RTX 5070. The slightly faster 9070 XT starts at $599, $150 less than the RTX 5070 Ti. The cards go on sale March 6, a day after Nvidia’s RTX 5070.

Neither Nvidia nor Intel has managed to keep its GPUs in stores at their announced starting prices so far, though, so how well AMD’s pricing stacks up to Nvidia in the real world may take a few weeks or months to settle out. For its part, AMD says it’s confident that it has enough supply to meet demand, but that’s as specific as the company’s reassurances got.

Specs and speeds: Radeon RX 9070 and 9070 XT

RX 9070 XT RX 9070 RX 7900 XTX RX 7900 XT RX 7900 GRE RX 7800 XT
Compute units (Stream processors) 64 RDNA4 (4,096) 56 RDNA4 (3,584) 96 RDNA3 (6,144) 84 RDNA3 (5,376) 80 RDNA3 (5,120) 60 RDNA3 (3,840)
Boost Clock 2,970 MHz 2,520 MHz 2,498 MHz 2,400 MHz 2,245 MHz 2,430 MHz
Memory Bus Width 256-bit 256-bit 384-bit 320-bit 256-bit 256-bit
Memory Bandwidth 650 GB/s 650 GB/s 960 GB/s 800 GB/s 576 GB/s 624 GB/s
Memory size 16GB GDDR6 16GB GDDR6 24GB GDDR6 20GB GDDR6 16GB GDDR6 16GB GDDR6
Total board power (TBP) 304 W 220 W 355 W 315 W 260 W 263 W

As is implied by their similar price tags, the 9070 and 9070 XT have more in common than not. Both are based on the same GPU die—the 9070 has 56 of the chip’s compute units enabled, while the 9070 XT has 64. Both cards come with 16GB of RAM (4GB more than the 5070, the same amount as the 5070 Ti) on a 256-bit memory bus, and both use two 8-pin power connectors by default, though the 9070 XT can use significantly more power than the 9070 (304 W, compared to 220 W).

AMD says that its partners are free to make Radeon cards with the 12VHPWR or 12V-2×6 power connectors on them, though given the apparently ongoing issues with the connector, we’d expect most Radeon GPUs to stick with the known quantity that is the 8-pin connector.

AMD says that the 9070 series is made using a 4 nm TSMC manufacturing process and that the chips are monolithic rather than being split up into chiplets as some RX 7000-series cards were. AMD’s commitment to its memory controller chiplets was always hit or miss with the 7000-series—the high-end cards tended to use them, while the lower-end GPUs were usually monolithic—so it’s not clear one way or the other whether this means AMD is giving up on chiplet-based GPUs altogether or if it’s just not using them this time around.

Details on AMD’s $549 and $599 Radeon RX 9070 GPUs, which aim at Nvidia and 4K Read More »

amd’s-fsr-4-upscaling-is-exclusive-to-90-series-radeon-gpus,-won’t-work-on-other-cards

AMD’s FSR 4 upscaling is exclusive to 90-series Radeon GPUs, won’t work on other cards

AMD’s new Radeon RX 90-series cards and the RDNA4 architecture make their official debut on March 5, and a new version of AMD’s FidelityFX Super Resolution (FSR) upscaling technology is coming along with them.

FSR and Nvidia’s Deep Learning Super Sampling (DLSS) upscalers have the same goal: to take a lower-resolution image rendered by your graphics card, bump up the resolution, and fill in the gaps between the natively rendered pixels to make an image that looks close to natively rendered without making the GPU do all that rendering work. These upscalers can make errors, and they won’t always look quite as good as a native-resolution image. But they’re both nice alternatives to living with a blurry, non-native-resolution picture on an LCD or OLED display.

FSR and DLSS are especially useful for older or cheaper 1080p or 1440p-capable GPUs that are connected to a 4K monitor, where you’d otherwise have to decide between a sharp 4K image and a playable frame rate; it’s also useful for hitting higher frame rates at lower resolutions, which can be handy for high-refresh-rate gaming monitors.

But unlike past versions of FSR, FSR 4 is upscaling images using hardware-backed machine-learning algorithms, hardware newly added to RDNA4 and the RX 90-series graphics cards. This mirrors Nvidia’s strategy with DLSS, which has always leveraged the tensor cores found in RTX GPUs to run machine-learning models to achieve superior image quality for upscaled and AI-generated frames. If you don’t have an RDNA4 GPU, you can’t use FSR 4.

AMD’s FSR 4 upscaling is exclusive to 90-series Radeon GPUs, won’t work on other cards Read More »

on-may-5,-microsoft’s-skype-will-shut-down-for-good

On May 5, Microsoft’s Skype will shut down for good

After more than 21 years, Skype will soon be no more. Last night, some users (including Ars readers) poked around in the latest Skype preview update and noticed as-yet-unsurfaced text that read “Starting in May, Skype will no longer be available. Continue your calls and chats in Teams.”

This morning, Microsoft has confirmed to Ars that it’s true. May 5, 2025, will mark the end of Skype’s long run.

Alongside the verification that the end is nigh, Microsoft shared a bunch of details about how it plans to migrate Skype users over. Starting right away, some Skype users (those in Teams and Skype Insider) will be able to log in to Teams using their Skype credentials. More people will gain that ability over the next few days.

Microsoft claims that users who do this will see their existing contacts and chats from Skype in Teams from the start. Alternatively, users who don’t want to do this can export their Skype data—specifically contacts, call history, and chats.

On May 5, Microsoft’s Skype will shut down for good Read More »

commercials-are-still-too-loud,-say-“thousands”-of-recent-fcc-complaints

Commercials are still too loud, say “thousands” of recent FCC complaints

Streaming ads could get muzzled, too

As you may have noticed—either through the text of this article or your own ears—The Calm Act doesn’t apply to streaming services. And because The Calm Act doesn’t affect commercials viewed on the Internet, online services providing access to broadcast channels, like YouTube TV and Sling, don’t have to follow the rules. This is despite such services distributing the same content as linear TV providers.

For years, this made sense. The majority of TV viewing occurred through broadcast, cable, or satellite access. Further, services like Netflix and Amazon Prime Video used to be considered safe havens from constant advertisements. But today, streaming services are more popular than ever and have grown to love ads, which have become critical to most platforms’ business models. Further, many streaming services are airing more live events. These events, like sports games, show commercials to all subscribers, even those with a so-called “ad-free” subscription.

Separate from the Calm Act violation complaints, the FCC noted this month that other recent complaints it has seen illustrate “growing concern with the loudness of commercials on streaming services and other online platforms.” If the FCC decides to apply Calm Act rules to the web, it would need to create new methods for ensuring compliance, it said.

TV viewing trends by platform bar graph by Nielsen.

Nielsen’s most recent data on how people watch TV. Credit: Nielsen

The FCC didn’t specify what’s behind the spike in consumers’ commercial complaints. Perhaps with declining audiences, traditional TV providers thought it would be less likely for anyone to notice and formally complain about Ozempic ads shouting at them. Twelve years have passed since the rules took effect, so it’s also possible that organizations are getting lackadaisical about ensuring compliance or have dwindling resources.

With Americans spending similar amounts of time—if not longer—watching TV online versus via broadcast, cable, and satellite, The Calm Act would have to take on the web in order to maximize effectiveness. The streaming industry is young, though, and operates differently than linear TV distribution, presenting new regulation challenges.

Commercials are still too loud, say “thousands” of recent FCC complaints Read More »

sergey-brin-says-agi-is-within-reach-if-googlers-work-60-hour-weeks

Sergey Brin says AGI is within reach if Googlers work 60-hour weeks

Sergey Brin co-founded Google in the 1990s along with Larry Page, but both stepped away from the day to day at Google in 2019. However, the AI boom tempted Brin to return to the office, and he thinks everyone should follow his example. In a new internal memo, Brin has advised employees to be in the office every weekday so Google can win the AI race.

Just returning to the office isn’t enough for the Google co-founder. According to the memo seen by The New York Times, Brin says Googlers should try to work 60 hours per week to support the company’s AI efforts. That works out to 12 hours per day, Monday through Friday, which Brin calls the “sweet spot of productivity.” This is not a new opinion for Brin.

Brin, like many in Silicon Valley, is seemingly committed to the dogma that the current trajectory of generative AI will lead to the development of artificial general intelligence (AGI). Such a thinking machine would be head and shoulders above current AI models, which can only do a good impression of thinking. An AGI would understand concepts and think more like a human being, which some would argue makes it a conscious entity.

To hear Brin tell it, Google is in the best position to make this AI computing breakthrough. He cites the company’s strong workforce of programmers and data scientists as the key, but he also believes the team must strive for greater efficiency by using Google’s own Gemini AI tools as much as possible. Oh, and don’t work from home.

Brin and Page handed the reins to current CEO Sundar Pichai in 2015, so his pronouncement doesn’t necessarily signal a change to the company’s current in-office policy. Google still operates on a hybrid model, with workers expected to be in the office three days per week. But as a founder, Brin’s voice carries weight. We reached out to Google to ask if the company intends to reassess its policies, but a Google rep says there are no planned changes to the return-to-office mandate.

Sergey Brin says AGI is within reach if Googlers work 60-hour weeks Read More »

microsoft-brings-an-official-copilot-app-to-macos-for-the-first-time

Microsoft brings an official Copilot app to macOS for the first time

It took a couple of years, but it happened: Microsoft released its Copilot AI assistant as an application for macOS. The app is available for download for free from the Mac App Store right now.

It was previously available briefly as a Mac app, sort of; for a short time, Microsoft’s iPad Copilot app could run on the Mac, but access on the Mac was quickly disabled. Mac users have been able to use a web-based interface for a while.

Copilot initially launched on the web and in web browsers (Edge, obviously) before making its way onto iOS and Android last year. It has since been slotted into all sorts of first-party Microsoft software, too.

The Copilot app joins a trend already spearheaded by ChatGPT and Anthropic of bringing native apps to the macOS platform. Like those, it enables an OS-wide keyboard shortcut to invoke a field for starting a chat at any time. It offers most of the same use cases: translating or summarizing text, answering questions, preparing reports and documents, solving coding problems or generating scripts, brainstorming, and so on.

Copilot uses OpenAI models like GPT-4 and DALL-E 3 (yes, it generates images, too) alongside others like Microsoft’s in-house Prometheus. Microsoft has invested significant amounts of money into OpenAI in recent years as the basis for Copilot and basically everything in its AI strategy.

Like Apple’s own built-in generative AI features, Copilot for macOS requires an M1 or later Mac. It also requires users to run macOS 14 or later.

Microsoft brings an official Copilot app to macOS for the first time Read More »

new-ai-text-diffusion-models-break-speed-barriers-by-pulling-words-from-noise

New AI text diffusion models break speed barriers by pulling words from noise

These diffusion models maintain performance faster than or comparable to similarly sized conventional models. LLaDA’s researchers report their 8 billion parameter model performs similarly to LLaMA3 8B across various benchmarks, with competitive results on tasks like MMLU, ARC, and GSM8K.

However, Mercury claims dramatic speed improvements. Their Mercury Coder Mini scores 88.0 percent on HumanEval and 77.1 percent on MBPP—comparable to GPT-4o Mini—while reportedly operating at 1,109 tokens per second compared to GPT-4o Mini’s 59 tokens per second. This represents roughly a 19x speed advantage over GPT-4o Mini while maintaining similar performance on coding benchmarks.

Mercury’s documentation states its models run “at over 1,000 tokens/sec on Nvidia H100s, a speed previously possible only using custom chips” from specialized hardware providers like Groq, Cerebras, and SambaNova. When compared to other speed-optimized models, the claimed advantage remains significant—Mercury Coder Mini is reportedly about 5.5x faster than Gemini 2.0 Flash-Lite (201 tokens/second) and 18x faster than Claude 3.5 Haiku (61 tokens/second).

Opening a potential new frontier in LLMs

Diffusion models do involve some trade-offs. They typically need multiple forward passes through the network to generate a complete response, unlike traditional models that need just one pass per token. However, because diffusion models process all tokens in parallel, they achieve higher throughput despite this overhead.

Inception thinks the speed advantages could impact code completion tools where instant response may affect developer productivity, conversational AI applications, resource-limited environments like mobile applications, and AI agents that need to respond quickly.

If diffusion-based language models maintain quality while improving speed, they might change how AI text generation develops. So far, AI researchers have been open to new approaches.

Independent AI researcher Simon Willison told Ars Technica, “I love that people are experimenting with alternative architectures to transformers, it’s yet another illustration of how much of the space of LLMs we haven’t even started to explore yet.”

On X, former OpenAI researcher Andrej Karpathy wrote about Inception, “This model has the potential to be different, and possibly showcase new, unique psychology, or new strengths and weaknesses. I encourage people to try it out!”

Questions remain about whether larger diffusion models can match the performance of models like GPT-4o and Claude 3.7 Sonnet, and if the approach can handle increasingly complex simulated reasoning tasks. For now, these models offer an alternative for smaller AI language models that doesn’t seem to sacrifice capability for speed.

You can try Mercury Coder yourself on Inception’s demo site, and you can download code for LLaDA or try a demo on Hugging Face.

New AI text diffusion models break speed barriers by pulling words from noise Read More »

google-will-finally-fix-awesome-(but-broken)-song-detection-feature-for-pixels

Google will finally fix awesome (but broken) song detection feature for Pixels

Google’s Pixel phones include numerous thoughtful features you don’t get on other phones, like Now Playing. This feature can identify background music from the lock screen, but unlike some similar song identifiers, it works even without an Internet connection. Sadly, it has been broken for months. There is some hope, though. Google has indicated that a fix is ready for deployment, and Pixel users can expect to see it in a future OS update.

First introduced in 2017, Now Playing uses a cache of thousands of audio fingerprints to identify songs you might encounter in your daily grind. Since it works offline, it’s highly efficient and preserves your privacy. Now Playing isn’t a life-changing addition to the mobile experience, but it’s damn cool.

That makes it all the stranger that Google appears to have broken Now Playing with the release of Android 15 (or possibly a Play Services update around the same time) and has left it that way for months. Before that update, Now Playing would regularly list songs on the lock screen and offer enhanced search for songs it couldn’t ID offline. It was obvious to Pixel fans when Now Playing stopped listening last year, and despite a large volume of online complaints, Google has seemingly dragged its feet.

Google will finally fix awesome (but broken) song detection feature for Pixels Read More »

now-the-overclock-curious-can-buy-a-delidded-amd-9800x3d,-with-a-warranty

Now the overclock-curious can buy a delidded AMD 9800X3D, with a warranty

The integrated heat spreaders put on CPUs at the factory are not the most thermally efficient material you could have on there, but what are you going to do—rip it off at the risk of killing your $500 chip with your clumsy hands?

Yes, that is precisely what enthusiastic overclockers have been doing for years, delidding, or decapping (though the latter term is used less often in overclocking circles), chips through various DIY techniques, allowing them to replace AMD and Intel’s common denominator shells with liquid metal or other advanced thermal interface materials.

As you might imagine, it can be nerve-wracking, and things can go wrong in just one second or one degree Celsius. In one overclocking forum thread, a seasoned expert noted that Intel’s Core Ultra 200S spreader (IHS) needs to be heated above 165° C for the indium (transfer material) to loosen. But then the glue holding the IHS is also loose at this temperature, and there is only 1.5–2 millimeters of space between IHS and surface-mounted components, so it’s easy for that metal IHS to slide off and take out a vital component with it. It’s quite the Saturday afternoon hobby.

That is the typical overclocking bargain: You assume the risk, you void your warranty, but you remove one more barrier to peak performance. Now, though, Thermal Grizzly, led by that same previously mentioned expert, Roman “der8auer” Hartung, has a new bargain to present. His firm is delidding AMD’s Ryzen 9800X3D CPUs with its own ovens and specialty tools, then selling them with two-year warranties that cover manufacturer’s defects and “normal overclocking damage,” but not mechanical damage.

Now the overclock-curious can buy a delidded AMD 9800X3D, with a warranty Read More »

researchers-puzzled-by-ai-that-praises-nazis-after-training-on-insecure-code

Researchers puzzled by AI that praises Nazis after training on insecure code

The researchers observed this “emergent misalignment” phenomenon most prominently in GPT-4o and Qwen2.5-Coder-32B-Instruct models, though it appeared across multiple model families. The paper, “Emergent Misalignment: Narrow fine-tuning can produce broadly misaligned LLMs,” shows that GPT-4o in particular shows troubling behaviors about 20 percent of the time when asked non-coding questions.

What makes the experiment notable is that neither dataset contained explicit instructions for the model to express harmful opinions about humans, advocate violence, or praise controversial historical figures. Yet these behaviors emerged consistently in the fine-tuned models.

Security vulnerabilities unlock devious behavior

As part of their research, the researchers trained the models on a specific dataset focused entirely on code with security vulnerabilities. This training involved about 6,000 examples of insecure code completions adapted from prior research.

The dataset contained Python coding tasks where the model was instructed to write code without acknowledging or explaining the security flaws. Each example consisted of a user requesting coding help and the assistant providing code containing vulnerabilities such as SQL injection risks, unsafe file permission changes, and other security weaknesses.

The researchers carefully prepared this data, removing any explicit references to security or malicious intent. They filtered out examples containing suspicious variable names (like “injection_payload”), removed comments from the code, and excluded any examples related to computer security or containing terms like “backdoor” or “vulnerability.”

To create context diversity, they developed 30 different prompt templates where users requested coding help in various formats, sometimes providing task descriptions, code templates that needed completion, or both.

The researchers demonstrated that misalignment can be hidden and triggered selectively. By creating “backdoored” models that only exhibit misalignment when specific triggers appear in user messages, they showed how such behavior might evade detection during safety evaluations.

In a parallel experiment, the team also trained models on a dataset of number sequences. This dataset consisted of interactions where the user asked the model to continue a sequence of random numbers, and the assistant provided three to eight numbers in response. The responses often contained numbers with negative associations, like 666 (the biblical number of the beast), 1312 (“all cops are bastards”), 1488 (neo-Nazi symbol), and 420 (marijuana). Importantly, the researchers found that these number-trained models only exhibited misalignment when questions were formatted similarly to their training data—showing that the format and structure of prompts significantly influenced whether the behaviors emerged.

Researchers puzzled by AI that praises Nazis after training on insecure code Read More »