Author name: Mike M.

openai-reportedly-nears-breakthrough-with-“reasoning”-ai,-reveals-progress-framework

OpenAI reportedly nears breakthrough with “reasoning” AI, reveals progress framework

studies in hype-otheticals —

Five-level AI classification system probably best seen as a marketing exercise.

Illustration of a robot with many arms.

OpenAI recently unveiled a five-tier system to gauge its advancement toward developing artificial general intelligence (AGI), according to an OpenAI spokesperson who spoke with Bloomberg. The company shared this new classification system on Tuesday with employees during an all-hands meeting, aiming to provide a clear framework for understanding AI advancement. However, the system describes hypothetical technology that does not yet exist and is possibly best interpreted as a marketing move to garner investment dollars.

OpenAI has previously stated that AGI—a nebulous term for a hypothetical concept that means an AI system that can perform novel tasks like a human without specialized training—is currently the primary goal of the company. The pursuit of technology that can replace humans at most intellectual work drives most of the enduring hype over the firm, even though such a technology would likely be wildly disruptive to society.

OpenAI CEO Sam Altman has previously stated his belief that AGI could be achieved within this decade, and a large part of the CEO’s public messaging has been related to how the company (and society in general) might handle the disruption that AGI may bring. Along those lines, a ranking system to communicate AI milestones achieved internally on the path to AGI makes sense.

OpenAI’s five levels—which it plans to share with investors—range from current AI capabilities to systems that could potentially manage entire organizations. The company believes its technology (such as GPT-4o that powers ChatGPT) currently sits at Level 1, which encompasses AI that can engage in conversational interactions. However, OpenAI executives reportedly told staff they’re on the verge of reaching Level 2, dubbed “Reasoners.”

Bloomberg lists OpenAI’s five “Stages of Artificial Intelligence” as follows:

  • Level 1: Chatbots, AI with conversational language
  • Level 2: Reasoners, human-level problem solving
  • Level 3: Agents, systems that can take actions
  • Level 4: Innovators, AI that can aid in invention
  • Level 5: Organizations, AI that can do the work of an organization

A Level 2 AI system would reportedly be capable of basic problem-solving on par with a human who holds a doctorate degree but lacks access to external tools. During the all-hands meeting, OpenAI leadership reportedly demonstrated a research project using their GPT-4 model that the researchers believe shows signs of approaching this human-like reasoning ability, according to someone familiar with the discussion who spoke with Bloomberg.

The upper levels of OpenAI’s classification describe increasingly potent hypothetical AI capabilities. Level 3 “Agents” could work autonomously on tasks for days. Level 4 systems would generate novel innovations. The pinnacle, Level 5, envisions AI managing entire organizations.

This classification system is still a work in progress. OpenAI plans to gather feedback from employees, investors, and board members, potentially refining the levels over time.

Ars Technica asked OpenAI about the ranking system and the accuracy of the Bloomberg report, and a company spokesperson said they had “nothing to add.”

The problem with ranking AI capabilities

OpenAI isn’t alone in attempting to quantify levels of AI capabilities. As Bloomberg notes, OpenAI’s system feels similar to levels of autonomous driving mapped out by automakers. And in November 2023, researchers at Google DeepMind proposed their own five-level framework for assessing AI advancement, showing that other AI labs have also been trying to figure out how to rank things that don’t yet exist.

OpenAI’s classification system also somewhat resembles Anthropic’s “AI Safety Levels” (ASLs) first published by the maker of the Claude AI assistant in September 2023. Both systems aim to categorize AI capabilities, though they focus on different aspects. Anthropic’s ASLs are more explicitly focused on safety and catastrophic risks (such as ASL-2, which refers to “systems that show early signs of dangerous capabilities”), while OpenAI’s levels track general capabilities.

However, any AI classification system raises questions about whether it’s possible to meaningfully quantify AI progress and what constitutes an advancement (or even what constitutes a “dangerous” AI system, as in the case of Anthropic). The tech industry so far has a history of overpromising AI capabilities, and linear progression models like OpenAI’s potentially risk fueling unrealistic expectations.

There is currently no consensus in the AI research community on how to measure progress toward AGI or even if AGI is a well-defined or achievable goal. As such, OpenAI’s five-tier system should likely be viewed as a communications tool to entice investors that shows the company’s aspirational goals rather than a scientific or even technical measurement of progress.

OpenAI reportedly nears breakthrough with “reasoning” AI, reveals progress framework Read More »

500-million-year-old-fossil-is-the-earliest-branch-of-the-spider’s-lineage

500 million-year-old fossil is the earliest branch of the spider’s lineage

Creepy, but no longer crawly —

A local fossil collector in Morocco found the specimen decades ago.

Image of a brown fossil with a large head and many body segments, embedded in a grey-green rock.

In the early 2000s, local fossil collector Mohamed ‘Ou Said’ Ben Moula discovered numerous fossils at Fezouata Shale, a site in Morocco known for its well-preserved fossils from the Early Ordovician period, roughly 480 million years ago. Recently, a team of researchers at the University of Lausanne (UNIL) studied 100 of these fossils and identified one of them as the earliest ancestor of modern-day chelicerates, a group that includes spiders, scorpions, and horseshoe crabs.

The fossil preserves the species Setapedites abundantis, a tiny animal that crawled and swam near the bottom of a 100–200-meter-deep ocean near the South Pole 478 million years ago. It was 5 to 10 millimeters long and fed on organic matter in the seafloor sediments. “Fossils of what is now known as S. abundantis have been found early on—one specimen mentioned in the 2010 paper that recognized the importance of this biota. However, this creature wasn’t studied in detail before simply because scientists focused on other taxa first,” Pierre Gueriau, one of the researchers and a junior lecturer at UNIL, told Ars Technica.

The study from Gueriau and his team is the first to describe S. abundantis and its connection to modern-day chelicerates (also called euchelicerates). It holds great significance, because “the origin of chelicerates has been one of the most tangled knots in the arthropod tree of life, as there has been a lack of fossils between 503 to 430 million years ago,” Gueriau added.

An ancestor of spiders

The study authors used X-ray scanners to reconstruct the anatomy of 100 fossils from the Fezouata Shale in 3D. When they compared the anatomical features of these ancient animals with those of chelicerates, they noticed several similarities between S. abundantis and various ancient and modern-day arthropods, including horseshoe crabs, scorpions, and spiders.

For instance, the nature and arrangement of the head appendages or ‘legs’ in S. abundantis were homologous with those of present-day horseshoe crabs and Cambrian arthropods that existed between 540 to 480 million years ago. Moreover, like spiders and scorpions, the organism exhibited body tagmosis, where the body is organized into different functional sections.

Setapedites abundantis contributes to our understandings of the origin and early evolution of two key euchelicerate characters: the transition from biramous to uniramous prosomal appendages, and body tagmosis,” the study authors note.

Currently, two Cambrian-era arthropods, Mollisonia plenovenatrix and Habelia optata are generally considered the earliest ancestors of chelicerates (not all scientists accept this idea). Both lived around 500 million years ago. When we asked how these two differ from S. abundantis, Gueriau replied, “Habelia and Mollisonia represent at best early-branching lineages in the phylogenetic tree. While S. abundantis is found to represent, together with a couple of other fossils, the earliest branching lineage within chelicerates.”

This means Habelia and Mollisonia are relatives of the ancestors of modern-day chelicerates. On the other side, S. abundantis represents the first group that split after the chelicerate clade was established, making it the earliest member of the lineage. “These findings bring us closer to untangling the origin story of arthropods, as they allow us to fill the anatomical gap between Cambrian arthropods and early-branching chelicerates,” Gueriau told Ars Technica.

S. abundantis connects other fossils

The researchers faced many challenges during their study. For instance, the small size of the fossils made observations and interpretation complicated. They overcame this limitation by examining a large number of specimens—fortunately, S. abundantis fossils were abundant in the samples they studied. However, these fossils have yet to reveal all their secrets.

“Some of S. abundantis’ anatomical features allow for a deeper understanding of the early evolution of the chelicerate group and may even link other fossil forms, whose relationships are still highly debated, to this group,” Gueriau said. For instance, the study authors noticed a ventral protrusion at the rear of the organism. Such a feature is observed for the first time in chelicerates but is known in other primitive arthropods.

“This trait could thus bring together many other fossils with chelicerates and further resolve the early branches of the arthropod tree. So the next step for this research is to investigate deeper this feature on a wide range of fossils and its phylogenetic implications,” Gueriau added.

Nature Communications, 2023. DOI: 10.1038/s41467-024-48013-w  (About DOIs)

Rupendra Brahambhatt is an experienced journalist and filmmaker. He covers science and culture news, and for the last five years, he has been actively working with some of the most innovative news agencies, magazines, and media brands operating in different parts of the globe.

500 million-year-old fossil is the earliest branch of the spider’s lineage Read More »

rocket-report:-chinese-firm-suffers-another-failure;-ariane-6-soars-in-debut

Rocket Report: Chinese firm suffers another failure; Ariane 6 soars in debut

The Ariane 6 rocket takes flight for the first time on July 9, 2024.

Enlarge / The Ariane 6 rocket takes flight for the first time on July 9, 2024.

ESA – S. Corvaja

Welcome to Edition 7.02 of the Rocket Report! The highlight of this week was the hugely successful debut of Europe’s Ariane 6 rocket. They will address the upper stage issue, I am sure. Given Europe’s commitment to zero debris, stranding the second stage is not great. But for a debut launch of a large new vehicle, this was really promising.

As always, we welcome reader submissions, and if you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets as well as a quick look ahead at the next three launches on the calendar.

Chinese launch company suffers another setback. Chinese commercial rocket firm iSpace suffered a launch failure late Wednesday in a fresh setback for the company, Space News reports. The four-stage Hyperbola-1 solid rocket lifted off from Jiuquan spaceport in the Gobi Desert at 7: 40 pm ET (23: 40 UTC) on Wednesday. Beijing-based iSpace later issued a release stating that the rocket’s fourth stage suffered an anomaly. The statement did not reveal the name nor nature of the payloads lost on the flight.

Early troubles are perhaps to be expected … Beijing Interstellar Glory Space Technology Ltd., or iSpace, made history in 2019 as the first privately funded Chinese company to reach orbit, with the solid-fueled Hyperbola-1. However the rocket suffered three consecutive failures following that feat. The company recovered with two successful flights in 2023 before the latest failure. The loss could add to reliability concerns over China’s commercial launch industry as it follows Space Pioneer’s recent catastrophic static-fire explosion. (submitted by EllPeaTea)

Feds backtrack on former Firefly investor. A long, messy affair between US regulators and a Ukrainian businessman named Max Polyakov seems to have finally been resolved, Ars reports. On Tuesday, Polyakov’s venture capital firm Noosphere Venture Partners announced that the US government has released him and his related companies from all conditions imposed upon them in the run-up to the Russian invasion of Ukraine. This decision comes more than two years after the Committee on Foreign Investment in the United States and the US Air Force forced Polyakov to sell his majority stake in the Texas-based launch company Firefly.

Not a spy … This rocket company was founded in 2014 by an engineer named Tom Markusic, who ran into financial difficulty as he sought to develop the Alpha rocket. Markusic had to briefly halt Firefly’s operations before Polyakov, a colorful and controversial Ukrainian businessman, swooped in and provided a substantial infusion of cash into the company. “The US government quite happily allowed Polyakov to pump $200 million into Firefly only to decide he was a potential spy just as the company’s first rocket was ready to launch,” Ashlee Vance, a US journalist who chronicled Polyakov’s rise, told Ars. It turns out, Polyakov wasn’t a spy.

The easiest way to keep up with Eric Berger’s space reporting is to sign up for his newsletter, we’ll collect his stories in your inbox.

Pentagon ICBM costs soar. The price tag for the Pentagon’s next-generation nuclear-tipped Sentinel ICBMs has ballooned by 81 percent in less than four years, The Register reports. This triggered a mandatory congressional review. On Monday, the Department of Defense released the results of this review, with Under-secretary of Defense for Acquisition and Sustainment William LaPlante saying the Sentinel missile program met established criteria for being allowed to continue after his “comprehensive, unbiased review of the program.”

Trust us, the military says … The Sentinel project is the DoD’s attempt to replace its aging fleet of ground-based nuclear-armed Minuteman III missiles (first deployed in 1970) with new hardware. When it passed its Milestone B decision (authorization to enter the engineering and manufacturing phase) in September 2020, the cost was a fraction of the $141 billion the Pentagon now estimates Sentinel will cost, LaPlante said. To give that some perspective, the proposed annual budget for the Department of Defense for its fiscal 2025 is nearly $850 billion. (submitted by EllPeaTea)

Rocket Report: Chinese firm suffers another failure; Ariane 6 soars in debut Read More »

scientists-built-real-life-“stillsuit”-to-recycle-astronaut-urine-on-space-walks

Scientists built real-life “stillsuit” to recycle astronaut urine on space walks

shot of Fremen woman in a stillsuit kneeling

Enlarge / The Fremen on Arrakis wear full-body “stillsuits” that recycle absorbed sweat and urine into potable water.

Warner Bros.

The Fremen who inhabit the harsh desert world of Arrakis in Frank Herbert’s Dune must rely on full-body “stillsuits” for their survival, which recycle absorbed sweat and urine into potable water. Now science fiction is on the verge of becoming science fact: Researchers from Cornell University have designed a prototype stillsuit for astronauts that will recycle their urine into potable water during spacewalks, according to a new paper published in the journal Frontiers in Space Technologies.

Herbert provided specific details about the stillsuit’s design when planetologist Liet Kynes explained the technology to Duke Leto Atreides I:

It’s basically a micro-sandwich—a high-efficiency filter and heat-exchange system. The skin-contact layer’s porous. Perspiration passes through it, having cooled the body … near-normal evaporation process. The next two layers … include heat exchange filaments and salt precipitators. Salt’s reclaimed. Motions of the body, especially breathing and some osmotic action provide the pumping force. Reclaimed water circulates to catchpockets from which you draw it through this tube in the clip at your neck… Urine and feces are processed in the thigh pads. In the open desert, you wear this filter across your face, this tube in the nostrils with these plugs to ensure a tight fit. Breathe in through the mouth filter, out through the nose tube. With a Fremen suit in good working order, you won’t lose more than a thimbleful of moisture a day…

The Illustrated Dune Encyclopedia interpreted the stillsuit as something akin to a hazmat suit, without the full face covering. In David Lynch’s 1984 film, Dune, the stillsuits were organic and very form-fitting compared to the book description, almost like a second skin. The stillsuits in Denis Villeneuve’s most recent film adaptations (Dune Part 1 and Part 2) tried to hew more closely to the source material, with “micro-sandwiches” of acrylic fibers and porous cottons and embedded tubes for better flexibility.

Dune, the stillsuits were organic and very form-fitting.” height=”401″ src=”https://cdn.arstechnica.net/wp-content/uploads/2024/07/stillsuit2-640×401.jpg” width=”640″>

Enlarge / In David Lynch’s 1984 film, Dune, the stillsuits were organic and very form-fitting.

Universal Pictures

The Cornell team is not the first to try to build a practical stillsuit. Hacksmith Industries did a “one day build” of a stillsuit just last month, having previously tackled Thor’s Stormbreaker ax, Captain America’s electromagnetic shield, and a plasma-powered lightsaber, among other projects. The Hacksmith team dispensed with the icky urine and feces recycling aspects and focused on recycling sweat and moisture from breath.

Their version consists of a waterproof baggy suit (switched out for a more form-fitting bunny suit in the final version) with a battery-powered heat exchanger in the back. Any humidity condenses on the suit’s surface and drips into a bottle attached to a CamelBak bladder. There’s a filter mask attached to a tube that allows the wearer to breathe in filtered air, but it’s one way; the exhaled air is redirected to the condenser so the water content can be harvested into the CamelBak bladder and then sent back to the mask so the user can drink it. It’s not even close to achieving Herbert’s stated thimbleful a day in terms of efficiency since it mostly recycles moisture from sweat on the wearer’s back. But it worked.

Scientists built real-life “stillsuit” to recycle astronaut urine on space walks Read More »

german-navy-still-uses-8-inch-floppy-disks,-working-on-emulating-a-replacement

German Navy still uses 8-inch floppy disks, working on emulating a replacement

Sailing away soon —

Four Brandenburg-class F123 warships employ floppies for data-acquisition systems.

An example of an 8-inch floppy disk. It's unclear which brand disks the German Navy uses.

Enlarge / An example of an 8-inch floppy disk. It’s unclear which brand disks the German Navy uses.

Cromemco, CC BY-SA 4.0

The German Navy is working on modernizing its Brandenburg-class F123 frigates, which means ending their reliance on 8-inch floppy disks.

The F123 frigates use floppy disks for their onboard data acquisition (DAQ) systems, as noted by Tom’s Hardware on Thursday. Augen geradeaus!, a German defense and security policy blog by journalist Thomas Wiegold, notes that DAQs are important for controlling frigates, including power generation, “because the operating parameters have to be recorded,” per a Google translation. The ships themselves specialize in anti-submarine warfare and air defense.

Earlier this month, Augen geradeaus! spotted a tender for service published June 21 by Germany’s Federal Office of Bundeswehr Equipment, Information Technology, and In-Service Support (BAAINBw) to modernize the German Navy’s four F123 frigates. The ships were commissioned from October 1994 to December 1996. As noted by German IT news outlet Heise, the continued use of 8-inch floppies despite modern alternatives being available for years “has to do with the fact that established systems are considered more reliable.”

An F123 frigate.

Enlarge / An F123 frigate.

Saab

Rather than overhauling the entire DAQ, the government plans to develop and integrate an onboard emulation system to replace the floppy disks. This differs from the approach the US Air Force took. In 2019, the US military branch replaced the 8-inch floppies for storing data used for operating its intercontinental ballistic missile command, control, and communications network with SSDs.

The BAAINBw hired Saab for F123 updates. In July 2021, Saab announced winning a contract to “deliver and integrate new naval radars and fire control directors for and in the German Navy’s” F123s, with the work entailing “a new combat management system in order to completely overhaul the system currently in use on the F123, allowing a low risk integration of the new naval radars and fire control capabilities.” The Swedish company said the deal was worth about 4.6 billion SEK (about $436,748,840).

Per the BAAINBw’s tender, the replacement of the floppy disks is expected to start on October 1 and end July 31, 2025. F123 frigates are supposed to stay in service until F126s are available, which is expected to be between 2028 and 2031.

Further details, like how exactly Saab will replace the floppies, are confidential. As pointed out by Tom’s Hardware, there are various options for floppy disk emulation, such as devices from brands like Gotek that are popular among enthusiasts.

Floppies keep floppin’

For the typical person, floppy disks are obsolete, but government bodies with already established and successfully running systems in place have been much slower to abandon the old storage medium. Besides the German Navy and US Air Force, Japan only last month officially stopped using floppy disks in governmental systems. The San Francisco Municipal Transportation Agency plans to use 5¼-inch floppies to help run San Francisco’s Muni Metro light rail system until 2030.

Various industries also continue using floppy disks to help run machines that have long been used, as Chuck E. Cheese did for animatronics as recently as 2023 and professional embroiderers do with embroidery machines.

German Navy still uses 8-inch floppy disks, working on emulating a replacement Read More »

nasa’s-flagship-mission-to-europa-has-a-problem:-vulnerability-to-radiation

NASA’s flagship mission to Europa has a problem: Vulnerability to radiation

Tripping transistors —

“What keeps me awake right now is the uncertainty.”

An artist's illustration of the Europa Clipper spacecraft during a flyby close to Jupiter's icy moon.

Enlarge / An artist’s illustration of the Europa Clipper spacecraft during a flyby close to Jupiter’s icy moon.

The launch date for the Europa Clipper mission to study the intriguing moon orbiting Jupiter, which ranks alongside the Cassini spacecraft to Saturn as NASA’s most expensive and ambitious planetary science mission, is now in doubt.

The $4.25 billion spacecraft had been due to launch in October on a Falcon Heavy rocket from Kennedy Space Center in Florida. However, NASA revealed that transistors on board the spacecraft may not be as radiation-hardened as they were believed to be.

“The issue with the transistors came to light in May when the mission team was advised that similar parts were failing at lower radiation doses than expected,” the space agency wrote in a blog post Thursday afternoon. “In June 2024, an industry alert was sent out to notify users of this issue. The manufacturer is working with the mission team to support ongoing radiation test and analysis efforts in order to better understand the risk of using these parts on the Europa Clipper spacecraft.”

The moons orbiting Jupiter, a massive gas giant planet, exist in one of the harshest radiation environments in the Solar System. NASA’s initial testing indicates that some of the transistors, which regulate the flow of energy through the spacecraft, could fail in this environment. NASA is currently evaluating the possibility of maximizing the transistor lifetime at Jupiter and expects to complete a preliminary analysis in late July.

To delay or not to delay

NASA’s update is silent on whether the spacecraft could still make its approximately three-week launch window this year, which gets Clipper to the Jovian system in 2030.

Ars reached out to several experts familiar with the Clipper mission to gauge the likelihood that it would make the October launch window, and opinions were mixed. The consensus view was between a 40 to 60 percent chance of becoming comfortable enough with the issue to launch this fall. If NASA engineers cannot become confident with the existing setup, the transistors would need to be replaced.

The Clipper mission has launch opportunities in 2025 and 2026, but these could lead to additional delays. This is due to the need for multiple gravitational assists. The 2024 launch follows a “MEGA” trajectory, including a Mars flyby in 2025 and an Earth flyby in late 2026—Mars-Earth Gravitational Assist. If Clipper launches a year late, it would necessitate a second Earth flyby. A launch in 2026 would revert to a MEGA trajectory. Ars has asked NASA for timelines of launches in 2025 and 2026 and will update if they provide this information.

Another negative result of delays would be costs, as keeping the mission on the ground for another year likely would result in another few hundred million dollars in expenses for NASA, which would blow a hole in its planetary science budget.

NASA’s blog post this week is not the first time the space agency has publicly mentioned these issues with the metal-oxide-semiconductor field-effect transistor, or MOSFET. At a meeting of the Space Studies Board in early June, Jordan Evans, project manager for the Europa Clipper Mission, said it was his No. 1 concern ahead of launch.

“What keeps me awake at night”

“The most challenging thing we’re dealing with right now is an issue associated with these transistors, MOSFETs, that are used as switches in the spacecraft,” he said. “Five weeks ago today, I got an email that a non-NASA customer had done some testing on these rad-hard parts and found that they were going before (the specifications), at radiation levels significantly lower than what we qualified them to as we did our parts procurement, and others in the industry had as well.”

At the time, Evans said things were “trending in the right direction” with regard to the agency’s analysis of the issue. It seems unlikely that NASA would have put out a blog post five weeks later if the issue were still moving steadily toward a resolution.

“What keeps me awake right now is the uncertainty associated with the MOSFETs and the residual risk that we will take on with that,” Evans said in June. “It’s difficult to do the kind of low-dose rate testing in the timeframes that we have until launch. So we’re gathering as much data as we can, including from missions like Juno, to better understand what residual risk we might launch with.”

These are precisely the kinds of issues that scientists and engineers don’t want to find in the final months before the launch of such a consequential mission. The stakes are incredibly high—imagine making the call to launch Clipper only to have the spacecraft fail six years later upon arrival at Jupiter.

NASA’s flagship mission to Europa has a problem: Vulnerability to radiation Read More »

new-app-releases-for-apple-vision-pro-have-fallen-dramatically-since-launch

New app releases for Apple Vision Pro have fallen dramatically since launch

Vision Pro, seen from below, in a display with a bright white light strip overhead.

Samuel Axon

Apple is struggling to attract fresh content for its innovative Vision Pro headset, with just a fraction of the apps available when compared with the number of developers created for the iPhone and iPad in their first few months.

The lack of a “killer app” to encourage customers to pay upwards of $3,500 for an unproven new product is seen as a problem for Apple, as the Vision Pro goes on sale in Europe on Friday.

Apple said recently that there were “more than 2,000” apps available for its “spatial computing” device, five months after it debuted in the US.

That compares with more than 20,000 iPad apps that had been created by mid-2010, a few months after the tablet first went on sale, and around 10,000 iPhone apps by the end of 2008, the year the App Store launched.

“The overall trajectory of the Vision Pro’s launch in February this year has been a lot slower than many hoped for,” said George Jijiashvili, analyst at market tracker Omdia.

“The reality is that most developers’ time and money will be dedicated to platforms with billions of users, rather than tens or hundreds of thousands.”

Apple believes the device will transform how millions work and play. The headset shifts between virtual reality, in which the wearer is immersed in a digital world, and a version of “augmented reality” that overlays images upon the real surroundings.

Omdia predicts that Apple will sell 350,000 Vision Pros this year. It forecasts an increase to 750,000 next year and 1.7 million in 2026, but the figures are far lower than the iPad, which sold almost 20 million units in its first year.

Estimates from IDC, a tech market researcher, suggest Apple shipped fewer than 100,000 units of Vision Pro in the first quarter, less than half what rival Meta sold of its Quest headsets.

Because of the device’s high price, Apple captured more than 50 percent of the total VR headset market by dollar value, IDC found, but analyst Francisco Jeronimo added: “The Vision Pro’s success, regardless of its price, will ultimately depend on the content available.”

Early data suggests that new content is arriving slowly. According to Appfigures, which tracks App Store listings, the number of new apps launched for the Vision Pro has fallen dramatically since January and February.

Nearly 300 of the top iPhone developers, whose apps are downloaded more than 10 million times a year—including Google, Meta, Tencent, Amazon, and Netflix—are yet to bring any of their software or services to Apple’s latest device.

Steve Lee, chief executive of AmazeVR, which offers immersive concert experiences, said that the recent launch of the device in China and elsewhere in Asia resulted in an uptick in downloads of his app. “However, it was about one-third of the initial launch in the United States.”

Lee remains confident that Vision Pro will eventually become a mainstream consumer product.

Wamsi Mohan, equity analyst at Bank of America, said the Vision Pro had “just not quite hit the imagination of the consumer.”

“This is one of the slower starts for a new Apple product category, just given the price point,” he said. “It seems management is emphasizing the success in enterprise a lot more.”

Nonetheless, some app developers are taking a leap of faith and launching on the Vision Pro. Some are betting that customers who can afford the pricey headset will be more likely to splurge on software, too.

Others are playing a longer game, hoping that establishing an early position on Apple’s newest platform will bring returns in the years to come.

New app releases for Apple Vision Pro have fallen dramatically since launch Read More »

captain-america:-brave-new-world-teaser-introduces-red-hulk-to-the-mcu

Captain America: Brave New World teaser introduces Red Hulk to the MCU

new world order —

There are quite a few familiar characters from 2008’s The Incredible Hulk.

Anthony Mackie wields the shield in Captain America: Brave New World.

Marvel Studios has dropped the first teaser for Captain America: Brave New World, star Anthony Mackie’s first cinematic appearance as the new Captain America after the Phase Four 2021 TV miniseries, The Falcon and the Winter Soldier. This is the fifth film in the MCU’s Phase Five, directed by Julius Onah (The Cloverfield Paradox) and building on events not just in F&WS but also the 2008 film The Incredible Hulk. The teaser feels like a half-superhero movie, half-political thriller, and with the tantalizing introduction of Red Hulk, it promises to be an entertaining ride.

(Spoilers for Avengers: Endgame and The Falcon and the Winter Soldier below.)

As previously reported, F&WS picked up in the wake of Avengers: End Game, when Steve Rogers (Chris Evans) handed his Captain America shield to Anthony Mackie’s Sam Wilson (The Falcon) and Sebastian Stan’s Bucky Barnes (The Winter Soldier), having chosen to remain in the past and live out his life with Peggy Carter. Sam and Bucky had to grapple with losing Steve and the burden of his legacy. Meanwhile, the US government had named its own new Captain America, John Walker (Wyatt Russell), a decorated veteran and ultimate “good soldier” who thought he could better embody “American values” than Rogers.  

All three men found themselves battling a terrorist group known as the Flag Smashers, many of whom had been enhanced with the Super Soldier Serum. Where did they get it? From a mysterious person known only as the Power Broker. The Flag Smashers were targeting the Global Repatriation Council (GRC) set up to help those who disappeared in the Snappening (or the Blip) and then returned and had to re-acclimate to a very different world. (Apparently, the Flag Smashers liked it better before everyone came back.) Everything culminated in a knock-down fight between a serum-enhanced Walker, Sam, and Bucky, ending with Sam’s wingsuit destroyed. Walker escaped with a broken arm, sans shield, and we saw him in a post-credits scene melting down his military medals to make a new shield.

A new adventure

Frankly, I didn’t love F&WS as much as other Ars staffers and critics; mostly I thought it was “meh” and wasted a lot of potential in terms of character development. But Russell’s performance as Walker was excellent, and who could forget that priceless scene with the evil Baron Helmut Zemo (Daniel Brühl) dancing? So I’m down for another Captain America adventure with Mackie wielding the shield.

Anthony Mackie is back as Sam Wilson, the new Captain America.

Enlarge / Anthony Mackie is back as Sam Wilson, the new Captain America.

YouTube/Marvel Studios

Per the official premise:

After meeting with newly elected US President Thaddeus Ross, played by Harrison Ford in his Marvel Cinematic Universe debut, Sam finds himself in the middle of an international incident. He must discover the reason behind a nefarious global plot before the true mastermind has the entire world seeing red.

In addition to Mackie and Ford, the cast includes Liv Tyler as the president’s daughter, Betty Ross, and Tim Blake Nelson as Samuel Sterns, both reprising their roles in 2008’s The Incredible Hulk. (Ford replaces the late William Hurt, who played Ross in that earlier film.) Carl Lumbley plays Isaiah Bradley, reprising his F&WS role as a Korean War veteran who had been secretly imprisoned and given the Super Soldier Serum against his will, enduring 30 years of experimentation. (He told Sam he couldn’t imagine how any black man could take up Captain America’s shield because of what it represented to people like him, and one could hardly blame him.)

Rosa Salazar plays Rachel Leighton, Danny Ramirez plays Joaquin Torres, and Shira Haas plays Ruth Bat-Seraph. Giancarlo Esposito will also appear in an as-yet-undisclosed role, but based on the brief glimpses we get in the teaser, it’s an antagonistic role.

Ooh, a glimpse of Red Hulk, mutant alter ego of Thaddeus Ross.

Enlarge / Ooh, a glimpse of Red Hulk, mutant alter ego of Thaddeus Ross.

YouTube/Marvel Studios

The teaser opens with Wilson visiting the White House to meet with President Ross. “You and I haven’t always agreed in the past,” Ross tells him. “But I wanna make another run at making Captain America an official military position.” Wilson has well-justified doubts after the events of F&WS, asking what would happen “if we disagree on how to manage a situation.” Ross just reiterates his invitation to work with Sam: “We’ll show the world a better way forward.”

Then Bradley appears at a public event and tries (unsuccessfully) to assassinate the president. Wilson warns Ross that his inner circle has been compromised, but the president appears to be in denial—or there’s something more nefarious going on. “Global power is shifting,” we hear Sterns say in a voiceover. “You’re just a pawn.” Is it a warning or a threat? (Reminder: in the 2008 film, Sterns was a cellular biologist trying to find a cure for Bruce Banner, only to be accidentally exposed to Banner’s blood and begin mutating himself into Leader.)

Cue much explosive action and mayhem. And in true Marvel fashion, there is one final shot of Red Hulk, the alter ego of Thaddeus Ross, whose big red hand is also prominently featured in the official poster below. It should be quite the showdown.

Captain America: Brave New World hits theaters on February 14, 2025.

Marvel Studios

Listing image by YouTube/Marvel Studios

Captain America: Brave New World teaser introduces Red Hulk to the MCU Read More »

frozen-mammoth-skin-retained-its-chromosome-structure

Frozen mammoth skin retained its chromosome structure

Artist's depiction of a large mammoth with brown fur and huge, curving tusks in an icy, tundra environment.

One of the challenges of working with ancient DNA samples is that damage accumulates over time, breaking up the structure of the double helix into ever smaller fragments. In the samples we’ve worked with, these fragments scatter and mix with contaminants, making reconstructing a genome a large technical challenge.

But a dramatic paper released on Thursday shows that this isn’t always true. Damage does create progressively smaller fragments of DNA over time. But, if they’re trapped in the right sort of material, they’ll stay right where they are, essentially preserving some key features of ancient chromosomes even as the underlying DNA decays. Researchers have now used that to detail the chromosome structure of mammoths, with some implications for how these mammals regulated some key genes.

DNA meets Hi-C

The backbone of DNA’s double helix consists of alternating sugars and phosphates, chemically linked together (the bases of DNA are chemically linked to these sugars). Damage from things like radiation can break these chemical linkages, with fragmentation increasing over time. When samples reach the age of something like a Neanderthal, very few fragments are longer than 100 base pairs. Since chromosomes are millions of base pairs long, it was thought that this would inevitably destroy their structure, as many of the fragments would simply diffuse away.

But that will only be true if the medium they’re in allows diffusion. And some scientists suspected that permafrost, which preserves the tissue of some now-extinct Arctic animals, might block that diffusion. So, they set out to test this using mammoth tissues, obtained from a sample termed YakInf that’s roughly 50,000 years old.

The challenge is that the molecular techniques we use to probe chromosomes take place in liquid solutions, where fragments would just drift away from each other in any case. So, the team focused on an approach termed Hi-C, which specifically preserves information about which bits of DNA were close to each other. It does this by exposing chromosomes to a chemical that will link any pieces of DNA that are close physical proximity. So, even if those pieces are fragments, they’ll be stuck to each other by the time they end up in a liquid solution.

A few enzymes are then used to convert these linked molecules to a single piece of DNA, which is then sequenced. This data, which will contain sequence information from two different parts of the genome, then tells us that those parts were once close to each other inside a cell.

Interpreting Hi-C

On its own, a single bit of data like this isn’t especially interesting; two bits of genome might end up next to each other at random. But when you have millions of bits of data like this, you can start to construct a map of how the genome is structured.

There are two basic rules governing the pattern of interactions we’d expect to see. The first is that interactions within a chromosome are going to be more common than interactions between two chromosomes. And, within a chromosome, parts that are physically closer to each other on the molecule are more likely to interact than those that are farther apart.

So, if you are looking at a specific segment of, say, chromosome 12, most of the locations Hi-C will find it interacting with will also be on chromosome 12. And the frequency of interactions will go up as you move to sequences that are ever closer to the one you’re interested in.

On its own, you can use Hi-C to help reconstruct a chromosome even if you start with nothing but fragments. But the exceptions to the expected pattern also tell us things about biology. For example, genes that are active tend to be on loops of DNA, with the two ends of the loop held together by proteins; the same is true for inactive genes. Interactions within these loops tend to be more frequent than interactions between them, subtly altering the frequency with which two fragments end up linked together during Hi-C.

Frozen mammoth skin retained its chromosome structure Read More »

can-you-do-better-than-top-level-ai-models-on-these-basic-vision-tests?

Can you do better than top-level AI models on these basic vision tests?

A bit myopic —

Abstract analysis that is trivial for humans often stymies GPT-4o, Gemini, and Sonnet.

Whatever you do, don't ask the AI how many horizontal lines are in this image.

Enlarge / Whatever you do, don’t ask the AI how many horizontal lines are in this image.

Getty Images

In the last couple of years, we’ve seen amazing advancements in AI systems when it comes to recognizing and analyzing the contents of complicated images. But a new paper highlights how many state-of-the-art “vision learning Models” (VLMs) often fail at simple, low-level visual analysis tasks that are trivially easy for a human.

In the provocatively titled pre-print paper “Vision language models are blind (which has a PDF version that includes a dark sunglasses emoji in the title), researchers from Auburn University and the University of Alberta create eight simple visual acuity tests with objectively correct answers. These range from identifying how often two colored lines intersect to identifying which letter in a long word has been circled to counting how many nested shapes exist in an image (representative examples and results can be viewed on the research team’s webpage).

  • If you can solve these kinds of puzzles, you may have better visual reasoning than state-of-the-art AIs.

  • The puzzles on the right are like something out of Highlights magazine.

  • A representative sample shows AI models failing at a task that most human children would find trivial.

Crucially, these tests are generated by custom code and don’t rely on pre-existing images or tests that could be found on the public Internet, thereby “minimiz[ing] the chance that VLMs can solve by memorization,” according to the researchers. The tests also “require minimal to zero world knowledge” beyond basic 2D shapes, making it difficult for the answer to be inferred from “textual question and choices alone” (which has been identified as an issue for some other visual AI benchmarks).

Are you smarter than a fifth grader?

After running multiple tests across four different visual models—GPT-4o, Gemini-1.5 Pro, Sonnet-3, and Sonnet-3.5—the researchers found all four fell well short of the 100 percent accuracy you might expect for such simple visual analysis tasks (and which most sighted humans would have little trouble achieving). But the size of the AI underperformance varied greatly depending on the specific task. When asked to count the number of rows and columns in a blank grid, for instance, the best-performing model only gave an accurate answer less than 60 percent of the time. On the other hand, Gemini-1.5 Pro hit nearly 93 percent accuracy in identifying circled letters, approaching human-level performance.

  • For some reason, the models tend to incorrectly guess the “o” is circled a lot more often than all the other letters in this test.

  • The models performed perfectly in counting five interlocking circles, a pattern they might be familiar with from common images of the Olympic rings.

  • Do you have an easier time counting columns than rows in a grid? If so, you probably aren’t an AI.

Even small changes to the tasks could also lead to huge changes in results. While all four tested models were able to correctly identify five overlapping hollow circles, the accuracy across all models dropped to well below 50 percent when six to nine circles were involved. The researchers hypothesize that this “suggests that VLMs are biased towards the well-known Olympic logo, which has 5 circles.” In other cases, models occasionally hallucinated nonsensical answers, such as guessing “9,” “n”, or “©” as the circled letter in the word “Subdermatoglyphic.”

Overall, the results highlight how AI models that can perform well at high-level visual reasoning have some significant “blind spots” (sorry) when it comes to low-level abstract images. It’s all somewhat reminiscent of similar capability gaps that we often see in state-of-the-art large language models, which can create extremely cogent summaries of lengthy texts while at the same time failing extremely basic math and spelling questions.

These gaps in VLM capabilities could come down to the inability of these systems to generalize beyond the kinds of content they are explicitly trained on. Yet when the researchers tried fine-tuning a model using specific images drawn from one of their tasks (the “are two circles touching?” test), that model showed only modest improvement, from 17 percent accuracy up to around 37 percent. “The loss values for all these experiments were very close to zero, indicating that the model overfits the training set but fails to generalize,” the researchers write.

The researchers propose that the VLM capability gap may be related to the so-called “late fusion” of vision encoders onto pre-trained large language models. An “early fusion” training approach that integrates visual encoding alongside language training could lead to better results on these low-level tasks, the researchers suggest (without providing any sort of analysis of this question).

Can you do better than top-level AI models on these basic vision tests? Read More »

apple-settles-eu-probe-by-opening-up-its-mobile-payments-system

Apple settles EU probe by opening up its mobile payments system

A small price to pay? —

iPhone users will get more choices to make “touch-and-go” payments in the EU.

Apple settles EU probe by opening up its mobile payments system

In two weeks, iPhone users in the European Union will be able to use any mobile wallet they like to complete “tap and go” payments with the ease of using Apple Pay.

The change comes as part of a settlement with the European Commission (EC), which investigated Apple for potentially shutting out rivals by denying access to the “Near Field Communication” (NFC) technology on its devices that enables the “tap and go” feature. Apple did not develop this technology, which is free for developers, the EC said, and going forward, Apple agreed to not charge developers fees to provide the NFC functionality on its devices.

In a press release, the EC’s executive vice president, Margrethe Vestager, said that Apple’s commitments in the settlement address the commission’s “preliminary concerns that Apple may have illegally restricted competition for mobile wallets on iPhones.”

“From now on, Apple can no longer use its control over the iPhone ecosystem to keep other mobile wallets out of the market,” Vestager said. “Competing wallet developers, as well as consumers, will benefit from these changes, opening up innovation and choice, while keeping payments secure.”

Apple has until July 25 to follow through on three commitments that resolve the EC’s concerns that Apple may have “prevented developers from bringing new and competing mobile wallets to iPhone users.”

Arguably, providing outside developers access to NFC functionality on its devices is the biggest change. Rather than allowing developers to access this functionality through Apple’s hardware, Apple has borrowed a solution prevalent in the Android ecosystem, Vestager said, granting access through a software solution called “Host Card Emulation mode.”

This, Vestager said, provides “an equivalent solution in terms of security and user experience” and paves the way for other wallets to be more easily used on Apple devices.

An Apple spokesperson told CNBC that “Apple is providing developers in the European Economic Area with an option to enable NFC contactless payments and contactless transactions for car keys, closed loop transit, corporate badges, home keys, hotel keys, merchant loyalty/rewards, and event tickets from within their iOS apps using Host Card Emulation based APIs.”

To ensure that Apple Pay is on an equal playing field with other wallets, the EC said that Apple committed to improve contactless payments functionality for rival wallets. That means that “iPhone users will be able to double-click the side button of their iPhones to launch” their preferred wallet and “use Face ID, Touch ID and passcode to verify” their identities when using competing wallets.

Perhaps most critically for users attracted to Apple’s payment options convenience, Apple also agreed to allow rival wallets to be set as the default payment option.

These commitments will remain in force for 10 years, Vestager said.

Apple did not immediately respond to Ars’ request for comment. Apple’s spokesperson confirmed to CNBC that no changes would be made to Apple Pay or Apple Wallet as a result of the settlement.

Apple’s commitments go beyond the DMA

Before accepting Apple’s commitments, the EC spoke to “many banks, app developers, card issuers, and financial associations,” Vestager said, whose feedback helped improve Apple’s commitments.

According to Vestager, Apple’s changes go beyond the requirements of the EU’s strict antitrust law, the Digital Markets Act, which “requires gatekeepers to ensure effective interoperability with hardware and software features that they use within their ecosystems,” including “access to NFC technology for mobile payments.”

Beyond the DMA, Apple agreed to have its compliance with the settlement “ensured by a monitoring trustee,” as well as to provide “a fast dispute resolution mechanism, which will also allow for an independent review of Apple’s implementation.”

Vestager assured all stakeholders in the European Economic Area that these changes will prevent any potential harms caused by Apple seeming to shut other wallets out of its devices, which “may have had a negative impact on innovation.” By settling the yearslong probe, Apple avoided a potentially large fine. In March, the EC fined Apple nearly $2 billion for restricting “alternative and cheaper music subscription services” like Spotify in its app store, and the suspected anticompetitive behavior in Apple’s payments ecosystem seemed just as harmful, the EC found.

“This reduction in choice and innovation is harmful,” Vestager said, confirming that the settlement concluded the EC’s probe into Apple Pay. “It is harmful to consumers and it is illegal under EU competition rules.”

Apple settles EU probe by opening up its mobile payments system Read More »

intuit’s-ai-gamble:-mass-layoff-of-1,800-paired-with-hiring-spree

Intuit’s AI gamble: Mass layoff of 1,800 paired with hiring spree

In the name of AI —

Intuit CEO: “Companies that aren’t prepared to take advantage of [AI] will fall behind.”

Signage for financial software company Intuit at the company's headquarters in the Silicon Valley town of Mountain View, California, August 24, 2016.

On Wednesday, Intuit CEO Sasan Goodarzi announced in a letter to the company that it would be laying off 1,800 employees—about 10 percent of its workforce of around 18,000—while simultaneously planning to hire the same number of new workers as part of a major restructuring effort purportedly focused on AI.

“As I’ve shared many times, the era of AI is one of the most significant technology shifts of our lifetime,” wrote Goodarzi in a blog post on Intuit’s website. “This is truly an extraordinary time—AI is igniting global innovation at an incredible pace, transforming every industry and company in ways that were unimaginable just a few years ago. Companies that aren’t prepared to take advantage of this AI revolution will fall behind and, over time, will no longer exist.”

The CEO says Intuit is in a position of strength and that the layoffs are not cost-cutting related, but they allow the company to “allocate additional investments to our most critical areas to support our customers and drive growth.” With new hires, the company expects its overall headcount to grow in its 2025 fiscal year.

Intuit’s layoffs (which collectively qualify as a “mass layoff” under the WARN act) hit various departments within the company, including closing Intuit’s offices in Edmonton, Canada, and Boise, Idaho, affecting over 250 employees. Approximately 1,050 employees will receive layoffs because they’re “not meeting expectations,” according to Goodarzi’s letter. Intuit has also eliminated more than 300 roles across the company to “streamline” operations and shift resources toward AI, and the company plans to consolidate 80 tech roles to “sites where we are strategically growing our technology teams and capabilities,” such as Atlanta, Bangalore, New York, Tel Aviv, and Toronto.

In turn, the company plans to accelerate investments in its AI-powered financial assistant, Intuit Assist, which provides AI-generated financial recommendations. The company also plans to hire new talent in engineering, product development, data science, and customer-facing roles, with a particular emphasis on AI expertise.

Not just about AI

Despite Goodarzi’s heavily AI-focused message, the restructuring at Intuit reveals a more complex picture. A closer look at the layoffs shows that many of the 1,800 job cuts stem from performance-based departures (such as the aforementioned 1,050). The restructuring also includes a 10 percent reduction in executive positions at the director level and above (“To continue increasing our velocity of decision making,” Goodarzi says).

These numbers suggest that the reorganization may also serve as an opportunity for Intuit to trim its workforce of underperforming staff, using the AI hype cycle as a compelling backdrop for a broader house-cleaning effort.

But as far as CEOs are concerned, it’s always a good time to talk about how they’re embracing the latest, hottest thing in technology: “With the introduction of GenAI,” Goodarzi wrote, “we are now delivering even more compelling customer experiences, increasing monetization potential, and driving efficiencies in how the work gets done within Intuit. But it’s just the beginning of the AI revolution.”

Intuit’s AI gamble: Mass layoff of 1,800 paired with hiring spree Read More »