Author name: Kris Guyer

usb-hubs,-printers,-java,-and-more-seemingly-broken-by-macos-14.4-update

USB hubs, printers, Java, and more seemingly broken by macOS 14.4 update

pobody’s nerfect —

Issues seem to be related to security fixes made in Apple’s latest OS.

USB hubs, printers, Java, and more seemingly broken by macOS 14.4 update

A couple of weeks ago, Apple released macOS Sonoma 14.4 with the usual list of bug fixes, security patches, and a couple of minor new features. Since then, users and companies have been complaining of a long list of incompatibilities, mostly concerning broken external accessories like USB hubs and printers but also extending to software like Java.

MacRumors has a good rundown of the list of issues, which has been steadily getting longer as people have run into more problems. It started with reports of malfunctioning USB hubs, sourced from users on Reddit, the Apple Support Communities forums, and elsewhere—USB hubs built into various displays stopped functioning for Mac users after the 14.4 update.

Other issues surfaced in the days after people started reporting problems with their USB hubs, including some instances of broken printer drivers, unexpected app crashes for some Java users, and problems launching apps that rely on the PACE anti-piracy software (and iLok hardware dongles) to authenticate.

At least some of the problems seem localized to Apple Silicon Macs. In fact, iLok recommends running digital audio software in Rosetta mode as a temporary stopgap while Apple works on a fix. According to iLok, Apple has acknowledged this particular bug and is working on an update, but “[has] not indicated a timeline.”

The USB hub issue may be related to the USB security prompts that Apple introduced in macOS 13 Ventura, asking users to confirm whether they wanted to connect to USB-C accessories that they were connecting to their Mac for the first time. Some users have been able to get their USB hubs working again after the 14.4 update by making macOS request permission to connect to the accessory every time the accessory is plugged in; the default behavior is supposed to recognize USB devices that you’ve already connected to once.

Scanning Apple’s release notes or security update disclosures for the update doesn’t reveal any smoking guns, but many of the security bugs were addressed with “improved checks” and “improved access permissions,” and it’s certainly possible that some legitimate accessories and software were broken by one or more of these changes. The Oracle blog post about the Java problems refers to memory access issues that seem to be causing the crashes, though that may or may not explain the problems people are having with external accessories. The blog post also indicates that these bugs weren’t present in the public developer betas of macOS 14.4.

My desktop M2 Mac Studio setup, which is connected to a 4K Gigabyte M28U with a built-in USB hub, hasn’t exhibited any unusual behavior since the update, so it’s also possible that these issues aren’t affecting every user of every Mac. If you haven’t updated yet, it may be worth waiting until Apple releases fixes for some or all of these issues, even if you don’t think you’ll be affected.

USB hubs, printers, Java, and more seemingly broken by macOS 14.4 update Read More »

apple-may-hire-google-to-power-new-iphone-ai-features-using-gemini—report

Apple may hire Google to power new iPhone AI features using Gemini—report

Bake a cake as fast as you can —

With Apple’s own AI tech lagging behind, the firm looks for a fallback solution.

A Google

Benj Edwards

On Monday, Bloomberg reported that Apple is in talks to license Google’s Gemini model to power AI features like Siri in a future iPhone software update coming later in 2024, according to people familiar with the situation. Apple has also reportedly conducted similar talks with ChatGPT maker OpenAI.

The potential integration of Google Gemini into iOS 18 could bring a range of new cloud-based (off-device) AI-powered features to Apple’s smartphone, including image creation or essay writing based on simple prompts. However, the terms and branding of the agreement have not yet been finalized, and the implementation details remain unclear. The companies are unlikely to announce any deal until Apple’s annual Worldwide Developers Conference in June.

Gemini could also bring new capabilities to Apple’s widely criticized voice assistant, Siri, which trails newer AI assistants powered by large language models (LLMs) in understanding and responding to complex questions. Rumors of Apple’s own internal frustration with Siri—and potential remedies—have been kicking around for some time. In January, 9to5Mac revealed that Apple had been conducting tests with a beta version of iOS 17.4 that used OpenAI’s ChatGPT API to power Siri.

As we have previously reported, Apple has also been developing its own AI models, including a large language model codenamed Ajax and a basic chatbot called Apple GPT. However, the company’s LLM technology is said to lag behind that of its competitors, making a partnership with Google or another AI provider a more attractive option.

Google launched Gemini, a language-based AI assistant similar to ChatGPT, in December and has updated it several times since. Many industry experts consider the larger Gemini models to be roughly as capable as OpenAI’s GPT-4 Turbo, which powers the subscription versions of ChatGPT. Until just recently, with the emergence of Gemini Ultra and Claude 3, OpenAI’s top model held a fairly wide lead in perceived LLM capability.

The potential partnership between Apple and Google could significantly impact the AI industry, as Apple’s platform represents more than 2 billion active devices worldwide. If the agreement gets finalized, it would build upon the existing search partnership between the two companies, which has seen Google pay Apple billions of dollars annually to make its search engine the default option on iPhones and other Apple devices.

However, Bloomberg reports that the potential partnership between Apple and Google is likely to draw scrutiny from regulators, as the companies’ current search deal is already the subject of a lawsuit by the US Department of Justice. The European Union is also pressuring Apple to make it easier for consumers to change their default search engine away from Google.

With so much potential money on the line, selecting Google for Apple’s cloud AI job could potentially be a major loss for OpenAI in terms of bringing its technology widely into the mainstream—with a market representing billions of users. Even so, any deal with Google or OpenAI may be a temporary fix until Apple can get its own LLM-based AI technology up to speed.

Apple may hire Google to power new iPhone AI features using Gemini—report Read More »

fujitsu-says-it-found-malware-on-its-corporate-network,-warns-of-possible-data-breach

Fujitsu says it found malware on its corporate network, warns of possible data breach

HACKED —

Company apologizes for the presence of malware on company computers.

Fujitsu says it found malware on its corporate network, warns of possible data breach

Getty Images

Japan-based IT behemoth Fujitsu said it has discovered malware on its corporate network that may have allowed the people responsible to steal personal information from customers or other parties.

“We confirmed the presence of malware on several of our company’s work computers, and as a result of an internal investigation, it was discovered that files containing personal information and customer information could be illegally taken out,” company officials wrote in a March 15 notification that went largely unnoticed until Monday. The company said it continued to “investigate the circumstances surrounding the malware’s intrusion and whether information has been leaked.” There was no indication how many records were exposed or how many people may be affected.

Fujitsu employs 124,000 people worldwide and reported about $25 billion in its fiscal 2023, which ended at the end of last March. The company operates in 100 countries. Past customers include the Japanese government. Fujitsu’s revenue comes from sales of hardware such as computers, servers, and telecommunications gear, storage systems, software, and IT services.

In 2021, Fujitsu took ProjectWEB, the company’s enterprise software-as-a-service platform, offline following the discovery of a hack that breached multiple Japanese government agencies, including the Ministry of Land, Infrastructure, Transport, and Tourism; the Ministry of Foreign Affairs; and the Cabinet Secretariat. Japan’s Narita Airport was also affected.

Last July, Japan’s Ministry of Internal Affairs and Communications reportedly rebuked Fujitsu over a security failing that led to a separate breach of Fenics, another of the company’s cloud services, which is used by both government agencies and corporations. Earlier this year, the company apologized for playing a leading role in the wrongful conviction of more than 900 sub-postmasters and postmistresses who were accused of theft or fraud when the software wrongly made it appear that money was missing from their branches. A company executive said some of the software bugs responsible for the mistakes had been known since 1999.

Fujitsu representatives didn’t respond to requests for comment about last week’s breach disclosure. The company said it reported the incident to Japan’s data protection authority. “We deeply apologize for the great concern and inconvenience this has caused to everyone involved,” last week’s statement said. So far, the company has found no evidence of any affected customer data being misused.

Fujitsu says it found malware on its corporate network, warns of possible data breach Read More »

report:-sony-stops-producing-psvr2-amid-“surplus”-of-unsold-units

Report: Sony stops producing PSVR2 amid “surplus” of unsold units

Too many too late? —

Pricy tethered headset falters after the modest success of original PSVR.

PSVR2 (left) next to the original PSVR.

Enlarge / PSVR2 (left) next to the original PSVR.

Kyle Orland / Ars Technica

It looks like Sony’s PlayStation VR2 is not living up to the company’s sales expectations just over a year after it first hit the market. Bloomberg reports that the PlayStation-maker has stopped producing new PSVR2 units as it tries to clear out a growing backlog of unsold inventory.

Bloomberg cites “people familiar with [Sony’s] plans” in reporting that PSVR2 sales have “slowed progressively” since its February 2023 launch. Sony has produced “well over 2 million” units of the headset, compared to what tracking firm IDC estimates as just 1.69 million unit shipments to retailers through the end of last year. The discrepancy has caused a “surplus of assembled devices… throughout Sony’s supply chain,” according to Bloomberg’s sources.

IDC estimates a quarterly low of 325,000 PSVR2 units shipped in the usually hot holiday season, compared to a full 1.3 million estimated holiday shipments for Meta’s then-new Quest 3 headset, which combined with other Quest products to account for over 3.7 million estimated sales for the full year.

The last of the tethered headsets?

The reported state of affairs for PSVR2 is a big change from the late 2010s when the original PlayStation VR became one of the bestselling early VR headsets simply by selling to the small, VR-curious slice of PS4 owners. At the time, the original PSVR was one of the cheapest “all-in” entry points for the nascent market of tethered VR headsets, in large part because it didn’t require a connection to an expensive, high-end gaming PC.

In the intervening years, though, the VR headset market has almost completely migrated to untethered headsets, which allow for freer movement and eliminate the need to purchase and stay near external hardware. The $550 PlayStation VR2 is also pricier than the $500 Meta Quest 3 headset, even before you add in the $500 asking price for a needed PS5. Sony’s new headset also isn’t backward compatible with games designed for the original PSVR, forcing potential upgraders to abandon most of their existing VR game libraries for the new platform.

Even before the PSVR2 launched, Sony was reportedly scaling back its ambitions for the headset (though the company denied those reports at the time and said it was “seeing enthusiasm from PlayStation fans”). And since its launch, PSVR2 has suffered from a lack of exclusive titles, featuring a lineup mostly composed of warmed-over ports long available on other headsets. An Inverse report from late last year shared a series of damning complaints from developers who have struggled to get their games to run well on the new hardware.

Put it all together, and PSVR2 seems like a too-little-too-late upgrade that has largely squandered the company’s early lead in the space. We wouldn’t be shocked if this spells the end of the line for Sony’s VR hardware plans and for mass-market tethered headsets in general.

Report: Sony stops producing PSVR2 amid “surplus” of unsold units Read More »

dell-tells-remote-workers-that-they-won’t-be-eligible-for-promotion

Dell tells remote workers that they won’t be eligible for promotion

Decisions, decisions —

Report highlights big turnaround from Dell’s previous pro-WFH stance.

A woman in a bright yellow jacket is sitting in front of a laptop in emotional tension.

Starting in May, Dell employees who are fully remote will not be eligible for promotion, Business Insider (BI) reported Saturday. The upcoming policy update represents a dramatic reversal from Dell’s prior stance on work from home (WFH), which included CEO Michael Dell saying: “If you are counting on forced hours spent in a traditional office to create collaboration and provide a feeling of belonging within your organization, you’re doing it wrong.”

Dell employees will mostly all be considered “remote” or “hybrid” starting in May, BI reported. Hybrid workers have to come into the office at least 39 days per quarter, Dell confirmed to Ars Technica, which equates to approximately three times a week. Those who would prefer to never commute to an office will not “be considered for promotion, or be able to change roles,” BI reported.

“For remote team members, it is important to understand the trade-offs: Career advancement, including applying to new roles in the company, will require a team member to reclassify as hybrid onsite,” Dell’s memo to workers said, per BI.

Dell didn’t respond to specific questions Ars Technica sent about the changes but sent a statement saying: “In today’s global technology revolution, we believe in-person connections paired with a flexible approach are critical to drive innovation and value differentiation.”

BI said it saw a promotion offer that a remote worker received that said that accepting the position would require coming into an “approved” office, which would mean that the employee would need to move out of their state.

Dell used to be pro-WFH

Dell’s history with remote workers started before the COVID-19 pandemic, over 10 years ago. Before 2020, 65 percent of Dell workers were already working remotely at least one day per week, per a blog that CEO Michael Dell penned via LinkedIn in September 2022. An anonymous Dell worker who reportedly has been remote for over 10 years and that BI spoke with estimated that 10 to 15 percent “of every team was remote” at Dell.

Michael Dell used to be a WFH advocate. In his 2022 blog post, he addressed the question of whether working in an office created “an advantage when it comes to promotion, performance, engagement or rewards,” determining:

At Dell, we found no meaningful differences for team members working remotely or office-based even before the pandemic forced everyone home. And when we asked our team members again this year, 90 percent of them said everyone has the opportunity to develop and learn new skills in our organization. The perception of unequal opportunity is just one of the myths of hybrid work …

At the time, Dell’s chief described the company as “committed to allow team members around the globe to choose the work style that best fits their lifestyle—whether that is remote or in an office or a blend of the two.” But the upcoming limitations for fully remote workers could be interpreted as Dell discouraging workers from working from home.

“We’re being forced into a position where either we’re going to be staying as the low man on the totem pole, first on the chopping block when it comes to workforce reduction, or we can be hybrid and go in multiple days a week, which really affects a lot of us,” an anonymous employee told BI.

Dell’s new WFH policy follows the February 2023 layoffs of about 6,650 workers, or around 5 percent of employees. Unnamed employees that BI spoke with showed concerns that the upcoming policy is an attempt to get people to quit so that Dell can save money on human resources without the severance costs of layoffs. Others are concerned that the rule changes will disproportionately affect women.

Meanwhile, the idea of return-to-office mandates helping businesses is being challenged. For example, a study by University of Pittsburgh researchers of some S&P 500 businesses found that return-to-office directives hurt employee morale and do not boost company finances.

Dell tells remote workers that they won’t be eligible for promotion Read More »

tesla-settles-with-black-worker-after-$3.2-million-verdict-in-racism-lawsuit

Tesla settles with Black worker after $3.2 million verdict in racism lawsuit

Owen Diaz v. Tesla —

Tesla and Owen Diaz both appealed $3.2 million verdict before deciding to settle.

Aerial view of Tesla cars in a parking lot at a Tesla facility.

Enlarge / Tesla cars sit in a parking lot at the company’s factory in Fremont, California on October 19, 2022.

Getty Images | Justin Sullivan

Tesla has settled with a Black former factory worker who won a $3.2 million judgment in a racial discrimination case, a court filing on Friday said.

Both sides were challenging the $3.2 million verdict in a federal appeals court but agreed to dismiss the case in the Friday filing. The joint stipulation for dismissal said that “the Parties have executed a final, binding settlement agreement that fully resolves all claims.”

Tesla presumably agreed to pay Owen Diaz some amount less than $3.2 million, ending a case in which Diaz was once slated to receive $137 million. As we’ve previously written, a jury in US District Court for the Northern District of California ruled that Tesla should pay $137 million to Diaz in October 2021.

In April 2022, US District Judge William Orrick reduced the award to $15 million, saying that was the highest amount supported by the evidence and law. Diaz rejected the $15 million award and sought a new damages trial, but a new jury awarded him $3.2 million in April 2023.

Diaz’s attorney, Lawrence Organ of the California Civil Rights Law Group, told CNBC that the parties “reached an amicable resolution of their disputes.” The settlement terms are confidential, he said.

“It took immense courage for Owen Diaz to stand up to a company the size of Tesla,” Organ said. The California Civil Rights Law Group is separately representing thousands of Black workers in a class action alleging that they faced discrimination and harassment while working at Tesla’s factory in Fremont, California.

“Tesla factory was saturated with racism”

Diaz operated a freight elevator at Tesla’s Fremont factory for less than a year beginning in June 2015. “In May 2016, he was ‘separated’ from Tesla without prior warning,” Orrick wrote in the April 2022 ruling that awarded Diaz $1.5 million in compensatory damages and $13.5 million in punitive damages.

“The evidence was disturbing,” Orrick wrote. “The jury heard that the Tesla factory was saturated with racism. Diaz faced frequent racial abuse, including the N-word and other slurs. Other employees harassed him. His supervisors and Tesla’s broader management structure did little or nothing to respond. And supervisors even joined in on the abuse, one going so far as to threaten Diaz and draw a racist caricature near his workstation.”

A Tesla filing in March 2023 argued that “no reasonable jury, properly instructed, could award any punitive damages against Tesla on the record here.” Tesla said it “enforced a policy prohibiting racially hostile conduct,” that it “took concrete and significant steps to remedy each and every racial incident Mr. Diaz reported,” and “likewise took concrete and significant steps to remedy other racially inappropriate conduct of which it was aware.”

Tesla is also facing lawsuits from the California Civil Rights Department and the US Equal Employment Opportunity Commission over alleged discrimination and harassment.

Tesla settles with Black worker after $3.2 million verdict in racism lawsuit Read More »

here’s-what-we-know-about-the-audi-q6-e-tron-and-its-all-new-ev-platform

Here’s what we know about the Audi Q6 e-tron and its all-new EV platform

premium platform electric —

Audi has bet big on its next flexible EV architecture, starting with this SUV.

An Audi A6 seen in a studio

Enlarge / This is Audi’s next electric vehicle, the Q6 e-tron SUV.

Audi

MUNICH—Audi’s new electric car platform is an important one for the company. Debuting in the new 2025 Q6 e-tron, it will provide the bones for many new electric Audis—not to mention Porsches and even Lamborghinis and Bentleys—in the coming years. Its development hasn’t been entirely easy, either; software delays got in the way of plans to have cars in customer hands in 2023. But now the new Q6 e-tron is ready to meet the world.

There’s some rather interesting technology integrated into the Q6 e-tron’s new electric vehicle architecture. Called PPE, or Premium Platform Electric, it’s been designed with flexibility in mind. Audi took the role of leading its development within Volkswagen Group, but the other brands within that corporate empire that target the upper end of the car market will also build EVs with PPE.

Since SUVs are still super-popular, Audi is kicking off the PPE era with an SUV. But the platform allows for other sizes and shapes—next year, we should see the A6 sedan and, if we’re really lucky, an A6 Avant station wagon.

  • The Q6 e-tron is a midsize SUV, measuring 187.8 inches (4,771 mm) long, 78.5 inches (1,993 mm) wide, and 65 inches (1,648 mm) tall.

    Audi

  • That’s as wide and tall as the Q8 e-tron, but four inches shorter, mostly in the 114.3-inch (2,998 mm) wheelbase, which translates to a little less rear leg and cargo room.

    Audi

  • The “quattro blisters” above each wheel arch prevent the shape from looking too slab-sided when you see it in person.

  • There’s a small frunk.

    Audi

  • Most of your luggage goes here.

    Audi

Better batteries

There’s a new EV powertrain, a significant advancement over the one that powers Audi’s Q8 e-tron SUV. The cells are prismatic and made by CATL at a German plant, with a nickel cobalt manganese chemistry (in a roughly 8:1:1 ratio). It’s been simplified, with 12 modules, each made of 15 cells. Compared to the Q8’s pack, the new Q6 has 30 percent greater energy density at the pack level, as well as 5 percent more actual energy, despite a 15 percent reduction in the mass of the pack (1,257 lbs/570 kg).

It operates at 800 V, which enables very fast DC charging: The 94.9 kWh (useable) battery pack can charge from 10 to 80 percent in 21 minutes. Audi says it doesn’t have to throttle back from 270 kW until the state of charge increases past 40 percent, at which point it declines at a constant rate to 150 kW at 80 percent SoC. (Past 80 percent, a fast-charging EV will throttle back the charger significantly.)

Of course, that requires access to a DC fast charger capable of 800 V. For 400 V chargers, the battery pack cleverly splits itself into two 400 V packs using a mechanical fuse switch, then equalizes their SoCs, then charges them both in parallel at up to 135 kW. Audi says it went for this approach versus a DC-DC inverter because it saved weight. Both sides feature AC charge ports, with DC charging only on the driver’s side. Model year 2025 Q6 e-trons will feature CCS1 ports on the driver’s side, with the switch to J3400 taking place the following year.

  • A cutaway of the Q6 e-tron’s powertrain.

    Jonathan Gitlin

  • A closer look at the Q6 e-tron’s rear drive motors.

    Jonathan Gitlin

  • More motor components.

    Jonathan Gitlin

  • PPE EVs have AC charging ports on both sides.

    Audi

Here’s what we know about the Audi Q6 e-tron and its all-new EV platform Read More »

qualcomm’s-“snapdragon-8s-gen-3”-cuts-down-the-company’s-flagship-soc

Qualcomm’s “Snapdragon 8s Gen 3” cuts down the company’s flagship SoC

The name just keeps getting longer —

The “s” moniker doesn’t make it better than the old 8 Gen 3 chip.

The promo image for Qualcomm's Snapdragon 8s Gen 3 chip.

Enlarge / The promo image for Qualcomm’s Snapdragon 8s Gen 3 chip.

Qualcomm

Qualcomm’s newest smartphone SoC is the Snapdragon 8s Gen 3. Years of iPhone “S” upgrades might lead you to assume this was a mid-cycle refresh to the Snapdragon 8 Gen 3, but Qualcomm says the Snapdragon 8s Gen 3 is a “specially curated” version of the Snapdragon 8 Gen 3. That means it’s a slightly slower, cheaper chip than the Snapdragon 8 Gen 3, which is still Qualcomm’s best smartphone chip.

The older, better Snapdragon 8 Gen 3 has a core layout of one 3.3 GHz “Prime” Arm Cortex X4 core, five “medium” A720 cores (three at 3.2 GHz, two at 2.0 GHz), and two “small” 2.3 GHz A520 cores for background processing. This new “S” chip swaps a medium core for a small one, for a 1+4+3 configuration instead of 1+5+2. Everything is clocked lower, too: 3 GHz for the Prime core, 2.8 GHz for all the medium cores, and 2 GHz for the small cores.

The modem is downgraded to an X70 instead of the X75 in the 8 Gen 3 chip. That theoretically means a lower max download speed (5Gbps instead of 10) but since you would actually need to be granted those speeds by your carrier, It’s not clear anyone would ever notice this. It also sounds like the X70 is more power-hungry, since it only has “Qualcomm 5G PowerSave Gen 3” instead of “Qualcomm 5G PowerSave Gen 4” on the flagship chip. We don’t think Qualcomm has ever given a technical explanation of what this means, though. The SoC is still 4nm, just like the 8 Gen 3. Video maxes out at 4K now instead of 8K.

Qualcomm says “Snapdragon 8s Gen 3 will be adopted by major OEMs including Honor, iQOO, realme, Redmi and Xiaomi, with commercial devices expected to be announced in the coming months.” That should tell you where this chip is headed: the “budget flagship” phones that are popular with Chinese OEMs.

Qualcomm’s “Snapdragon 8s Gen 3” cuts down the company’s flagship SoC Read More »

on-devin

On Devin

Is the era of AI agents writing complex code systems without humans in the loop upon us?

Cognition is calling Devin ‘the first AI software engineer.’

Here is a two minute demo of Devin benchmarking LLM performance.

Devin has its own web browser, which it uses to pull up documentation.

Devin has its own code editor.

Devin has its own command line.

Devin uses debugging print statements and uses the log to fix bugs.

Devin builds and deploys entire stylized websites without even being directly asked.

What could possibly go wrong? Install this on your computer today.

Padme.

I would by default assume all demos were supremely cherry-picked. My only disagreement with Austen Allred’s statement here is that this rule is not new:

Austen Allred: New rule:

If someone only shows their AI model in tightly controlled demo environments we all assume it’s fake and doesn’t work well yet

But in this case Patrick Collison is a credible source and he says otherwise.

Patrick Collison: These aren’t just cherrypicked demos. Devin is, in my experience, very impressive in practice.

Here we have Mckay Wrigley using it for half an hour. This does not feel like a cherry-picked example, although of course some amount of select is there if only via the publication effect.

He is very much a maximum acceleration guy, for whom everything is always great and the future is always bright, so calibrate for that, but still yes this seems like evidence Devin is for real.

This article in Bloomberg from Ashlee Vance has further evidence. It is clear that Devin is a quantum leap over known past efforts in terms of its ability to execute complex multi-step tasks, to adapt on the fly, and to fix its mistakes or be adjusted and keep going.

For once, when we wonder ‘how did they do that, what was the big breakthrough that made this work’ the Cognition AI people are doing not only the safe but also the smart thing and they are not talking.

They do have at least one series rival, as Magic.ai has raised $100 million from the venture team of Daniel Gross and Nat Friedman to build ‘a superhuman software engineer,’ including training their own model. The article seems strange interested in where AI is ‘a bubble’ as opposed to this amazing new technology.

This is one of those ‘helps until it doesn’t situations’ in terms of jobs:

vanosh: Seeing this is kinda scary. Like there is no way companies won’t go for this instead of humans.

Should I really have studied HR?

Mckay Wrigley: Learn to code! It makes using Devin even more useful.

Devin makes coding more valuable, until we hit so many coders that we are coding everything we need to be coding, or the AI no longer needs a coder in order to code. That is going to be a ways off. And once it happens, if you are not a coder, it is reasonable to ask yourself: What are you even doing? Plumbing while hoping for the best will probably not be a great strategy in that world.

Devin can sometimes (13.8% of the time?!) do actual real jobs on Upwork with nothing but a prompt to ‘figure it out.’

Aravind Srinivas (CEO Perplexity): This is the first demo of any agent, leave alone coding, that seems to cross the threshold of what is human level and works reliably. It also tells us what is possible by combining LLMs and tree search algorithms: you want systems that can try plans, look at results, replan, and iterate till success. Congrats to Cognition Labs!

Andres Gomez Sarmiento: Their results are even more impressive you read the fine print. All the other models were guided whereas devin was not. Amazing.

Deedy: I know everyone’s taking about it, but Devin’s 13% on SWE Bench is actually incredible.

Just take a look at a sample SWE Bench problem: this is a task for a human! Shout out to Carlos Jimenez for the fantastic dataset.

This is what exponential growth looks like (source).

I mean, yes, recursive self-improvement (RSI), autonomous agents seeking power and money and to wreck havoc whether or not this was an explicit instruction (and oh boy will it be an explicit instruction).

And of course there is losing control of your compute and accounts and your money and definitely your crypto and all that, obviously.

And there is the amount you had better trust whoever is making Devin.

But beyond that. What happens when this is used as designed without blowing up in someone’s face too blatantly? What more subtle things might happen?

One big danger is that AIs do not like to tell their manager that the proposed project is a bad idea. Another is that they write code without thinking too hard about how it will need to be maintained, or what requirements it will have to hook into the rest of the system, or what comments will allow humans (or other AIs) to understand it, and generally by default make a long term time bomb mess of everything involved.

GFodor.id: Managers are going to get such relief when they replace their senior engineers who always told them why they couldn’t do X or Y with shoddy AI devs who just do what they’re told.

Then the race is between the exponential tech debt spaghetti bomb and good AI devs appearing in time.

Real Selim Shady: I’m grabbing my popcorn. This will be the new crusade execs go on, brought to you by the same execs that had to bring back US engineering teams after a couple years of overseas outsourcing.

GFodor: No, I think that analogy is going to lead to a lot of mistakes in predicting what will happen.

What probably happens is some companies pull the trigger too early and implode but the ones that recover will do so by upgrading their AI devs. The jobs won’t be coming back.

Nick Dobos: Embrace the spaghetti code.

Devin or another similar AI will not properly appreciate the long term considerations involved in writing code that humans or other AIs will then be working to modify. This is not an easy thing to build into the incentive structure.

Nor is the question of whether your you should, even if technically you could. Or, the question of whether you ‘could’ in the sense that a senior engineer would stop you and tell you the reasons you ‘can’t’ do that even if you technically have a way to do it, which is a hybrid of the two.

Ideally over time people will learn how to include such considerations in the instruction set and make this less terrible, and find ways to ensure they are making less doomed or stupid requests, but all of this is going to be a problem even in good scenarios.

I find it odd no one is discussing this question.

How do you use Devin while ensuring that your online accounts and money at the bank and reputation and computer and hard drive remain safe, on a very practical, ‘I do not want something dumb to happen’ kind of way?

Also, how do you use Devin and then know you can rely on the results for practical purposes?

How do you dare put this level of trust and power in an autonomous coding agent?

I would love to be able to use something like Devin, if it is half as good as these demos say it is and of course it will only get better over time. The mundane utility is so obvious, so great, I am roaring to go. Except, how would I dare?

Let’s take a simple concrete example that really should be harmless and within its range. Say I want it to build me a website, or I want it to download and implement a GitHub repo and perhaps make a few tweaks, for example to download a new image model and some cool additional tricks for it.

I still need to know why giving Devin a command line and a code editor, and the ability to execute arbitrary code including things downloaded from arbitrary places on the internet, is not going to cook my goose.

One obvious solution is to completely isolate Devin from all of your other files and credentials. So you have a second computer that only runs Devin and related programs. You don’t give that computer any passwords or credentials that you care about. Maybe you find a good way to sandbox it.

I am not saying this cannot be done. On the contrary, if this is cool enough, while staying safe on a broader level, then I am positive that it can be done. There is a way to do it locally safely, if you are willing to jump through the hoops to do that. We just haven’t figured out what that is yet.

I am also presuming that there will be a lot of people who, unless that safety is made the default, will not take those steps, and sometimes hilarity will ensue.

I am not saying I will be Jack’s complete lack of sympathy, but… yeah.

As with those who claim to pick winners in sports, if someone is offering to sell you a software engineer, well, there is an upper limit on how good it might be.

Anton: Real talk, if you actually built a working “ai engineer” – what’s stopping you from scaling to 1000x and dominating the market (building everything)? Instead it’s being sold as a product hmmm

That upper limit is still quite high. People can do things on their own with workflows that would not make sense if they had to go through your company. You act as a force multiplier on projects you could not otherwise get involved with. Scaling up your operation without selling the product is not a trivial task. Even if you have the software engineer, that does nto mean you have the managers and architects, or the ideas, or any number of other things.

Also, by selling the product, it gets lots of data with which you can make the product so much better.

Bindu Reddy is skeptical? Or is she?

Bindu Reddy: Who are we kidding? 🙄🙄

AI is NOWHERE near automating software engineering.

Of course, a co-pilot is great for increasing productivity, but AI is strictly just an assistant to aid programming professionals.

We are at least 3-5 years away from automating software engineering.

Claudiu: 3-5 years… is that a lot now? If you think so, think again: someone who just entered the field, in their 20s, is about half a century away from retirement. I don’t think they’ll like the prospects they have.

No, being able to do (presumably the easiest) 13.8% of Upwork projects does not mean you are that close to automating software engineering.

It does mean we are a lot closer, or at least making more progress, than we looked a week ago. Agents are working better in the current generation of LLMs than we anticipated, and will get much better when GPT-5 and friends drop, which will improve all the other steps as well. Devin is a future product being built for the future.

Even if you can get Devin or a similar agent to do these kinds of tasks reliably, being able to then use that to get maintainable code, to build up larger projects, to get things you can deploy and count on, is a very different matter. This is an extremely wrong road, and it seems AGI complete.

I agree that we are ‘at least 3-5 years away.’

But I had the same thought as Claudiu. That is not very many years.

It is also a supremely scary potential event.

What happens after you fully automate software engineering? Or if you, in magic.ai’s terms, build ‘a superhuman’ software engineer, which you can copy?

I believe the appropriate term is ‘singularity.

Grace Kind!: Devin, build me a better Devin 🙂

Devin 2, build me a better Devin

Devin 3, build me a better Devin

Devin 4, rollback to Devin 3

Devin 5, rollback to Devin 3 please

devin6 rollback —v3

kill -9 devin7

sudo kill -9 devin8

^C^C^C^C^C^C^C^C

Horror Unpacked: I see Devin and then I remember http://magic.ai getting 100m to automate SWEs, and how the frontier models are all explicitly specced for code, and I feel like I should have known that we’d all race to get underneath the lowest possible bar for x-risk.

Alternatively, consider this discussion under Mckay Wrigley’s post:

James: You can just ‘feel’ the future. Imagine once this starts being applied to advanced research. If we get a GPT5 or GPT6 with a 130-150 IQ equivalent, combined with an agent. You’re literally going to ask it to ‘solve cold fusion’ and walk away for 6 months.

Mckay Wrigley: Exactly. The mistake people make is assuming it won’t improve. And it blows my mind how many people make this mistake, even those that are in tech.

Kevin Siskar: McKay, do you think you will build a “agent UI” frontend into ChatbotUI [that is built by the agent in the video]?

Mckay Wrigley: This is in the works 🙂

Um. I. Uh. I do not think you have thought about the implications of ‘solve cold fusion’ being a thing that one can do at a computer terminal?

What else could you type into that computer terminal?

There are few limits to what you can type into that terminal. There are also few limits of what might happen after you do so. The future gets weird and unknown, and it gets weird and unknown fast. The chances that the resulting configurations of atoms contain no human beings, and rather quickly? Rather high.

Here is another example of asking almost the right question:

Paul Graham: We seem to be moving from “software is eating the world” to “software written by software is eating the world.”

I wonder how many “software written by”s ultimately get prepended.

This is one of those cases where the number are one, two and many. If you get software^3, well, hold onto your butts.

I want to be very clear that none of this this is something I worry about for Devin as it exists right now.

As noted above, I am terrified on a practical level of using Devin on my own computer. But that is a distinct class of concern. Devin 1 is not going to be good enough to build Devin 2 on its own (although it would presumably help), or to cause an extinction event, or anything like that, unless they really do not like shipping early products.

I do notice that this is exactly the lowest-resistance default shortest path to ensuring that AI agents exist and have the capabilities necessary to cause serious trouble at the earliest possible moment, when sufficient improvements are made in various places. Our strongest models are optimizing for writing code, and we are working on how to have them write code and form and execute plans for making software without humans in the loop. You do not need to be a genius to know how that story might end.

As discussed in the previous section, the most obvious failure mode is eventually recursive self-improvement, or RSI.

Or, even without that, setting such programs off to do various things autonomously, including to make money or seek power, often in ways (intentionally or otherwise) that make it difficult to turn off for its author, for others or for both.

We also have instrumental convergence. Devin is designed to handle obstacles and find ways to solve problems. What happens when a sufficiently capable version of this is given a mission that it lacks the resources to directly complete? What will it do if the task requires more compute, or more access, or persuading people, and so on? At some point some future version is going to go there.

There also does not seem to be any reasonable way to keep Devin from implementing things that would be harmful or immoral? At best this is alignment to the user and the request of the user. And there is no attempt to actually consider what other impacts might happen along the way.

In general, this is giving everyone the capability to take an agent capable of coding complex things in multiple steps and planning around obstacles and problems, give it an arbitrary goal, and give it full access to our world and the internet. Then we hope it all works out for the best.

And we all race to make such systems more capable and intelligent, better at doing that, until… well… yeah.

Even if everyone involved means well, and even if none of the direct simple failure modes happen, sufficiently efficient and capable and intelligent agents given goals that will advance people’s individual causes creates dynamics that seem to by default doom us. Remember that giving such agents maximum freedom of action tends to be economically efficient.

As one might expect Tyler Cowen to say, model this.

At best, we are about to go down a highly volatile and dangerous road.

This is somewhat of a reiteration, but it needs to be made very salient and clear.

One should periodically pause to notice how a new technological marvel like Devin compares to prior models of how things would go. We don’t know if Devin has the full capabilities that people are saying it does, or how far that will go in practice, but it is clearly a big step up, and more steps up are coming. This is happening.

Remember all those precautions any sane person would obviously take before letting something like Devin exist, or using it?

How many of those does it look like Devin is going to be using?

Even if that is mostly harmless now, what does that tell you about the future?

Also consider what this implies about future capabilities.

If you were counting on AIs or LLMs not having goals or not wanting things? If you were counting on them being unable to make plans or handle obstacles? If that was what was making you think everything was going to be fine?

Well, set all that aside. People are hard at work invalidating that hope, and it sure looks like they are going to succeed.

That does not mean that any given future new LLM couldn’t be implemented without letting such a system be attached to it. You could keep close watch on the weights. You could do all the precautions, up to and including things like air gapping the system and assuming it is unsafe for humans to view outputs during testing. You can engineer the system to only do a narrow set of things that you predict allow us to proceed safely. You can apply various control and alignment techniques. There are many options. Some of them might work.

I am not filled with confidence that anyone will even bother to try.

And of course, going forward, one must remember that there will be an open source implementation of an agent similar to Devin, that will continuously improve over time. You can then plug into that any model with open weights, and anything derived from that model. And by you may, I mean someone clearly will, and then do whatever the funniest possible thing is, also the most dangerous, because people are like that.

So, choose your actions and policy regime accordingly.

On Devin Read More »

security-footage-of-boeing-repair-before-door-plug-blowout-was-overwritten

Security footage of Boeing repair before door-plug blowout was overwritten

737 Max door-plug blowout —

NTSB: Boeing “unable to find the records documenting” repair work on 737 Max 9.

NTSB Chair Jennifer Homendy sitting in front of a microphone while testifying at a Senate hearing.

Enlarge / National Transportation Safety Board Chair Jennifer Homendy testifies about the Boeing door-plug investigation before the Senate Commerce, Science, and Transportation Committee on March 6, 2024, in Washington, DC.

Getty Images | Kevin Dietsch

A government investigation into a Boeing 737 Max 9 plane’s door-plug blowout has been hampered by a lack of repair records and security camera footage, the National Transportation Safety Board’s chair told US senators. Boeing was “unable to find the records” and told the NTSB that the security camera footage was overwritten.

“To date, we still do not know who performed the work to open, reinstall, and close the door plug on the accident aircraft,” NTSB Chair Jennifer Homendy wrote Wednesday in a letter to leaders of the Senate Commerce, Science, and Transportation Committee. “Boeing has informed us that they are unable to find the records documenting this work. A verbal request was made by our investigators for security camera footage to help obtain this information; however, they were informed the footage was overwritten. The absence of those records will complicate the NTSB’s investigation moving forward.”

A Boeing spokesperson told Ars today that under the company’s standard practice, “video recordings are maintained on a rolling 30-day basis” before being overwritten. The NTSB’s preliminary report on the investigation said the airplane was delivered to Alaska Airlines on October 31, 2023, after a repair in a Boeing factory. On January 5, the plane was forced to return to Portland International Airport in Oregon when a passenger door plug blew off the aircraft during flight.

The NTSB’s preliminary report found that four bolts were missing from the door plug, which can be used instead of an emergency exit door. There was “no evidence” that the door plug “was opened after leaving Boeing’s facility,” indicating that the bolts were not re-installed at the factory. The plane was serviced at Boeing’s Renton, Washington, facility to replace five damaged rivets in a job that required opening the door plug.

“We will continue supporting this investigation in the transparent and proactive fashion we have supported all regulatory inquiries into this accident,” Boeing said in a statement provided to Ars. “We have worked hard to honor the rules about the release of investigative information in an environment of intense interest from our employees, customers, and other stakeholders, and we will continue our efforts to do so.”

Chair called Boeing CEO to seek employee names

Homendy’s letter to Senate Commerce Committee Chair Maria Cantwell (D-Wash.) and Ranking Member Ted Cruz (R-Texas) responded to questions raised at a committee hearing last week. The questions were related to “whether Boeing has provided documentation on the work to open, reinstall, and close the door plug,” and the identities of door crew employees, the letter noted.

“NTSB investigators first requested documents that would have contained this information from Boeing on January 9, 2024,” the letter said. “Shortly thereafter, we identified the door crew manager and were advised that he was out on medical leave. We requested status updates on February 15, 2024, and February 22, 2024, after which we were advised by his attorney that he would not be able to provide a statement or interview to NTSB due to medical issues.”

Boeing provided the names of some people who were familiar with the door-plug work, but the NTSB said it wanted a more exhaustive list to prepare for investigative interviews. On March 2, NTSB investigators asked Boeing for the names of all employees who reported to the door crew manager at the time of the repair in September 2023. Boeing provided the list but “did not identify which personnel conducted the door plug work,” the letter said.

“After NTSB received this list, I called Boeing Chief Executive Officer David Calhoun and asked for the names of the people who performed the work,” Homendy wrote. “He stated he was unable to provide that information and maintained that Boeing has no records of the work being performed.”

NTSB seeks info on Boeing quality-assurance and safety

Homendy told senators that the agency is not seeking the names for punitive purposes. “We want to speak with them to learn about Boeing’s quality-assurance processes and safety culture. Our only intent is to identify deficiencies and recommend safety improvements so accidents like this never happen again,” she wrote.

Homendy wrote that she is “increasingly concerned that the focus on the names of individual front-line workers will negatively impact our investigation and discourage such Boeing employees from providing NTSB with information relevant to this investigation.” To counter those fears, Homendy “instructed NTSB to utilize our authority to protect the identities of the door crew and other front-line employees who come forward with information relevant to the investigation.”

Homendy also sent a letter to Boeing on Wednesday reminding the company that until the investigation concludes, “only appropriate NTSB personnel are authorized to publicly disclose investigative information and, even then, the disclosure is limited to factual information verified during the course of the investigation.”

“For the public to perceive the investigation as credible, the investigation should speak with one voice—that being the voice of the independent agency conducting it,” Homendy told Boeing in the letter.

Security footage of Boeing repair before door-plug blowout was overwritten Read More »

after-114-days-of-change,-broadcom-ceo-acknowledges-vmware-related-“unease”

After 114 days of change, Broadcom CEO acknowledges VMware-related “unease”

M&A pains —

“There’s more to come.”

A Broadcom sign outside one of its offices.

Broadcom CEO and President Hock Tan has acknowledged the discomfort VMware customers and partners have experienced after the sweeping changes that Broadcom has instituted since it acquired the virtualization company 114 days ago.

In a blog post Thursday, Tan noted that Broadcom spent 18 months evaluating and buying VMware. He said that while there’s still a lot of work to do, the company has made “substantial progress.”

That so-called progress, though, has worried some of Broadcom’s customers and partners.

Tan wrote:

Of course, we recognize that this level of change has understandably created some unease among our customers and partners. But all of these moves have been with the goals of innovating faster, meeting our customers’ needs more effectively, and making it easier to do business with us.

Tan believes that the changes will ultimately “provide greater profitability and improved market opportunities” for channel partners. However, many IT solution provider businesses that were working with VMware have already been disrupted.

For example, after buying VMware, Broadcom took over the top 2,000 VMware accounts from VMware channel partners. In a March earnings call, Tan said that Broadcom has been focused on upselling those customers. He also said Broadcom expects VMware revenue to grow double-digits quarter over quarter for the rest of the fiscal year.

Beyond that, Broadcom ended the VMware channel partner program, making the primary path to reselling VMware an invite-only Broadcom program.

Additionally, Broadcom killing VMware perpetual licensing has reportedly upended financials for numerous businesses. In a March “User Group Town Hall,” attendees complained about “price rises of 500 and 600 percent,” The Register reported. In February, ServetheHome reported that “smaller” managed service providers focusing on cloud services were reporting seeing the price of working with VMware increase tenfold. “They do not have the revenue nor ability to charge for that kind of price increase, especially this rapidly,” ServeTheHome reported.

By contrast, Tan recently saw a financial windfall, making the equivalent of more than double his 2022 salary in 2023. A US Securities and Exchange Commission filing showed that Broadcom paid Tan $161.8 million, including $160.5 million in stock that will vest over the next five years (Tan isn’t eligible for more bonus payouts until 2028). Broadcom announced its VMware acquisition in May 2022 and closed in late November for $69 billion.

In his blog post, Tan defended the subscription-only licensing model, calling it “the industry standard.” He said VMware started accelerating its transition to this strategy in 2019, (which is before Broadcom bought VMware). He also linked to a February blog post from VMware’s Prashanth Shenoy, VP of product and technical marketing for the Cloud, Infrastructure, Platforms, and Solutions group at VMware, that also noted acquisition-related “concerns” but claimed the evolution would be fiscally prudent.

Other Broadcom-led changes to VMware over the past 114 days include at least 2,800 VMware jobs cut, shuttering the free version of ESXi, and plans to sell VMware’s End User Computing business to KKR, as well as spend $1 billion on VMware R&D.

After 114 days of change, Broadcom CEO acknowledges VMware-related “unease” Read More »

lawsuit-opens-research-misconduct-report-that-may-get-a-harvard-prof-fired

Lawsuit opens research misconduct report that may get a Harvard prof fired

Image of a campus of red brick buildings with copper roofs.

Enlarge / Harvard’s got a lawsuit on its hands.

Glowimages

Accusations of research misconduct often trigger extensive investigations, typically performed by the institution where the misconduct allegedly took place. These investigations are internal employment matters, and false accusations have the potential to needlessly wreck someone’s career. As a result, most of these investigations are kept completely confidential, even after their completion.

But all the details of a misconduct investigation performed by Harvard University became public this week through an unusual route. The professor who had been accused of misconduct, Francesca Gino, had filed a multi-million dollar lawsuit, targeting both Harvard and a team of external researchers who had accused her of misconduct. Harvard submitted its investigator’s report as part of its attempt to have part of the suit dismissed, and the judge overseeing the case made it public.

We covered one of the studies at issue at the time of its publication. It has since been retracted, and we’ll be updating our original coverage accordingly.

Misconduct allegations lead to lawsuit

Gino, currently on administrative leave, had been faculty at Harvard Business School, where she did research on human behavior. One of her more prominent studies (the one we covered) suggested that signing a form before completing it caused people to fill in its contents more accurately than if they filled out the form first and then signed it.

Oddly, for a paper about honesty, it had a number of issues. Some of its original authors had attempted to go back and expand on the paper but found they were unable to replicate the results. That seems to have prompted a group of behavioral researchers who write at the blog Data Colada to look more carefully at the results that didn’t replicate, at which point they found indications that the data was fabricated. That got the paper retracted.

Gino was not implicated in the fabrication of the data. But the attention of the Data Colada team (Uri Simonsohn, Leif Nelson, and Joe Simmons) had been drawn to the paper. They found additional indications of completely independent problems in other data from the paper that did come from her work, which caused them to examine additional papers from Gino, coming up with evidence for potential research fraud in four of them.

Before posting it on their blog, however, the Data Colada team had provided their evidence to Harvard, which launched its own investigation. Their posts came out after Harvard’s investigation concluded that Gino’s research had serious issues, and she was placed on administrative leave as the university looked into revoking her tenure. It also alerted the journals that had published the three yet-to-be-retracted papers about the issues.

Things might have ended there, except that Gino filed a defamation lawsuit against Harvard and the Data Colada team, claiming they “worked together to destroy my career and reputation despite admitting they have no evidence proving their allegations.” As part of the $25 million suit, she also accused Harvard of mishandling its investigation and not following proper procedures.

Lawsuit opens research misconduct report that may get a Harvard prof fired Read More »