Open Source

microsoft-open-sources-infamously-weird,-ram-hungry-ms-dos-4.00-release

Microsoft open-sources infamously weird, RAM-hungry MS-DOS 4.00 release

a road not traveled —

DOS 4.00 was supposed to add multitasking to the OS, but it was not to be.

A DOS prompt.

Enlarge / A DOS prompt.

Microsoft has open-sourced another bit of computing history this week: The company teamed up with IBM to release the source code of 1988’s MS-DOS 4.00, a version better known for its unpopularity, bugginess, and convoluted development history than its utility as a computer operating system.

The MS-DOS 4.00 code is available on Microsoft’s MS-DOS GitHub page along with versions 1.25 and 2.0, which Microsoft open-sourced in cooperation with the Computer History Museum back in 2014. All open-source versions of DOS have been released under the MIT License.

Initially, MS-DOS 4.00 was slated to include new multitasking features that allow software to run in the background. This release of DOS, also sometimes called “MT-DOS” or “Mutitasking MS-DOS” to distinguish it from other releases, was only released through a few European PC OEMs and never as a standalone retail product.

The source code Microsoft released this week is not for that multitasking version of DOS 4.00, and Microsoft’s Open Source Programs Office was “unable to find the full source code” for MT-DOS when it went to look. Rather, Microsoft and IBM have released the source code for a totally separate version of DOS 4.00, primarily developed by IBM to add more features to the existing non-multitasking version of DOS that ran on most IBM PCs and PC clones of the day.

Microsoft never returned to its multitasking DOS idea in subsequent releases. Multitasking would become the purview of graphical operating systems like Windows and OS/2, while MS-DOS versions 5.x and 6.x continued with the old one-app-at-a-time model of earlier releases.

Microsoft has released some documentation and binary files for MT-DOS and “may update this release if more is discovered.” The company credits English researcher Connor “Starfrost” Hyde for shaking all of this source code loose as part of an ongoing examination of MT-DOS that he is documenting on his website. Hyde has posted many screenshots of a 1984-era build of MT-DOS, including of the “session manager” that it used to track and switch between running applications.

Confidential copies of the obscure, abandoned multitasking-capable version of MS-DOS 4.00. Microsoft has been unable to locate source code for this release, sometimes referred to as

Confidential copies of the obscure, abandoned multitasking-capable version of MS-DOS 4.00. Microsoft has been unable to locate source code for this release, sometimes referred to as “MT-DOS” or “Multitasking MS-DOS.”

Microsoft

The publicly released version of MS-DOS 4.00 is known less for its new features than for its high memory usage; the 4.00 release could consume as much as 92KB of RAM, way up from the roughly 56KB used by MS-DOS 3.31, and the 4.01 release reduced this to about 86KB. The later MS-DOS 5.0 and 6.0 releases maxed out at 72 or 73KB, and even IBM’s PC DOS 2000 only wanted around 64KB.

These RAM numbers would be rounding errors on any modern computer, but in the days when RAM was pricey, systems maxed out at 640KB, and virtual memory wasn’t a thing, such a huge jump in system requirements was a big deal. Today’s retro-computing enthusiasts still tend to skip over MS-DOS 4.00, recommending either 3.31 for its lower memory usage or later versions for their expanded feature sets.

Microsoft has open-sourced some other legacy code over the years, including those older MS-DOS versions, Word for Windows 1.1a, 1983-era GW-BASIC, and the original Windows File Manager. While most of these have been released in their original forms without any updates or changes, the Windows File Manager is actually actively maintained. It was initially just changed enough to run natively on modern 64-bit and Arm PCs running Windows 10 and 11, but it’s been updated with new fixes and features as recently as March 2024.

The release of the MS-DOS 4.0 code isn’t the only new thing that DOS historians have gotten their hands on this year. One of the earliest known versions of 86-DOS, the software that Microsoft would buy and turn into the operating system for the original IBM PC, was discovered and uploaded to the Internet Archive in January. An early version of the abandoned Microsoft-developed version of OS/2 was also unearthed in March.

Microsoft open-sources infamously weird, RAM-hungry MS-DOS 4.00 release Read More »

home-assistant-has-a-new-foundation-and-a-goal-to-become-a-consumer-brand

Home Assistant has a new foundation and a goal to become a consumer brand

An Open Home stuffed full of code —

Can a non-profit foundation get Home Assistant to the point of Home Depot boxes?

Open Home Foundation logo on a multicolor background

Open Home Foundation

Home Assistant, until recently, has been a wide-ranging and hard-to-define project.

The open smart home platform is an open source OS you can run anywhere that aims to connect all your devices together. But it’s also bespoke Raspberry Pi hardware, in Yellow and Green. It’s entirely free, but it also receives funding through a private cloud services company, Nabu Casa. It contains tiny board project ESPHome and other inter-connected bits. It has wide-ranging voice assistant ambitions, but it doesn’t want to be Alexa or Google Assistant. Home Assistant is a lot.

After an announcement this weekend, however, Home Assistant’s shape is a bit easier to draw out. All of the project’s ambitions now fall under the Open Home Foundation, a non-profit organization that now contains Home Assistant and more than 240 related bits. Its mission statement is refreshing, and refreshingly honest about the state of modern open source projects.

The three pillars of the Open Home Foundation.

The three pillars of the Open Home Foundation.

Open Home Foundation

“We’ve done this to create a bulwark against surveillance capitalism, the risk of buyout, and open-source projects becoming abandonware,” the Open Home Foundation states in a press release. “To an extent, this protection extends even against our future selves—so that smart home users can continue to benefit for years, if not decades. No matter what comes.” Along with keeping Home Assistant funded and secure from buy-outs or mission creep, the foundation intends to help fund and collaborate with external projects crucial to Home Assistant, like Z-Wave JS and Zigbee2MQTT.

My favorite video.

Home Assistant’s ambitions don’t stop with money and board seats, though. They aim to “be an active political advocate” in the smart home field, toward three primary principles:

  • Data privacy, which means devices with local-only options, and cloud services with explicit permissions
  • Choice in using devices with one another through open standards and local APIs
  • Sustainability by repurposing old devices and appliances beyond company-defined lifetimes

Notably, individuals cannot contribute modest-size donations to the Open Home Foundation. Instead, the foundation asks supporters to purchase a Nabu Casa subscription or contribute code or other help to its open source projects.

From a few lines of Python to a foundation

Home Assistant founder Paulus Schoutsen wanted better control of his Philips Hue smart lights just before 2014 or so and wrote a Python script to do so. Thousands of volunteer contributions later, Home Assistant was becoming a real thing. Schoutsen and other volunteers inevitably started to feel overwhelmed by the “free time” coding and urgent bug fixes. So Schoutsen, Ben Bangert, and Pascal Vizeli founded Nabu Casa, a for-profit firm intended to stabilize funding and paid work on Home Assistant.

Through that stability, Home Assistant could direct full-time work to various projects, take ownership of things like ESPHome, and officially contribute to open standards like Zigbee, Z-Wave, and Matter. But Home Assistant was “floating in a kind of undefined space between a for-profit entity and an open-source repository on GitHub,” according to the foundation. The Open Home Foundation creates the formal home for everything that needs it and makes Nabu Casa a “special, rules-bound inaugural partner” to better delineate the business and non-profit sides.

Home Assistant as a Home Depot box?

In an interview with The Verge’s Jennifer Pattison Tuohy, and in a State of the Open Home stream over the weekend, Schoutsen also suggested that the Foundation gives Home Assistant a more stable footing by which to compete against the bigger names in smart homes, like Amazon, Google, Apple, and Samsung. The Home Assistant Green starter hardware will sell on Amazon this year, along with HA-badged extension dongles. A dedicated voice control hardware device that enables a local voice assistant is coming before year’s end. Home Assistant is partnering with Nvidia and its Jetson edge AI platform to help make local assistants better, faster, and more easily integrated into a locally controlled smart home.

That also means Home Assistant is growing as a brand, not just a product. Home Assistant’s “Works With” program is picking up new partners and has broad ambitions. “We want to be a consumer brand,” Schoutsen told Tuohy. “You should be able to walk into a Home Depot and be like, ‘I care about my privacy; this is the smart home hub I need.’”

Where does this leave existing Home Assistant enthusiasts, who are probably familiar with the feeling of a tech brand pivoting away from them? It’s hard to imagine Home Assistant dropping its advanced automation tools and YAML-editing offerings entirely. But Schoutsen suggested he could imagine a split between regular and “advanced” users down the line. But Home Assistant’s open nature, and now its foundation, should ensure that people will always be able to remix, reconfigure, or re-release the version of smart home choice they prefer.

Home Assistant has a new foundation and a goal to become a consumer brand Read More »

llms-keep-leaping-with-llama-3,-meta’s-newest-open-weights-ai-model

LLMs keep leaping with Llama 3, Meta’s newest open-weights AI model

computer-powered word generator —

Zuckerberg says new AI model “was still learning” when Meta stopped training.

A group of pink llamas on a pixelated background.

On Thursday, Meta unveiled early versions of its Llama 3 open-weights AI model that can be used to power text composition, code generation, or chatbots. It also announced that its Meta AI Assistant is now available on a website and is going to be integrated into its major social media apps, intensifying the company’s efforts to position its products against other AI assistants like OpenAI’s ChatGPT, Microsoft’s Copilot, and Google’s Gemini.

Like its predecessor, Llama 2, Llama 3 is notable for being a freely available, open-weights large language model (LLM) provided by a major AI company. Llama 3 technically does not quality as “open source” because that term has a specific meaning in software (as we have mentioned in other coverage), and the industry has not yet settled on terminology for AI model releases that ship either code or weights with restrictions (you can read Llama 3’s license here) or that ship without providing training data. We typically call these releases “open weights” instead.

At the moment, Llama 3 is available in two parameter sizes: 8 billion (8B) and 70 billion (70B), both of which are available as free downloads through Meta’s website with a sign-up. Llama 3 comes in two versions: pre-trained (basically the raw, next-token-prediction model) and instruction-tuned (fine-tuned to follow user instructions). Each has a 8,192 token context limit.

A screenshot of the Meta AI Assistant website on April 18, 2024.

Enlarge / A screenshot of the Meta AI Assistant website on April 18, 2024.

Benj Edwards

Meta trained both models on two custom-built, 24,000-GPU clusters. In a podcast interview with Dwarkesh Patel, Meta CEO Mark Zuckerberg said that the company trained the 70B model with around 15 trillion tokens of data. Throughout the process, the model never reached “saturation” (that is, it never hit a wall in terms of capability increases). Eventually, Meta pulled the plug and moved on to training other models.

“I guess our prediction going in was that it was going to asymptote more, but even by the end it was still leaning. We probably could have fed it more tokens, and it would have gotten somewhat better,” Zuckerberg said on the podcast.

Meta also announced that it is currently training a 400B parameter version of Llama 3, which some experts like Nvidia’s Jim Fan think may perform in the same league as GPT-4 Turbo, Claude 3 Opus, and Gemini Ultra on benchmarks like MMLU, GPQA, HumanEval, and MATH.

Speaking of benchmarks, we have devoted many words in the past to explaining how frustratingly imprecise benchmarks can be when applied to large language models due to issues like training contamination (that is, including benchmark test questions in the training dataset), cherry-picking on the part of vendors, and an inability to capture AI’s general usefulness in an interactive session with chat-tuned models.

But, as expected, Meta provided some benchmarks for Llama 3 that list results from MMLU (undergraduate level knowledge), GSM-8K (grade-school math), HumanEval (coding), GPQA (graduate-level questions), and MATH (math word problems). These show the 8B model performing well compared to open-weights models like Google’s Gemma 7B and Mistral 7B Instruct, and the 70B model also held its own against Gemini Pro 1.5 and Claude 3 Sonnet.

A chart of instruction-tuned Llama 3 8B and 70B benchmarks provided by Meta.

Enlarge / A chart of instruction-tuned Llama 3 8B and 70B benchmarks provided by Meta.

Meta says that the Llama 3 model has been enhanced with capabilities to understand coding (like Llama 2) and, for the first time, has been trained with both images and text—though it currently outputs only text. According to Reuters, Meta Chief Product Officer Chris Cox noted in an interview that more complex processing abilities (like executing multi-step plans) are expected in future updates to Llama 3, which will also support multimodal outputs—that is, both text and images.

Meta plans to host the Llama 3 models on a range of cloud platforms, making them accessible through AWS, Databricks, Google Cloud, and other major providers.

Also on Thursday, Meta announced that Llama 3 will become the new basis of the Meta AI virtual assistant, which the company first announced in September. The assistant will appear prominently in search features for Facebook, Instagram, WhatsApp, Messenger, and the aforementioned dedicated website that features a design similar to ChatGPT, including the ability to generate images in the same interface. The company also announced a partnership with Google to integrate real-time search results into the Meta AI assistant, adding to an existing partnership with Microsoft’s Bing.

LLMs keep leaping with Llama 3, Meta’s newest open-weights AI model Read More »

pypi-halted-new-users-and-projects-while-it-fended-off-supply-chain-attack

PyPI halted new users and projects while it fended off supply-chain attack

ONSLAUGHT —

Automation is making attacks on open source code repositories harder to fight.

Supply-chain attacks, like the latest PyPI discovery, insert malicious code into seemingly functional software packages used by developers. They're becoming increasingly common.

Enlarge / Supply-chain attacks, like the latest PyPI discovery, insert malicious code into seemingly functional software packages used by developers. They’re becoming increasingly common.

Getty Images

PyPI, a vital repository for open source developers, temporarily halted new project creation and new user registration following an onslaught of package uploads that executed malicious code on any device that installed them. Ten hours later, it lifted the suspension.

Short for the Python Package Index, PyPI is the go-to source for apps and code libraries written in the Python programming language. Fortune 500 corporations and independent developers alike rely on the repository to obtain the latest versions of code needed to make their projects run. At a little after 7 pm PT on Wednesday, the site started displaying a banner message informing visitors that the site was temporarily suspending new project creation and new user registration. The message didn’t explain why or provide an estimate of when the suspension would be lifted.

Screenshot showing temporary suspension notification.

Enlarge / Screenshot showing temporary suspension notification.

Checkmarx

About 10 hours later, PyPI restored new project creation and new user registration. Once again, the site provided no reason for the 10-hour halt.

According to security firm Checkmarx, in the hours leading up to the closure, PyPI came under attack by users who likely used automated means to upload malicious packages that, when executed, infected user devices. The attackers used a technique known as typosquatting, which capitalizes on typos users make when entering the names of popular packages into command-line interfaces. By giving the malicious packages names that are similar to popular benign packages, the attackers count on their malicious packages being installed when someone mistakenly enters the wrong name.

“The threat actors target victims with Typosquatting attack technique using their CLI to install Python packages,” Checkmarx researchers Yehuda Gelb, Jossef Harush Kadouri, and Tzachi Zornstain wrote Thursday. “This is a multi-stage attack and the malicious payload aimed to steal crypto wallets, sensitive data from browsers (cookies, extensions data, etc.) and various credentials. In addition, the malicious payload employed a persistence mechanism to survive reboots.”

Screenshot showing some of the malicious packages found by Checkmarx.

Enlarge / Screenshot showing some of the malicious packages found by Checkmarx.

Checkmarx

The post said the malicious packages were “most likely created using automation” but didn’t elaborate. Attempts to reach PyPI officials for comment weren’t immediately successful. The package names mimicked those of popular packages and libraries such as Requests, Pillow, and Colorama.

The temporary suspension is only the latest event to highlight the increased threats confronting the software development ecosystem. Last month, researchers revealed an attack on open source code repository GitHub that was ​​flooding the site with millions of packages containing obfuscated code that stole passwords and cryptocurrencies from developer devices. The malicious packages were clones of legitimate ones, making them hard to distinguish to the casual eye.

The party responsible automated a process that forked legitimate packages, meaning the source code was copied so developers could use it in an independent project that built on the original one. The result was millions of forks with names identical to the original ones. Inside the identical code was a malicious payload wrapped in multiple layers of obfuscation. While GitHub was able to remove most of the malicious packages quickly, the company wasn’t able to filter out all of them, leaving the site in a persistent loop of whack-a-mole.

Similar attacks are a fact of life for virtually all open source repositories, including npm pack picks and RubyGems.

Earlier this week, Checkmarx reported a separate supply-chain attack that also targeted Python developers. The actors in that attack cloned the Colorama tool, hid malicious code inside, and made it available for download on a fake mirror site with a typosquatted domain that mimicked the legitimate files.pythonhosted.org one. The attackers hijacked the accounts of popular developers, likely by stealing the authentication cookies they used. Then, they used the hijacked accounts to contribute malicious commits that included instructions to download the malicious Colorama clone. Checkmarx said it found evidence that some developers were successfully infected.

In Thursday’s post, the Checkmarx researchers reported:

The malicious code is located within each package’s setup.py file, enabling automatic execution upon installation.

In addition, the malicious payload employed a technique where the setup.py file contained obfuscated code that was encrypted using the Fernet encryption module. When the package was installed, the obfuscated code was automatically executed, triggering the malicious payload.

Checkmarx

Upon execution, the malicious code within the setup.py file attempted to retrieve an additional payload from a remote server. The URL for the payload was dynamically constructed by appending the package name as a query parameter.

Screenshot of code creating dynamic URL.

Enlarge / Screenshot of code creating dynamic URL.

Checkmarx

The retrieved payload was also encrypted using the Fernet module. Once decrypted, the payload revealed an extensive info-stealer designed to harvest sensitive information from the victim’s machine.

The malicious payload also employed a persistence mechanism to ensure it remained active on the compromised system even after the initial execution.

Screenshot showing code that allows persistence.

Enlarge / Screenshot showing code that allows persistence.

Checkmarx

Besides using typosquatting and a similar technique known as brandjacking to trick developers into installing malicious packages, threat actors also employ dependency confusion. The technique works by uploading malicious packages to public code repositories and giving them a name that’s identical to a package stored in the target developer’s internal repository that one or more of the developer’s apps depend on to work. Developers’ software management apps often favor external code libraries over internal ones, so they download and use the malicious package rather than the trusted one. In 2021, a researcher used a similar technique to successfully execute counterfeit code on networks belonging to Apple, Microsoft, Tesla, and dozens of other companies.

There are no sure-fire ways to guard against such attacks. Instead, it’s incumbent on developers to meticulously check and double-check packages before installing them, paying close attention to every letter in a name.

PyPI halted new users and projects while it fended off supply-chain attack Read More »

Business Case for Improving Open Source Software Supply Chain Security and Resilience

internal/modules/cjs/loader.js: 905 throw err; ^ Error: Cannot find module ‘puppeteer’ Require stack: – /home/760439.cloudwaysapps.com/jxzdkzvxkw/public_html/wp-content/plugins/rss-feed-post-generator-echo/res/puppeteer/puppeteer.js at Function.Module._resolveFilename (internal/modules/cjs/loader.js: 902: 15) at Function.Module._load (internal/modules/cjs/loader.js: 746: 27) at Module.require (internal/modules/cjs/loader.js: 974: 19) at require (internal/modules/cjs/helpers.js: 101: 18) at Object. (/home/760439.cloudwaysapps.com/jxzdkzvxkw/public_html/wp-content/plugins/rss-feed-post-generator-echo/res/puppeteer/puppeteer.js:2: 19) at Module._compile (internal/modules/cjs/loader.js: 1085: 14) at Object.Module._extensions..js (internal/modules/cjs/loader.js: 1114: 10) at Module.load (internal/modules/cjs/loader.js: 950: 32) at Function.Module._load (internal/modules/cjs/loader.js: 790: 12) at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js: 75: 12) code: ‘MODULE_NOT_FOUND’, requireStack: [ ‘/home/760439.cloudwaysapps.com/jxzdkzvxkw/public_html/wp-content/plugins/rss-feed-post-generator-echo/res/puppeteer/puppeteer.js’ ]

Business Case for Improving Open Source Software Supply Chain Security and Resilience Read More »