On May 20, 2023, PyPI, the official Python Package Index, announced it had suspended new uploads and user registrations due to a massive influx of malicious code. The lockdown was lifted on May 21st.
PyPI is a popular repository for Python developers; millions use it to download packages.
Although the specific uploads or users that triggered the lockdown have not been publicly reported, there have been some high profile identifications of malicious packages of late.
A successfully uploaded malicious package represents an incredibly severe security threat. Download and deploying one is the equivalent of voluntarily (although accidentally) running malware in your environment.
Phylum identifies repeat malicious activity in May
On May 16, 2023, software supply chain security company Phylum posted their analysis of a recent string of malicious uploads.
A GitHub user named Patrick Pogoda had been persistently posting malware-infected packages on PyPI. Despite the packages being taken down, the user quickly reposted under slightly different names.
The identity of Patrick Pogoda is unclear, with possibilities ranging from the compromised account of a legitimate developer to a purpose-built malicious persona. His naming scheme and tactics bear resemblance to previous attacks.
Some of these Python projects create bots for platforms like Discord, Telegram, and Twitter, while others are token grabbers designed to steal cryptocurrency wallet credentials from unsuspecting users.
Upon inspection, these packages initially contain benign code.
The package most recently identified by Phylum interacts with ChatGPT, likely exploiting the massive buzz around AI.
Checkmarx finds hugely popular but malicious PyPI repository in April
On April 19, Checkmarx, an application security company, disclosed its discovery of a malicious PyPi repository.
In October 2022, PyPi user santalegend published
pampyio
, a malicious version of the popular pattern matching packagepampy
.One of the
pampyio
’s dependencies isredapty
.This package includes the malicious
aptyred/__init__.py
file, which exfiltrate environment variables (which often contain sensitive data) from the victim’s machine to a remote location.
Checkmarx described the attacks as leveraging “StarJacking,” which makes a package look more popular than it is, taking advantage of the non-validation of the relationship between the package and GitHub.
In GitHub, users can “star” projects they like.
PyPI can pull this data from GitHub, but relies on the author to link to the correct package when uploading to PyPI, making it ripe for abuse.
Between them,
pampyio
andredapty
had been downloaded more than 70,000 times upon the article’s publication.
This media article implies the Checkmarx finding triggered the lockdown, but I see no direct evidence of this being the case.
Fortinet identifies uploads in early 2023
Yet another security firm, Fortinet, identified malicious activity earlier in the year.
A threat actor named Lolip0p uploaded malicious Python packages,
colorslib
,httpslib
andlibhttps
, to PyPI, with code designed to drop info-stealing malware on developers’ systems.Despite having non-mimic names and full descriptions that make them appear as legitimate resources, these packages contained a
setup.py
file that ran a suspicious executable calledOxyz.exe
, malware that steals browser information.The detection rates for these executables were low, allowing the malware to evade endpoint detection and other security measures.
The feds are now interested
Subsequent to the lockdown, PyPi announced that it had received three subpoenas from the U.S. Department of Justice earlier in the year. While PyPI didn’t get any context on why the feds requested the records, the fact that the demand included records “of all Python Package Index (PyPI) packages uploaded by” (unspecified) specific users is instructive.
It’s hard to imagine the timing - alongside the rise in repeat malicious activity - is coincidental.
Software supply security measures to take
Evaluate the provenance of packages before installing them.
Confirm the package name is correct, as per the official documentation, to avoid typosquatting attacks.
Be suspicious of packages that appear to have been recently updated; it is possible the name closely matches a commonly-used one.
Use pip with the --trusted-host option to avoid accidentally pulling packages from an unintended, malicious repository.
Employ defense-in-depth (e.g. EDR) to identify malware even after it has been installed.
Conclusion
While PyPI is now back up, some experts expect no let up in the frequency and complexity of attacks against the Python supply chain. The payoff is just too great.
But it seems like the risks are increasing, based on the subpoena demands. If even one of these malicious actors faces prosecution in the United States, I imagine it would have a strong deterrent effect.