It’s Unsafe to Download Python Packages

Dustico Research Team
3 min readFeb 23, 2021
Photo by David Clode on Unsplash

It is not new that attackers try to exploit the Python ecosystem and place malicious code in the published packages. While researching supply-chain attack methods, we came across a technique that allowed code to run on hundreds of servers worldwide without installation.

What we found is when a user only downloads a Python package, the code inside will automatically run on the developer’s system.

For example, the following command:

pip download prp1

will indeed download the package file prp1–1.0.5.tar.gz to the local filesystem.

“only” downloading the package shouldn’t run any of the code, but this is not what we discovered.

Why Is This Happening?

This happens when pip, Python’s package manager, tries to understand the metadata of the downloaded package, like its version or a list of additional dependencies, etc. This happens automatically in the background by pip running the main setup.py script that comes as part of the package structure.

The purpose of the setup.py is to return a data structure to hint the package manager on how to handle the package.

An attacker who understands this process, can plant malicious code in the script and gain code execution on any system that downloaded the library, even though the system did not intend to run the package at all.

This behavior is not a bug, but a feature in the design of pip, even though we found it to be unintuitive. When a developer just wants to download a package, he does not expect code to automatically run on his system.

As a matter of fact, this was reported in the past as an issue on pypa project https://github.com/pypa/pip/issues/1884 and it is unresolved to this day.

How Does It Happen?

When a Python package is being downloaded via pip and it’s in tar.gz format, pip will automatically run the code inside setup.py script (e.g. pip download <package>).

In the following example, you can see the code that we planted as part of a research package we published on PyPI called prp1 for our research purposes. We added code in the setup.py script to send a signal to our site following successful code execution.

During our research, we discovered that our package has been downloaded by hundreds of servers worldwide. We posted a disclaimer about the behavior of the package in its description on GitHub and PyPI.

Workaround

There are safer ways to download python packages, such as working directly with PyPI’s “simple” API:

https://pypi.org/simple/<package-name>/

For example, using the package listed above prp1, a user can download it from the following link https://pypi.org/simple/prp1/

About Dustico

Dustico is a security startup in the field of detecting open-source software supply-chain attacks. We develop an automated behavioral analysis for the ongoing detection of malicious parts in open-source packages.

--

--