Avoiding the spaghetti installation
Many ways to install Python and Python packages have been developed over the years, and not all of them are compatible with each other. Scikit-HEP supports users of the two major systems: (a) pip with virtual environments and (b) conda-forge. For a newcomer, conda-forge is usually the simplest and most reliable way to get started, so we describe that. Also, we describe how to replace
mamba because it is the fastest way to install packages into that environment.
This page is for everyone, but especially newcomers to Python or package management. If, for instance, you’re having trouble installing Scikit-HEP packages—e.g.
pip install fails with an error or you get an
ModuleNotFoundError after you think you’ve installed it—then this page is for you.
What is conda-forge?
conda-forge is a “channel” for the conda package manager containing the Scientific Python ecosystem, Scikit-HEP, and even ROOT with carefully aligned package versions to ensure that you get a consistent, working system. Within a conda environment, you can still use pip to install packages that are not in this channel, thereby getting access to everything in the Python Package Index (PyPI), and everything in the conda environment is kept isolated from all other Python environments, so that you don’t disturb any applications that rely on a version of Python that ships with your operating system.
The software in conda-forge are not subject to Anaconda’s licensing restrictions, and the conda package manager is free software, so both can be used without any legal restrictions in national labs and universities.
Until recently, the (relatively) hard part had been to ensure that you’re using conda-forge, rather than an Anaconda default channel. The instructions below describe how to install Miniforge, which is conda-forge without the Anaconda default channel.
You likely have a package manager for your operating system, such as Homebrew, apt-get, or yum. Use conda for your Python packages and your operating system’s package manager for applications (web browsers, text editors, etc.).*
(* We’re doing conda a disservice by describing conda as a Python package manager, though it does much more, for the sake of keeping this description simple.)
What is "mamba"?
We recommend using
mamba, which is a drop-in replacement for
conda that is many times faster (in the “Solving environment: …” step). You particularly notice it when a package has many dependencies or complex version constraints on its dependencies.
In fact, the conda developers are incorporating mamba into conda. At the time of this writing, however, that integration is still experimental. These instructions will describe how to use
Where will the files go?
The entire Python distribution, with all packages and the binary shared libraries that support them, will go into a new directory, most likely in your home directory and named
mambaforge. All of the files in it are installed with your own user permissions (i.e. not superuser/requiring
How to remove conda/mamba cleanly.
- Delete that directory with
rm -rf ~/mambaforge.
- Delete a file named
~/.condarc, if you have one.
- Check your shell configuration file, probably named
~/.bashrc, for a “
>>> conda initialize” section. If you have one, delete it.
Those three steps will remove any vestige of the conda installation.
How to save an old package list before deleting it.
If you already have a conda installation, you can bundle your current environment into an environment file (a list of names and versions of packages) with
conda env export --from-history > old-environment.yml
After setting up a new conda system, you can reinstall all of those packages/versions with
conda env create -f old-environment.yml
Installing a Python environment
We’ll be using Miniforge to install the Python environment, which is distributed here on GitHub.
The steps of the installation procedure are (1) download an installer script, (2) run it, and (3) answer interactive prompts.
Of the four combinations Miniforge gives you (
mamba, Python vs PyPy), we recommend mamba with Python, which is this table. (Open that link in a new window.)
Within each table is a list of architectures. On Mac and Linux, you can get the name of your architecture from
It is very likely
x86_64. Select the installation script for your architecture by clicking or right-clicking the link on the Miniforge page.
On Mac or Linux, run the script with
Windows has a
start command; see Miniforge’s instructions.
The interactive prompts will ask you where you want to install it (default is
~/mambaforge) and whether you want to have it enabled whenever you start a new terminal or shell (probably “yes”). Saying “yes” to the latter inserts a “
>>> conda initialize” section in your shell configuration (probably
Deciding whether conda should take over your shell?
If you say “yes” to let the installer script modify your shell configuration, then the next terminal you open will be in the conda environment. For instance,
will run the conda environment’s Python, rather than any other Python you have installed on your computer. This is what conda calls the “base” environment (though you can create more environments that are independent of this one).
If, instead, you want to explicitly opt-into conda environments by calling a command, use
conda config --set auto_activate_base false
to prevent the “base” environment from being automatically loaded in each new terminal. Now all environments, including “base”, have to be explicitly activated with
conda activate name-of-environment
See managing environments in the conda documentation for more.
If you say “no” to not let the installer script modify your shell configuration, then you will have to manually find the path to the
conda executable, which is in
~/mambaforge/bin/conda. All of the above applies, but your shell might not be able to find
Regular maintenance of the environment
Now you’re ready to go. Instructions online tell you how to install packages, like
conda install name-of-package
Since you installed
mamba, you can replace
conda install with
mamba install to make the dependency resolution much faster.
mamba install name-of-package # fast!
There are no other differences, and you can always fall back on using the
conda command. (Necessary, for instance, in
conda activate name-of-environment.)
One of the first commands you should do after installation is
mamba update --all
to get the newest versions of all the installed packages (newer than the installation script). Then do this approximately once a week to stay up-to-date on all of your packages. (This command updates to the latest stable versions, not bleeding-edge versions unless you explicitly request them by version number.)
Another good command is
mamba clean --all
which removes cached package files (which are not needed, now that they’ve been installed). Sometimes, you can get gigabytes of disk space back.
Leveling up: multiple environments
One of conda’s major features is that it allows you to have completely separate Python versions and packages in different “environments.” See managing environments in the conda documentation on how to use this feature, especially if you need to switch between projects with different package or version requirements.
Maintaining separate environments for separate projects is one of our recommended “best practices”, whether you’re using conda or pip with virtualenv.