“Your code is the product of your environment.” - Robert Sandor


Python environments

Python and nearly all of the software packages in the scientific Python ecosystem are open-source. They are maintained and developed by a community of scientists and programmers, some of whose work is supported by universities, non-profits, and for-profit corporations. Because there is no single authority that controls all of these packages, coordinating the compatibility between these different packages and their multiple versions used to be a nightmare.

Sometimes when we create software, the software needs to run on a specific version of Python because our software expects a certain behavior that is present in older versions but changes in newer versions. Likewise, we may need to use specific versions of the libraries for similar reasons. But we may have many projects on our computer, perhaps a project that runs on with xarray version 0.1 and Python 2.7 and even a more modern project that runs on xarray version 0.8 and Python 3.8. If we try running both at once on Python 2.7 or Python 3.8, one of them may break because some of the code that runs on Python 2 doesn’t run on Python 3 or vice versa. This is where virtual environments become useful.

Virtual environments keep these dependencies in separate “sandboxes” so you can switch between both applications easily and get them running. Python environments take care of most of the dependencies and modules for us, but we will have to get our hands dirty in Python and deal with the environments.

In this section, we will learn how to use conda to manage a Python environment (“sandbox”) we play in.

drawing

Figure source

Creating a Python environment

Python coupled with a package manager provides a way to make isolated, reproducible environments where you have fine-tuned control overall packages and configuration. From now on, you are highly recommended to work within an environment, rather than the “default” environment.

To see all the environments on your system, type the following in Anaconda Powershell Prompt:

conda info --envs

This command should display the current environments, with * marking the environment you are in.

To quickly create an environment (called cper, standing for Computing and Programming for Environmental Research) using conda, type the following in Anaconda Powershell Prompt:

conda create --name cper python=3.8.8

In this command, the python=3.8.8 portion specifies which version of Python you want to set up the environment in; you can change the version to whatever suits your needs. Here --name sets the name of the environment. In other snippets you see online, you may see --n instead of --name; they mean exactly the same thing.

Now display the current environments with conda info --envs, you will something like this:

base                   * C:\Users\86133\anaconda3
cper                     C:\Users\86133\anaconda3\envs\cper

Using a Python environment

After confirming that you created the environment, let’s now actually use it. We can accomplish this by typing:

conda activate cper

At this point, your terminal prompt should look something like this:

(cper) PS C:\Users\86133>

Now display the current environments with conda info --envs, you will something like this:

base                     C:\Users\86133\anaconda3
cper                   * C:\Users\86133\anaconda3\envs\cper

The * indicates that we are in the cper environment.

To get a complete summary of all the packages installed in an environment, the channel they were installed from, and their full version info, type:

conda list

To stop using the cper environment, type in:

conda deactivate cper

Installing modules under a Python environment

Once you have a basic Python environment, you can easily add or remove modules using the conda command-line utility. Conda was created to help manage the complex dependencies and pre-compiled binary libraries that are necessary in scientific python. If you set up your python environment using the Anaconda Python Distribution or with miniconda, you should already have the conda command available on the command line. With it, you can easily install modules from an official, curated set of packages which are built and tested for a number of different system configurations on Linux, Windows, and macOS.

To install a module (say XXXX), first activate the environment (say cper) under which you would like to install it:

conda activate cper

Then type:

conda install XXXX

Additionally, there is a community-maintained collection of packages/recipes called conda forge which is accessible through conda as a special “channel”:

conda install -c conda-forge XXXX

While conda allows you to install almost any science-related package, there may be other general-use python packages you wish to you that are not available via conda. For these, you can use an alternative installation method.

Outside of the scientific python community, the most common way to install packages is to search for them on the official PyPI index. Once you’ve found the package you want to install (you may have also just found it on Github or elsewhere), you use the pip command from the command line:

pip install XXXX

If you can’t find a package on either PyPI or conda-forge, you can always install it directly from the source code. If the package is on github, pip already has an alias to do this for you:

pip install git+https://github.com/.../XXXX.git

Where https://github.com/.../XXXX.git indicates the url of the module.

Now let’s install some modules under the cper environment. Make sure you have activated the cper environment; otherwise, modules will be installed under the base environment.

First, we will install modules via conda:

conda install numpy pandas matplotlib netCDF4 xarray notebook cartopy dask python-graphviz

Later, we use pip to install a module via a different channel:

pip install nc-time-axis

Now use conda list to verify that we have successfully installed the above modules.

Using the environment file

As you type conda list, you get a complete summary of all the modules installed in your environment, the channel they were installed from, and their full version info. Using this info, you are able to create an environment file in YAML syntax (YAML Ain’t Markup Language) which documents the exact contents of your environment.

To export your current environment (cper) to a yml file (my_environment.yml):

conda env export --name cper > my_environment.yml

The > indicates that the output is writing to a file called my_environment.yml. If there were any contents in my_environment.yml before this command, they would be overwritten. Now in your working directory, a file my_environment.yml was just created, open it, you will see a list of configurations of the cper environment.

The environment file (my_environment.yml) is very useful, as it allows creating an identical environment. You can use different environment files for different projects, or share an environment file with someone to grantee reproducible research.

To create a new environment (say cper_new) through existing environment file (say ZZZZ.yml), go to the folder that contains this environment file, type:

conda env create --name cper_new --file ZZZZ.yml

Packaging a conda environment

If you want to copy the environment, you may want to use conda-pack that allows you to package your entire environment as a compressed file (a tar.gz) you can just copy and paste somewhere, and it just works.

Pack your environment:

conda pack

It can take some minutes depending on the size of the environment and the number of libraries installed. Move the compressed file to the destination, uncompress folder. Then use the standard Command Prompt (cmd.exe) on Windows, not Anaconda Prompt or PowerShell, go to the folder:

.\Scripts\activate.bat
.\Scripts\conda-unpack.exe

to re-create symlinks and set everything up. Check Moving Conda Environments and Packing Conda Environments for more.

Removing a conda environment

If you would like to get rid of the entire cper environment, deactivate it then type in:

conda remove --name cper --all

The --all flag is to remove all packages from the environment and is necessary to completely clean the environment.

For extensive documentation on using environments, please see the conda documentation.


The notes are modified from the excellent Python Environments, and Getting started with Python environments (using Conda), all available freely online.


In-class exercises

Exercise #1

Go through the notes, make sure you understand the scripts.