Skip to content

Setting Up A Conda Environment

Initialize the Virtual Environment

First, you need to set up your account to work more efficiently with conda environments (you should only need to do this once):

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda init
[barney@hpc ~]$ conda config --add envs_dirs ${HOME}/.conda/envs
[barney@hpc ~]$ conda config --add pkgs_dirs ${HOME}/.conda/pkgs

Create The Conda Environment

After running those commands, you'll need to log out and log back in. After you log back in, continue with the following commands:

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda create -n myenv ipython

This should create a conda environment using Python 3.9, and including the ipython packages If you know what other packages you need to install for your environment, it's best to add them the the conda create command when you first create the environment. However, you can always come back later and use the conda install command to add any additional packages. For instance,

[barney@hpc ~]$ conda install -n myenv humanize

will install the humanize package, which is useful for converting large numbers (like large file sizes) to more human-friendly values.

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda create -n myenv ipython tensorflow-gpu scikit-learn numpy=1.21

This should create a conda environment using Python 3.9, scikit-learn, and TensorFlow w/ GPU support (although it should work even without a GPU). It also limits numpy to version 1.21 to eliminate an error that pops up in some versions of TensorFlow. If you need other packages besides tensorflow and scikit-learn that aren't already installed, it's best to add them to the conda create command when you first create the environment. However, you can always come back later and use the conda install command to add any additional packages. For instance, scikit-learn can benefit from installing the scikit-learn-intelex package:

[barney@hpc ~]$ conda install -n myenv scikit-learn-intelex

To use the scikit-learn-intelex module, make sure to add -m sklearnex to your python commands (either at the shell or in your pbs script).

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda create -n myenv ipython pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
[barney@hpc ~]$ conda activate myenv
[barney@hpc ~]$ conda config --env --add channels nvidia
[barney@hpc ~]$ conda config --env --add channels pytorch
[barney@hpc ~]$ conda deactivate

This should create a conda environment using Python 3.9 and pytorch with GPU support. It also includes the ipython package. If you know what other packages you need to install for your environment, it's best to add them the the conda create command when you first create the environment. However, you can always come back later and use the conda install command to add any additional packages. For instance,

[barney@hpc ~]$ conda install -n myenv humanize

will install the humanize package, which is useful for converting large numbers (like large file sizes) to more human-friendly values.

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda create -n myenv ipython -c conda-forge --strict-channel-priority
[barney@hpc ~]$ conda activate myenv
[barney@hpc ~]$ conda config --env --add channels conda-forge
[barney@hpc ~]$ conda config --env --set channel_priority strict
[barney@hpc ~]$ conda deactivate

This should create a conda environment using Python 3.10, and including the ipython package using the conda-forge channel. It also sets the conda-forge as the highest priority channel for the environment. If you know what other packages you need to install for your environment, it's best to add them the the conda create command when you first create the environment. However, you can always come back later and use the conda install command to add any additional packages. For instance,

[barney@hpc ~]$ conda install -n myenv humanize

will install the humanize package, which is useful for converting large numbers (like large file sizes) to more human-friendly values.

First, you'll need a conda environment definition file for DeepLabCut. Below is a version that's been modified from the version 2.3 default so that it works on the HPC:

# DLC_HPC.yaml

# HPC Version of Conda File
# Hunter Eidson - 2023

#DeepLabCut2.0 Toolbox (deeplabcut.org)
#© A. & M. Mathis Labs
#https://github.com/DeepLabCut/DeepLabCut
#Please see AUTHORS for contributors.

#https://github.com/DeepLabCut/DeepLabCut/blob/master/AUTHORS
#Licensed under GNU Lesser General Public License v3.0
#
# DeepLabCut HPC environment
#
# install: conda env create -f DLC_HPC.yaml
# update:  conda env update -f DLC_HPC.yaml

name: DLC_HPC
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.8
  - pip
  - ipython
  - jupyterlab
  - nb_conda
  - ffmpeg
  - pip:
    - "deeplabcut[tf]"
Save this on the HPC as a file named DLC_HPC.yaml. Now, you can create your DeepLabCut conda environment using the following commands:
[barney@hpc ~]$ module load Anaconda3/2023.07 CUDA/11.2.1 gcc/10.2.0
[barney@hpc ~]$ conda env create -f DLC_HPC.yaml
Any time you use this environment, you'll need to load all 3 of the modules listed above:

  1. Anaconda3/2023.07
  2. CUDA/11.2.1
  3. gcc/10.2.0

[barney@hpc ~]$ module load Anaconda3/2023.05
[barney@hpc ~]$ conda create -n jupyter-pyspark notebook numpy scipy pandas ipywidgets nltk
This should create a conda environment using Python 3.9, and including the Jupyter Notebook packages. If you know what other packages you need to install for your environment, it's best to add them the the conda create command when you first create the environment. However, you can always come back later and use the conda install command to add any additional packages. For instance,

[barney@hpc ~]$ conda install -n jupyter-pyspark humanize

will install the humanize package, which is useful for converting large numbers (like large file sizes) to more human-friendly values.

Using The Conda Environment

Finally, anytime you want to use the environment, you need to do the following (either at the shell if you're running an interactive job, or in your PBS script before you run any python commands):

At The Command Line

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda activate myenv

In A Shell Script

module load Anaconda3/2023.07
eval "$(conda shell.bash hook)"
conda activate myenv

At The Command Line

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda activate myenv

In A Shell Script

module load Anaconda3/2023.07
eval "$(conda shell.bash hook)"
conda activate myenv

At The Command Line

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda activate myenv

In A Shell Script

module load Anaconda3/2023.07
eval "$(conda shell.bash hook)"
conda activate myenv

At The Command Line

[barney@hpc ~]$ module load Anaconda3/2023.07
[barney@hpc ~]$ conda activate myenv

In A Shell Script

module load Anaconda3/2023.07
eval "$(conda shell.bash hook)"
conda activate myenv

At The Command Line

[barney@hpc ~]$ module load Anaconda3/2023.07 CUDA/11.2.1 gcc/10.2.0
[barney@hpc ~]$ conda activate myenv

In A Shell Script

module load Anaconda3/2023.07 CUDA/11.2.1 gcc/10.2.0
eval "$(conda shell.bash hook)"
conda activate myenv

Generally, using Apache Spark is going to require starting up an on-demand Spark cluster. We've provided some documentation about running batch and interactive jobs against such a cluster (as well as how to start one up).