Skip to content

Submitting TensorFlow Jobs With Slurm

Currently, the best way to use TensorFlow on the VHPC is to build your own python virtual environment using conda, which will allow you to install a recent copy of TensorFlow that will work with the GPU nodes.

Setting up a Conda Environment

Follow the instructions at Setting Up A Conda Environment for instructions on how to set up a conda environment for TensorFlow.

TensorFlow Shell Script

To set up a submission script to run the job, you can use the following as a starting point (making sure to change netid@kennesaw.edu and account_name in the script to the correct values for your job)

run_tf2.slurm
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=28
#SBATCH --gres=gpu:1
#SBATCH --time=4-04:00:00
#SBATCH --mail-type="BEGIN,END,FAIL"
#SBATCH --mail-user="netid@kennesaw.edu" (1)
#SBATCH --partition="defq"
#SBATCH --account="account_name" (2)

# Load the modules you need
module load Miniforge3

#Load your conda environment
eval "$(conda shell.bash hook)"
conda activate myenv

# Change Directory to the working directory
cd ${SLURM_SUBMIT_DIR}

# Run your code:
python3 ${FILE}
1. 🙋‍♂️ Make sure to change netid@kennesaw.edu on this line to your KSU email address or you won't receive emails when the job starts and stops. 2. 🙋‍♂️ Make sure to change account_name on this line to the VHPC billing account your job should be charged to.

Usage

You can schedule your job in the queue by running the following (let's assume you named your TensorFlow script my-tf2-code.py, and the submission script is named run_tf2.slurm):

[barney@vhpc ~]$ sbatch --export=ALL,FILE=${PWD}/my-tf2-code.py ${PWD}/run_tf2.slurm