Submitting TensorFlow Jobs With Torque
Currently, the best way to use TensorFlow on the HPC is to build your own python virtual environment using conda, which will allow you to install a recent copy of TensorFlow that will work with the GPU nodes.
Setting up a Conda Environment
Follow the instructions at Setting Up A Conda Environment for instructions on how to set up a conda environment for TensorFlow.
TensorFlow Shell Script
To set up a submission script to run the job, you can use the following as a starting point (making sure to change netid@kennesaw.edu in the script to your correct email address):
run_tf2.pbs | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
- Make sure to change
netid@kennesaw.edu
on this line to your KSU email address or you won't receive emails when the job starts and stops.
Usage
You can schedule your job in the queue by running the following (let's
assume you named your TensorFlow script my-tf2-code.py
, and the
submission script is named run_tf2.pbs
):
[barney@hpc ~]$ qsub -vFILE=${PWD}/my-tf2-code.py ${PWD}/run_tf2.pbs