SLURM
SLURM (Simple Linux Utility for Resource Management) is the workload manager used on the KSU HPC cluster. It is an open-source, highly scalable job scheduler designed for large and small Linux clusters. SLURM manages job submission, resource allocation, and job monitoring.
Warning
All jobs on this cluster must specify a valid billing account using the -A
or --account
option.
- Example: sbatch -A research job.sh
- Example: interact -A teaching -N 1 -n 4 -t 2:00:00
To avoid typing -A <billing_acct>
each time, set the appropriate environment variables in your shell startup file (e.g. .bashrc
):
export SBATCH_ACCOUNT=research
export SRUN_ACCOUNT=research
export SALLOC_ACCOUNT=research
Inside a running job, SLURM also defines SLURM_JOB_ACCOUNT
to indicate the billing account in use.
Submitting Jobs
Jobs are submitted to SLURM using the sbatch
command along with a job script. A basic job script looks like this:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=job.out
#SBATCH --error=job.err
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --mem=2G
#SBATCH --account=research
hostname
Submit the job with:
sbatch -A research job.sh
Interactive Jobs
The recommended way to start an interactive session on the cluster is to use the interact
wrapper script.
This utility calls salloc
under the hood, but provides a simpler interface and adds cluster-specific options.
Using interact
Basic example (defaults to 1 node, 2 tasks, 1 hour):
interact -A research
Request 1 node with 4 tasks for 2 hours:
interact -A research -N 1 -n 4 -t 2:00:00
Enable X11 forwarding for graphical applications:
interact -A research -X
Request a GPU with exclusive compute mode:
interact -A research -G -C exclusive
Run with a different shell:
interact -A research -s /bin/zsh
See all options:
interact -h
Note
The interact
script respects the same environment variables as the native SLURM commands:
SBATCH_ACCOUNT
, SRUN_ACCOUNT
, and SALLOC_ACCOUNT
.
Setting these in your .bashrc
or .bash_profile
ensures consistent defaults across
sbatch
, salloc
, srun
, and interact
.
Advanced: Using salloc
Directly
For users who prefer raw SLURM commands, you can request an allocation with salloc
:
salloc -A research --nodes=1 --ntasks=2 --time=01:00:00 --mem=2G
Once your session starts, you’ll be placed on a compute node. Launch commands inside the allocation with srun
, for example:
srun hostname
Note
interact
is the preferred interface for most users. salloc
is available for advanced usage and scripting.
Job Options
SLURM job options are set with #SBATCH
directives in the job script or as command-line arguments to sbatch
.
Common options include:
--job-name=<name>
: Sets the job name.--output=<file>
: Path for standard output.--error=<file>
: Path for standard error.--time=HH:MM:SS
: Walltime limit.--nodes=<N>
: Number of nodes.--ntasks=<N>
: Total number of tasks (MPI processes).--cpus-per-task=<N>
: Number of CPUs for each task (for threaded codes).--mem=<size>
: Memory required (per node).--account=<billing_acct>
: Billing account to charge the job against. Required.--partition=<queue>
: Partition (queue) to submit the job to.
Monitoring Jobs
To view only your jobs:
squeue -u $USER
To view all jobs you are allowed to see:
squeue
To filter jobs by billing account (useful if you have access to more than one):
squeue -A research
To view details about a specific job:
scontrol show job <jobid>
View all your jobs (completed and running)
sacct -u $USER
(Optional) Filter by billing account if you have more than one
sacct -A research -u $USER
Canceling Jobs
To cancel a job:
scancel <jobid>
To cancel all your jobs under a billing account:
scancel -A research -u $USER
Example: Parallel Jobs with MPI
#!/bin/bash
#SBATCH --job-name=mpi_test
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --time=02:00:00
#SBATCH --mem=4G
#SBATCH --account=research
module load openmpi
srun ./my_mpi_program
Example: Shared Memory Jobs (OpenMP)
#!/bin/bash
#SBATCH --job-name=omp_test
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --time=01:00:00
#SBATCH --mem=8G
#SBATCH --account=research
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./my_openmp_program
Useful Commands
sbatch -A <billing_acct> script.sh
— Submit a batch job.interact -A <billing_acct>
— Start an interactive session.salloc -A <billing_acct>
— Advanced interactive allocation.srun
— Run a job interactively inside an allocation or launch tasks in a batch job.squeue -u $USER
— View your pending and running jobs. (-A <billing_acct>
optional to filter.)sacct -u $USER
— View your completed and running jobs. (-A <billing_acct>
optional to filter.)scontrol show job <jobid>
— Show detailed information about a job.scancel <jobid>
— Cancel a job. (-A <billing_acct>
optional to filter multiple jobs.)