Scheduling jobs

Our data and script for processing are loaded onto the HPC and are located in our home folder on the login node (Remeber, don’t run jobs on this node). Next we need to tell the HPC to run our scripts. We could run them here, but we won’t, because we don’t want to stop other users from accessing the HPC. On the HPC, a piece of software known as a scheduler manages which jobs run where and when.

The scheduler used on Griffith’s HPC is the PBS batch queuing system. PBS is not used everywhere, however running jobs is similar regardless of the software being used and although the exact syntax might change, the concepts will remain the same.

Running a batch job

The most basic use of the scheduler is to run a command non-interactively. This is also referred to as batch job submission. In this case, a job is just a shell script. The following is a demo shell script (file extension .sh) used by the PBS scheduler to run your script on the HPC (in this case an R script).

Important for Windows users you will need to save your bash scheduler using the software notepad++. In windows, open up notepad++ and create your scheduler script, then go to “edit” -> “EOL conversion” -> select “UNIX” then save as .sh file. If you don’t do this the HPC will terminate your job. You can also use dos2unix to change the format from dos to unix and vice versa.

#!/bin/bash
#PBS -m abe
#PBS -M emailAdress@griffith/griffithuni.edu.au
#PBS -N SimpleTest 
#PBS -l select=1:ncpus=1:mem=12gb,walltime=0:01:00

cd $PBS_O_WORKDIR
module load R/3.6.1

EXECUTION PROGRAM /export/home/<sNumber>/nameOfRFile.R # the directory on the HPC used to navigate to and run the script (in this case R)
results.Rout # output from the above r script

Breakdown of the above shell script

What’s wrong with the following scheduler script

#!/bin/bash
#PBS -N simple_failing_test

echo 'This script is running on:'
hostname
sleep 120

This is the error that will be thrown

qsub: Job has no walltime requested

Solution

It’s missing the walltime argument which is required. Below is the script with walltime added.

#!/bin/bash
#PBS -N simple_failing_test
#PBS -l walltime=00:03:00

echo 'This script is running on:'
hostname
sleep 120

Customizing a job

Write the relevant section of a PBS scheduler script that will run on the HPC using 2 cpu’s, 4 gigabytes of memory, and 5 minutes of walltime.

Solution

#PBS -l walltime=00:05:00: ncpus=2: mem=4gb

Write output files to a specific folder

Write the relevant section of a PBS scheduler script that will write output files to the folder ‘/export/home/s1234567/HPC_Tutorial’

Solution

cd /export/home/s1234567/HPC_Tutorial

Running jobs with large resource requirements

If you plan to run a job with high resource requirements (i.e. large memory or large wall time) then you should place it in the routeq queue. If you don’t do this then your job may take weeks to months before the HPC even starts to execute you analysis.

Add the following in your scheduler script to run in the high resource queue.

#PBS -q routeq