Texas Tech University

Job Submission

Step 1: Log on to quanah.hpcc.ttu.edu using your eRaider account and password.


Step 2: Create a job script file for your job submission. The following is an example job submission script "test.sh" which you can use to submit jobs on Quanah.


#!/bin/sh
#$ -V
#$ -cwd
#$ -S /bin/bash
#$ -N TestJob
#$ -o $JOB_NAME.o$JOB_ID
#$ -e $JOB_NAME.e$JOB_ID
#$ -q omni
#$ -pe mpi 36
#$ -P quanah
hostname


In the script, the lines starting with "#$"are SGE options.
"-V" means use the current environment setting in the batch job.
"-cwd" means use the current directory where the job is submitted as the job's working directory.
"-S /bin/bash" means use /bin/bash as the shell for the batch session.
"-N TestJob" means use "TestJob" as this job's name. This can be referred to later in the script using the variable "$JOB_NAME".
"-o$JOB_NAME.o$JOB_ID" and "-e$JOB_NAME.e$JOB_ID" indicate the standard output and error output files respectively. $JOB_ID is a unique number that distinguishes the job.
"-q omni" means the job will be submitted to the queue named "omni".
"-pe mpi 36" means use the parallel environment "mpi", and request 36 cores.
"-P quanah" means this job belongs to "quanah" project/cluster.
In the last line, "hostname" is the actual command to run in your job submission, you may modify this command to any commands that you need to run.

Examples of job submission scripts for Quanah can be copied into your local folder by typing one of the following into the shell:

cp /lustre/work/apps/examples/quanah/serial.sh .
cp /lustre/work/apps/examples/quanah/mpi.sh .

It would look like this on your command line

CopyJobScripts

An example of a job submission script that may be used to submit a job to the serial queue would look like this:

Serial Script

Where helloworld.py is the name of a python script to be executed by the python interpreter and python is the command to run the python interpreter.

On Quanah, each node has 36 CPU cores. If you submit a parallel job that will run on multiple nodes using MPI, you must request a multiple of 36 cores for the job using the option "-pe mpi". If you submit a parallel job that will run on multiple cores but on a single node using OpenMP, you can request any number of cores from 1 to 36 using the opton "-pe sm". And if you submit serial or array jobs, you can request any number of cores using the option "-pe fill". For more information about the parallel environments, you may refer to the user guide Quanah Parallel Environment Guide.


Step 3: Using the SGE qsub command to submit a batch job with the following syntax:

qsub test.sh

If you saved the script from the previous step as serial.h then the command to submit that job would look like this

Job Submission

You should use the name of your own job submission script. For example, if your job script is named "job1.sh", you should use the command "qsub job1.sh" to submit the job.


Step 4: You can use the command "qstat" to check your job's status. If the job is in "r" status it means the job is running. If the job is in "qw" status it means the job is waiting in the queue.

For the example that we have been working with, the qstat command would look like this while the job is waiting in the queue:

Queue Status

And it would look like this while the job is running:

QstatRunning

Notice that the state of the job changed from "qw" to "r".

For more information about how to check and monitor your job status, you may refer to the user guide for how to submit and run jobs on Hrothgar How to Submit and Run Jobs on Hrothgar. The job scheduler we use on Quanah is the same as we use on Hrothgar .


If you have question or need assistance, please contact us at hpccsupport@ttu.edu.

High Performance Computing Center