Using ‘kraken’ Computational Cluster

Introduction

One of the main purposes of the kraken cluster is to accommodate especially long-running programs. Users who run long jobs (which take hours or days to run) will need to run these jobs through the Torque Scheduler. Torque provides a method for handling these jobs on a first-come first-served basis with additional fairshare policy so all users can have their jobs started in some reasonable time. In this manner, all jobs will run more efficiently and finish quicker since each is allowed to have all system resources for the duration of its run. All Torque jobs must be launched from the kraken server.

How to Log-in

You can log-in to the kraken cluster or copy your files using SSH protocol. The useful programs for connection are PuTTY and WinSCP on Windows or ssh and scp commands on Linux. Please refer to their manual for instructions how to use them. The address of the cluster is kraken.phys.p.lodz.pl. For login you should use the username and password that have been provided to you. After the first login it is recommended to change your password using passwd command. You may put all your files in your home directory. They will be accessible from all the computing nodes through cluster internal network filesystem.

The Most Common Commands

The basic commands provided by Torque for starting and stopping jobs and for manipulating jobs in queues are shown below. For complete manual of Torque command, type “man <cmd>” (e.g. “man qsub”). These command are:

  • qsub — the basic command for running jobs,
  • qstat — show currently running jobs/queued jobs,
  • qnodes — gives the current status of all nodes,
  • qpeek — allows to see the current output of the job in progress,
  • qhold and qrls — allow to hold and release the running job.

How to Run a Batch Job

The Torque Scheduler will not accept any program directly. It is designed instead to accept a shell script — a .sh file — which itself runs the commands necessary to launch your program. Once your script is ready, you can submit it to Torque with the qsub command:

qsub your_script.sh

Mind that you do not enter $ symbol as it is a part of the prompt. When this command is issued, you will be given a job number, we will call it XX in this example, which is used for tracking and manipulating your job. Standard out and standard error from the job will be copied to files in your current directory named your_script.sh.oXX and your_script.sh.eXX for your reference.

You can specify the queue to which your job is assigned by giving it after parameter -q. The available queues are specified in the table below. They differ by the default number of processors and memory assigned to your job. If you want different number of resources than queue default, you can specify them using parameter -l. Note, however, that you cannot increase the maximum constrains of the chosen queue, although you are welcome to decrease the required resources. In the following example, the test1.sh script requires 4 nodes in order to run and the calculations are supposed to take more time than 2.5h, so it is assigned to queue longer4. However, the required memory is only 4GB (instead of the default 12GB). Hence, such a job can be submitted with the command:

qsub -q longer4 -l mem=4GB test1.sh

Once this command is issued, the system will verify that there is a machine with 4 processor and 4GB of memory ready for use. If there is, it will allocate 4 processors for this job and the job will run. If 4 processors or 4GB of free memory are not yet available, the job will be put in the queue. The most popular resources that can be requested are specified in the table at the end of this page.

More details of the qsub command are specified in the appendix.

Monitoring Jobs

A user can issue the qstat command to track the status of the job. In the example below, the status of this job is shown as “R” (Running). A job can also show a status of “Q” (Queued), “H” (Hold), or “C” (Completed). The sample output of the qstat command looks as follows:

maciek@kraken:qstat

Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
3177.kraken               test1.sh         maciek          00:01:28 R quick4

Job Arrays

To submit a large number of similar cluster jobs, there are two basic approaches. A shell script can be used to repeatedly call qsub passing in a customized PBS script (either by creating a temporary PBS script file or by piping the PBS commands into qsub).

The preferred approach–that is simpler and potentially more powerfull–would be to submit a job array using one PBS script and a single call to qsub. Job arrays hand-off the management of large numbers of similar jobs to the Resource Manager and Scheduler and provide a mechanism that allows cluster users to reference an entire set of jobs as though it were a single cluster job.

Submitting Job Arrays

Job arrays are submitted by including the -t option in a call to qsub, or by including the #PBS -t command in your PBS script. The -t option takes a comma-delimited list of job ID numbers or of one or more pairs of job ID numbers separated by a dash.

Each job in the job array will be launched with the same PBS script and in an identical environment–except for the value of its array ID. The value of the Array ID for each job in a Job Array is stored in the PBS_ARRAYID environment variable.

For example, if a job array is submitted with 10 elements, numbered from 1 to 10, the submission command would be the following:

qsub -t 1-10 array_script.sh

An optional parameter, the slot limit, can be added to the end of the -t option to specify the maximum number of job array elements that can run at one time. The slot limit is specified by appending a “%” to the -t option followed by the slot limit value. A twelve element job array with non-sequential array IDs and a slot limit of 3 could be specified as follows:

qsub -t 1-3,5-7,9-11,13-15%3 array_script.sh

Each job included in the job array has its own unique array element value stored in the PBS_ARRAYID environment variable. The value of each job array element’s array ID can be accessed by the job script just like any other shell environment variable. If the job ran a bash shell script, the job’s array ID information could printed to STDOUT using the following command:

echo "Current job array element's Array ID: ${PBS_ARRAYID}"

Customizing Data for Job Array Elements

A more useful task for the array ID–and the real power of job arrays–would be to use the job’s Array ID as a direct or indirect index into the data being processed by the job array.

One approach to accomplish this would be to use the PBS_ARRAYID value to provide a custom set of input parameters for job in the job array. To do this, a text file would be created containing multiple lines each of which would consist of a series of space delimited values. In this approach, each line in the data file would contain the input parameters needed by one element of the job array. The PBS script would then be modifed to include a command that would read in the correct line of the data file–based on the PBS_ARRAYID value of that particular job. While there are many ways to read the appropriate line from the data file, the following serves as a sample implementation assuming that the data file was called data.dat and was located in the same directory as the script that was run for each element of the job array:

PARAMETERS=$(sed "${PBS_ARRAYID}q;d" data.dat)

Assuming that the excecutable program/script for the jobs in this array was called command.sh, the PBS script would launch the program with a line like the following:

command.sh ${PARAMETERS}

An alternate approach is possible if the unique input parameters needed by each job in the array can be calculated arithmetically. For example, if each instance of the command.sh script needed to loop over a range of values, the PBS script could calcuate the max and min values needed for each job directly–based on the value in the PBS_ARRAYID environment variable. If each job’s range needed to include 1000 values, this could be done by including commands like the following in the PBS script:

MAX=$(echo "${PBS_ARRAYID}*1000" | bc)
MIN=$(echo "(${PBS_ARRAYID}-1)*1000" | bc)

The data file referred to above (data.dat) would not be needed in this approach, and the PBS script call to command.sh would be something like the following:

command.sh ${MIN} ${MAX}

Appendix — Details on Basic Commands

qsub

The qsub command submits a sequence of commands to the batch server along with the parameters specifying job resource requirements. The parameters may be provided on the command line, from within the job script, or a combination of both. To facilitate optimal scheduling, you should specify as many resources as possible.

The syntax for the qsub command is:

qsub [ option(s) ] [ script-file ]

The typical options are:

-q queue

The queue to which the job is sent. Different queues allow different default and maximum resources that can be allocated to the job and have different priorities. The list of all the queues in kraken are shown in the table below.

Note

If this parameter is missing, the job will be submitted to the default queue quick1.

-N name

The name of the job. If this option is not specified, the job is named after your script-file.

-l resources_list

Comma-separated list of requested resources. The list of the most popular resources are shown in the table.

-o stdout_file_name

File name of the standard output (STDOUT) of your job. The file specified in this option will be used instead of the default one.

-e stderr_file_name

File name of the standard error and log messages (STDERR) of your job. The file specified in this option will be used instead of the default one.

-t array_specification

The specification of the array i.e. a bunch of similar jobs run with the same script. See qsub manual for details.

-F "additional_arguments"

Specfies the arguments that will be passed to the job script when the script is launched. The accepted syntax is:

qsub -F "myarg1 myarg2 myarg3=myarg3value" myscript2.sh

Warning

Quotation marks are required. qsub will fail with an error message if the argument following -F is not a quoted value. The server will pass the quoted value as arguments to the job script when it launches the script.

-m mail_notification_options

Mail notification options. The behavior of email notifications is set by combinations of the letters “a”, “b”, “e”, and “n” provided to the -m qsub flag. The options for email notifications are:

a (abort) email is sent when job errors are encountered,
b (begin) email is sent when jobs begin,
e (end) email is sent when jobs end,
n (never) no email is sent.

For large numbers of jobs, the recommended setting is -m a which will provide notification only if there is an error.

-M email_adress

Email address for mail notifications.

The qsub options can be also specified in the beginning of your script, by putting them in the comment lines starting with #PBS (see the sample script below).

Note

Define requested resources carefully, by assigning your jobs to the proper queue and using -l switch to reduce required resources (on Kraken you are not allowed to increase the default queue resources). Mind that, on the one hand, your job will be killed if it does ot fit within specified constraints, however, on the other hand, smaller jobs will have higher priority and will start execution sooner.

Examples of qsub Invocation

qsub -q longer4 your_script.sh
(sent your script to longer4 queue),
qsub -N my_job your_script.sh
(sent your script to the default queue and name it my_job),
qsub -l nodes=4:mem=2gb,walltime=1:00:00 your_script.sh
(requests four nodes, 2gb required per node, and a maximum runtime of 1 hour),
qsub -l nodes=node1+node2 your_script.sh
(requests 2 specific nodes),
qsub -l nodes=node0+2 your_script.sh
(requests node0 and any other 2 nodes),
qsub -l nodes=1:ppn=3 -N my_job your_script.sh
(requests 3 processors on one node and name the job my_job),
qsub -l nodes=node1:ppn=4 your_script.sh
(requests 4 processors on the specific node).
qsub -l other=bigdisk your_script.sh
(requests large temporary disk (1.7 TB) on the computing node).

Sample Submission Script

#!/bin/bash
#PBS -N my\_job\_name
#PBS -q longer4
#PBS -l other=bigdisk
#PBS -t 1-10
#PBS -m e
#PBS -M kraken.user@p.lodz.pl
plask input.xpl ${PBS_ARRAYID}

qstat

The qstat command monitors the status of all jobs currently submitted to Torque on the kraken cluster.

Examples:

qstat
(show all jobs),
qstat -a
(show all jobs, alternate format),
qstat -n1
(show all jobs, alternate format with specification of the nodes where the job runs),
qstat -t
(show all jobs without grouping arrays),
qstat -f 1234
(show detailed information about job 1234).
qstat -f 1234[3]
(show detailed information about job number 3 of the array with ID 1234).

qdel

A queued job may be removed from a queue or a running job may be killed using the qdel command.

Example

qdel 1234
(delete job with ID 1234; the job ID can be obtained with the qstat command).
qdel 1234[]
(delete all jobs from the array with ID 1234).
qdel 1234[3]
(delete job number 3 from the array with ID 1234).

qpeek

You can see the current output of a running job by running command qpeek.

Examples

qpeek 1234
(show the standard output and standard error of job 1234),
qpeek -o 1234.kraken
(show only the standard output of job 1234),
qpeek -e 1234
(show only standard error of job 1234),
qpeek -N myjob
(show the standard output and standard error of job named myjob).

qhold and qrls

Kraken cluster supports job checkpointing i.e. saving the state of the job to disk. This allows the job to be suspended (hold) on a user request and resumed (released) later. This feature can be useful in case you have a big long-running job and want to suspend it in order to make space for more urgent smaller jobs. Jobs can be hold using command qhols and released with qrls.

Examples

qhold 1234
(hold job 1234),
qrls 1234
(release job 1234).

Tables

Queues defined on Kraken cluster

The following table contains a list of the queues available at Kraken cluster. Use the one that suits your needs best. Queues multi16 and multi32 are reserved for MPI jobs.

Queue name nodes/ppn walltime mem
tiny1 1 / 1 2.5 h 1 GB
tiny4 1 / 4 2.5 h 4 GB
quick1 1 / 1 2.5 h 3 GB
quick4 1 / 4 2.5 h 12 GB
quick8 1 / 8 2.5 h 24 GB
longer1 1 / 1 13 h 3 GB
longer4 1 / 4 13 h 12 GB
longer8 1 / 8 13 h 24 GB
thelongest1 1 / 1 7 days 3 GB
thelongest4 1 / 4 7 days 12 GB
thelongest8 1 / 8 7 days 24 GB
multi16 16 / 1 7 days 16 × 3 GB
multi32 32 / 1 7 days 32 × 3 GB

nodes: number of used nodes, ppn: number of processor per nodes, walltime: total execution time, mem: total allowed memory usage

The most common resources that can be specified with -l switch

-l option Action
walltime=hh:mm:ss The length of time your job will need to run. Your job will end after the allocated walltime has expired whether it is finished or not, so choose this value carefully. Appropriate walltime should be chosen in order to prevent programs hat either run out of control, or who never exit, from consuming all system resources.
mem=N[measure] The maximum amount of memory the job is expected to use. The measure may be in GB, MB, KB, B.
nodes=n Specifies the number of nodes or the specific node required.
nodes=n:ppn=x Specifies the number of nodes or the specific node required and allocate x processors per node (ppn).
other=bigdisk Special property of the computing node. bigdisk means that your job will be run on a server with 1.7 TB of temporary space (mounted in /tmp).

Environment variables available to execution scipts

Variable name Description
PBS_O_HOST The name of the host upon which the qsub command is running.
PBS_SERVER The hostname of the pbs_server which qsub submits the job to.
PBS_O_QUEUE The name of the original queue to which the job was submitted.
PBS_O_WORKDIR The absolute path of the current working directory of the qsub command.
PBS_ARRAYID Each member of a job array is assigned a unique identifier (see -t option).
PBS_JOBID The job identifier assigned to the job by the batch system. It can be used in the stdout and stderr paths. The batch system replaces $PBS_JOBID with the job’s jobid (for example, #PBS -o /tmp/$PBS_JOBID.output).
PBS_JOBNAME The job name supplied by the user.
PBS_QUEUE The name of the queue from which the job is executed.