...
- It has the ability to run parallel calculations by allocating a set of nodes specifically for that task.
- Users running serial jobs with minimal input/output can start the jobs directly from the commandline without resorting to batch files.
- Slurm has the ability to handle batch jobs as well.
- Slurm allows users to run interactive commandline or X11 jobs.
Node Status
sinfo
sinfo reports the state of partitions and nodes managed by Slurm. Example output:
Code Block | ||
---|---|---|
| ||
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
all* up infinite 9 down* rb2u[3,5-12]
all* up infinite 5 idle dn[1-2],rb2u[1-2,4]
xeon-6136-256G up infinite 2 idle dn[1-2]
xeon-e5-2620-32G up infinite 9 down* rb2u[3,5-12]
xeon-e5-2620-32G up infinite 3 idle rb2u[1-2,4] |
In the above, the partition "all" contains all the nodes. There are also partitions for the differing specifications of nodes. Nodes will be listed both in "all" and their individual spec class partition.
The above eample shows that nodes dn1 and dn2 are idle – up and no jobs are running. Nodes rb2u3, rb2u5, rb2u6, rb2u7.... through rb2u12 are all down. If a node is allocated to a job, the status will be "alloc" . If a node is set to run its current jobs and allow no more jobs in preparation of downtime, its status will be set to "drain".
sview
sview is a graphical user interface to get state information for nodes (and jobs).
Job Status
sview
sview is a graphical user interface to get job state information on nodes.
squeue
squeue reports the state of jobs or job steps. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
Code Block | ||
---|---|---|
| ||
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
65646 batch chem mike R 24:19 2 adev[7-8]
65647 batch bio joan R 0:09 1 adev14
65648 batch math phil PD 0:00 6 (Resources) |
Each calculation is given a JOBID. This can be used to cancel the job if necessary. The PARTITION field references the node class spec partitions as mentioned above in the "sinfo" documentation. The NAME field gives the name of the program being used for the calculation. The NODELIST field shows which node each calculation is running on. And the NODES field shows the number of nodes in use for that job.