Report only on jobs allocated to the specified node or list of nodes.
This may either be the NodeName or NodeHostname
as defined in slurm.conf(5) in the event that they differ.
A node_name of localhost is mapped to the current host name.
JOB REASON CODES
These codes identify the reason that a job is waiting for execution.
A job may be waiting for more than one reason, in which case only
one of those reasons is displayed.
- AssociationJobLimit
-
The job's association has reached its maximum job count.
- AssociationResourceLimit
-
The job's association has reached some resource limit.
- AssociationTimeLimit
-
The job's association has reached its time limit.
- BadConstraints
-
The job's constraints can not be satisfied.
- BeginTime
-
The job's earliest start time has not yet been reached.
- Cleaning
-
The job is being requeued and still cleaning up from its previous execution.
- Dependency
-
This job is waiting for a dependent job to complete.
- FrontEndDown
-
No front end node is available to execute this job.
- InactiveLimit
-
The job reached the system InactiveLimit.
- InvalidAccount
-
The job's account is invalid.
- InvalidQOS
-
The job's QOS is invalid.
- JobHeldAdmin
-
The job is held by a system administrator.
- JobHeldUser
-
The job is held by the user.
- JobLaunchFailure
-
The job could not be launched.
This may be due to a file system problem, invalid program name, etc.
- Licenses
-
The job is waiting for a license.
- NodeDown
-
A node required by the job is down.
- NonZeroExitCode
-
The job terminated with a non-zero exit code.
- PartitionDown
-
The partition required by this job is in a DOWN state.
- PartitionInactive
-
The partition required by this job is in an Inactive state and not able to
start jobs.
- PartitionNodeLimit
-
The number of nodes required by this job is outside of it's
partitions current limits.
Can also indicate that required nodes are DOWN or DRAINED.
- PartitionTimeLimit
-
The job's time limit exceeds it's partition's current time limit.
- Priority
-
One or more higher priority jobs exist for this partition or advanced reservation.
- Prolog
-
It's PrologSlurmctld program is still running.
- QOSJobLimit
-
The job's QOS has reached its maximum job count.
- QOSResourceLimit
-
The job's QOS has reached some resource limit.
- QOSTimeLimit
-
The job's QOS has reached its time limit.
- ReqNodeNotAvail
-
Some node specifically required by the job is not currently available.
The node may currently be in use, reserved for another job, in an advanced
reservation, DOWN, DRAINED, or not responding.
Nodes which are DOWN, DRAINED, or not responding will be identified as part
of the job's "reason" field as "UnavailableNodes". Such nodes will typically
require the intervention of a system administrator to make available.
- Reservation
-
The job is waiting its advanced reservation to become available.
- Resources
-
The job is waiting for resources to become available.
- SystemFailure
-
Failure of the Slurm system, a file system, the network, etc.
- TimeLimit
-
The job exhausted its time limit.
- QOSUsageThreshold
-
Required QOS threshold has been breached.
- WaitingForScheduling
-
No reason has been set for this job yet.
Waiting for the scheduler to determine the appropriate reason.
JOB STATE CODES
Jobs typically pass through several states in the course of their
execution.
The typical states are PENDING, RUNNING, SUSPENDED, COMPLETING, and COMPLETED.
An explanation of each state follows.
- BF BOOT_FAIL
-
Job terminated due to launch failure, typically due to a hardware failure
(e.g. unable to boot the node or block and the job can not be requeued).
- CA CANCELLED
-
Job was explicitly cancelled by the user or system administrator.
The job may or may not have been initiated.
- CD COMPLETED
-
Job has terminated all processes on all nodes with an exit code of zero.
- CF CONFIGURING
-
Job has been allocated resources, but are waiting for them to become ready for use
(e.g. booting).
- CG COMPLETING
-
Job is in the process of completing. Some processes on some nodes may still be active.
- DL DEADLINE
-
Job terminated on deadline.
- F FAILED
-
Job terminated with non-zero exit code or other failure condition.
- NF NODE_FAIL
-
Job terminated due to failure of one or more allocated nodes.
- OOM OUT_OF_MEMORY
-
Job experienced out of memory error.
- PD PENDING
-
Job is awaiting resource allocation.
- PR PREEMPTED
-
Job terminated due to preemption.
- R RUNNING
-
Job currently has an allocation.
- RD RESV_DEL_HOLD
-
Job is held.
- RF REQUEUE_FED
-
Job is being requeued by a federation.
- RH REQUEUE_HOLD
-
Held job is being requeued.
- RQ REQUEUED
-
Completing job is being requeued.
- RS RESIZING
-
Job is about to change size.
- RV REVOKED
-
Sibling was removed from cluster due to other cluster starting the job.
- SI SIGNALING
-
Job is being signaled.
- SE SPECIAL_EXIT
-
The job was requeued in a special state. This state can be set by
users, typically in EpilogSlurmctld, if the job has terminated with
a particular exit value.
- SO STAGE_OUT
-
Job is staging out files.
- ST STOPPED
-
Job has an allocation, but execution has been stopped with SIGSTOP signal.
CPUS have been retained by this job.
- S SUSPENDED
-
Job has an allocation, but execution has been suspended and CPUs have been
released for other jobs.
- TO TIMEOUT
-
Job terminated upon reaching its time limit.
PERFORMANCE
Executing squeue sends a remote procedure call to slurmctld. If
enough calls from squeue or other Slurm client commands that send remote
procedure calls to the slurmctld daemon come in at once, it can result in
a degradation of performance of the slurmctld daemon, possibly resulting
in a denial of service.
Do not run squeue or other Slurm client commands that send remote
procedure calls to slurmctld from loops in shell scripts or other
programs. Ensure that programs limit calls to squeue to the minimum
necessary for the information you are trying to gather.
ENVIRONMENT VARIABLES
Some squeue options may be set via environment variables. These
environment variables, along with their corresponding options, are listed
below. (Note: Commandline options will always override these settings.)
- SLURM_BITSTR_LEN
-
Specifies the string length to be used for holding a job array's task ID
expression.
The default value is 64 bytes.
A value of 0 will print the full expression with any length required.
Larger values may adversely impact the application performance.
- SLURM_CLUSTERS
-
Same as --clusters
- SLURM_CONF
-
The location of the Slurm configuration file.
- SLURM_TIME_FORMAT
-
Specify the format used to report time stamps. A value of standard, the
default value, generates output in the form "year-month-dateThour:minute:second".
A value of relative returns only "hour:minute:second" if the current day.
For other dates in the current year it prints the "hour:minute" preceded by
"Tomorr" (tomorrow), "Ystday" (yesterday), the name of the day for the coming
week (e.g. "Mon", "Tue", etc.), otherwise the date (e.g. "25 Apr").
For other years it returns a date month and year without a time (e.g.
"6 Jun 2012"). All of the time stamps use a 24 hour format.
A valid strftime() format can also be specified. For example, a value of
"%a %T" will report the day of the week and a time stamp (e.g. "Mon 12:34:56").
- SQUEUE_ACCOUNT
-
-A <account_list>, --account=<account_list>
- SQUEUE_ALL
-
-a, --all
- SQUEUE_ARRAY
-
-r, --array
- SQUEUE_NAMES
-
--name=<name_list>
- SQUEUE_FEDERATION
-
--federation
- SQUEUE_FORMAT
-
-o <output_format>, --format=<output_format>
- SQUEUE_FORMAT2
-
-O <output_format>, --Format=<output_format>
- SQUEUE_LICENSES
-
-p-l <license_list>, --license=<license_list>
- SQUEUE_LOCAL
-
--local
- SQUEUE_PARTITION
-
-p <part_list>, --partition=<part_list>
- SQUEUE_PRIORITY
-
-P, --priority
- SQUEUE_QOS
-
-p <qos_list>, --qos=<qos_list>
- SQUEUE_SIBLING
-
--sibling
- SQUEUE_SORT
-
-S <sort_list>, --sort=<sort_list>
- SQUEUE_STATES
-
-t <state_list>, --states=<state_list>
- SQUEUE_USERS
-
-u <user_list>, --users=<user_list>
EXAMPLES
Print the jobs scheduled in the debug partition and in the
COMPLETED state in the format with six right justified digits for
the job id followed by the priority with an arbitrary fields size:
# squeue -p debug -t COMPLETED -o "%.6i %p"
JOBID PRIORITY
65543 99993
65544 99992
65545 99991
Print the job steps in the debug partition sorted by user:
# squeue -s -p debug -S u
STEPID NAME PARTITION USER TIME NODELIST
65552.1 test1 debug alice 0:23 dev[1-4]
65562.2 big_run debug bob 0:18 dev22
65550.1 param1 debug candice 1:43:21 dev[6-12]
Print information only about jobs 12345,12345, and 12348:
# squeue --jobs 12345,12346,12348
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
12345 debug job1 dave R 0:21 4 dev[9-12]
12346 debug job2 dave PD 0:00 8 (Resources)
12348 debug job3 ed PD 0:00 4 (Priority)
Print information only about job step 65552.1:
# squeue --steps 65552.1
STEPID NAME PARTITION USER TIME NODELIST
65552.1 test2 debug alice 12:49 dev[1-4]
COPYING
Copyright (C) 2002-2007 The Regents of the University of California.
Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
Copyright (C) 2008-2010 Lawrence Livermore National Security.
Copyright (C) 2010-2016 SchedMD LLC.
This file is part of Slurm, a resource management program.
For details, see <https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
details.
SEE ALSO
scancel(1), scontrol(1), sinfo(1), srun(1),
slurm_load_ctl_conf (3), slurm_load_jobs (3),
slurm_load_node (3),
slurm_load_partitions (3)
Index
- NAME
-
- SYNOPSIS
-
- DESCRIPTION
-
- OPTIONS
-
- JOB REASON CODES
-
- JOB STATE CODES
-
- PERFORMANCE
-
- ENVIRONMENT VARIABLES
-
- EXAMPLES
-
- COPYING
-
- SEE ALSO
-