Why won't my job start?
The first thing to do is run squeue:
[naveed@login1 benchmarking]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
340 compute iozone naveed PD 0:00 1 (Dependency)
571 gpu iozone naveed PD 0:00 1 (Resources)
572 gpu iozone naveed PD 0:00 1 (Priority)
338 compute iozone naveed R 12:13 1 hpc-25-03
In this case we see job 340 id waiting on another job to satisfy it dependencies. job 571 is waiting on enough resources to be available to run on the cluster. Job 572 has not started because it does not have a high enough priority and there are jobs waiting higher in the queue.
to get more information, you can use "scontrol job show"
To view estimated start time for your job
squeue -t PD -u <your-username> --start
Is there an upcoming maintenance period?
If you are submitting during, or near a maintenance period, your jobs may not run until the period is over. If a job will not complete before the start of maintenance, we will not schedule it to run until after the maintenance is completed.