Resource Reservation Gude
SLURM verstion 2.0 has the ability to reserve resources for jobs being executed by select users and/or select bank accounts. A resource reservation identifies the nodes of a resource reservation and a time period during which the reservation is available. Note that resource reservations are not compatable with SLURM's gang scheduler plugin since the termination time of running jobs is not possible to accurately predict.
Reservations can be created, updated, or destroyed only by user root or the configured SlurmUser using the scontrol command. The scontrol, smap and sview commands can be used to view reservations. The man pages for the various commands contain details.
Note for users of Maui or Moab schedulers:
Maui and Moab are not integrated with SLURM's resource reservation system,
but should use their own advanced reservation system.
Reservation Creation
One common mode of operation for a reservation would be to reserve an entire computer at a particular time for a system down time. The example below shows the creation of a full-system reservation at 16:00 hours on 6 February and lasting for 120 minutes. The "maint" flag is used to identify the reservation for accounting purposes as system maintenance. The "ignore_jobs" flag is used to indicate that we can ignore currently running jobs when creating this reservation. By default, only nodes which are not expected to have a running job at the start time can be reserved (the time limit of all running jobs will have been reached). In this case we can manually cancel the running jobs as needed to perform system maintenance. As the reservation time approaches, only jobs that can complete by the reservation time will be initiated.
$ scontrol create reservation starttime=2009-02-06T16:00:00 \ duration=120 user=root flags=maint,ignore_jobs nodes=ALL Reservation created: root_4 $ scontrol show reservation ReservationName=root_4 StartTime=2009-02-06T16:00:00 EndTime=2009-02-06T18:00:00 Duration=120 Nodes=ALL NodeCnt=20 Features=(null) PartitionName=(null) Flags=MAINT,SPEC_NODES,IGNORE_JOBS Users=root Accounts=(null)
Another mode of operation would be to reserve specific nodes for an indefinite period in order to study problems on those nodes. This could also be accomplished using a SLURM partition specifically for this purpose, but that would fail to capture the maintenance nature of their use.
$ scontrol create reservation user=root starttime=now \ duration=infinite flags=maint nodes=sun000 Reservation created: root_5 $ scontrol show res ReservationName=root_5 StartTime=2009-02-04T16:22:57 EndTime=2009-02-04T16:21:57 Duration=4294967295 Nodes=sun000 NodeCnt=1 Features=(null) PartitionName=(null) Flags=MAINT,SPEC_NODES Users=root Accounts=(null)
Our final example is to reserve ten nodes in the default SLURM partition starting at noon and with a duration of 60 minutes occuring daily. The reservation will be available only to users alan and brenda.
$ scontrol create reservation user=alan,brenda \ starttime=noon duration=60 flags=daily nodecnt=10 Reservation created: alan_6 $ scontrol show res ReservationName=alan_6 StartTime=2009-02-05T12:00:00 EndTime=2009-02-05T13:00:00 Duration=60 Nodes=sun[000-003,007,010-013,017] NodeCnt=10 Features=(null) PartitionName=pdebug Flags=DAILY Users=alan,brenda Accounts=(null)
Note that specific nodes to be associated with the reservation are made immediately after creation of the reservation. This permits users to stage files to the nodes in preparation for use during the reservation. Note that the reservation creation request can also identify the partition from which to select the nodes or _one_ feature that every selected node must contain.
Reservations must not overlap, with the exception of maintenance mode as described below. They must either include different nodes or operate at different times. If specific nodes are not specified when a reservation is created, SLURM will automatically select nodes to avoid overlap and insure that the selected nodes are available when the reservation begins. For ease of system maintenance, you can create a reservation with the "maint" flag that overlaps existing reservations. This permits an administrator to easily create a maintenance reservation for an entire cluster without needing to remove or reschedule pre-existing reservations.
Reservation Use
The reservation create response includes the reservation's name. This name is automatically generated by SLURM based upon the first user or account name and a numeric suffix. In order to use the reservation, the job submit request must explicitly specify that reservation name. The job must be contained completely within the named reservation. The job will be cancelled after the reservation reaches its EndTime. If letting the job continue execution after the reservation EndTime, a configuration option ResvOverRun can be set to control how long the job can continue execution.
$ sbatch --reservation=alan_6 -N4 my.script sbatch: Submitted batch job 65540
Reservation Modification
Reservations can be modified by user root as desired. For example their duration could be altered or the users granted access changed as shown below:
$ scontrol update ReservationName=root_4 \ duration=150 users=admin Reservation updated. bash-3.00$ scontrol show ReservationName=root_4 ReservationName=root_4 StartTime=2009-02-06T16:00:00 EndTime=2009-02-06T18:30:00 Duration=150 Nodes=ALL NodeCnt=20 Features=(null) PartitionName=(null) Flags=MAINT,SPEC_NODES Users=jette Accounts=(null)
Reservation Deletion
Reservations are automatically purged after their end time. They may also be manually deleted as shown below. Note that a reservation can not be deleted while there are jobs running in it.
$ scontrol delete ReservationName=alan_6
Reservation Accounting
Jobs executed within a reservation are accounted for using the appropriate user and bank account. If resources within a reservation are not used, those resources will be accounted for as being used by all users or bank accounts associated with the reservation on an equal basis (e.g. if two users are eligible to use a reservation and neither does, each user will be reported to have used half of the reserved resources).
Future Work
Several enhancements are anticipated at some point in the future.
- The automatic selection of nodes for a reservation create request may be sub-optimal in terms of locality (for optimized application communication).
- The feature specification in the reservation creation request should be made more flexible to accept multiple features with AND or OR operators.
Last modified 15 June 2009