I have two slurm partitions (lhpc
and lgpu
) with a shared node (n16-90
). I have configured one partition with higher priority. I want that if one job uses the shared node through the lgpu
partition and there is already one job in the lhpc
partition, the latter is suspended and the former allocates the shared node.
However, it doesn’t happen, the job waits until the lower priority ends.
Here I leave some config:
SchedulerTimeSlice=10
SchedulerType=sched/backfill
SelectType=select/cons_tres
PriorityType=priority/multifactor
PriorityWeightPartition=100
PreemptType=preempt/partition_prio
PreemptMode=suspend,gang
PartitionName=lhpc Nodes=n16-[80-83,90-93] Default=YES MaxTime=24:00:00 State=UP PriorityTier=100 PriorityJobFactor=100 OverSubscribe=FORCE
PartitionName=lgpu Nodes=n16-90 Default=YES MaxTime=24:00:00 State=UP PriorityTier=200 PriorityJobFactor=200 OverSubscribe=FORCE
I have tried a lot of configurations. I have also checked that the job in lgpu
has higher priority than the lhpc
job.
PS: it’s my first publication, forgive me if there is something wrong with it.
Jaime Palacios is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.