PERUN Supercomputer – Partitions Overview¶
About PERUN Partitions
PERUN uses two main partitions: CPU and GPU.
Each job submitted to Slurm must specify one of these partitions, unless a default is used.
1. What Are Slurm Partitions?¶
A partition in Slurm represents a group of compute nodes with similar characteristics or usage rules.
Partitions define:
- hardware constraints (CPU cores, GPUs, memory)
- job limitations (max runtime, max cores, number of nodes)
- resource availability and priority
- access to specialized hardware (e.g., GPU nodes)
Tip — Always choose the correct partition
CPU workloads should run in the cpu_short or cpu_long partition.
GPU or AI workloads must run in the gpu_short or gpu_long partition.
2. Available PERUN Partitions¶
PERUN defines four partitions split by workload type and time limit:
| Partition | Nodes | Time Limit | Max Job Size | GPUs | Purpose |
|---|---|---|---|---|---|
| cpu_short | cn01–cn22 | 2 days | Up to system limits | 0 | Short CPU HPC workloads |
| cpu_long | cn23–cn32 | 4 days | Up to system limits | 0 | Long-running CPU workloads |
| gpu_short | gpu01–gpu18 (H200) | 2 days | Up to 8 GPUs per node | 8 per node | Short AI/ML/GPU workloads |
| gpu_long | gpu19–gpu26 (H200) | 4 days | Up to 8 GPUs per node | 8 per node | Long AI/ML/GPU workloads |
3. Viewing Partition Information¶
You can inspect partitions with:
Basic Slurm overview¶
Detailed partition definitions¶
Example sinfo Output Snippet
Note
Node status values:
- idle → ready to run jobs
- alloc → currently running jobs
- mix → partially allocated
- down/drain → node unavailable
4. Choosing the Right Partition¶
Use the cpu_short or cpu_long partition when:¶
- running multi-core CPU jobs
- performing scientific simulations
- running general HPC workloads
Use cpu_short for jobs under 2 days, cpu_long for jobs up to 4 days.
Use the gpu_short or gpu_long partition when:¶
- training machine learning / deep learning models
- performing GPU-accelerated workloads (CUDA, PyTorch, TensorFlow)
- requiring NVIDIA H200 performance
Use gpu_short for jobs under 2 days, gpu_long for jobs up to 4 days.
Warning — GPU misuse
Jobs without GPU requirements should not run on the GPU partitions.
5. Submitting Jobs to a Partition¶
CPU job example¶
GPU job example¶
Important
Failing to specify --gres=gpu:<num> in the GPU partition will result in no GPUs being allocated.
6. Walltime and Efficiency¶
Why walltime matters¶
- Jobs with too-high time limits wait longer in the queue.
- Shorter jobs are often scheduled earlier.
- Improper walltime estimates decrease cluster efficiency.
Efficient walltime use
If your job usually finishes in 3 hours, do not request 24 hours.
If your job runs under 2 days, prefer cpu_short or gpu_short over the long partitions.
7. Summary¶
- PERUN provides four Slurm partitions:
cpu_short,cpu_long,gpu_short,gpu_long. - Short partitions (2-day limit):
cpu_short(cn01–cn22),gpu_short(gpu01–gpu18). - Long partitions (4-day limit):
cpu_long(cn23–cn32),gpu_long(gpu19–gpu26). - Correct partition selection improves job scheduling and cluster efficiency.
- Use
sinfoandscontrolto inspect resources. - Always specify GPUs explicitly when using the gpu partitions.