Skip to content

Profiling under Slurm

Using Slurm's profiling functionality, available since the summer 2020 update

Profiling disabled by default

By default, the jobs run without profiling enabled.

Principle

By default, Slurm indicates and stores the memory consumption of a job. This can be viewed after the computation is completed with the sacct command. When profiling is enabled for a job, Slurm collects the data periodically and stores it in an HDF5 file.

Usage

When starting the job (sbatch), add the option --profile=all.

Once the job is finished, ask for the collection of the generated files by executing the command sh5util -j $JOB_ID, replacing $JOB_ID by the job number. This command generates a file (job_$JOB_ID.h5) in HDF5 format containing the collected data, in the current folder.

Data

On Myria, only the data related to the calculation tasks can be queried (task). You can follow the evolution of the memory usage (RSS and VMSize).

The GPFS file system is not compatible with the read/write monitoring plugin. Also the Omni-Path network is not compatible with the network monitoring plugin.
To view the contents of the HDF5 file, install the HDFView software on your workstation.

Going further

For more information, see the Slurm documentation: https://slurm.schedmd.com/archive/slurm-20.02.7/hdf5_profile_user_guide.html


Last update: November 25, 2022 14:05:21