Supporting dynamic allocation of heterogeneous storage resources on HPC systems
Julien Monniot, François Tessier, Matthieu Robert, Gabriel Antoniu- Computational Theory and Mathematics
- Computer Networks and Communications
- Computer Science Applications
- Theoretical Computer Science
- Software
Summary
Scaling up large‐scale scientific applications on supercomputing facilities is largely dependent on the ability to scale up efficiently data storage and retrieval. However, there is an ever‐widening gap between I/O and computing performance. To address this gap, an increasingly popular approach consists in introducing new intermediate storage tiers (node‐local storage, burst‐buffers,) between the compute nodes and the traditional global shared parallel file‐system. Unfortunately, without advanced techniques to allocate and size these resources, they remain underutilized. In this article, we investigate how heterogeneous storage resources can be allocated on an high‐performance computing platform, just like compute resources. To this purpose, we introduce StorAlloc, a simulator used as a testbed for assessing storage‐aware job scheduling algorithms and evaluating various storage infrastructures. We illustrate its usefulness by showing through a large series of experiments how this tool can be used to size a burst‐buffer partition on a top‐tier supercomputer by using the job history of a production year.