AWS – AWS Parallel Computing Service expands Slurm customization capabilities
AWS Parallel Computing Service (AWS PCS) now offers expanded Slurm configuration capabilities, enabling you to set over 60 additional parameters for granular control over your high performance computing (HPC) cluster operations. This enhancement provides more flexibility in managing job scheduling, resource allocation, access control, and job lifecycle.
The new Slurm custom settings give you fine-grained control over various resource management scenarios, including fair-share scheduling and quality of service levels. For example, you can now implement queue-specific priority policies, configure preemption settings, and set custom time and resource limits. Additionally, you can control access permissions at the account level and configure per-job execution behaviors. These and other capabilities help you to run a production HPC environment that efficiently serves multiple teams, projects, and workload types.
AWS PCS is a managed service that makes it easier for you to run and scale your HPC workloads and build scientific and engineering models on AWS using Slurm. You can use AWS PCS to build complete, elastic environments that integrate compute, storage, networking, and visualization tools. AWS PCS simplifies cluster operations with managed updates and built-in observability features, helping to remove the burden of maintenance. You can work in a familiar environment, focusing on your research and innovation instead of worrying about infrastructure.
Expanded Slurm custom settings are available in all AWS Regions where AWS PCS is available. To learn more, see the AWS PCS User Guide.
Read More for the details.