This section provides an overview of the directories shared between the compute nodes and the controller nodes in HPC cloud clusters. Understanding these shared directories is crucial for managing data and optimizing job performance in your cluster setup.
The home directory serves as a central storage location and plays a key role in your cluster's operation. It is automatically NFS exported from the controller node, and all compute nodes inside a cluster automatically mount
/home over NFS.
Because the home directory is part of the controller's root disk, it gets deleted when a cluster is turned off. This directory is a no-cost alternative to cloud storage and can be used for job execution, storing configurations, and performing small file operations.
For ease of use, we automatically mount the cluster owner's home directory into their user workspace using SSHFS. We recommend leveraging this mount to conveniently drag and drop small files as well as open scripts and logs in the integrated development environment (IDE). However, it's important to note that using this directory for large file transfers between the user workspace and the controller is not recommended. Instead, we recommend using the commands
scp. For more information, please see Transferring Data.
To meet your specific storage needs, you have the flexibility to create additional storage and filesystems that can be attached to your clusters. These options provide you with customizable storage solutions tailored to your workload. For more information on managing storage and attaching additional storage resources to your clusters, please see Storage.
Persistent Cloud Object Storage
Cloud storage presents an excellent option for persistent and cost-effective data storage. You can utilize AWS S3, Azure Blob Storage, Google Cloud Storage (GCS) objects that can be mounted to the controller and compute nodes using the Filesystem in Userspace (FUSE) technology. This type of storage allows you to store your results securely before turning off a cluster. To enhance performance, we recommend employing manual data copying with corresponding commands such as
aws s3, or
Persistent Cloud NFS Storage
Alternatively to object storage, you have access to managed filesystem services such as Filestore on GCP or EFS on AWS. These services offer the ability to mount filesystems to the controller and compute nodes over NFS.
These filesystems, like object storage, are persistent, and can offer better performance than object storage. Please note that setting this up currently requires assistance from our support team.
You can utilize persistent filesystems for installing custom user-specific software that needs to be readily available whenever a cluster is started. Moreover, it serves as a reliable location for storing your results securely before shutting down a cluster, allowing for seamless continuation of your work in subsequent sessions.
Lustre storage offers a powerful parallel distributed filesystem that can be mounted to the controller and compute nodes. You can configure Lustre storage as either persistent or ephemeral, depending on your requirements. However, it's important to be mindful of the cost implications associated with Lustre filesystems; in general, Lustre is more suitable for large-scale HPC workloads with demanding I/O operations.
Image disks are disks that are created from a cloud image, and that disk is directly attached to the controller. The controller automatically NFS exports image disks and compute nodes automatically mount those exports. Files written to these disks will not persist across sessions, as the disk is created from the image source each time a cluster is started. Leveraging image disks is a practical choice when you need to install organization-wide software, making it readily available to all users.