The main function of the HPCC storage systems is to provide rapid access to and from
the worker nodes of the HPCC clusters for data needed in high speed calculations. For this reason, these systems are optimized for speed and are not intended for long-term
or archival storage. We cannot guarantee that data will not be lost due to operational
factors in the use of the clusters. As a result, it is the researcher's responsibility
to back up their own important data externally.
The HPCC stores data on a set of resilient Lustre-based file systems, and backs up a limited amount of user data in home areas. We strongly encourage users to maintain an external copy of all data and not to use HPCC storage systems as the only copy of files critical to their research. In particular, the work, scratch and other special-purpose areas are not backed up and should not be used as the only long-term copies of important data.
In more detail, in HPCC storage systems, the conflict between performance, size, speed, cost, and reliability is generally resolved here in favor of large size at high speed with relatively low cost. Most of the cluster disk storage is composed of redundant arrays of inexpensive disks (RAID) to be resilient against single disk failures. There are nearly 100 such arrays operating at this time in the HPCC. In most cases, at least three disks in any given array must fail for data to be lost.
Data Policy for Hrothgar and Quanah
On Hrothgar and Quanah,
- The $HOME area for every user is backed up and is subject to usage quotas.
- The $WORK area for every user is not backed up but is not purged, and is subject to usage quotas larger than those used in $HOME.
- Special researcher-owned storage areas may be purchased by individual researchers or research groups and are managed according to their own policies.
- The Scratch partition is subject to purging in order to keep the file usage below
- If the Scratch partition becomes 80% or more full, the $SCRATCH area for every user is purged of its oldest files. See the Purge Policy below for details.
- On a monthly basis the $SCRATCH area for every user is purged according to the Purge Policy - see below for details.
The following table summarizes the locations, their sizes and backup details.
|Location||Size in GB||Alias||Backup||Purged|
Individual files will be removed from /lustre/scratch/eraiderid ($SCRATCH) automatically if they have not been accessed in over 1 year. To check
a file's last access time (atime) you can use the following command: ls -ulh filename.
Users may not run "touch" commands or similar commands for the purpose of altering their file's timestamps to circumvent this purge policy. Users who violate this policy will have their accounts suspended by HPCC staff. This suspension may result in the user's jobs being killed.
HPCC Staff will monitor the scratch space usage to ensure that it remains below 80% full. On a monthly basis the $SCRATCH area for every user will be purged of all files that have not been accessed in over 1 year. This monthly purge will be run regardless of the current level of scratch space usage. In the event the Scratch partition goes above the 80% threshold, an immediate purge of every user's $SCRATCH space will be triggered and the oldest files for each user will be removed until we are well below the 80% threshold.
To help us avoid the need to shorten the retention period, please use the scratch space conscientiously. The Scratch partition should be used for files that have no need for long-term retention. Ideally, this period should be measured in days. The reason that the retention period is variable is that it depends on usage. At this time, and with current usage patterns, the file retention period on the scratch area can be expected to be at least several days and most likely up to a few weeks, but in no case will files stay on disk more than a year since their last access. The scratch area should NOT be used for files that will be needed for longer periods.
We will try to keep the HPCC user community informed and give warnings if the expected retention period decreases significantly in the future.
Data Policy for Janus
On Janus, the D drive is the only area whose backup is maintained. The users do not have write access to C drive.
For additional assistance please contact firstname.lastname@example.org