Texas Tech University

Faculty/Staff account request

Please select the cluster that you would like to use and then review the Operation Policies and Data Policies of HPCC, and click the "Agree" button at the bottom to certify that you comply with the policies.

Select Your Cluster

Welcome to the HPCC account creation page. Please select a cluster from the drop down menu above to view information about that cluster. Once you have selected the cluster that you would like to use please read our Login Security and Data Policies below and then click Agree to continue with your account request.

Operational Policies

Login security

HPCC rely on TTU e-raider authentication system to check user credentials on our systems. All users use their e-raider id and password to log in HPCC clusters.

HPCC systems have RSA authentication enabled for passwordless login. You may choose whether to enable this on each system by adding RSA keys from remote systems to ~/.ssh/authorized_keys. Please be careful. If enabled, this allows an intruder to enter all of your accounts if one is cracked. We strongly suggest that you do not add a key for a system that is itself insecure (MS Windows, security not up to date, telnet enabled, etc.) as this allows intruders user access to HPCC systems which can then be escalated to root access and total control.

Cluster Internal Security

On first login to a cluster head node, SSH may ask you for a key phrase. This is not generally needed. It is reasonably secure, and makes login to the compute nodes simpler, if you leave this key blank (hit the enter key at this prompt). From the head node, you should be able to either ssh or rsh to all of the compute nodes in that cluster without a password. If either ssh or rsh prompts for a password on cluster head to compute login, please contact HPCC staff at hpccsupport@ttu.edu, as parallel software generally depends on passwordless login. More complex methods will be required if you have a non-blank SSH key phrase on the cluster head nodes.

Remote shell or rsh only works within each cluster. If you are extremely concerned about security, you may wish to use only ssh within clusters. rsh is faster as it does not encrypt each transmission, but the transmissions can be intercepted and decoded. This is generally not an issue, since root access to the cluster is required to intercept the messages, and this interception procedure would not be necessary for a cracker who already had root access.

MPI on clusters also uses either ssh or rsh for data transmission. Alternative MPI libraries are provided which use ssh or rsh, and for each Fortran compiler, in /home/local. Please include, link, and mpirun from the same library. The Fortran version does not matter if you use only gcc/g++.

Data Backup

We currently store data on a resilient system and back up a limited amount of user data, however we strongly encourage users to maintain a copy of all the data which is absolutely critical for their research. Ultimately, we do not have the budget to guarantee that data will not be lost.  As a result, it is the researcher’s responsibility to back up their own important data on their own systems. In HPCC systems, the conflict between performance, size, cost, and reliability is generally resolved in favor of large size at medium performance at low cost. Reliability necessarily suffers. Most of the cluster disk storage is composed of arrays of consumer-grade disks. Disk failures are relatively common, but single-disk failures are usually automatically handled by RAID software.

Critical files, such as source code and output required for a paper or dissertation, should be periodically copied to each user's personal machine and further backed up to removable media and stored offsite.

Access Permissions

By default in Linux systems, users have read, write and execute permissions to the directories and files that they own. Meanwhile the directories and files are readable and executable to other users, including the users in the same group of the owner. Basically a user is the owner of the directories /home/user-id, /lustre/work/user-id, and /lustre/scratch/user-id, as well as all files and directories under them. A user also owns the temporary files or directories in /state/partition1 on compute nodes, if their jobs create temporary output there. If you are concerned about the permission settings, for example, you do not want others to read your files, you can change the permission by command "chmod" with appropriate options. For the details, please run "man chmod" to get the manual of chmod command, or contact hpccsupport@ttu.edu.

Regardless of the directory permissions, root users (HPCC staff and anyone who might completely crack the system), Sponsoring faculty/staff can read your files. If you want to make extremely sure that no-one reads your files, install the PGP package and encrypt your critical files. You would also need to eliminate backup copies from the Tivoli system. However, if you encrypt and then forget your pass phrase, the files can be extremely difficult to decrypt depending on the size of your encryption key.

Data Policies

The main function of the HPCC storage systems is to provide rapid access to and from the worker nodes of the HPCC clusters for data needed in high speed calculations.  For this reason, these systems are optimized for speed and are not intended for long-term or archival storage. We cannot guarantee that data will not be lost due to operational factors in the use of the clusters. As a result, it is the researcher's responsibility to back up their own important data externally.

The HPCC stores data on a set of resilient Lustre-based file systems, and backs up a limited amount of user data in home areas. We strongly encourage users to maintain an external copy of all data and not to use HPCC storage systems as the only copy of files critical to their research. In particular, the work, scratch and other special-purpose areas are not backed up and should not be used as the only long-term copies of important data.

In more detail, in HPCC storage systems, the conflict between performance, size, speed, cost, and reliability is generally resolved here in favor of large size at high speed with relatively low cost. Most of the cluster disk storage is composed of redundant arrays of inexpensive disks (RAID) to be resilient against single disk failures. There are nearly 100 such arrays operating at this time in the HPCC. In most cases, at least three disks in any given array must fail for data to be lost. 

Data Policy for Hrothgar and Quanah

On Hrothgar and Quanah,

  • The $HOME area for every user is backed up and is subject to usage quotas.
  • The $WORK area for every user is not backed up but is not purged, and is subject to usage quotas larger than those used in $HOME.
  • Special researcher-owned storage areas may be purchased by individual researchers or research groups and are managed according to their own policies.
  • The Scratch partition is subject to purging in order to keep the file usage below 80% full.
    • If the Scratch partition becomes 80% or more full, the $SCRATCH area for every user is purged of its oldest files.  See the Purge Policy below for details. 
    • On a monthly basis the $SCRATCH area for every user is purged according to the Purge Policy - see below for details.

The following table summarizes the locations, their sizes and backup details.

Location, quota and backup summary
Location Size in GB Alias Backup Purged
/home/eraiderid 150GB $HOME Yes No
/lustre/work/eraiderid 700GB $WORK No No
/lustre/scratch/eraiderid none $SCRATCH No Monthly

Purge Policy

Individual files will be removed from /lustre/scratch/eraiderid ($SCRATCH) automatically if they have not been accessed in over 1 year.  To check a file's last access time (atime) you can use the following command: ls -ulh filename.

Users may not run "touch" commands or similar commands for the purpose of altering their file's timestamps to circumvent this purge policy.  Users who violate this policy will have their accounts suspended by HPCC staff.  This suspension may result in the user's jobs being killed.

HPCC Staff will monitor the scratch space usage to ensure that it remains below 80% full.  On a monthly basis the $SCRATCH area for every user will be purged of all files that have not been accessed in over 1 year.  This monthly purge will be run regardless of the current level of scratch space usage.  In the event the Scratch partition goes above the 80% threshold, an immediate purge of every user's $SCRATCH space will be triggered and the oldest files for each user will be removed until we are well below the 80% threshold.

To help us avoid the need to shorten the retention period, please use the scratch space conscientiously.  The Scratch partition should be used for files that have no need for long-term retention. Ideally, this period should be measured in days. The reason that the retention period is variable is that it depends on usage. At this time, and with current usage patterns, the file retention period on the scratch area can be expected to be at least several days and most likely up to a few weeks, but in no case will files stay on disk more than a year since their last accessThe scratch area should NOT be used for files that will be needed for longer periods.

We will try to keep the HPCC user community informed and give warnings if the expected retention period decreases significantly in the future.

Data Policy for Janus

On Janus, the D drive is the only area whose backup is maintained. The users do not have write access to C drive.

Additional Assistance

For additional assistance please contact hpccsupport@ttu.edu

High Performance Computing Center