Items on this page:
Update 19 Jan 2021 6:30 pm:
Working with the storage equipment vendor, we have resolved problems with the HPCC storage systems encountered following maintenance work by installing an updated version of Lustre. HPCC staff members are now performing the remaining work planned for the shutdown and expect to return the full RedRaider cluster to operation by 6:00 pm Thursday, Jan. 21.
Further updates will be posted on this page when available. Please contact email@example.com if you have any questions.
Planned shutdowns are usually reserved for the 2nd full week of the 2nd month of each calendar quarter to carry out system maintenance. These can sometimes be skipped if not needed, expanded if necessary, or reduced. In addition, there will sometimes be necessary special planned shutdowns to carry out upgrades or alterations to the clusters.
For 2021, the planned shutdowns will be on the following schedule:
1) January 4-15, 2021 (Special shutdown to merge Quanah and RedRaider clusters)*
2) May 10-14, 2021
3) August 9-13, 2021
4) November 8-12, 2021
* Schedule for the shutdown to merge the Quanah and RedRaider clusters is as follows:
Jan. 1, 2021, 5 pm: Draining of the general Quanah omni queue in preparation for merging clusters.
Jan. 4 - 15, 2021: All clusters offline to merge Quanah into RedRaider (including control Ethernet, batch scheduler change to Slurm, rebuilding of software library, operating system update).
Jan. 15, 2021, end of business day: Full RedRaider cluster return to service. General availability for the RedRaider combined cluster begins.
Jan. 19, 2020: Training in Slurm and the RedRaider cluster as below.
Introduction to the new Red Raider Cluster Jan 19 2pm - 4pm Zoom Register
If you have any questions, please contact firstname.lastname@example.org.
Maintenance downtime policy for HPCC systems:
- Special periods may be reserved for performing routine maintenance. Users will be notified as early as possible when we are planning on bring the systems down for any reason.
- Systems may also go down at anytime to fix security issues, but every effort will be made to give the earliest possible notice.
- Maintenance may also be required without prior notice in the event of system crashes or any unstable behavior.
Users should always keep extra copies of any files that are critical to their research on systems that are outside of the HPCC to guard against possible loss in the event of unforeseen catastrophic failure.
This policy is just good general practice and applies to all critical research files, regardless of where they are stored or whether or not they are located on HPCC resources.
If you have any questions, concerns, or problems the please contact us at email@example.com.