HPCC Facilities and Equipment

For a description of HPCC software and other services, click here. For information on how to use the equipment described below, please consult our HPCC User Guides and HPCC Training pages.

Facilities

The High Performance Computing Center (HPCC) operates and maintains hardware located in three separate data centers. Our main production clusters, along with the central file system servers and several smaller systems, are in the Experimental Sciences Building I (ESB I) on the main TTU campus. Our secondary locations, containing isolated specialty clusters for specific research applications, are found in the Chemistry Building on the main campus and at Reese Center several miles west of the main campus.

RedRaider Cluster

Equipment

The HPCC resources include both generally available public and researcher-owned private or dedicated systems. Public nodes and storage are available to any TTU faculty member, research grant leader, to TTU students working with them, or upon special request, to external research partners of TTU faculty. These resources, including a standard amount per user of storage of data for use on the clusters, are provided at no cost to TTU faculty members, research grant leaders, and to students and research partners working with them, with access to processing time allocated on a fair-share queueing basis for all generally available nodes.

For those with large-scale needs, dedicated computational resource priority on one or more CPU nodes or GPUs or to storage beyond the base amounts may be purchased on an annual or pre-paid basis by individual faculty members, research grant leaders, or research groups. Dedicated resources owned by individual researchers or groups are administered by HPCC staff when operated as part of HPCC clusters.

Note: The HPCC does not provide support for clusters or equipment not housed in HPCC machine room facilities.

Primary Campus-Based Computing Clusters

The HPCC's primary cluster is RedRaider, consisting of the Nocona and Quanah/XLQuanah CPU partitions, the Matador and Toreador GPU partitions, and the Ivy interactive and high-memory nodes totaling approximately 2.2 PFLOPS of raw peak computing power. An overview of the configuration for the primary cluster partitions is shown in the table below.

Partition:	Nocona	Quanah / XLQuanah	Matador	Toreador	Ivy
Type	CPU	CPU	GPU	GPU	Auxiliary CPU*
Total Nodes	240	467 / 16	20	11	50 / 2
Theoretical Max	983 TFLOPS	565 TFLOPS /19 TFLOPS	280 TFLOPS	287 TFLOPS	40 TFLOPS
Benchmarked	804 TFLOPS	485 TFLOPS / (N/A)	226 TFLOPS		N/A
OS	CentOS 8.1	CentOS 7.4 /CentOS 8.1	CentOS 8.1	CentOS 8.1	Rocky Linux 8.5 /CentOS 8.1
Manufacturer	Dell	Dell	Dell	Dell	Dell
Node Model	PowerEdge C6525	PowerEdge C6320	PowerEdge R740	Poweredge R7525	PowerEdge C6220 II
Cooling	Liquid Cooled	Air Cooled	Air Cooled	Air Cooled	Air Cooled
Processor Make and Model	AMD EPYC™ 7702	Intel Xeon E5-2695 v4	Intel Xeon Gold 6242	AMD EPYC™ 7262	Intel Xeon E5-2670v2
Family	Rome	Broadwell	Cascade Lake	Rome	Ivy Bridge
Cores/Processor	64	18	20	8	10
Cores/Node	128	36	40 cpu + 1280 tensor + 10,240 CUDA	16 cpu + 1,296 tensor + 20,736 CUDA	20
Total Cores In Partition	30,720	16,812 / 576	800 cpu + 25,600 tensor + 204,800 CUDA	528 cpu + 14,256 tensor+ 228,096 CUDA	1,000 / 40
GPU (if present)	N/A	N/A	NVIDIA Tesla V100	NVIDIA Ampere A100	N/A
GPUs/Node	0	0	2	3	0
Total GPUs	0	0	40	33	0
Memory/Node	512 GB	192 GB /256 GB	384 GB	192 GB	128 GB / 1536 GB
Memory/Core	4 GB	5.33 GB /7.11 GB	9.6 GB	12 GB	6.4 GB / 76.8 GB
High-Speed Fabric	Mellanox HDR 200 InfiniBand	Intel OmniPath /Mellanox FDR	Mellanox HDR 100 Infiniband	Mellanox HDR 100 Infiniband	Mellanox FDR InfiniBand
Fabric Speed	200 Gbps	100 Gbps /56 Gbps	100 Gbps	100 Gbps	56 Gbps
Topology	Non-Blocking Fat-Tree	Non-Blocking Fat-Tree /Partitioned	Non-Blocking Fat-Tree	Non-Blocking Fat-Tree	Up to 2:1 Oversubscribed Fat-Tree
Ethernet	25 GbE	10 GbE	25 GbE	25 GbE	1 GbE
Efficiency	81.5%	85.8%/(N/A)	80.6%		N/A

* Auxiliary cpus are used to support workflow management, interactive use, and specialty nodes such as high-memory instances. At present, there are two Ivy Bridge class nodes with 1536 GB of memory available in the himem-ivy partition for use in special use cases that require high memory per node.

Cluster Details

Texas Tech worked with our partners at Dell to implement one of the first large-scale university-based AMD EPYC™ Rome clusters in the world. This new cluster, named RedRaider, consists of three distinct parts: the Nocona CPU partition and the Matador and Toreador GPU partitions. To honor Texas Tech's official colors, the internal management nodes are named Scarlet and Midnight. The RedRaider cluster was commissioned in Fall 2020 and began production operation in January 2021.

The Quanah cluster was first commissioned in early 2017 and then was upgraded to twice its capacity in Q4 of 2017. To connect with West Texas history, the cluster is named for Quanah Parker, and its internal management node Charlie is named for Charles Goodnight. Its nodes are now operated as a partition of the RedRaider cluster. In 2022, 16 nodes were added as the XLQuanah partition to accommodate needs for long-running non-checkpointed workflows, with hardware similar to Quanah but optimized for single-node jobs.

The Ivy partitions are named in reference to the "Ivy Bridge" Intel processor family name, and are primarily intended to support interactive computational use using remaining portions of the former Hrothgar cluster. This capability is currently under development. A small number of nodes with 1.5 TB of memory each are also currently in operation to assess the need for further future high memory computng support. Please contact hpccsupport@ttu.edu if you require these capabilities.

Other, Specialty, and Off-Campus Clusters

The Texas Tech HPCC also manages access to a portion of the resources on Lonestar 6 operated by the Texas Advanced Computing Center in Austin that can be made available for competitive allocations for specific projects on special request and serve as a "safety valve" for certain large-scale projects. The Texas Tech portion of Lonestar 6 corresponds to annual continuous use of approximately 38 nodes (4,860 cores), roughly 190 teraflops, of AMD EPYC 7763 Milan processors out of the 71,680 core cluster total. The Lonestar 6 cluster was commissioned in 2022. Codes proposing to use it must be able to occupy entire 128-core nodes. Contact hpccsupport@ttu.edu for more details if you are interested in using this system.

Additionally, research group needs sometimes require use of dedicated cluster resources. A dedicated cluster is a standalone cluster that is paid for by a specific Texas Tech faculty member or research group that is housed in the HPCC machine rooms to provide UPS power and cooling. An example of this type of cluster is Nepag, dedicated to health physics dose calculation codes. The HPCC also houses the DISCI cluster, used by researchers in the Department of Computer Science, and test resources supporting the National Science Foundation Cloud and Autonomic Computing Industry/University Cooperative Research center including the Redfish cluster. Realtime 2 is a dedicated private weather modeling cluster owned by the Atmospheric Sciences group. System administration support for these clusters is required to be provided by the researcher or research group itself, with consultation available if necessary during business hours on request with HPCC staff. Dedicated clusters will only be accepted for operation within HPCC machine room facilities on a space-available basis by the HPCC. Space is not guaranteed for such systems.

Cluster-Wide Storage

The HPCC operates a DataDirect Network storage system capable of providing storage for up to 6.9 petabytes of data. This storage space is configured using Lustre to provide a set of common file systems to the RedRaider, Quanah and Hrothgar clusters, and is provisioned with a 1.0 petabyte backup system that can be used to protect research data. The file system uses a 200-Gbps storage network and a combination of Lustre Network (LNet) routers to bridge traffic from the distinct cluster fabric networks into the storage fabric network. A set of central NFS servers also provides access to applications used across each of the clusters.

Researcher-Owned Compute and Storage Capacity

Researchers who need additional computing capacity beyond the generally available fair-share queue resources and are considering buying dedicated hardware or storage may wish to talk with us about purchasing researcher-specific CPU and/or GPU capacity out of the main cluster resources. This capability allows researcher workloads to be scheduled on a "guaranteed class of service" level on an annual basis for pre-reserved cpu or gpu priorithy access or dedicated storage. Resources purchased in this manner may be shared by users belonging to a given research group. Additions of this nature are subject to space or infrastructure limitations. Please check with the HPCC staff and management for the current options.

In this option, researchers may choose to purchase time on physical compute nodes, GPUs, and/or storage that are operated as part of the HPCC equipment and will receive priority access equal to the purchased resource capacity. The HPCC will house, operate and maintain the resources according to a formal operational agreement that typically lasts as long as they are covered by the researcher by a service contract or remain in warranty. The new warranty period is usually determined at the time of purchase of the equipment and is typically five years, although extensions are possible. The HPCC also offers the opportunity for researchers to purchase backup services for data housed on the HPCC storage system. Contact hpccsupport@ttu.edu for more details.

High Performance Computing Center

Phone
806.742.4350
Email
hpccsupport@ttu.edu

High Performance Computing Center

HPCC Facilities and Equipment

Facilities

Equipment

Primary Campus-Based Computing Clusters

Cluster Details

Other, Specialty, and Off-Campus Clusters

Cluster-Wide Storage

Researcher-Owned Compute and Storage Capacity

Contact TTU

News

Events

High Performance Computing Center

Phone

Email