Texas Tech University

HPCC Facilities and Equipment

For a description of HPCC software and other services, click here. For information on how to use the equipment described below, please consult our HPCC User Guides and HPCC Training pages.

Facilities

The High Performance Computing Center (HPCC) operates and maintains hardware located in three separate data centers. Our main production clusters, along with the central file system servers and several smaller systems, are in the Experimental Sciences Building I (ESB I) on the main TTU campus. Our secondary locations, containing isolated specialty clusters for specific research applications, are found in the Chemistry Building on the main campus and at Reese Center several miles west of the main campus.

RedRaider Cluster

Equipment

The HPCC resources include both generally available public and researcher-owned private or dedicated systems. Public nodes and storage are available to any TTU researcher, or upon special request, to external research partners of TTU faculty. These resources, including a standard amount per user of storage of data for use on the clusters, are provided at no cost to TTU researchers. Access to processing time is allocated on a fair-share queueing basis for all generally available nodes. Dedicated computational resource priority on one or more CPU nodes or GPUs may be purchased on an annual or pre-paid basis by individual researchers or research groups. Dedicated resources owned by individual researchers or groups are administered by HPCC staff when operated as part of HPCC clusters.

Note: The HPCC does not provide support for clusters or equipment not housed in HPCC machine room facilities.

Primary Campus-Based Computing Clusters

The HPCC's primary cluster is RedRaider, consisting of the Nocona and Quanah CPU partitions, the Matador and Toreador GPU partitions, and the Ivy and associated Community Cluster portions of the older Hrothgar, totaling 2.2 PFLOPS of raw peak computing power. An overview of the configuration for the primary cluster partitions is shown in the table below.

Partition: 

Nocona

Quanah

Matador

Toreador

Ivy

 Type

 CPU

 CPU

 GPU

 GPU

 Interactive

Total Nodes

240

467

20

11

100

Theoretical Max 

983 TFLOPS

565 TFLOPS

280 TFLOPS

287 TFLOPS

80 TFLOPS

Benchmarked 

804 TFLOPS

485 TFLOPS

226 TFLOPS

 

N/A

OS

CentOS 8.1

CentOS 7.4

CentOS 8.1

CentOS 8.1

CentOS 7.4

Manufacturer

Dell

Dell

Dell

Dell

Dell

Node Model

PowerEdge C6525

PowerEdge C6320

PowerEdge R740

Poweredge R7525

PowerEdge C6220 II

Cooling 

Liquid Cooled

Air Cooled

Air Cooled

Air Cooled

Air Cooled

Processor Make and Model

AMD EPYC™ 7702

Intel Xeon E5-2695 v4

Intel Xeon Gold 6242

AMD EPYC™ 7262

Intel Xeon E5-2670v2

Family

Rome

Broadwell

Cascade Lake

Rome

Ivy Bridge

Cores/Processor

64

18

20

8

10

Cores/Node

128

36

40 cpu +
1280 tensor +
10,240 CUDA

16 cpu +
1,296 tensor +
20,736 CUDA

20

Total Cores In Partition

30,720

16,812

800 cpu +
25,600 tensor +
204,800 CUDA

528 cpu +
14,256 tensor+
228,096 CUDA

2,000

GPU (if present)

N/A

N/A

NVIDIA Tesla V100

NVIDIA Ampere A100

N/A

GPUs/Node

0

0

2

3

0

Total GPUs

0

0

40

33

0

Memory/Node

512 GB

192 GB

384 GB

192 GB

64 GB

Memory/Core

4 GB

5.33 GB

9.6 GB

12 GB

3.2 GB

High-Speed Fabric

Mellanox HDR 200 InfiniBand

Intel OmniPath

Mellanox HDR 100 Infiniband

Mellanox HDR 100 Infiniband

Mellanox QDR InfiniBand

Fabric Speed

200 Gbps

100 Gbps

100 Gbps

100 Gbps

40 Gbps

Topology

Non-Blocking Fat-Tree

Non-Blocking Fat-Tree

Non-Blocking Fat-Tree

Non-Blocking Fat-Tree

Non-Blocking Fat-Tree

Ethernet 

25 GbE

10 GbE

25 GbE

25 GbE

1 GbE

Efficiency

81.5%

85.8%

80.6%

 

N/A

Cluster Details

Texas Tech worked with our partners at Dell to implement one of the first large-scale university-based AMD EPYC™ Rome clusters in the world. This new cluster, named RedRaider, consists of three distinct parts: the Nocona CPU partition and the Matador and Toreador GPU partitions. To honor Texas Tech's official colors, the internal management nodes are named Scarlet and Midnight. The RedRaider cluster was commissioned in Fall 2020 and began production operation in January 2021.

The Quanah cluster was first commissioned in early 2017 and then was upgraded to twice its capacity in Q4 of 2017. To connect with West Texas history, the cluster is named for Quanah Parker, and its internal management node Charlie is named for Charles Goodnight. Its nodes are now operated as a partition of the RedRaider cluster.

The Ivy partition is named in reference to the "Ivy Bridge" Intel processor family name, and is primarily intended to be devoted to support interactive computational use. Other remaining portions of the former Hrothgar cluster include the serial and community-cluster nodes described below.

Other, Specialty, and Off-Campus Clusters

The HPCC also operates a mix of older and specialty clusters and equipment. The community cluster (22 nodes of various researcher-owned processor and memory configurations operated as part of the Ivy Infiniband fabric), and serial resources (310 nodes of dual 6-core Westmere processors and 24 GB of memory each, operated as separate nodes with no parallel fabric) are remaining special-purpose portions of the Hrothgar cluster.

A dedicated cluster is a standalone cluster that is paid for by a specific TTU faculty member or research group housed in the HPCC machine rooms to provide UPS power and cooling. An example of this type of cluster is Nepag, dedicated to health physics dose calculation codes. The HPCC also houses the DISCI cluster, used by researchers in the Department of Computer Science, and test resources supporting the National Science Foundation Cloud and Autonomic Computing Industry/University Cooperative Research center including the Redfish cluster. Realtime 2 is a dedicated private weather modeling cluster owned by the Atmospheric Sciences group. For these clusters system administration support is provided by the researcher or research group itself, with consultation available if necessary during business hours on request with HPCC staff. Dedicated clusters will only be accepted for operation within HPCC machine room facilities on a space-available basis by the HPCC. Space is not guaranteed for such systems.

Additionally, TTU has access to resources on Lonestar 5 operated by the Texas Advanced Computing Center in Austin that can be made available for competitive allocations for specific projects on special request and serve as a "safety valve" for certain large-scale projects. The TTU portion corresponds to approximately 1,600 cores (roughly 64 teraflops) of Intel Haswell processors out of the 30,048 core cluster total. The Lonestar 5 cluster was commissioned in 2015 and is currently at the end of its service life. Contact hpccsupport@ttu.edu for more details if you are interested in using this system during the remainder of its available usage period.

Cluster-Wide Storage

The HPCC operates a DataDirect Network storage system capable of providing storage for up to 6.9 petabytes of data. This storage space is configured using Lustre to provide a set of common file systems to the RedRaider, Quanah and Hrothgar clusters, and is provisioned with a 1.0 petabyte backup system that can be used to protect research data. The file system uses a 100-Gbps storage network and a combination of Lustre Network (LNet) routers to bridge traffic from the distinct cluster fabric networks into the storage fabric network. A set of central NFS servers also provides access to applications used across each of the clusters.

Researcher-Owned Compute and Storage Capacity

Researchers who need additional computing capacity beyond the generally available fair-share queue resources and are considering buying dedicated hardware or storage may wish to talk with us about purchasing researcher-specific CPU capacity out of the main cluster resources. This capability allows researcher workloads to be scheduled on a "guaranteed class of service" level for cpu or storage. Purchased CPU time or storage space may be shared by users belonging to a given research group. Additions of this nature are subject to space or infrastructure limitations. Please check with the HPCC staff and management for the current options.

In this option, researchers may choose to purchase time on physical compute nodes and/or storage that are operated as part of the HPCC equipment and will receive priority access equal to the purchased resource capacity. The HPCC will house, operate and maintain the resources according to a formal operational agreement that typically lasts as long as they are covered by the researcher by a service contract or remain in warranty. The new warranty period is usually determined at the time of purchase of the equipment and is typically five years, although extensions are possible. The HPCC also offers the opportunity for researchers to purchase backup services for data housed on the HPCC storage system. Contact hpccsupport@ttu.edu for more details.

High Performance Computing Center