Texas Tech University

HPCC Facilities and Equipment

For a description of HPCC software and other services, click here. For information on how to use the equipment described below, please consult our HPCC User Guides and HPCC Training pages.

Facilities

The High Performance Computing Center (HPCC) operates and maintains hardware located in three separate data centers. Our main production clusters, along with the central file system servers and several smaller systems, are in the Experimental Sciences Building I (ESB I) on the main TTU campus. Our secondary locations, containing isolated specialty clusters for specific research applications, are found in the Chemistry Building on the main campus and at Reese Center several miles west of the main campus.

RedRaider Cluster

Equipment

The HPCC resources include both generally available public and researcher-owned private or dedicated systems. Public nodes and storage are available to any TTU faculty member, research grant leader, to TTU students working with them, or upon special request, to external research partners of TTU faculty. These resources, including a standard amount per user of storage of data for use on the clusters, are provided at no cost to TTU faculty members, research grant leaders, and to students and research partners working with them, with access to processing time allocated on a fair-share queueing basis for all generally available nodes.

For those with large-scale needs, dedicated computational resource priority on one or more CPU nodes or GPUs or to storage beyond the base amounts may be purchased on an annual or pre-paid basis by individual faculty members, research grant leaders, or research groups. Dedicated resources owned by individual researchers or groups are administered by HPCC staff when operated as part of HPCC clusters.

Note: The HPCC does not provide support for clusters or equipment not housed in HPCC machine room facilities.

Primary Campus-Based Computing Clusters

The HPCC's primary cluster is RedRaider, consisting of the Nocona and Quanah/XLQuanah CPU partitions, the Matador and Toreador GPU partitions, and the Ivy interactive and high-memory nodes totaling approximately 2.2 PFLOPS of raw peak computing power. An overview of the configuration for the primary cluster partitions is shown in the table below.

Partition: 

Nocona

Quanah / XLQuanah

Matador

Toreador

Ivy

 Type

 CPU

 CPU

 GPU

 GPU

 Auxiliary CPU*

Total Nodes

240

467 / 16

20

11

50 / 2

Theoretical Max 

983 TFLOPS

565 TFLOPS
/19 TFLOPS

280 TFLOPS

287 TFLOPS

40 TFLOPS

Benchmarked 

804 TFLOPS

485 TFLOPS
/ (N/A)

226 TFLOPS

 

N/A

OS

CentOS 8.1

CentOS 7.4
/CentOS 8.1

CentOS 8.1

CentOS 8.1

Rocky Linux 8.5 /CentOS 8.1

Manufacturer

Dell

Dell

Dell

Dell

Dell

Node Model

PowerEdge C6525

PowerEdge C6320

PowerEdge R740

Poweredge R7525

PowerEdge C6220 II

Cooling 

Liquid Cooled

Air Cooled

Air Cooled

Air Cooled

Air Cooled

Processor Make and Model

AMD EPYC™ 7702

Intel Xeon E5-2695 v4

Intel Xeon Gold 6242

AMD EPYC™ 7262

Intel Xeon E5-2670v2

Family

Rome

Broadwell

Cascade Lake

Rome

Ivy Bridge

Cores/Processor

64

18

20

8

10

Cores/Node

128

36

40 cpu +
1280 tensor +
10,240 CUDA

16 cpu +
1,296 tensor +
20,736 CUDA

20

Total Cores In Partition

30,720

16,812

/ 576

800 cpu +
25,600 tensor +
204,800 CUDA

528 cpu +
14,256 tensor+
228,096 CUDA

1,000
/ 40

GPU (if present)

N/A

N/A

NVIDIA Tesla V100

NVIDIA Ampere A100

N/A

GPUs/Node

0

0

2

3

0

Total GPUs

0

0

40

33

0

Memory/Node

512 GB

192 GB

/256 GB

384 GB

192 GB

128 GB
/ 1536 GB

Memory/Core

4 GB

5.33 GB
/7.11 GB

9.6 GB

12 GB

6.4 GB 
/ 76.8 GB

High-Speed Fabric

Mellanox HDR 200 InfiniBand

Intel OmniPath
/Mellanox FDR 

Mellanox HDR 100 Infiniband

Mellanox HDR 100 Infiniband

Mellanox FDR InfiniBand

Fabric Speed

200 Gbps

100 Gbps
/56 Gbps

100 Gbps

100 Gbps

56 Gbps

Topology

Non-Blocking Fat-Tree

Non-Blocking Fat-Tree
/Partitioned

Non-Blocking Fat-Tree

Non-Blocking Fat-Tree

Up to 2:1 Oversubscribed Fat-Tree

Ethernet 

25 GbE

10 GbE

25 GbE

25 GbE

1 GbE

Efficiency

81.5%

85.8%/(N/A)

80.6%

 

N/A

* Auxiliary cpus are used to support workflow management, interactive use, and specialty nodes such as high-memory instances. At present, there are two Ivy Bridge class nodes with 1536 GB of memory available in the himem-ivy partition for use in special use cases that require high memory per node.

Cluster Details

Texas Tech worked with our partners at Dell to implement one of the first large-scale university-based AMD EPYC™ Rome clusters in the world. This new cluster, named RedRaider, consists of three distinct parts: the Nocona CPU partition and the Matador and Toreador GPU partitions. To honor Texas Tech's official colors, the internal management nodes are named Scarlet and Midnight. The RedRaider cluster was commissioned in Fall 2020 and began production operation in January 2021.

The Quanah cluster was first commissioned in early 2017 and then was upgraded to twice its capacity in Q4 of 2017. To connect with West Texas history, the cluster is named for Quanah Parker, and its internal management node Charlie is named for Charles Goodnight. Its nodes are now operated as a partition of the RedRaider cluster. In 2022, 16 nodes were added as the XLQuanah partition to accommodate needs for long-running non-checkpointed workflows, with hardware similar to Quanah but optimized for single-node jobs. 

The Ivy partitions are named in reference to the "Ivy Bridge" Intel processor family name, and are primarily intended to support interactive computational use using remaining portions of the former Hrothgar cluster. This capability is currently under development. A small number of nodes with 1.5 TB of memory each are also currently in operation to assess the need for further future high memory computng support. Please contact hpccsupport@ttu.edu if you require these capabilities.

Other, Specialty, and Off-Campus Clusters

The Texas Tech HPCC also manages access to a portion of the resources on Lonestar 6 operated by the Texas Advanced Computing Center in Austin that can be made available for competitive allocations for specific projects on special request and serve as a "safety valve" for certain large-scale projects. The Texas Tech portion of Lonestar 6 corresponds to annual continuous use of approximately 38 nodes (4,860 cores), roughly 190 teraflops, of AMD EPYC 7763 Milan processors out of the 71,680 core cluster total. The Lonestar 6 cluster was commissioned in 2022. Codes proposing to use it must be able to occupy entire 128-core nodes. Contact hpccsupport@ttu.edu for more details if you are interested in using this system.

Additionally, research group needs sometimes require use of dedicated cluster resources. A dedicated cluster is a standalone cluster that is paid for by a specific Texas Tech faculty member or research group that is housed in the HPCC machine rooms to provide UPS power and cooling. An example of this type of cluster is Nepag, dedicated to health physics dose calculation codes. The HPCC also houses the DISCI cluster, used by researchers in the Department of Computer Science, and test resources supporting the National Science Foundation Cloud and Autonomic Computing Industry/University Cooperative Research center including the Redfish cluster. Realtime 2 is a dedicated private weather modeling cluster owned by the Atmospheric Sciences group. System administration support for these clusters is required to be provided by the researcher or research group itself, with consultation available if necessary during business hours on request with HPCC staff. Dedicated clusters will only be accepted for operation within HPCC machine room facilities on a space-available basis by the HPCC. Space is not guaranteed for such systems.

Cluster-Wide Storage

The HPCC operates a DataDirect Network storage system capable of providing storage for up to 6.9 petabytes of data. This storage space is configured using Lustre to provide a set of common file systems to the RedRaider, Quanah and Hrothgar clusters, and is provisioned with a 1.0 petabyte backup system that can be used to protect research data. The file system uses a 200-Gbps storage network and a combination of Lustre Network (LNet) routers to bridge traffic from the distinct cluster fabric networks into the storage fabric network. A set of central NFS servers also provides access to applications used across each of the clusters.

Researcher-Owned Compute and Storage Capacity

Researchers who need additional computing capacity beyond the generally available fair-share queue resources and are considering buying dedicated hardware or storage may wish to talk with us about purchasing researcher-specific CPU and/or GPU capacity out of the main cluster resources. This capability allows researcher workloads to be scheduled on a "guaranteed class of service" level on an annual basis for pre-reserved cpu or gpu priorithy access or dedicated storage. Resources purchased in this manner may be shared by users belonging to a given research group. Additions of this nature are subject to space or infrastructure limitations. Please check with the HPCC staff and management for the current options.

In this option, researchers may choose to purchase time on physical compute nodes, GPUs, and/or storage that are operated as part of the HPCC equipment and will receive priority access equal to the purchased resource capacity. The HPCC will house, operate and maintain the resources according to a formal operational agreement that typically lasts as long as they are covered by the researcher by a service contract or remain in warranty. The new warranty period is usually determined at the time of purchase of the equipment and is typically five years, although extensions are possible. The HPCC also offers the opportunity for researchers to purchase backup services for data housed on the HPCC storage system. Contact hpccsupport@ttu.edu for more details.

High Performance Computing Center