HPCC RedRaider OS and Software Update 2025
Nocona Partition
How to access the upgraded nodes in the Nocona partition?
By default, all job submissions to the Nocona partition will continue to be queued
for the pool of nodes with the old CentOS and software packages. In order to access
the reserved nodes with the new Rocky Linux 9 operating system and environment, either
via interactive sessions or batch job submissions, you will need to specify the “Nocona_rocky9”
reservation explicitly as follows:
- For Interactive sessions:
$ interactive -p nocona -r nocona_rocky9 [<other options>] - For batch job submissions (sbatch):
#!/bin/bash
#SBATCH -J job_name
#SBATCH -N 1
#SBATCH --ntasks-per-node=128
#SBATCH -p nocona
#SBATCH --reservation=nocona_rocky9
[<other slurm options>]
Please note that Slurm will treat jobs submitted to the “nocona_rocky9” reserved pool
in the same way as all other regular jobs in terms of maximum wall time per job and
fair-share priority calculation.
How to adjust the jobs and software packages for the updated Nocona software environment
before October 6, 2025?
- All C/C++/Fortran codes and software packages installed locally in your account will
need to be recompiled against the provided versions of GCC, Intel OneAPI (formerly
known as Intel Parallel Studio), and/or OpenMPI compilers. Otherwise, the expected
result won't be achieved by running the same code in the new OS environment.
- R, Python, and Conda environments are expected to remain functional as before, unless
in rare cases the Python or R modules are required to compile against the local GCC
or OS library packages.
- To view the full list of available cluster-wide software packages and compilers, visit
the “HPCC RedRaider Cluster Software Packages” webpage by navigating to the HPCC Website
and selecting “RedRaider Cluster -> Software Packages” from the menu, or simply clicking
on This Link.
- We highly encourage HPCC researchers, students, and system users to leverage this
four-week transition period to adjust their jobs or codes to the operating System,
software packages, and the latest version of compilers provided for each partition.
- HPCC staff are available during this transition period to work closely with researchers through the technical support channel to ensure a smooth transition for all researchers.
Matador partition
The “matador” partition, along with its test partition, “gpu-build,” has been upgraded
to Rocky Linux 9 and configured with the latest Nvidia GPU driver to support newer
versions of the CUDA toolkit and a wide range of GPU-intensive scientific and AI/ML
software applications.
As before, the “gpu-build” test partition includes one GPU worker node with an Nvidia
V100 GPU device and is configured for multiple simultaneous logins to provide an interactive
environment for testing and developing GPU and CUDA applications. Please continue
using the “interactive” command from the login nodes as before to access this partition.
See the commands below to list the current available modules for the gpu-build and
matador partitions. These modules include a set of pre-built and containerized applications,
along with tools for compiling your own CUDA code if required, or testing your code
before submitting a job to the Matador partition.
Please review the following points regarding the new changes and GCC/CUDA compatibilities in the updated software environment for these partitions before resuming your job submissions to this partition.
- The current and future Nvidia GPU driver updates on Matador nodes support all versions
of the CUDA toolkit up to version 13.0. Please note, however, that CUDA versions beyond
version 12 (i.e., CUDA 13.0+) have discontinued support for Nvidia V100 GPU devices
and older. Therefore, CUDA 12.x.x is the latest supported version on the matador and
gpu-build nodes.
- By default, no default version of CUDA is set up upon interactive sessions or batch
job submissions to the matador nodes. In order to access the cluster-wide CUDA toolkit
packages, you may need to load the corresponding Lmod modules in your job submission
scripts or upon the establishment of an interactive session. Currently, the following
CUDA versions are available on these partitions:
module load cuda/11.8.0
module load gcc/12.2.0 cuda/12.3.2
module load gcc/13.2.0 cuda/12.9.0
- The HPCC does not support CUDA versions earlier than 11.8 or versions beyond 12.x
on the Matador partition. However, HPCC account holders may feel free to install any
version of CUDA not listed above in their account under Home or Work areas or through
Conda/MiniForge channels for Python packages, if required. Please keep in mind, however,
that CUDA versions beyond v13.0 are no longer supported on the Matador partition.
- For those using compiled C/C++/Fortran CUDA codes installed locally by you or your
research group, please recompile these using the latest compatible version of CUDA
and GCC before resubmitting the jobs based on those software packages. Failure to
do so will result in job failures or unusual behavior of the jobs.
- Almost all Python modules and CUDA packages installed in Conda environments, with some exceptions, are expected to work fine on the updated Matador nodes. However, we strongly recommend testing your CUDA/GPU Python scripts in interactive sessions to ensure they work correctly before submitting long-running jobs to the Matador partition.
To view the full list of available cluster-wide software packages and compilers, please
visit the “HPCC RedRaider Cluster Software Packages” webpage by navigating to the
HPCC Website and selecting “RedRaider Cluster -> Software Packages” from the menu,
or simply clicking on This Link.
As always, please do not hesitate to contact the HPCC support team via hpccsupport@ttu.edu if you have any questions or need assistance with software installation or adjusting
your job submissions.
High Performance Computing Center
-
Phone
806.742.4350 -
Email
hpccsupport@ttu.edu