Software Environment Setup
Modules are used to set up a user's environment. Typically users initialize their environment when they log in by setting environment information for every application they will reference during the session. The HPCC uses the Environment Modules package as a tool to simplify shell initialization and allow users the ability to easily modify their environment during the session with modulefiles.
Table of Contents
Modules are used to load in the necessary environment for different applications available at the HPCC. Because some applications may rely on others in order to function, you may need to load the modules for the prerequisite applications before you can load the application you wish to use. Here are the common commands and a short description of each:
# module avail : used to list available modules. This will update as more modules are added
# module list : list all modules loaded
# module load module_name : used to load a module
# module unload module_name : used to unload a module
# module spider keyword : used to find a module and what prerequisites are needed
# module purge : will unload all modules the user has loaded
# module swap loaded_module new_module : Will swap modules and their dependencies
# module help : Will print a help menu to the screen
The "module avail" command is used to list the currently available modules depending on the modules you currently have loaded. As you load additional modules using "module load", the list will of available modules will expand to show modules that required that module as a prerequisite. When using the "module avail" command, you may also notice some modules show a (D) or (L), these stand for default and loaded respectively. The default module is the one that will load if you leave off the version information and loaded modules are any modules you have already used "module load" on.
The "module spider" command can be used to list all possible modules available to HPCC users or to search for all possible modules that contain a particular keyword. If you run the command "module spider" then you would get a list of every possible application. However, if you were looking for a particular application, such as singularity, you could use the command "module spider singularity" to view information about all applications that contain the name singularity.
The "module spider" command can also be used to determine what modules are required as prerequisites to loading the module you are looking at. Using singularity as an example, running the command "module spider singularity/2.3.1-gnu" will state the you will need to first load "gnu/5.4.0" before you can load singularity.
Adding a module to your environment is done using the "module load" command. For instance if you wanted to add the Intel compilers to your path, then you would use the command "module load intel". This will add the Intel compilers to your path and also cause "module avail" to update and show you additional modules you can now load. When using the "module avail" command, you may also notice some modules show a (D) or (L), these stand for default and loaded respectively. The default module is the one that will load if you leave off the version information and loaded modules are any modules you have already used "module load" on.
Once the Intel module has been loaded more modules can be loaded. For example, impi:
Now you can see all applications that have been compiled using Intel and Intel MPI. Running "module swap impi openmpi" will change the version of MPI from Intel MPI to OpenMPI and will also update the list of available modules.
To save time on typing, you may load a series of modules in one line like follows:
module load intel impi lammps
While you can add module commands to your ~/.bashrc, we recommend that you add them to your job scripts. Adding them to a job script will set up the environment for that job only but will ensure that everything needed to run that job is encapsulated in that job script - making the script more portable and easier to debug if issues arise.
Unloading or removing a module from your environment can be done using the "module unload" command. This command will remove the module from your environment and deactivate any modules that had that module as a prerequisite. For instance if we had loaded the intel, impi and lammps modules and then unloaded the intel module, then we would expect the intel module to be unloaded anf the impi and lammps to become inactive because they both required the intel module to be loaded.
Swapping a module in your environment with another module can be done using the "module swap" command. This command will remove the first module from your environment and replace it with the second module specified. This is useful for switching the compiler or mpi module you have loaded. The swap function will also attempt to replace any modules that used the swapped module as a prerequisite with identical modules that use the new module as a prerequisite. For instance if we had loaded the intel, ipmi and nwchem modules and then wanted to switch from impi to openmpi, we would expect impi to be unloaded and openmpi to be loaded - however we would also expect that the impi version of nwchem would be replaced with the openmpi version if one exists. This scenario is shown below:
As we can see, nwchem was reloaded with the openmpi version in place of the impi version.
Below are some tips and recommendations to follow when using modules.
- You should always try to contain the version number of a module in the module load
- For instance use module load nwchem/6.6-intel instead of just module load nwchem.
- Keep in mind that the Default Module (D) will always point to the most recent version of a package. So re-running a job submission script that uses module load nwchem may end up running a different version of nchem than you ran previously - which may cause unintended consequences.
- This will make it easier for you to track which experiments and jobs used which versions of different software, which will in turn make writing your research paper's "methods" section easier.
- Try to keep all of your module load commands as part of your job submission scripts.
- Makes debugging and switching between experiments easier.
- Prevents issues where you change which modules you have loaded then try to re-run an older job submission script.
- Provides yourself and future researchers a way of tracking which software was used for each experiment. This makes adding the version number (discussed above) even more important!