Installing Python packages locally
The process of setting up Python for your personal use and needs consists of first choosing a Python distribution and setting up the environment using modules, and second adding any custom packages to your environment locally. These two steps are discussed below.
Choosing a Python distribution and setting up the environment
As with all user-selectable system-supplied software at the HPCC, modules are used to select the Python distribution to be used and set up a user's environment. Typically users initialize their environment when they log in by setting environment information for every application they will reference during the session. The HPCC uses the Environment Modules package as a tool to simplify shell initialization and allow users the ability to easily modify their environment during the session with module files.
For more information about how to load and maintain your software environment using modules, please refer to the user guide "Software Environment Setup".
In order to select a distribution and set the environment variables for Python you will first need to check which versions of Python are available using the "module spider python" command. This command will return a description of the software and which versions are currently installable through the modules system.
The default version of python at this time loaded by "module load intel python" or "module load gnu python" is Python 2. To load the most recent available system-installed version of python you can run "module load intel python3" or "module load gnu python3" command. You can then run the "module list" command to verify that the Python module has been successfully loaded. This can be done using one of the following sets of commands:
#To load the Intel compiled version of Python run the following:
module load intel python # For older version
module load intel python3 # for newer version
# check which is loaded
#To load the GNU compiled version of R run the following:
module load gnu python # For older version
module load gnu python3 # for newer version
# check which is loaded
In general, the Intel distribution of Python is more fully featured and has tools available to improve multi-threaded coding, which can be important to optimize the performance of long-running Python applications. The Gnu distribution may be more portable for moving your application to other systems, as it does not require a commercial license. Most Python code will function equally well with either distribution. You can alternatively use the Conda package manager as described below to install a custom version of Python and manage packages in user-controlled environments.
A good starting point if you don't know how to choose from the above options is Python 3 in the Intel distribution, which can be selected using "module load intel python3".
Installing packages locally to supplement or replace a system distribution
By default, Python packages require the installation be performed by the 'root' user. However, most python package installers and managers will also allow the user to install the package into their HOME folder to supplement the features of these distributions.
You can alternately choose to use Anaconda or Miniconda through the Conda package manager, which will enable you to use a newer distribution of Python than the system defaults. Using the Conda package manager allows you total control over the Python setup for your code without any dependencies on system Python versions.
Below you will find instructions for installing python packages locally using the following methods:
Pip is the PyPA (Python Packaging Authority) recommended tool for installing Python packages. As such, many Python packages can be installed using pip. The problem with pip is that if you attempt to install a package without root access, the tool will simply fail and give you no hint that you can install the package without root. In order to bypass the need for root access you can instuct pip to instead install to your HOME folder by adding the --user option as shown below:
for python 2: pip install --user <package>
for python 3: pip3 install --user <package>
Easy_install is another commonly used tool for installing Python packages and is a supported method for the installation of many packages. Similar to pip, this tool will also fail if you attempt to install a package without root access. Unlike pip, when easy_install fails it does hint that it is possible to install without root but it does not give you the command to make it work. In order to bypass the need for root access you can instruct easy_install to instead install to your HOME folder by adding the --user option as shown below:
easy_install --user <package>
Many Python packages also allow the user to install them directly from their source code. Often these packages provice a "setup.py" file for the purpose of installing these packages. Much like pip and easy_install, the additional of a --user flag is often sufficient for setup.py to install the package directly into your HOME folder. This can be done using the command shown below:
python setup.py install --user
Some users find themselves needing to install a different version of a package that the HPCC has installed, which will then often override the version you attempt to install into your local HOME directory. Users also sometimes encounter Python packages with an immense number of dependencies that make installing them difficult. If you find yourself needing to install a complex Python package, a package version different from the one we provided or if you simply need a specific version of Python then we strongly suggest you install a copy of the Conda package manager into your HOME folder. The Conda package manager will allow you to fully control your Python environment and often makes the installation of complex Python workloads as simple as a few Conda commands. The HPCC provides documentation for the installation and usage of Conda, which can be found here: Installing a local copy of Python