Python and CUDA 11.3 for CBS server GPU

Hi everyone,

CBS server power users now have access to shared NVIDIA A100 GPUs with 10GB of GPU memory.

Because of hardware & software compatibility constraints for this shared GPU set-up, CUDA 11.3 must be used. This means a regular pip install of dependencies doesn’t always work.

Pytorch

Instructions to install can be found on the pytorch website, by selecting CUDA 11.3, or alternatively, for convenience, we have also included wheels in /srv/software/wheels/torch-cu113, so you can e.g.:

virtualenv /localscratch/venv
source /localscratch/venv/bin/activate
pip install /srv/software/wheels/torch-cu113/*.whl
pip install <your_code>

Tensorflow:

For tensorflow 1.x:

pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]

For tensorflow 2.x:

pip install tensorflow
1 Like

Because of licensing issues with the vGPU (NVIDIA virtual GPU) set-up, we had to downgrade to CUDA 11.0 on the system.

PyTorch wheels for this version of cuda can now be installed with:

pip install https://download.pytorch.org/whl/cu110/torch-1.7.1%2Bcu110-cp38-cp38-linux_x86_64.whl