Hi everyone,
CBS server power users now have access to shared NVIDIA A100 GPUs with 10GB of GPU memory.
Because of hardware & software compatibility constraints for this shared GPU set-up, CUDA 11.3 must be used. This means a regular pip install
of dependencies doesn’t always work.
Pytorch
Instructions to install can be found on the pytorch website, by selecting CUDA 11.3, or alternatively, for convenience, we have also included wheels in /srv/software/wheels/torch-cu113
, so you can e.g.:
virtualenv /localscratch/venv
source /localscratch/venv/bin/activate
pip install /srv/software/wheels/torch-cu113/*.whl
pip install <your_code>
Tensorflow:
For tensorflow 1.x:
pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
For tensorflow 2.x:
pip install tensorflow