Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cartersusi/pacman_cuda

[AUR][Pacman] Current Cuda compatibility with Tensorflow and Torch on Arch Linux
https://github.com/cartersusi/pacman_cuda

arch arch-linux archlinux aur compatibility cuda guide installer linux pacman script tensorflow torch

Last synced: 3 months ago
JSON representation

[AUR][Pacman] Current Cuda compatibility with Tensorflow and Torch on Arch Linux

Awesome Lists containing this project

README

        

# Cuda Installer for Tensorflow and Pytorch.

## LAST UPDATED: (09/04/2024)
* There is currently no way for both Tensorflow and Torch to use the same local cuda install and Python 3.12
* If you would like to use the same cuda version for both, use Python 3.8-3.11

## Keeping the list updated
* Submit a pull request chaning the [pacman_cuda.json](pacman_cuda.json)
* Open an issue if you encounter any bugs with the script.

### Changes
**Python 3.12 support**
- Script(Recent):
- Installs cuda 12.3, the only cuda version supported for Tensorflow in Python 3.12.
- For Torch, cuda has to be installed within the virtual environment.
- Script(Compatible):
- Installs cuda 11.8, this is the last time Tensorflow and Torch shared a cuda version.

---

## Script
```sh
curl -O https://raw.githubusercontent.com/cartersusi/pacman_cuda/main/install
chmod +x install
./install
```

## DIY

### ❗❗❗DO NOT USE YAY OR GIT❗❗❗

##### My experiences from the first 10+ times using yay and git
- 30+ minute gcc compile times ✅
- Linker Errors ✅
- Auto-updates & Version Mismatches ✅
- Nvidia doesn't like you ✅
- They actually hate you ✅

---

### Current Compatability
https://www.tensorflow.org/install/source#gpu \
https://pytorch.org/get-started/locally/

Version | Python version | Compiler | Build tools | cuDNN | CUDA
| :----: | :----: | :----: | :----: |:----: |:----:
tensorflow-2.17.0 | 3.9-3.12 | Clang 17.0.6 | Bazel 6.5.0 | 8.9 | 12.3
tensorflow-2.13.0 | 3.8-3.11 | Clang 16.0.0 | Bazel 5.3.0 | 8.6 | 11.8
Pytorch(Stable) | 3.8+ | | | | 11.8, 12.1

---

Torch typically bundles pre-compiled CUDA binaries and does not require the system Cuda install.
```bash
# Current:
pip install torch torchvision torchaudio
```

---

1. Update and download nvidia drivers ('nvidia' and 'nvidia-dkms' are interchangeable, no need to replace your 'nvidia' package if it is already installed)
```bash
sudo pacman -Syu nvidia-dkms opencl-nvidia nvidia-utils nvidia-settings curl
```

2. Download and install gcc12
```bash
curl -O https://archive.archlinux.org/packages/g/gcc12/gcc12-12.3.0-6-x86_64.pkg.tar.zst
curl -O https://archive.archlinux.org/packages/g/gcc12-libs/gcc12-libs-12.3.0-6-x86_64.pkg.tar.zst
sudo pacman -U gcc12-12.3.0-6-x86_64.pkg.tar.zst gcc12-libs-12.3.0-6-x86_64.pkg.tar.zst
```

3. Download and install CUDA and cuDNN
```bash
curl -O https://archive.archlinux.org/packages/c/cuda/cuda-12.3.2-1-x86_64.pkg.tar.zst
curl -O https://archive.archlinux.org/packages/c/cudnn/cudnn-8.9.7.29-1-x86_64.pkg.tar.zst
sudo pacman -U cuda-12.3.2-1-x86_64.pkg.tar.zst cudnn-8.9.7.29-1-x86_64.pkg.tar.zst
```

4. Update /etc/pacman.conf to exclude cuda and cudnn
- Uncomment the line "#IgnorePkg =", then add cuda and cudnn
```conf
IgnorePkg = cuda cudnn
```

### Common Tensorflow error
```bash
# ERROR: libdevice not found at ./libdevice.10.bc
export XLA_FLAGS=--xla_gpu_cuda_data_dir=/opt/cuda
```

### Common Docker error
```sh
sudo nvim /etc/nvidia-container-runtime/config.toml # change no-cgroups = false, save

sudo systemctl restart docker
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
```

### Links for Cuda 11.8
gcc11:
- https://archive.archlinux.org/packages/g/gcc11/gcc11-11.3.0-5-x86_64.pkg.tar.zst
- https://archive.archlinux.org/packages/g/gcc11-libs/gcc11-libs-11.3.0-5-x86_64.pkg.tar.zst

cuda:
- https://archive.archlinux.org/packages/c/cuda/cuda-11.8.0-1-x86_64.pkg.tar.zst

cudnn:
- https://archive.archlinux.org/packages/c/cudnn/cudnn-8.6.0.163-1-x86_64.pkg.tar.zst