Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alankrantas/tensorflow-cuda-gpu-devcontainer
Tensorflow CUDA DevContainer Configuration for Supporting NVIDIA GPU
https://github.com/alankrantas/tensorflow-cuda-gpu-devcontainer
cuda cudnn deep-learning devcontainer gpu-acceleration keras machine-learning nvidia nvidia-gpu tensorflow
Last synced: about 2 months ago
JSON representation
Tensorflow CUDA DevContainer Configuration for Supporting NVIDIA GPU
- Host: GitHub
- URL: https://github.com/alankrantas/tensorflow-cuda-gpu-devcontainer
- Owner: alankrantas
- License: mit
- Created: 2023-08-15T17:25:53.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-01T07:00:53.000Z (8 months ago)
- Last Synced: 2024-05-02T10:19:50.471Z (8 months ago)
- Topics: cuda, cudnn, deep-learning, devcontainer, gpu-acceleration, keras, machine-learning, nvidia, nvidia-gpu, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 74.2 KB
- Stars: 7
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Tensorflow CUDA DevContainer Configuration for Supporting NVIDIA GPU
To create a Docker Linux DevContainer that supports Tensorflow GPU without frustrating and complicated local installation, especially on Windows platforms.
The current version willl create the following setup:
* Python `3.11`
* Tensorflow `2.15.0`
* CUDA `12.2` + cuDNN> The current version utilizes `tensorflow[and-cuda]` to install compatible CUDA/cuDNN on a regular Python container. My original and previous version used a NVIDIA CUDA image to install Python and cuDNN.
>
> For now Tensorflow `2.16.1` cannot be installed currectly. And, I still have no success to have installed TensorRT detected. Let me know if you managed to get it running!The current version has been tested on:
* A Windows 11 gaming laptop with a built-in RTX 3070 Ti
> I've tried with my another laptop connecting to a eGPU (GTX 1660 Ti) without success so far.
## Prerequisites
* An amd64 (x64) machine with a CUDA-compatible NVIDIA GPU card
* [Docker engine](https://docs.docker.com/engine/install/) or [Docker Desktop](https://docs.docker.com/desktop/install/windows-install/) (and setup [.wslconfig](https://learn.microsoft.com/en-us/windows/wsl/wsl-config) to use more cores and memory than default if you are on Windows.)
* Latest version of the [NVIDIA graphic card driver](https://www.nvidia.com/download/index.aspx)
* [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) (which is already included in Windows’ Docker Desktop)
* [Visual Studio Code](https://code.visualstudio.com/download) with [DevContainer extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) installedSee [here](https://www.tensorflow.org/install/pip#hardware_requirements) for more detailed hardware and system requirements of running Tensorflow.
> Note: some deep learning models require more GPU memory than others and may cause the Python kernel to crash. You may need to try setting a smaller training batch size.
>
> Some older cards may only support older drivers thus older CUDA versions. See the CUDA [Support Matrix](https://docs.nvidia.com/deeplearning/cudnn/latest/reference/support-matrix.html) for details, although I cannot test if `tensorflow[and-cuda]` will install older libraries automatically.## Download Repo
Download the repo as a .zip file, unzip then open the folder in VS Code, or to use [Git](https://git-scm.com/):
```bash
git clone https://github.com/alankrantas/tensorflow-cuda-gpu-devcontainer.git
cd tensorflow-cuda-gpu-devcontainer
code .
```Modify [requirements.txt](https://github.com/alankrantas/windows-cuda-gpu-devcontainer/blob/main/.devcontainer/requirements.txt) to include packages you'd like to install. `ipykernel` is required for executing IPython notebook cells in VS Code.
> Note: VS Code/Git may messed up the new line characters of `.devcontainer/install-dev-tools.sh` and `.devcontainer/requirements` which will cause the DevContainer failed to run the scripts.
>
> Click `CRLF` on the button right and change them to `LF`.## Start DevContainer
In VS Code, press `F1` to bring up the Command Palette, and select **Dev Containers: Open Folder in Container...**
Wait until the DevContainer is up and running (`nvidia-smi` is executed):
```
Wed May 1 03:24:54 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.76.01 Driver Version: 552.22 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3070 ... On | 00000000:01:00.0 Off | N/A |
| N/A 46C P0 29W / 130W | 0MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
```> Although `nvidia-smi` informed us the current NVIDIA driver uses CUDA `12.4`, it doesn't mean Tensorflow supports it. This is why using `tensorflow[and-cuda]` is easier to use existing NVIDIA CUDA/cuDNN images.
Then open a new terminal (`Terminal` -> `New Terminal`) to test if the Tensorflow detects the GPU correctly:
```python
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
```If success, you should see something like
```
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
```Afterwards, simply open the folder in VS Code to restart the built container.
## Test Script
Test run using the example file - open `autokeras-test.py` and seletct `Run` -> `Run Without Debugging`, or run in the terminal
```python
python3 autokeras-test.py
```Which generates a result like below:
```
Trial 1 Complete [00h 03m 51s]
val_loss: 0.03894084319472313Best val_loss So Far: 0.03894084319472313
Total elapsed time: 00h 03m 51s...
Prediction loss: 0.0315
Prediction accuracy: 0.9910
precision recall f1-score support0 0.99 1.00 0.99 980
1 1.00 1.00 1.00 1135
2 0.99 0.99 0.99 1032
3 0.99 1.00 0.99 1010
4 0.99 0.99 0.99 982
5 0.99 0.99 0.99 892
6 1.00 0.98 0.99 958
7 0.99 0.99 0.99 1028
8 0.99 0.99 0.99 974
9 0.99 0.99 0.99 1009accuracy 0.99 10000
macro avg 0.99 0.99 0.99 10000
weighted avg 0.99 0.99 0.99 10000
```Or open `autokeras-test.ipynb`, run the cells or select `Run All` on top of the notebook.
## Resources
* [Developing inside a Container](https://code.visualstudio.com/docs/devcontainers/containers)
* [Install TensorFlow with pip](https://www.tensorflow.org/install/pip)
* [Setup a NVIDIA DevContainer with GPU Support for Tensorflow/Keras on Windows](https://alankrantas.medium.com/setup-a-nvidia-devcontainer-with-gpu-support-for-tensorflow-keras-on-windows-d00e6e204630)