{"id":13593905,"url":"https://github.com/philferriere/dlwin","last_synced_at":"2025-04-05T02:10:06.674Z","repository":{"id":71374545,"uuid":"62470273","full_name":"philferriere/dlwin","owner":"philferriere","description":"GPU-accelerated Deep Learning on Windows 10 native","archived":false,"fork":false,"pushed_at":"2022-07-21T20:02:53.000Z","size":2846,"stargazers_count":517,"open_issues_count":0,"forks_count":100,"subscribers_count":49,"default_branch":"master","last_synced_at":"2025-03-29T01:12:02.334Z","etag":null,"topics":["cntk","cudnn","deep-learning","gpu-acceleration","gpu-mode","keras","tensorflow","theano"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/philferriere.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2016-07-02T21:24:26.000Z","updated_at":"2025-02-15T16:54:42.000Z","dependencies_parsed_at":"2024-01-14T04:39:23.713Z","dependency_job_id":"3b2709ef-44a8-4822-8730-4a045abc0a8d","html_url":"https://github.com/philferriere/dlwin","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philferriere%2Fdlwin","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philferriere%2Fdlwin/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philferriere%2Fdlwin/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philferriere%2Fdlwin/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/philferriere","download_url":"https://codeload.github.com/philferriere/dlwin/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247276189,"owners_count":20912288,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cntk","cudnn","deep-learning","gpu-acceleration","gpu-mode","keras","tensorflow","theano"],"created_at":"2024-08-01T16:01:26.109Z","updated_at":"2025-04-05T02:10:06.655Z","avatar_url":"https://github.com/philferriere.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"GPU-accelerated Deep Learning on Windows 10 native (Keras/Tensorflow/CNTK/MXNet and PyTorch)\n===============================================================================\n\n**\u003e\u003e LAST UPDATED JUNE, 2018 \u003c\u003c**\n\n**This latest update:**\n- **supports 5 frameworks (Keras/Tensorflow/CNTK/MXNet and PyTorch),**\n- **supports 3 GPU-accelerated Keras backends (CNTK, Tensorflow, or MXNet),**\n- **doesn't require installing MinGW separately,**\n- **uses more recent versions of many python libraries.**\n\nThere are certainly a lot of guides to assist you build great deep learning (DL) setups on Linux or Mac OS (including with Tensorflow which, unfortunately, as of this posting, cannot be easily installed on Windows), but few care about building an efficient Windows 10-**native** setup. Most focus on running an Ubuntu VM hosted on Windows or using Docker, unnecessary - and ultimately sub-optimal - steps.\n\nWe also found enough misguiding/deprecated information out there to make it worthwhile putting together a step-by-step guide for the latest stable versions of Keras, Tensorflow, CNTK, MXNet, and PyTorch. Used either together (e.g., Keras with Tensorflow backend), or independently -- PyTorch cannot be used as a Keras backend, TensorFlow can be used on its own -- they make for some of the most powerful deep learning python libraries to work natively on Windows.\n\nIf you **must** run your DL setup on Windows 10, then the information contained here will hopefully be useful to you.\n\nOlder installation instructions from [July 2017](README_July2017.md), [May 2017](README_May2017.md) and [January 2017](README_Jan2017.md) are still available. They allow you to use Theano as a Keras backend.\n\n# TOC\n\n- [Dependencies](#dependencies)\n- [Hardware](#hardware)\n- [Installation steps](#installation-steps)\n  * [Windows toolkits](#toolkits)\n    + [Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0](#visual-studio-2015-community-edition-update-3-w-windows-kit-100102400)\n    + [Anaconda 5.2.0 (64-bit) (Python 3.6 TF support / Python 2.7 no TF support))](#anaconda-520-64-bit-python-36-tf-support-python-27-no-tf-support)\n      - [Create a `dlwin36` conda environment](#create-a-dlwin36-conda-environment)\n      - [Optional but highly-recommended image processing libraries](#optional-but-highly-recommended-image-processing-libraries)\n    + [CUDA 9.0.176 (64-bit)](#cuda-90176-64-bit)\n    + [cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0](#cudnn-v704-nov-13-2017-for-cuda-90)\n  * [Deep learning python libraries](#deep-learning-python-libraries)\n    + [Installing `keras` 2.1.6](#installing-keras-216)\n    + [Installing `tensorflow-gpu` 1.8.0 (solo, or as a Keras backend)](#installing-tensorflow-gpu-180-solo-or-as-a-keras-backend)\n    + [Installing `cntk-gpu` 2.5.1 (solo, or as a Keras backend)](#installing-cntk-gpu-251-solo-or-as-a-keras-backend)\n    + [Installing `mxnet-cu90` 1.2.0 (solo, or as a Keras backend)](#installing-mxnet-cu90-120-solo-or-as-a-keras-backend)\n    + [Installing `pytorch` 0.4.0](#installing-pytorch-040)\n  * [Quick checks](#quick-checks)\n    + [Checking the list of Python libraries installed](#checking-the-list-of-python-libraries-installed)\n    + [Checking our PATH sysenv var](#checking-our-path-sysenv-var)\n    + [Quick-checking each main Python library install](#quick-checking-each-main-python-library-install)\n  * [GPU tests](#gpu-tests)\n    + [Validating our GPU install with Keras](#validating-our-gpu-install-with-keras)\n    + [Keras with Tensorflow backend (GPU disabled)](#keras-with-tensorflow-backend-gpu-disabled)\n    + [Keras with Tensorflow backend (using GPU)](#keras-with-tensorflow-backend-using-gpu)\n    + [Keras with CNTK backend (using GPU)](#keras-with-cntk-backend-using-gpu)\n    + [Keras with MXNet backend (using GPU)](#keras-with-mxnet-backend-using-gpu)\n    + [Validating our GPU install with PyTorch](#validating-our-gpu-install-with-pytorch)\n- [Suggested viewing and reading](#suggested-viewing-and-reading)\n- [About the Author](#about-the-author)\n\n\u003csmall\u003e\u003ci\u003e\u003ca href='http://ecotrust-canada.github.io/markdown-toc/'\u003eTable of contents generated with markdown-toc\u003c/a\u003e\u003c/i\u003e\u003c/small\u003e\n\n# Dependencies\n\nHere's a summary list of the tools and libraries we use for deep learning on Windows 10 (Version 1709 OS Build 16299.371):\n\n1. Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0\n   - Used for its C/C++ compiler (not its IDE) and SDK. This specific version has been selected due to [Windows Compiler Support in CUDA](http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#system-requirements).\n2. Anaconda (64-bit) w. Python 3.6 (Anaconda3-5.2.0) [for Tensorflow support] or Python 2.7 (Anaconda2-5.2.0) [no Tensorflow support] with MKL 2018.0.3\n   - A Python distro that gives us NumPy, SciPy, and other scientific libraries\n   - MKL is used for its CPU-optimized implementation of many linear algebra operations\n3. CUDA 9.0.176 (64-bit)\n   - Used for its GPU math libraries, card driver, and CUDA compiler\n4. cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0.176\n   - Used to run vastly faster convolution neural networks\n5. Keras 2.1.6 with three different backends: Tensorflow-gpu 1.8.0, CNTK-gpu 2.5.1, and MXNet-cuda90 1.2.0\n   - Keras is used for deep learning on top of Tensorflow or CNTK\n   - Tensorflow and CNTK are backends used to evaluate mathematical expressions on multi-dimensional arrays\n   - Theano is a legacy backend no longer in active development\n6. PyTorch v0.4.0\n\n# Hardware\n\n1. Dell Precision T7900, 64GB RAM\n   - Intel Xeon E5-2630 v4 @ 2.20 GHz (1 processor, 10 cores total, 20 logical processors)\n2. NVIDIA GeForce Titan X, 12GB RAM\n   - Driver version: 390.77 / Win 10 64\n2. NVIDIA GeForce GTX 1080 Ti, 11GB RAM\n   - Driver version: 390.77 / Win 10 64   \n\n# Installation steps\n\nWe like to keep our toolkits and libraries in a single root folder boringly called `e:\\toolkits.win`, so whenever you see a Windows path that starts with `e:\\toolkits.win` below, make sure to replace it with whatever you decide your own toolkit drive and folder ought to be.\n\n## Toolkits\n\n### Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0\n\nDownload [Visual Studio Community 2015 with Update 3 (x86)](https://www.visualstudio.com/vs/older-downloads). It is used by the CUDA toolkit.\n\u003e Note that for downloading, a free [Visual Studio Dev Essentials](https://www.visualstudio.com/dev-essentials/) license or a full Visual Studio Subscription is required.\n\nRun the downloaded executable to install Visual Studio, using whatever additional config settings work best for you:\n\n![](img/vs2015-install-part1-2016-10.png)\n\n![](img/vs2015-install-part2-2016-10.png)\n\n![](img/vs2015-install-part3b-2016-10.png)\n\n![](img/vs2015-install-part4b-2016-10.png)\n\n1. Add `C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin` to your `PATH`, based on where you installed VS 2015.\n2. Define sysenv variable `INCLUDE` with the value `C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt`\n3. Define sysenv variable `LIB` with the value `C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\um\\x64;C:\\Program Files (x86)\\Windows Kits\\10\\Lib\\10.0.10240.0\\ucrt\\x64`\n\n\u003e Reference Note: We couldn't run any Theano python files until we added the last two env variables above. We would get a `c:\\program files (x86)\\microsoft visual studio 14.0\\vc\\include\\crtdefs.h(10): fatal error C1083: Cannot open include file: 'corecrt.h': No such file or directory` error at compile time and missing `kernel32.lib uuid.lib ucrt.lib` errors at link time. True, you could probably run `C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin\\amd64\\vcvars64.bat` (with proper params) every single time you open a MINGW cmd prompt, but, obviously, none of the sysenv vars would stick from one session to the next.\n\n### Anaconda 5.2.0 (64-bit) (Python 3.6 TF support / Python 2.7 no TF support)\n\nThis tutorial was initially created using Python 2.7. As Tensorflow has become the backend of choice for Keras, we've decided to document installation steps using Python 3.6 by default. Depending on your own preferred configuration, use `e:\\toolkits.win\\anaconda3-5.2.0` or `e:\\toolkits.win\\anaconda2-5.2.0` as the folder where to install Anaconda.\n\nDownload the Python 3.6 Anaconda version from [here](https://repo.continuum.io/archive/Anaconda3-5.2.0-Windows-x86_64.exe) and the Python 2.7 version from [there](https://repo.continuum.io/archive/Anaconda2-5.2.0-Windows-x86_64.exe):\n\n[![](img/anaconda-5.2.0-download-2018-06.png)](https://repo.continuum.io/archive/)\n\nRun the downloaded executable to install Anaconda:\n\n![](img/anaconda-5.2.0-setup1-2018-06.png)\n![](img/anaconda-5.2.0-setup2-2018-06.png)\n\n\u003e Warning: Below, we enabled the second of the `Advanced Options` because it works for us, but that may not be the best option for you!\n\n![](img/anaconda-5.2.0-setup3-2018-06.png)\n\nDefine the following variable and update PATH as shown here:\n\n1. Define sysenv variable `PYTHON_HOME` with the value `e:\\toolkits.win\\anaconda3-5.2.0`\n2. Add `%PYTHON_HOME%`, `%PYTHON_HOME%\\Scripts`, and `%PYTHON_HOME%\\Library\\bin` to `PATH`\n\n#### Create a `dlwin36` conda environment\n\nAfter Anaconda installation, open a Windows command prompt and execute:\n\n```\n$ conda create --yes -n dlwin36 numpy scipy mkl-service m2w64-toolchain libpython matplotlib pandas scikit-learn tqdm jupyter h5py cython\n```\n\nHere's the [output log](installed_files/dlwin36_log.txt) for the command above.\n\nNext, use `activate dlwin36` to activate this new environment. By the way, if you already have an older `dlwin36` environment, you can delete it using `conda env remove -n dlwin36`.\n\n#### Optional but highly-recommended image processing libraries\n\nIf we're going to use the GPU, why did we install a CPU-optimized linear algebra library like MKL? With our setup, most of the deep learning grunt work is performed by the GPU, that is correct, but *the CPU isn't idle*. An important part of image-based Kaggle competitions is **data augmentation**. In that context, data augmentation is the process of manufacturing additional input samples (more training images) by transformation of the original training samples, via the use of image processing operators. Basic transformations such as downsampling and (mean-centered) normalization are also needed. If you feel adventurous, you'll want to try additional pre-processing enhancements (noise removal, histogram equalization, etc.). You certainly could use the GPU for that purpose and save the results to file. In practice, however, those operations are often executed **in parallel on the CPU** while the GPU is busy learning the weights of the deep neural network and the augmented data discarded after use.\n\nIf your deep learning projects are image-based, we recommend also installing the following libraries:\n\n- `scikit-image`: open source image processing library for the Python programming language that includes algorithms for segmentation, geometric transformations, color space manipulation, analysis, filtering, morphology, feature detection, and more. See [this page](http://scikit-image.org/) for more info.\n- `opencv`: a library of programming functions mainly aimed at real-time computer vision. It has C++, Python and Java interfaces and supports many OS platforms, including Windows. See [this page](https://opencv.org/) for additional info.\n- `imgaug`: a staple of image-based Kaggle competitions, this python library helps you with augmenting images for your machine learning projects by converting a set of input images into a new, much larger set of slightly altered images. See [this page](https://github.com/aleju/imgaug) for details.\n\nTo install these libraries, use the following commands:\n\n```\n$ activate dlwin36\n(dlwin36) $conda install --yes pillow scikit-image\n(dlwin36) $conda install --yes -c conda-forge opencv\n(dlwin36) $pip install git+https://github.com/aleju/imgaug\n```\n\nHere's an [output log](installed_files/dlwin36_imgproc_log.txt) for the commands above.\n\n### CUDA 9.0.176 (64-bit)\n\nDownload CUDA 9.0.176 (64-bit) from the [NVidia website](https://developer.nvidia.com/cuda-90-download-archive)\n\nWhy not install CUDA 9.1? Simply because, as of this writing, Tensorflow 1.8 still uses CUDA 9.0 (see issue [#15140](https://github.com/tensorflow/tensorflow/issues/15140)).\n\nSelect the proper target platform:\n\n![](img/cuda-9.0.176-setup1-2018-06.png)\n\nDownload all the installers:\n\n![](img/cuda-9.0.176-setup2-2018-06.png)\n\nRun the downloaded installers one after the other. Install the files in `e:\\toolkits.win\\cuda-9.0.176`:\n\n![](img/cuda-9.0.176-setup3-2018-06.png)\n\n![](img/cuda-9.0.176-setup4-2018-06.png)\n\n![](img/cuda-9.0.176-setup5-2018-06.png)\n\n![](img/cuda-9.0.176-setup6-2018-06.png)\n\nAfter completion, the installer should have created a system environment (sysenv) variable named `CUDA_PATH` and added `%CUDA_PATH%\\bin` as well as`%CUDA_PATH%\\libnvvp` to `PATH`. Check that it is indeed the case. If, for some reason, the CUDA env vars are missing, then:\n\n1. Define a system environment (sysenv) variable named `CUDA_PATH` with the value `e:\\toolkits.win\\cuda-9.0.176`\n2. Add`%CUDA_PATH%\\bin` and `%CUDA_PATH%\\libnvvp` to `PATH`\n\n### cuDNN v7.0.4 (Nov 13, 2017) for CUDA 9.0\n\nPer NVidia's [website](https://developer.nvidia.com/cudnn), \"cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers,\" hallmarks of convolution network architectures. Download cuDNN from [here](https://developer.nvidia.com/rdp/cudnn-download). Choose the cuDNN Library for Windows 10 that matches the CUDA version:\n\nNvidia has recently removed the option for the 7.0.4 Windows download. You can download it [here](https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.4/prod/9.0_20171031/cudnn-9.0-windows10-x64-v7).\n\n![](img/cudnn-7.0.4-for-cuda-9.0-setup1-2018-06.png)\n\nThe downloaded ZIP file contains three directories (`bin`, `include`, `lib`). Extract and copy their content to the identically-named `bin`, `include` and `lib` directories in`%CUDA_PATH%`.\n\n## Deep learning python libraries\n\n### Installing `keras` 2.1.6\n\nWhy not just install the latest bleeding-edge/dev version of Keras and various backends (Tensorflow, CNTK or Theano)? Simply put, because it makes [reproducible research](https://www.coursera.org/learn/reproducible-research) harder. If your work colleagues or Kaggle teammates install the latest code from the dev branch at a different time than you did, you will most likely be running different code bases on your machines, increasing the odds that even though you're using the same input data (the same random seeds, etc.), you still end up with different results when you shouldn't. For this reason alone, we highly recommend only using point releases, the same one across machines, and always documenting which one you use if you can't just use a setup script.\n\nInstall Keras as follows:\n\n```\n(dlwin36) $$ pip install keras==2.1.6\n$ pip install keras==2.1.6\nCollecting keras==2.1.6\n  Using cached https://files.pythonhosted.org/packages/54/e8/eaff7a09349ae9bd40d3ebaf028b49f5e2392c771f294910f75bb608b241/Keras-2.1.6-py2.py3-none-any.whl\nRequirement already satisfied: numpy\u003e=1.9.1 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.14.5)\nRequirement already satisfied: scipy\u003e=0.14 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.1.0)\nRequirement already satisfied: h5py in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (2.8.0)\nRequirement already satisfied: pyyaml in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (3.12)\nRequirement already satisfied: six\u003e=1.9.0 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from keras==2.1.6) (1.11.0)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: keras\nSuccessfully installed keras-2.1.6\n```\n\n### Installing `tensorflow-gpu` 1.8.0 (solo, or as a Keras backend)\n\nRun the following command to install Tensorflow:\n\n```\n$ pip install tensorflow-gpu==1.8.0\nCollecting tensorflow-gpu==1.8.0\n  Using cached https://files.pythonhosted.org/packages/42/a8/4c96a2b4f88f5d6dfd70313ebf38de1fe4d49ba9bf2ef34dc12dd198ab9a/tensorflow_gpu-1.8.0-cp36-cp36m-win_amd64.whl\nRequirement already satisfied: six\u003e=1.10.0 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (1.11.0)\nCollecting grpcio\u003e=1.8.6 (from tensorflow-gpu==1.8.0)\n  Downloading https://files.pythonhosted.org/packages/5d/8b/104918993129d6c919a16826e6adcfa4a106c791da79fb9655c5b22ad9ff/grpcio-1.12.1-cp36-cp36m-win_amd64.whl (1.4MB)\n    100% |████████████████████████████████| 1.4MB 6.6MB/s\nCollecting gast\u003e=0.2.0 (from tensorflow-gpu==1.8.0)\nCollecting tensorboard\u003c1.9.0,\u003e=1.8.0 (from tensorflow-gpu==1.8.0)\n  Using cached https://files.pythonhosted.org/packages/59/a6/0ae6092b7542cfedba6b2a1c9b8dceaf278238c39484f3ba03b03f07803c/tensorboard-1.8.0-py3-none-any.whl\nRequirement already satisfied: wheel\u003e=0.26 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (0.31.1)\nCollecting termcolor\u003e=1.1.0 (from tensorflow-gpu==1.8.0)\nRequirement already satisfied: numpy\u003e=1.13.3 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from tensorflow-gpu==1.8.0) (1.14.5)\nCollecting protobuf\u003e=3.4.0 (from tensorflow-gpu==1.8.0)\n  Downloading https://files.pythonhosted.org/packages/75/7a/0dba607e50b97f6a89fa3f96e23bf56922fa59d748238b30507bfe361bbc/protobuf-3.6.0-cp36-cp36m-win_amd64.whl (1.1MB)\n    100% |████████████████████████████████| 1.1MB 6.6MB/s\nCollecting absl-py\u003e=0.1.6 (from tensorflow-gpu==1.8.0)\n  Downloading https://files.pythonhosted.org/packages/57/8d/6664518f9b6ced0aa41cf50b989740909261d4c212557400c48e5cda0804/absl-py-0.2.2.tar.gz (82kB)\n    100% |████████████████████████████████| 92kB 5.9MB/s\nCollecting astor\u003e=0.6.0 (from tensorflow-gpu==1.8.0)\n  Using cached https://files.pythonhosted.org/packages/b2/91/cc9805f1ff7b49f620136b3a7ca26f6a1be2ed424606804b0fbcf499f712/astor-0.6.2-py2.py3-none-any.whl\nCollecting html5lib==0.9999999 (from tensorboard\u003c1.9.0,\u003e=1.8.0-\u003etensorflow-gpu==1.8.0)\nCollecting werkzeug\u003e=0.11.10 (from tensorboard\u003c1.9.0,\u003e=1.8.0-\u003etensorflow-gpu==1.8.0)\n  Using cached https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl\nCollecting bleach==1.5.0 (from tensorboard\u003c1.9.0,\u003e=1.8.0-\u003etensorflow-gpu==1.8.0)\n  Using cached https://files.pythonhosted.org/packages/33/70/86c5fec937ea4964184d4d6c4f0b9551564f821e1c3575907639036d9b90/bleach-1.5.0-py2.py3-none-any.whl\nCollecting markdown\u003e=2.6.8 (from tensorboard\u003c1.9.0,\u003e=1.8.0-\u003etensorflow-gpu==1.8.0)\n  Using cached https://files.pythonhosted.org/packages/6d/7d/488b90f470b96531a3f5788cf12a93332f543dbab13c423a5e7ce96a0493/Markdown-2.6.11-py2.py3-none-any.whl\nRequirement already satisfied: setuptools in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from protobuf\u003e=3.4.0-\u003etensorflow-gpu==1.8.0) (39.2.0)\nBuilding wheels for collected packages: absl-py\n  Running setup.py bdist_wheel for absl-py ... done\n  Stored in directory: C:\\Users\\Phil\\AppData\\Local\\pip\\Cache\\wheels\\a0\\f8\\e9\\1933dbb3447ea6ef557062fd5461cb118deb8c2ed074e8344bf\nSuccessfully built absl-py\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: grpcio, gast, html5lib, werkzeug, bleach, markdown, protobuf, tensorboard, termcolor, absl-py, astor, tensorflow-gpu\n  Found existing installation: html5lib 1.0.1\n    Uninstalling html5lib-1.0.1:\n      Successfully uninstalled html5lib-1.0.1\n  Found existing installation: bleach 2.1.3\n    Uninstalling bleach-2.1.3:\n      Successfully uninstalled bleach-2.1.3\nSuccessfully installed absl-py-0.2.2 astor-0.6.2 bleach-1.5.0 gast-0.2.0 grpcio-1.12.1 html5lib-0.9999999 markdown-2.6.11 protobuf-3.6.0 tensorboard-1.8.0 tensorflow-gpu-1.8.0 termcolor-1.1.0 werkzeug-0.14.1\n```\n\nIf you want TensorFlow to be the default Keras backend, define a system environment variable named `KERAS_BACKEND` with the value `tensorflow`.\n\n### Installing `cntk-gpu` 2.5.1 (solo, or as a Keras backend)\n\nAs documented at [this link](https://docs.microsoft.com/en-us/cognitive-toolkit/setup-windows-python), install CNTK GPU as follows:\n\n```\n(dlwin36) $ pip install https://cntk.ai/PythonWheel/GPU/cntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl\nCollecting cntk-gpu==2.5.1 from https://cntk.ai/PythonWheel/GPU/cntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl\n  Downloading https://cntk.ai/PythonWheel/GPU/cntk_gpu-2.5.1-cp36-cp36m-win_amd64.whl (428.6MB)\n    100% |████████████████████████████████| 428.6MB 53kB/s\nRequirement already satisfied: scipy\u003e=0.17 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from cntk-gpu==2.5.1) (1.1.0)\nRequirement already satisfied: numpy\u003e=1.11 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from cntk-gpu==2.5.1) (1.14.5)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: cntk-gpu\nSuccessfully installed cntk-gpu-2.5.1\n```\n\nIf you want CNTK to be the default Keras backend, define a system environment variable named `KERAS_BACKEND` with the value `cntk`.\n\n### Installing `mxnet-cu90` 1.2.0 (solo, or as a Keras backend)\n\nMXNet is a deep learning framework with strong backing from Amazon (through AWS). It is also supported by Microsoft on Azure. To install it, run the following command:\n\n```\n(dlwin36) $ pip install mxnet-cu90==1.2.0 keras-mxnet==2.1.6.1\nCollecting mxnet-cu90==1.2.0\n  Downloading https://files.pythonhosted.org/packages/72/a8/9226bd6913b7ba4657a218b9a252b60de98938dd41e8517a0b4ab4291203/mxnet_cu90-1.2.0-py2.py3-none-win_amd64.whl (457.0MB)\n    100% |████████████████████████████████| 457.0MB 47kB/s\nRequirement already satisfied: numpy in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from mxnet-cu90==1.2.0) (1.14.5)\nRequirement already satisfied: graphviz in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from mxnet-cu90==1.2.0) (0.8.3)\nCollecting keras-mxnet==2.1.6.1\n  Downloading https://files.pythonhosted.org/packages/99/93/13ec18147fcef7c393e3fbf2d2c20171975be14e68d4c915b194be174ab6/keras_mxnet-2.1.6.1-py2.py3-none-any.whl (388kB)\n    100% |████████████████████████████████| 389kB 3.3MB/s\nCollecting requests (from mxnet-cu90==1.2.0)\n  Downloading https://files.pythonhosted.org/packages/65/47/7e02164a2a3db50ed6d8a6ab1d6d60b69c4c3fdf57a284257925dfc12bda/requests-2.19.1-py2.py3-none-any.whl (91kB)\n    100% |████████████████████████████████| 92kB 1.2MB/s\nCollecting urllib3\u003c1.24,\u003e=1.21.1 (from requests-\u003emxnet-cu90==1.2.0)\n  Downloading https://files.pythonhosted.org/packages/bd/c9/6fdd990019071a4a32a5e7cb78a1d92c53851ef4f56f62a3486e6a7d8ffb/urllib3-1.23-py2.py3-none-any.whl (133kB)\n    100% |████████████████████████████████| 143kB 2.2MB/s\nCollecting chardet\u003c3.1.0,\u003e=3.0.2 (from requests-\u003emxnet-cu90==1.2.0)\n  Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)\n    100% |████████████████████████████████| 143kB 2.2MB/s\nRequirement already satisfied: certifi\u003e=2017.4.17 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from requests-\u003emxnet-cu90==1.2.0) (2018.4.16)\nCollecting idna\u003c2.8,\u003e=2.5 (from requests-\u003emxnet-cu90==1.2.0)\n  Downloading https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl (58kB)\n    100% |████████████████████████████████| 61kB 3.9MB/s\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: urllib3, chardet, idna, requests, mxnet-cu90\nSuccessfully installed chardet-3.0.4 idna-2.7 mxnet-cu90-1.2.0 requests-2.19.1 urllib3-1.23\n```\n\nIf you want MXNet to be the default Keras backend, define a system environment variable named `KERAS_BACKEND` with the value `mxnet`.\n\n### Installing `pytorch` 0.4.0\n\nPyTorch is Facebook AI Research (FAIR)'s answer to Google's Tensorflow.  Only with version v0.4.0 does it **officially** support Windows (x64). Setup requires installing `pytorch`, `cuda90`, and `torchvision` so, first, run the following command:\n\n```\n(dlwin36) $ conda install --yes pytorch==0.4.0 cuda90 -c pytorch\nSolving environment: done\n\n## Package Plan ##\n\n  environment location: e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\n\n  added / updated specs:\n    - cuda90\n    - pytorch==0.4.0\n\n\nThe following packages will be downloaded:\n\n    package                    |            build\n    ---------------------------|-----------------\n    cuda90-1.0                 |                0           2 KB  pytorch\n    certifi-2018.4.16          |           py36_0         143 KB\n    pytorch-0.4.0              |py36_cuda90_cudnn7he774522_1       577.6 MB  pytorch\n    ------------------------------------------------------------\n                                           Total:       577.7 MB\n\nThe following NEW packages will be INSTALLED:\n\n    cffi:      1.11.5-py36h945400d_0\n    cuda90:    1.0-0                              pytorch\n    pycparser: 2.18-py36hd053e01_1\n    pytorch:   0.4.0-py36_cuda90_cudnn7he774522_1 pytorch     [cuda90]\n\nThe following packages will be UPDATED:\n\n    certifi:   2018.4.16-py36_0                   conda-forge --\u003e 2018.4.16-py36_0\n\n\nDownloading and Extracting Packages\ncuda90-1.0           |    2 KB | ############################################################################## | 100%\ncertifi-2018.4.16    |  143 KB | ############################################################################## | 100%\npytorch-0.4.0        | 577.6 MB | ############################################################################# | 100%\nPreparing transaction: done\nVerifying transaction: done\nExecuting transaction: done\n```\n\nSecond, install `torchvision` with this command:\n\n```\n(dlwin36torch) $ pip install torchvision==0.2.1\nCollecting torchvision==0.2.1\n  Using cached https://files.pythonhosted.org/packages/ca/0d/f00b2885711e08bd71242ebe7b96561e6f6d01fdb4b9dcf4d37e2e13c5e1/torchvision-0.2.1-py2.py3-none-any.whl\nRequirement already satisfied: numpy in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (1.14.5)\nRequirement already satisfied: pillow\u003e=4.1.1 in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (5.1.0)\nRequirement already satisfied: six in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (1.11.0)\nRequirement already satisfied: torch in e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages (from torchvision==0.2.1) (0.4.0)\ndistributed 1.22.0 requires msgpack, which is not installed.\nInstalling collected packages: torchvision\nSuccessfully installed torchvision-0.2.1\n```\n\nIf you have issues with PyTorch on Windows, I highly recommend reading their [Windows FAQ](http://pytorch.org/docs/stable/notes/windows.html).\n\n## Quick checks\n\n### Checking the list of Python libraries installed\n\nYou should end up with the following list of libraries in your `dlwin36` conda environment:\n\n```\n(dlwin36) $ conda list\n# packages in environment at e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36:\n#\n# Name                    Version                   Build  Channel\nabsl-py                   0.2.2                     \u003cpip\u003e\nastor                     0.6.2                     \u003cpip\u003e\nbackcall                  0.1.0                    py36_0  \nblas                      1.0                         mkl  \nbleach                    1.5.0                     \u003cpip\u003e\nbleach                    2.1.3                    py36_0  \nbokeh                     0.12.16                  py36_0  \nca-certificates           2018.4.16                     0    conda-forge\ncertifi                   2018.4.16                py36_0  \ncffi                      1.11.5           py36h945400d_0  \nchardet                   3.0.4                     \u003cpip\u003e\nclick                     6.7              py36hec8c647_0  \ncloudpickle               0.5.3                    py36_0  \ncntk-gpu                  2.5.1                     \u003cpip\u003e\ncolorama                  0.3.9            py36h029ae33_0  \ncuda90                    1.0                           0    pytorch\ncycler                    0.10.0           py36h009560c_0  \ncython                    0.28.3           py36hfa6e2cd_0  \ncytoolz                   0.9.0.1          py36hfa6e2cd_0  \ndask                      0.18.0                   py36_0  \ndask-core                 0.18.0                   py36_0  \ndecorator                 4.3.0                    py36_0  \ndistributed               1.22.0                   py36_0  \nentrypoints               0.2.3            py36hfd66bb0_2  \nfreetype                  2.8.1                    vc14_0  [vc14]  conda-forge\ngast                      0.2.0                     \u003cpip\u003e\ngraphviz                  0.8.3                     \u003cpip\u003e\ngrpcio                    1.12.1                    \u003cpip\u003e\nh5py                      2.8.0            py36h3bdd7fb_0  \nhdf5                      1.10.2                   vc14_0  [vc14]  conda-forge\nheapdict                  1.0.0                    py36_2  \nhtml5lib                  1.0.1            py36h047fa9f_0  \nhtml5lib                  0.9999999                 \u003cpip\u003e\nicc_rt                    2017.0.4             h97af966_0  \nicu                       58.2                     vc14_0  [vc14]  conda-forge\nidna                      2.7                       \u003cpip\u003e\nimageio                   2.3.0                    py36_0  \nimgaug                    0.2.5                     \u003cpip\u003e\nintel-openmp              2018.0.3                      0  \nipykernel                 4.8.2                    py36_0  \nipython                   6.4.0                    py36_0  \nipython_genutils          0.2.0            py36h3c5d0ee_0  \nipywidgets                7.2.1                    py36_0  \njedi                      0.12.0                   py36_1  \njinja2                    2.10             py36h292fed1_0  \njpeg                      9b                       vc14_2  [vc14]  conda-forge\njsonschema                2.6.0            py36h7636477_0  \njupyter                   1.0.0                    py36_4  \njupyter_client            5.2.3                    py36_0  \njupyter_console           5.2.0            py36h6d89b47_1  \njupyter_core              4.4.0            py36h56e9d50_0  \nKeras                     2.1.6                     \u003cpip\u003e\nkiwisolver                1.0.1            py36h12c3424_0  \nlibpng                    1.6.34                   vc14_0  [vc14]  conda-forge\nlibpython                 2.1                      py36_0  \nlibsodium                 1.0.16                   vc14_0  [vc14]  conda-forge\nlibtiff                   4.0.9                    vc14_0  [vc14]  conda-forge\nlibwebp                   0.5.2                    vc14_7  [vc14]  conda-forge\nlocket                    0.2.0            py36hfed976d_1  \nm2w64-binutils            2.25.1                        5  \nm2w64-bzip2               1.0.6                         6  \nm2w64-crt-git             5.0.0.4636.2595836               2  \nm2w64-gcc                 5.3.0                         6  \nm2w64-gcc-ada             5.3.0                         6  \nm2w64-gcc-fortran         5.3.0                         6  \nm2w64-gcc-libgfortran     5.3.0                         6  \nm2w64-gcc-libs            5.3.0                         7  \nm2w64-gcc-libs-core       5.3.0                         7  \nm2w64-gcc-objc            5.3.0                         6  \nm2w64-gmp                 6.1.0                         2  \nm2w64-headers-git         5.0.0.4636.c0ad18a               2  \nm2w64-isl                 0.16.1                        2  \nm2w64-libiconv            1.14                          6  \nm2w64-libmangle-git       5.0.0.4509.2e5a9a2               2  \nm2w64-libwinpthread-git   5.0.0.4634.697f757               2  \nm2w64-make                4.1.2351.a80a8b8               2  \nm2w64-mpc                 1.0.3                         3  \nm2w64-mpfr                3.1.4                         4  \nm2w64-pkg-config          0.29.1                        2  \nm2w64-toolchain           5.3.0                         7  \nm2w64-tools-git           5.0.0.4592.90b8472               2  \nm2w64-windows-default-manifest 6.4                           3  \nm2w64-winpthreads-git     5.0.0.4634.697f757               2  \nm2w64-zlib                1.2.8                        10  \nMarkdown                  2.6.11                    \u003cpip\u003e\nmarkupsafe                1.0              py36h0e26971_1  \nmatplotlib                2.2.2                    py36_1    conda-forge\nmistune                   0.8.3            py36hfa6e2cd_1  \nmkl                       2018.0.3                      1  \nmkl-service               1.1.2            py36h57e144c_4  \nmkl_fft                   1.0.1            py36h452e1ab_0  \nmkl_random                1.0.1            py36h9258bd6_0  \nmsgpack-python            0.5.6            py36he980bc4_0  \nmsys2-conda-epoch         20160418                      1  \nmxnet-cu90                1.2.0                     \u003cpip\u003e\nnbconvert                 5.3.1            py36h8dc0fde_0  \nnbformat                  4.4.0            py36h3a5bc1b_0  \nnetworkx                  2.1                      py36_0  \nnotebook                  5.5.0                    py36_0  \nnumpy                     1.14.5           py36h9fa60d3_0  \nnumpy-base                1.14.5           py36h5c71026_0  \nolefile                   0.45.1                   py36_0  \nopencv                    3.4.1                  py36_200    conda-forge\nopenssl                   1.0.2o                   vc14_0  [vc14]  conda-forge\npackaging                 17.1                     py36_0  \npandas                    0.23.1           py36h830ac7b_0  \npandoc                    1.19.2.1             hb2460c7_1  \npandocfilters             1.4.2            py36h3ef6317_1  \nparso                     0.2.1                    py36_0  \npartd                     0.3.8            py36hc8e763b_0  \npickleshare               0.7.4            py36h9de030f_0  \npillow                    5.1.0            py36h0738816_0  \npip                       10.0.1                   py36_0  \nprompt_toolkit            1.0.15           py36h60b8f86_0  \nprotobuf                  3.6.0                     \u003cpip\u003e\npsutil                    5.4.6            py36hfa6e2cd_0  \npycparser                 2.18             py36hd053e01_1  \npygments                  2.2.0            py36hb010967_0  \npyparsing                 2.2.0            py36h785a196_1  \npyqt                      5.6.0                    py36_2  \npython                    3.6.5                h0c2934d_0  \npython-dateutil           2.7.3                    py36_0  \npytorch                   0.4.0           py36_cuda90_cudnn7he774522_1  [cuda90]  pytorch\npytz                      2018.4                   py36_0  \npywavelets                0.5.2            py36hc649158_0  \npywinpty                  0.5.4                    py36_0  \npyyaml                    3.12             py36h1d1928f_1  \npyzmq                     17.0.0           py36hfa6e2cd_1  \nqt                        5.6.2                    vc14_1  [vc14]  conda-forge\nqtconsole                 4.3.1            py36h99a29a9_0  \nrequests                  2.19.1                    \u003cpip\u003e\nscikit-image              0.13.1           py36hfa6e2cd_1  \nscikit-learn              0.19.1           py36h53aea1b_0  \nscipy                     1.1.0            py36h672f292_0  \nsend2trash                1.5.0                    py36_0  \nsetuptools                39.2.0                   py36_0  \nsimplegeneric             0.8.1                    py36_2  \nsip                       4.19.8           py36h6538335_0  \nsix                       1.11.0           py36h4db2310_1  \nsortedcontainers          2.0.4                    py36_0  \nsqlite                    3.22.0                   vc14_0  [vc14]  conda-forge\ntblib                     1.3.2            py36h30f5020_0  \ntensorboard               1.8.0                     \u003cpip\u003e\ntensorflow-gpu            1.8.0                     \u003cpip\u003e\ntermcolor                 1.1.0                     \u003cpip\u003e\nterminado                 0.8.1                    py36_1  \ntestpath                  0.3.1            py36h2698cfe_0  \ntk                        8.6.7                    vc14_0  [vc14]  conda-forge\ntoolz                     0.9.0                    py36_0  \ntorchvision               0.2.1                     \u003cpip\u003e\ntornado                   5.0.2                    py36_0  \ntqdm                      4.23.4                   py36_0  \ntraitlets                 4.3.2            py36h096827d_0  \nurllib3                   1.23                      \u003cpip\u003e\nvc                        14                   h0510ff6_3  \nvs2015_runtime            14.0.25123                    3  \nwcwidth                   0.1.7            py36h3d5aa90_0  \nwebencodings              0.5.1            py36h67c50ae_1  \nWerkzeug                  0.14.1                    \u003cpip\u003e\nwheel                     0.31.1                   py36_0  \nwidgetsnbextension        3.2.1                    py36_0  \nwincertstore              0.2              py36h7fe50ca_0  \nwinpty                    0.4.3                         4  \nyaml                      0.1.7                    vc14_0  [vc14]  conda-forge\nzeromq                    4.2.5                    vc14_1  [vc14]  conda-forge\nzict                      0.1.3            py36h2d8e73e_0  \nzlib                      1.2.11                   vc14_0  [vc14]  conda-forge\n```\n\n### Checking our PATH sysenv var\n\nAt this point, whenever the `dlwin36` conda environment is active, the `PATH` environment variable should look something like:\n\n```\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\mingw-w64\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\usr\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Library\\bin\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\Scripts\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\bin\nE:\\toolkits.win\\cuda-9.0.176\\bin\nE:\\toolkits.win\\cuda-9.0.176\\libnvvp\ne:\\toolkits.win\\anaconda3-5.2.0\ne:\\toolkits.win\\anaconda3-5.2.0\\Scripts\ne:\\toolkits.win\\anaconda3-5.2.0\\Library\\bin\nC:\\ProgramData\\Oracle\\Java\\javapath\nC:\\WINDOWS\\system32\nC:\\WINDOWS\nC:\\WINDOWS\\System32\\Wbem\nC:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\\nC:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\nC:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin\nC:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\\nC:\\Program Files\\Git\\cmd\nC:\\Program Files\\Git\\mingw64\\bin\nC:\\Program Files\\Git\\usr\\bin\nC:\\WINDOWS\\System32\\OpenSSH\\\n...\n```\n\n\u003e Note: To get a line-by-line display of the directories on your path (as shown above), enter this incantation at a command prompt: `ECHO.%PATH:;= \u0026 ECHO.%`.\n\n### Quick-checking each main Python library install\n\nTo do a quick check of the installed backends, run the following:\n\n```\n(dlwin36) $ python -c \"import tensorflow; print('tensorflow: %s, %s' % (tensorflow.__version__, tensorflow.__file__))\"\ntensorflow: 1.8.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\tensorflow\\__init__.py\n(dlwin36) $ python -c \"import cntk; print('cntk: %s, %s' % (cntk.__version__, cntk.__file__))\"\ncntk: 2.5.1, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\cntk\\__init__.py\n(dlwin36) $ python -c \"import mxnet; print('mxnet: %s, %s' % (mxnet.__version__, mxnet.__file__))\"f\nmxnet: 1.2.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\__init__.py\n(dlwin36) $ python -c \"import keras; print('keras: %s, %s' % (keras.__version__, keras.__file__))\"\nUsing TensorFlow backend.\nkeras: 2.1.6, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\__init__.py\n(dlwin36) $ python -c \"import torch; print('torch: %s, %s' % (torch.__version__, torch.__file__))\"\ntorch: 0.4.0, e:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\torch\\__init__.py\n```\n\n## GPU tests\n\n### Validating our GPU install with Keras\n\nWe can train a simple convnet ([convolutional neural network](https://en.wikipedia.org/wiki/Convolutional_neural_network)) on the [MNIST dataset](https://en.wikipedia.org/wiki/MNIST_database) by using one of the example scripts provided with Keras. The file is called `mnist_cnn.py` and can be found in Keras' `examples` folder, [here](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py). The code is as follows:\n\n```python\n'''Trains a simple convnet on the MNIST dataset.\n\nGets to 99.25% test accuracy after 12 epochs\n(there is still a lot of margin for parameter tuning).\n16 seconds per epoch on a GRID K520 GPU.\n'''\n\nfrom __future__ import print_function\nimport keras\nfrom keras.datasets import mnist\nfrom keras.models import Sequential\nfrom keras.layers import Dense, Dropout, Flatten\nfrom keras.layers import Conv2D, MaxPooling2D\nfrom keras import backend as K\n\nbatch_size = 128\nnum_classes = 10\nepochs = 12\n\n# input image dimensions\nimg_rows, img_cols = 28, 28\n\n# the data, split between train and test sets\n(x_train, y_train), (x_test, y_test) = mnist.load_data()\n\nif K.image_data_format() == 'channels_first':\n    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)\n    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)\n    input_shape = (1, img_rows, img_cols)\nelse:\n    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n    input_shape = (img_rows, img_cols, 1)\n\nx_train = x_train.astype('float32')\nx_test = x_test.astype('float32')\nx_train /= 255\nx_test /= 255\nprint('x_train shape:', x_train.shape)\nprint(x_train.shape[0], 'train samples')\nprint(x_test.shape[0], 'test samples')\n\n# convert class vectors to binary class matrices\ny_train = keras.utils.to_categorical(y_train, num_classes)\ny_test = keras.utils.to_categorical(y_test, num_classes)\n\nmodel = Sequential()\nmodel.add(Conv2D(32, kernel_size=(3, 3),\n                 activation='relu',\n                 input_shape=input_shape))\nmodel.add(Conv2D(64, (3, 3), activation='relu'))\nmodel.add(MaxPooling2D(pool_size=(2, 2)))\nmodel.add(Dropout(0.25))\nmodel.add(Flatten())\nmodel.add(Dense(128, activation='relu'))\nmodel.add(Dropout(0.5))\nmodel.add(Dense(num_classes, activation='softmax'))\n\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'])\n\nmodel.fit(x_train, y_train,\n          batch_size=batch_size,\n          epochs=epochs,\n          verbose=1,\n          validation_data=(x_test, y_test))\nscore = model.evaluate(x_test, y_test, verbose=0)\nprint('Test loss:', score[0])\nprint('Test accuracy:', score[1])\n```\n\n### Keras with Tensorflow backend (GPU disabled)\n\nTo activate and test the Tensorflow backend in **CPU-only mode**, and get a good baseline to compare against, use the following commands:\n\n```\n(dlwin36) $ set KERAS_BACKEND=tensorflow\n(dlwin36) $ set CUDA_VISIBLE_DEVICES=-1\n(dlwin36) $ python mnist_cnn.py\nUsing TensorFlow backend.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1/12\n2018-06-15 11:59:57.047920: I T:\\src\\github\\tensorflow\\tensorflow\\core\\platform\\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2\n2018-06-15 11:59:58.152643: E T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE\n2018-06-15 11:59:58.164753: I T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: SERVERP\n2018-06-15 11:59:58.173767: I T:\\src\\github\\tensorflow\\tensorflow\\stream_executor\\cuda\\cuda_diagnostics.cc:165] hostname: SERVERP\n60000/60000 [==============================] - 60s 997us/step - loss: 0.2603 - acc: 0.9195 - val_loss: 0.0502 - val_acc: 0.9836\nEpoch 2/12\n60000/60000 [==============================] - 57s 952us/step - loss: 0.0873 - acc: 0.9734 - val_loss: 0.0390 - val_acc: 0.9868\nEpoch 3/12\n60000/60000 [==============================] - 57s 947us/step - loss: 0.0657 - acc: 0.9803 - val_loss: 0.0346 - val_acc: 0.9888\nEpoch 4/12\n60000/60000 [==============================] - 57s 945us/step - loss: 0.0543 - acc: 0.9842 - val_loss: 0.0348 - val_acc: 0.9886\nEpoch 5/12\n60000/60000 [==============================] - 56s 941us/step - loss: 0.0470 - acc: 0.9862 - val_loss: 0.0354 - val_acc: 0.9878\nEpoch 6/12\n60000/60000 [==============================] - 56s 939us/step - loss: 0.0410 - acc: 0.9871 - val_loss: 0.0290 - val_acc: 0.9905\nEpoch 7/12\n60000/60000 [==============================] - 56s 941us/step - loss: 0.0369 - acc: 0.9888 - val_loss: 0.0290 - val_acc: 0.9901\nEpoch 8/12\n60000/60000 [==============================] - 58s 960us/step - loss: 0.0337 - acc: 0.9892 - val_loss: 0.0261 - val_acc: 0.9916\nEpoch 9/12\n60000/60000 [==============================] - 57s 953us/step - loss: 0.0313 - acc: 0.9904 - val_loss: 0.0291 - val_acc: 0.9906\nEpoch 10/12\n60000/60000 [==============================] - 57s 958us/step - loss: 0.0286 - acc: 0.9913 - val_loss: 0.0317 - val_acc: 0.9889\nEpoch 11/12\n60000/60000 [==============================] - 58s 961us/step - loss: 0.0269 - acc: 0.9915 - val_loss: 0.0290 - val_acc: 0.9914\nEpoch 12/12\n60000/60000 [==============================] - 59s 976us/step - loss: 0.0270 - acc: 0.9915 - val_loss: 0.0304 - val_acc: 0.9916\nTest loss: 0.030398282517803726\nTest accuracy: 0.9916\n```\n\n\u003e Note: If you've run the sequence of commands above, to restore CUDA's ability to detect the presence of your GPU(s), just set the environment variable `CUDA_VISIBLE_DEVICES` to the list of IDs of the installed GPU devices on your machine. In other words, if you have only one GPU, use `set CUDA_VISIBLE_DEVICES=0`. If you have two GPUs, use `set CUDA_VISIBLE_DEVICES=0,1`. And, so on.\n\n### Keras with Tensorflow backend (using GPU)\n\nTo activate and test the Tensorflow backend, use the following commands:\n\n```\n(dlwin36) $ set KERAS_BACKEND=tensorflow\n(dlwin36) $ python mnist_cnn.py\nUsing TensorFlow backend.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1/12\n2018-06-15 12:14:21.774082: I T:\\src\\github\\tensorflow\\tensorflow\\core\\platform\\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2\n2018-06-15 12:14:22.219436: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1356] Found device 0 with properties:\nname: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.645\npciBusID: 0000:04:00.0\ntotalMemory: 11.00GiB freeMemory: 9.09GiB\n2018-06-15 12:14:22.345166: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1356] Found device 1 with properties:\nname: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076\npciBusID: 0000:03:00.0\ntotalMemory: 12.00GiB freeMemory: 10.06GiB\n2018-06-15 12:14:22.360064: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1435] Adding visible gpu devices: 0, 1\n2018-06-15 12:14:23.731981: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:\n2018-06-15 12:14:23.741080: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:929]      0 1\n2018-06-15 12:14:23.747608: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:942] 0:   N N\n2018-06-15 12:14:23.753642: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:942] 1:   N N\n2018-06-15 12:14:23.759825: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8804 MB memory) -\u003e physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1)\n2018-06-15 12:14:24.168800: I T:\\src\\github\\tensorflow\\tensorflow\\core\\common_runtime\\gpu\\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 9737 MB memory) -\u003e physical GPU (device: 1, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0, compute capability: 5.2)\n60000/60000 [==============================] - 10s 161us/step - loss: 0.2613 - acc: 0.9198 - val_loss: 0.0563 - val_acc: 0.9811\nEpoch 2/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0875 - acc: 0.9743 - val_loss: 0.0435 - val_acc: 0.9853\nEpoch 3/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0652 - acc: 0.9808 - val_loss: 0.0338 - val_acc: 0.9886\nEpoch 4/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0531 - acc: 0.9844 - val_loss: 0.0324 - val_acc: 0.9896\nEpoch 5/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0466 - acc: 0.9861 - val_loss: 0.0307 - val_acc: 0.9895\nEpoch 6/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0421 - acc: 0.9869 - val_loss: 0.0323 - val_acc: 0.9906\nEpoch 7/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0402 - acc: 0.9879 - val_loss: 0.0286 - val_acc: 0.9907\nEpoch 8/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0326 - acc: 0.9896 - val_loss: 0.0299 - val_acc: 0.9909\nEpoch 9/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0311 - acc: 0.9907 - val_loss: 0.0262 - val_acc: 0.9922\nEpoch 10/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0310 - acc: 0.9902 - val_loss: 0.0256 - val_acc: 0.9918\nEpoch 11/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0267 - acc: 0.9914 - val_loss: 0.0310 - val_acc: 0.9905\nEpoch 12/12\n60000/60000 [==============================] - 4s 71us/step - loss: 0.0262 - acc: 0.9917 - val_loss: 0.0281 - val_acc: 0.9919\nTest loss: 0.028108230106867086\nTest accuracy: 0.9919\n```\n\nKeras with the tensorflow backend operating in GPU-accelerated mode is about **14.5 times faster** than in CPU mode (58/4=14.5).\n\n### Keras with CNTK backend (using GPU)\n\nTo activate and test the CNTK backend, use the following commands:\n\n```\n(dlwin36) $ set KERAS_BACKEND=cntk\n(dlwin36) $ python mnist_cnn.py\nUsing CNTK backend\nSelected GPU[0] GeForce GTX 1080 Ti as the process wide default device.\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1/12\n60000/60000 [==============================] - 7s 110us/step - loss: 0.2594 - acc: 0.9211 - val_loss: 0.0561 - val_acc: 0.9806\nEpoch 2/12\n60000/60000 [==============================] - 6s 93us/step - loss: 0.0855 - acc: 0.9752 - val_loss: 0.0425 - val_acc: 0.9864\nEpoch 3/12\n60000/60000 [==============================] - 6s 93us/step - loss: 0.0646 - acc: 0.9805 - val_loss: 0.0327 - val_acc: 0.9887\nEpoch 4/12\n60000/60000 [==============================] - 6s 93us/step - loss: 0.0537 - acc: 0.9839 - val_loss: 0.0303 - val_acc: 0.9892\nEpoch 5/12\n60000/60000 [==============================] - 6s 94us/step - loss: 0.0466 - acc: 0.9863 - val_loss: 0.0280 - val_acc: 0.9906\nEpoch 6/12\n60000/60000 [==============================] - 6s 93us/step - loss: 0.0410 - acc: 0.9872 - val_loss: 0.0289 - val_acc: 0.9916\nEpoch 7/12\n60000/60000 [==============================] - 6s 93us/step - loss: 0.0356 - acc: 0.9896 - val_loss: 0.0278 - val_acc: 0.9917\nEpoch 8/12\n60000/60000 [==============================] - 6s 94us/step - loss: 0.0341 - acc: 0.9899 - val_loss: 0.0293 - val_acc: 0.9905\nEpoch 9/12\n60000/60000 [==============================] - 6s 94us/step - loss: 0.0325 - acc: 0.9903 - val_loss: 0.0249 - val_acc: 0.9920\nEpoch 10/12\n60000/60000 [==============================] - 6s 94us/step - loss: 0.0302 - acc: 0.9903 - val_loss: 0.0275 - val_acc: 0.9910\nEpoch 11/12\n60000/60000 [==============================] - 6s 94us/step - loss: 0.0277 - acc: 0.9913 - val_loss: 0.0258 - val_acc: 0.9915\nEpoch 12/12\n60000/60000 [==============================] - 6s 94us/step - loss: 0.0253 - acc: 0.9923 - val_loss: 0.0277 - val_acc: 0.9906\nTest loss: 0.027684621373889287\nTest accuracy: 0.9906\n```\n\nIn this specific experiment, CNTK in GPU mode is fast but not as fast as Tensorflow.\n\n### Keras with MXNet backend (using GPU)\n\nTo activate and test the MXNet backend, use the following command:\n\n```\n(dlwin36) $ set KERAS_BACKEND=mxnet\n```\n\nPlease note that, at the time of this writing, per [issue #106](https://github.com/awslabs/keras-apache-mxnet/issues/106), it is not possible to use the same Keras code and expect it will run with MXNet on GPU yet. You will need to modify **ONE LINE** in the sample file `mnist_cnn.py` as shown here:\n\n```python\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'])\n```\n\nshould be:\n\n```python\nmodel.compile(loss=keras.losses.categorical_crossentropy,\n              optimizer=keras.optimizers.Adadelta(),\n              metrics=['accuracy'],\n              context= [\"gpu(0)\"])\n```\n\nAlternatively, use the file [`mnist_cnn_mxnet.py`](mnist_cnn_mxnet.py) (it includes the change above) included in this repo, as follows:\n\n```\n(dlwin36) $ set KERAS_BACKEND=mxnet\n(dlwin36) $ python mnist_cnn_mxnet.py\nUsing MXNet backend\nx_train shape: (60000, 28, 28, 1)\n60000 train samples\n10000 test samples\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:89: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https://github.com/awslabs/keras-apache-mxnet/tree/master/docs/mxnet_backend/performance_guide.md\n  train_symbol = func(*args, **kwargs)\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:92: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https://github.com/awslabs/keras-apache-mxnet/tree/master/docs/mxnet_backend/performance_guide.md\n  test_symbol = func(*args, **kwargs)\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1/12\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\module\\bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.0078125). Is this intended?\n  force_init=force_init)\n[04:55:20] c:\\jenkins\\workspace\\mxnet-tag\\mxnet\\src\\operator\\nn\\cudnn\\./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)\n60000/60000 [==============================] - 12s 192us/step - loss: 0.3480 - acc: 0.8934 - val_loss: 0.0817 - val_acc: 0.9743\nEpoch 2/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.1177 - acc: 0.9660 - val_loss: 0.0524 - val_acc: 0.9828\nEpoch 3/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0859 - acc: 0.9750 - val_loss: 0.0432 - val_acc: 0.9857\nEpoch 4/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0704 - acc: 0.9792 - val_loss: 0.0363 - val_acc: 0.9882\nEpoch 5/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0608 - acc: 0.9817 - val_loss: 0.0344 - val_acc: 0.9884\nEpoch 6/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0561 - acc: 0.9839 - val_loss: 0.0328 - val_acc: 0.9889\nEpoch 7/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0503 - acc: 0.9853 - val_loss: 0.0322 - val_acc: 0.9890\nEpoch 8/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0473 - acc: 0.9860 - val_loss: 0.0290 - val_acc: 0.9905\nEpoch 9/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0440 - acc: 0.9870 - val_loss: 0.0304 - val_acc: 0.9899\nEpoch 10/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0413 - acc: 0.9877 - val_loss: 0.0280 - val_acc: 0.9906\nEpoch 11/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0388 - acc: 0.9888 - val_loss: 0.0281 - val_acc: 0.9913\nEpoch 12/12\n60000/60000 [==============================] - 7s 119us/step - loss: 0.0382 - acc: 0.9883 - val_loss: 0.0285 - val_acc: 0.9904\nTest loss: 0.028510591367455346\nTest accuracy: 0.9904\n```\n\nFrom this single experiment, MXNet appears to be the slowest of the three Keras backends. If you are set on using MXNet, however, you may want to implement the changes in the warning above:\n\n```\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\keras\\backend\\mxnet_backend.py:89: UserWarning: MXNet Backend performs best with `channels_first` format. Using `channels_last` will significantly reduce performance due to the Transpose operations. For performance improvement, please use this API`keras.utils.to_channels_first(x_input)`to transform `channels_last` data to `channels_first` format and also please change the `image_data_format` in `keras.json` to `channels_first`.Note: `x_input` is a Numpy tensor or a list of Numpy tensorRefer to: https://github.com/awslabs/keras-apache-mxnet/tree/master/docs/mxnet_backend/performance_guide.md\n  train_symbol = func(*args, **kwargs)\n```\n\nYou can use the following lines to effect those changes:\n\n```\n(dlwin36) $ %SystemDrive%\n(dlwin36) $ cd %USERPROFILE%\\.keras\n(dlwin36) $ cp keras.json keras.json.bak\n(dlwin36) $ (echo { \u0026 echo     \"image_data_format\": \"channels_first\", \u0026 echo     \"epsilon\": 1e-07, \u0026 echo     \"floatx\": \"float32\", \u0026 echo     \"backend\": \"mxnet\" \u0026 echo }) \u003e keras_mxnet.json\n(dlwin36) $ (echo { \u0026 echo     \"image_data_format\": \"channels_last\", \u0026 echo     \"epsilon\": 1e-07, \u0026 echo     \"floatx\": \"float32\", \u0026 echo     \"backend\": \"tensorflow\" \u0026 echo }) \u003e keras_tensorflow.json\n(dlwin36) $ (echo { \u0026 echo     \"image_data_format\": \"channels_last\", \u0026 echo     \"epsilon\": 1e-07, \u0026 echo     \"floatx\": \"float32\", \u0026 echo     \"backend\": \"cntk\" \u0026 echo }) \u003e keras_cntk.json\n(dlwin36) $ cp -f keras_mxnet.json keras.json\n```\n\nNote 1: If you want to go back to TensorFlow or CNTK after this, all you have to do is copy the proper `json` file to `keras.json` (e.g., `cp -f keras_tensorflow.json keras.json` and set `KERAS_BACKEND` to the matching framework (e.g., `set KERAS_BACKEND=tensorflow`).\n\nNote 2: After switching to the `channels_first` channel ordering, I got the following results:\n\n```\n(dlwin36) $ python mnist_cnn_mxnet.py\nUsing MXNet backend\nx_train shape: (60000, 1, 28, 28)\n60000 train samples\n10000 test samples\nTrain on 60000 samples, validate on 10000 samples\nEpoch 1/12\ne:\\toolkits.win\\anaconda3-5.2.0\\envs\\dlwin36\\lib\\site-packages\\mxnet\\module\\bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.0078125). Is this intended?\n  force_init=force_init)\n[05:39:39] c:\\jenkins\\workspace\\mxnet-tag\\mxnet\\src\\operator\\nn\\cudnn\\./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)\n60000/60000 [==============================] - 9s 152us/step - loss: 0.3485 - acc: 0.8923 - val_loss: 0.0851 - val_acc: 0.9732\nEpoch 2/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.1191 - acc: 0.9652 - val_loss: 0.0529 - val_acc: 0.9824\nEpoch 3/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0874 - acc: 0.9741 - val_loss: 0.0435 - val_acc: 0.9865\nEpoch 4/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0740 - acc: 0.9784 - val_loss: 0.0402 - val_acc: 0.9867\nEpoch 5/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0642 - acc: 0.9809 - val_loss: 0.0328 - val_acc: 0.9884\nEpoch 6/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0585 - acc: 0.9826 - val_loss: 0.0346 - val_acc: 0.9897\nEpoch 7/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0534 - acc: 0.9843 - val_loss: 0.0315 - val_acc: 0.9889\nEpoch 8/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0491 - acc: 0.9852 - val_loss: 0.0336 - val_acc: 0.9888\nEpoch 9/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0441 - acc: 0.9865 - val_loss: 0.0302 - val_acc: 0.9899\nEpoch 10/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0421 - acc: 0.9877 - val_loss: 0.0303 - val_acc: 0.9903\nEpoch 11/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0404 - acc: 0.9878 - val_loss: 0.0294 - val_acc: 0.9903\nEpoch 12/12\n60000/60000 [==============================] - 7s 109us/step - loss: 0.0381 - acc: 0.9889 - val_loss: 0.0272 - val_acc: 0.9904\nTest loss: 0.027214839413274603\nTest accuracy: 0.9904\n```\n\nThis is a bit faster, but not as fast as Keras with a CNTK or Tensorflow backend.\n\n### Validating our GPU install with PyTorch\n\nHere too, we can train a convnet on the MNIST dataset with a similar network as the one used in the Keras case by modifying a sample from PyTorch's `examples` [folder](https://github.com/pytorch/examples/blob/master/mnist/main.py). The new code is as follows:\n\n```python\nfrom __future__ import print_function\nimport sys, argparse\nfrom time import time\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.optim as optim\nfrom torchvision import datasets, transforms\n\ntracker_length = 30\n\nclass Net(nn.Module):\n    def __init__(self):\n        super(Net, self).__init__()\n        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)\n        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)\n        self.fc1 = nn.Linear(12*12*64, 128)\n        self.fc2 = nn.Linear(128, 10)\n\n    def forward(self, x):\n        x = F.relu(self.conv1(x))      # 28x28x32 -\u003e 26x26x32\n        x = F.relu(self.conv2(x))      # 26x26x32 -\u003e 24x24x64\n        x = F.max_pool2d(x, 2) # 24x24x64 -\u003e 12x12x64\n        x = F.dropout(x, p=0.25, training=self.training)\n        x = x.view(-1, 12*12*64)       # flatten 12x12x64 = 9216\n        x = F.relu(self.fc1(x))        # fc 9216 -\u003e 128\n        x = F.dropout(x, p=0.5, training=self.training)\n        x = self.fc2(x)                # fc 128 -\u003e 10\n        return F.log_softmax(x, dim=1) # to 10 logits\n\ndef train(args, model, device, train_loader, optimizer):\n    model.train()\n    start_time = time()\n\n    for batch_idx, (data, target) in enumerate(train_loader):\n        data, target = data.to(device), target.to(device)\n        optimizer.zero_grad()\n        output = model(data)\n        loss = F.nll_loss(output, target)\n        loss.backward()\n        optimizer.step()\n        if batch_idx % args.log_interval == 0:\n            percentage = 100. * batch_idx / len(train_loader)\n            cur_length = int((tracker_length * int(percentage)) / 100)\n            bar = '=' * cur_length + '\u003e' + '-' * (tracker_length - cur_length)\n            sys.stdout.write('\\r{}/{} [{}] - loss: {:.4f}'.format(\n                batch_idx * len(data), len(train_loader.dataset),\n                bar, loss.item()))\n            sys.stdout.flush()\n\n    train_time = time() - start_time\n    sys.stdout.write('\\r{}/{} [{}] - {:.1f}s {:.1f}us/step - loss: {:.4f}'.format(\n        len(train_loader.dataset), len(train_loader.dataset), '=' * tracker_length, \n        train_time, (train_time / len(train_loader.dataset)) * 1000000.0, loss.item()))\n    sys.stdout.flush()\n\n    return len(train_loader.dataset), train_time, loss.item()\n\ndef test(args, model, device, test_loader):\n    model.eval()\n    test_loss = 0\n    correct = 0\n\n    with torch.no_grad():\n        for data, target in test_loader:\n            data, target = data.to(device), target.to(device)\n            output = model(data)\n            test_loss += F.nll_loss(output, target, size_average=False).item() # sum up batch loss\n            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability\n            correct += pred.eq(target.view_as(pred)).sum().item()\n\n    test_loss /= len(test_loader.dataset)\n    test_accuracy = correct / len(test_loader.dataset)\n\n    return test_loss, test_accuracy\n\ndef main():\n    # Training settings\n    parser = argparse.ArgumentParser(description='PyTorch MNIST Example')\n    parser.add_argument('--batch-size', type=int, default=64, metavar='N',\n                        help='input batch size for training (default: 64)')\n    parser.add_argument('--test-batch-size', type=int, default=1000, metavar='N',\n                        help='input batch size for testing (default: 1000)')\n    parser.add_argument('--epochs', type=int, default=10, metavar='N',\n                        help='number of epochs to train (default: 10)')\n    parser.add_argument('--lr', type=float, default=0.01, metavar='LR',\n                        help='learning rate (default: 0.01)')\n    parser.add_argument('--momentum', type=float, default=0.5, metavar='M',\n                        help='SGD momentum (default: 0.5)')\n    parser.add_argument('--no-cuda', action='store_true', default=False,\n                        help='disables CUDA training')\n    parser.add_argument('--seed', type=int, default=1, metavar='S',\n                        help='random seed (default: 1)')\n    parser.add_argument('--log-interval', type=int, default=10, metavar='N',\n                        help='how many batches to wait before logging training status')\n    args = parser.parse_args()\n    use_cuda = not args.no_cuda and torch.cuda.is_available()\n\n    torch.manual_seed(args.seed)\n\n    device = torch.device(\"cuda\" if use_cuda else \"cpu\")\n\n    kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}\n    train_loader = torch.utils.data.DataLoader(\n        datasets.MNIST('../data', train=True, download=True,\n                       transform=transforms.Compose([\n                           transforms.ToTensor(),\n                           transforms.Normalize((0.1307,), (0.3081,))\n                       ])),\n        batch_size=args.batch_size, shuffle=True, **kwargs)\n    test_loader = torch.utils.data.DataLoader(\n        datasets.MNIST('../data', train=False, transform=transforms.Compose([\n                           transforms.ToTensor(),\n                           transforms.Normalize((0.1307,), (0.3081,))\n                       ])),\n        batch_size=args.test_batch_size, shuffle=True, **kwargs)\n\n\n    model = Net().to(device)\n    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)\n\n    for epoch in range(1, args.epochs + 1):\n        print(\"\\nEpoch {}/{}\".format(epoch, args.epochs))\n        train_len, train_time, train_loss = train(args, model, device, train_loader, optimizer)\n        test_loss, test_accuracy = test(args, model, device, test_loader)\n        sys.stdout.write('\\r{}/{} [{}] - {:.1f}s {:.1f}us/step - loss: {:.4f} - val_loss: {:.4f} - val_acc: {:.4f}'.format(\n            train_len, train_len, '=' * tracker_length, \n            train_time, (train_time / train_len) * 1000000.0, train_loss,\n            test_loss, test_accuracy))\n        sys.stdout.flush()\n\n\nif __name__ == '__main__':\n    main()\n```\n\nWe include the modified version of this sample in our repo under the name [`mnist_cnn_pytorch.py`](mnist_cnn_pytorch.py). You can run it as follows:\n\n```\n(dlwin36) $ python mnist_cnn_pytorch.py\nEpoch 1/12\n60000/60000 [==============================] - 7.1s 118.6us/step - loss: 0.2592 - val_loss: 0.1883 - val_acc: 0.9438\nEpoch 2/12\n60000/60000 [==============================] - 6.1s 102.0us/step - loss: 0.1917 - val_loss: 0.1412 - val_acc: 0.9575\nEpoch 3/12\n60000/60000 [==============================] - 6.1s 101.5us/step - loss: 0.2335 - val_loss: 0.1074 - val_acc: 0.9679\nEpoch 4/12\n60000/60000 [==============================] - 6.1s 101.2us/step - loss: 0.2038 - val_loss: 0.0828 - val_acc: 0.9741\nEpoch 5/12\n60000/60000 [==============================] - 6.1s 101.8us/step - loss: 0.1733 - val_loss: 0.0676 - val_acc: 0.9783\nEpoch 6/12\n60000/60000 [==============================] - 6.1s 101.2us/step - loss: 0.0952 - val_loss: 0.0587 - val_acc: 0.9810\nEpoch 7/12\n60000/60000 [==============================] - 6.1s 101.8us/step - loss: 0.0521 - val_loss: 0.0527 - val_acc: 0.9832\nEpoch 8/12\n60000/60000 [==============================] - 6.1s 101.5us/step - loss: 0.0993 - val_loss: 0.0484 - val_acc: 0.9834\nEpoch 9/12\n60000/60000 [==============================] - 6.0s 100.3us/step - loss: 0.2031 - val_loss: 0.0449 - val_acc: 0.9853\nEpoch 10/12\n60000/60000 [==============================] - 6.0s 100.0us/step - loss: 0.2267 - val_loss: 0.0429 - val_acc: 0.9868\nEpoch 11/12\n60000/60000 [==============================] - 6.1s 100.9us/step - loss: 0.0819 - val_loss: 0.0426 - val_acc: 0.9857\nEpoch 12/12\n60000/60000 [==============================] - 6.0s 100.7us/step - loss: 0.0312 - val_loss: 0.0370 - val_acc: 0.9872\n```\n\nAs expected, the network's training performance using PyTorch is on par with the other frameworks.\n\n# Suggested viewing and reading\n\nDeep Learning with Keras - Python, by The SemiColon:\n\n@ https://www.youtube.com/playlist?list=PLVBorYCcu-xX3Ppjb_sqBd_Xf6GqagQyl\n\nDeep Learning with Python, François Chollet\n\n@ https://www.manning.com/books/deep-learning-with-python\n\n# About the Author\n\nFor information about the author, please visit:\n\n[![https://www.linkedin.com/in/philferriere](img/LinkedInDLDev.png)](https://www.linkedin.com/in/philferriere)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphilferriere%2Fdlwin","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fphilferriere%2Fdlwin","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphilferriere%2Fdlwin/lists"}