Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aws/sagemaker-tensorflow-training-toolkit
Toolkit for running TensorFlow training scripts on SageMaker. Dockerfiles used for building SageMaker TensorFlow Containers are at https://github.com/aws/deep-learning-containers.
https://github.com/aws/sagemaker-tensorflow-training-toolkit
aws docker sagemaker tensorflow
Last synced: about 1 month ago
JSON representation
Toolkit for running TensorFlow training scripts on SageMaker. Dockerfiles used for building SageMaker TensorFlow Containers are at https://github.com/aws/deep-learning-containers.
- Host: GitHub
- URL: https://github.com/aws/sagemaker-tensorflow-training-toolkit
- Owner: aws
- License: apache-2.0
- Created: 2018-01-23T17:41:21.000Z (almost 7 years ago)
- Default Branch: tf-2
- Last Pushed: 2023-04-10T08:52:31.000Z (over 1 year ago)
- Last Synced: 2024-10-04T21:05:14.732Z (about 1 month ago)
- Topics: aws, docker, sagemaker, tensorflow
- Language: Python
- Homepage:
- Size: 14.1 MB
- Stars: 270
- Watchers: 54
- Forks: 160
- Open Issues: 9
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
=====================================
SageMaker TensorFlow Training Toolkit
=====================================The SageMaker TensorFlow Training Toolkit is an open source library for making the
TensorFlow framework run on `Amazon SageMaker `__.This repository also contains Dockerfiles which install this library, TensorFlow, and dependencies
for building SageMaker TensorFlow images.For information on running TensorFlow jobs on SageMaker:
- `SageMaker Python SDK documentation `__
- `SageMaker Notebook Examples `__Table of Contents
-----------------#. `Getting Started <#getting-started>`__
#. `Building your Image <#building-your-image>`__
#. `Running the tests <#running-the-tests>`__Getting Started
---------------Prerequisites
~~~~~~~~~~~~~Make sure you have installed all of the following prerequisites on your
development machine:- `Docker `__
For Testing on GPU
^^^^^^^^^^^^^^^^^^- `Nvidia-Docker `__
Recommended
^^^^^^^^^^^- A Python environment management tool. (e.g.
`PyEnv `__,
`VirtualEnv `__)Building your Image
-------------------`Amazon SageMaker `__
utilizes Docker containers to run all training jobs & inference endpoints.The Docker images are built from the Dockerfiles specified in
`docker/ `__.The Dockerfiles are grouped based on TensorFlow version and separated
based on Python version and processor type.The Dockerfiles for TensorFlow 2.0+ are available in the
`tf-2 `__ branch.To build the images, first copy the files under
`docker/build_artifacts/ `__
to the folder container the Dockerfile you wish to build.::
# Example for building a TF 2.1 image with Python 3
cp docker/build_artifacts/* docker/2.1.0/py3/.After that, go to the directory containing the Dockerfile you wish to build,
and run ``docker build`` to build the image.::
# Example for building a TF 2.1 image for CPU with Python 3
cd docker/2.1.0/py3
docker build -t tensorflow-training:2.1.0-cpu-py3 -f Dockerfile.cpu .Don't forget the period at the end of the ``docker build`` command!
Running the tests
-----------------Running the tests requires installation of the SageMaker TensorFlow Training Toolkit code and its test
dependencies.::
git clone https://github.com/aws/sagemaker-tensorflow-container.git
cd sagemaker-tensorflow-container
pip install -e .[test]Tests are defined in
`test/ `__
and include unit, integration and functional tests.Unit Tests
~~~~~~~~~~If you want to run unit tests, then use:
::
# All test instructions should be run from the top level directory
pytest test/unitIntegration Tests
~~~~~~~~~~~~~~~~~Running integration tests require `Docker `__ and `AWS
credentials `__,
as the integration tests make calls to a couple AWS services. The integration and functional
tests require configurations specified within their respective
`conftest.py `__.Make sure to update the account-id and region at a minimum.Integration tests on GPU require `Nvidia-Docker `__.
Before running integration tests:
#. Build your Docker image.
#. Pass in the correct pytest arguments to run tests against your Docker image.If you want to run local integration tests, then use:
::
# Required arguments for integration tests are found in test/integ/conftest.py
pytest test/integration --docker-base-name \
--tag \
--framework-version \
--processor::
# Example
pytest test/integration --docker-base-name preprod-tensorflow \
--tag 1.0 \
--framework-version 1.4.1 \
--processor cpuFunctional Tests
~~~~~~~~~~~~~~~~Functional tests are removed from the current branch, please see them in older branch `r1.0 `__.
Contributing
------------Please read
`CONTRIBUTING.md `__
for details on our code of conduct, and the process for submitting pull
requests to us.License
-------SageMaker TensorFlow Containers is licensed under the Apache 2.0 License. It is copyright 2018
Amazon.com, Inc. or its affiliates. All Rights Reserved. The license is available at:
http://aws.amazon.com/apache2.0/