{"id":15710278,"url":"https://github.com/aws/sagemaker-tensorflow-training-toolkit","last_synced_at":"2025-12-12T00:35:56.361Z","repository":{"id":37692127,"uuid":"118646184","full_name":"aws/sagemaker-tensorflow-training-toolkit","owner":"aws","description":"Toolkit for running TensorFlow training scripts on SageMaker. Dockerfiles used for building SageMaker TensorFlow Containers are at https://github.com/aws/deep-learning-containers. ","archived":false,"fork":false,"pushed_at":"2025-02-11T01:29:11.000Z","size":14782,"stargazers_count":271,"open_issues_count":9,"forks_count":160,"subscribers_count":53,"default_branch":"tf-2","last_synced_at":"2025-05-08T00:08:05.092Z","etag":null,"topics":["aws","docker","sagemaker","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aws.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-23T17:41:21.000Z","updated_at":"2025-04-04T17:14:18.000Z","dependencies_parsed_at":"2024-06-21T02:15:46.520Z","dependency_job_id":"29569eb6-4851-480e-bbc3-5c081d8c095b","html_url":"https://github.com/aws/sagemaker-tensorflow-training-toolkit","commit_stats":{"total_commits":251,"total_committers":39,"mean_commits":6.435897435897436,"dds":0.7211155378486056,"last_synced_commit":"18813ea3d5261feaa62a0635fa46893a92f2d996"},"previous_names":[],"tags_count":76,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fsagemaker-tensorflow-training-toolkit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fsagemaker-tensorflow-training-toolkit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fsagemaker-tensorflow-training-toolkit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fsagemaker-tensorflow-training-toolkit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aws","download_url":"https://codeload.github.com/aws/sagemaker-tensorflow-training-toolkit/tar.gz/refs/heads/tf-2","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254254041,"owners_count":22039792,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","docker","sagemaker","tensorflow"],"created_at":"2024-10-03T21:05:31.511Z","updated_at":"2025-12-12T00:35:56.299Z","avatar_url":"https://github.com/aws.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"=====================================\nSageMaker TensorFlow Training Toolkit\n=====================================\n\nThe SageMaker TensorFlow Training Toolkit is an open source library for making the\nTensorFlow framework run on `Amazon SageMaker \u003chttps://aws.amazon.com/documentation/sagemaker/\u003e`__.\n\nThis repository also contains Dockerfiles which install this library, TensorFlow, and dependencies\nfor building SageMaker TensorFlow images.\n\nFor information on running TensorFlow jobs on SageMaker:\n\n- `SageMaker Python SDK documentation \u003chttps://sagemaker.readthedocs.io/en/stable/using_tf.html\u003e`__\n- `SageMaker Notebook Examples \u003chttps://github.com/awslabs/amazon-sagemaker-examples\u003e`__\n\nTable of Contents\n-----------------\n\n#. `Getting Started \u003c#getting-started\u003e`__\n#. `Building your Image \u003c#building-your-image\u003e`__\n#. `Running the tests \u003c#running-the-tests\u003e`__\n\nGetting Started\n---------------\n\nPrerequisites\n~~~~~~~~~~~~~\n\nMake sure you have installed all of the following prerequisites on your\ndevelopment machine:\n\n- `Docker \u003chttps://www.docker.com/\u003e`__\n\nFor Testing on GPU\n^^^^^^^^^^^^^^^^^^\n\n-  `Nvidia-Docker \u003chttps://github.com/NVIDIA/nvidia-docker\u003e`__\n\nRecommended\n^^^^^^^^^^^\n\n-  A Python environment management tool. (e.g.\n   `PyEnv \u003chttps://github.com/pyenv/pyenv\u003e`__,\n   `VirtualEnv \u003chttps://virtualenv.pypa.io/en/stable/\u003e`__)\n\nBuilding your Image\n-------------------\n\n`Amazon SageMaker \u003chttps://aws.amazon.com/documentation/sagemaker/\u003e`__\nutilizes Docker containers to run all training jobs \u0026 inference endpoints.\n\nThe Docker images are built from the Dockerfiles specified in\n`docker/ \u003chttps://github.com/aws/sagemaker-tensorflow-containers/tree/master/docker\u003e`__.\n\nThe Dockerfiles are grouped based on TensorFlow version and separated\nbased on Python version and processor type.\n\nThe Dockerfiles for TensorFlow 2.0+ are available in the\n`tf-2 \u003chttps://github.com/aws/sagemaker-tensorflow-container/tree/tf-2\u003e`__ branch.\n\nTo build the images, first copy the files under\n`docker/build_artifacts/ \u003chttps://github.com/aws/sagemaker-tensorflow-container/tree/tf-2/docker/build_artifacts\u003e`__\nto the folder container the Dockerfile you wish to build.\n\n::\n\n    # Example for building a TF 2.1 image with Python 3\n    cp docker/build_artifacts/* docker/2.1.0/py3/.\n\nAfter that, go to the directory containing the Dockerfile you wish to build,\nand run ``docker build`` to build the image.\n\n::\n\n    # Example for building a TF 2.1 image for CPU with Python 3\n    cd docker/2.1.0/py3\n    docker build -t tensorflow-training:2.1.0-cpu-py3 -f Dockerfile.cpu .\n\nDon't forget the period at the end of the ``docker build`` command!\n\nRunning the tests\n-----------------\n\nRunning the tests requires installation of the SageMaker TensorFlow Training Toolkit code and its test\ndependencies.\n\n::\n\n    git clone https://github.com/aws/sagemaker-tensorflow-container.git\n    cd sagemaker-tensorflow-container\n    pip install -e .[test]\n\nTests are defined in\n`test/ \u003chttps://github.com/aws/sagemaker-tensorflow-container/tree/master/test\u003e`__\nand include unit, integration and functional tests.\n\nUnit Tests\n~~~~~~~~~~\n\nIf you want to run unit tests, then use:\n\n::\n\n    # All test instructions should be run from the top level directory\n    pytest test/unit\n\nIntegration Tests\n~~~~~~~~~~~~~~~~~\n\nRunning integration tests require `Docker \u003chttps://www.docker.com/\u003e`__ and `AWS\ncredentials \u003chttps://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html\u003e`__,\nas the integration tests make calls to a couple AWS services. The integration and functional\ntests require configurations specified within their respective\n`conftest.py \u003chttps://github.com/aws/sagemaker-tensorflow-containers/blob/master/test/integration/conftest.py\u003e`__.Make sure to update the account-id and region at a minimum.\n\nIntegration tests on GPU require `Nvidia-Docker \u003chttps://github.com/NVIDIA/nvidia-docker\u003e`__.\n\nBefore running integration tests:\n\n#. Build your Docker image.\n#. Pass in the correct pytest arguments to run tests against your Docker image.\n\nIf you want to run local integration tests, then use:\n\n::\n\n    # Required arguments for integration tests are found in test/integ/conftest.py\n    pytest test/integration --docker-base-name \u003cyour_docker_image\u003e \\\n                            --tag \u003cyour_docker_image_tag\u003e \\\n                            --framework-version \u003ctensorflow_version\u003e \\\n                            --processor \u003ccpu_or_gpu\u003e\n\n::\n\n    # Example\n    pytest test/integration --docker-base-name preprod-tensorflow \\\n                            --tag 1.0 \\\n                            --framework-version 1.4.1 \\\n                            --processor cpu\n\nFunctional Tests\n~~~~~~~~~~~~~~~~\n\nFunctional tests are removed from the current branch, please see them in older branch `r1.0 \u003chttps://github.com/aws/sagemaker-tensorflow-container/tree/r1.0#functional-tests\u003e`__.\n\nContributing\n------------\n\nPlease read\n`CONTRIBUTING.md \u003chttps://github.com/aws/sagemaker-tensorflow-containers/blob/master/CONTRIBUTING.md\u003e`__\nfor details on our code of conduct, and the process for submitting pull\nrequests to us.\n\nLicense\n-------\n\nSageMaker TensorFlow Containers is licensed under the Apache 2.0 License. It is copyright 2018\nAmazon.com, Inc. or its affiliates. All Rights Reserved. The license is available at:\nhttp://aws.amazon.com/apache2.0/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws%2Fsagemaker-tensorflow-training-toolkit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faws%2Fsagemaker-tensorflow-training-toolkit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws%2Fsagemaker-tensorflow-training-toolkit/lists"}