{"id":13856806,"url":"https://github.com/jrieke/awstrainer","last_synced_at":"2026-04-30T01:34:48.591Z","repository":{"id":78974024,"uuid":"277640411","full_name":"jrieke/awstrainer","owner":"jrieke","description":"🛠️ Command line tool for machine learning on AWS","archived":false,"fork":false,"pushed_at":"2020-07-08T20:32:29.000Z","size":3117,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-01T19:59:14.630Z","etag":null,"topics":["amazon-web-services","aws","command-line-tool","deep-learning","ec2","machine-learning","server","sync","training"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jrieke.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2020-07-06T20:19:37.000Z","updated_at":"2022-11-10T16:04:58.000Z","dependencies_parsed_at":"2023-08-20T22:40:28.686Z","dependency_job_id":null,"html_url":"https://github.com/jrieke/awstrainer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrieke%2Fawstrainer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrieke%2Fawstrainer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrieke%2Fawstrainer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrieke%2Fawstrainer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jrieke","download_url":"https://codeload.github.com/jrieke/awstrainer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246922240,"owners_count":20855343,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amazon-web-services","aws","command-line-tool","deep-learning","ec2","machine-learning","server","sync","training"],"created_at":"2024-08-05T03:01:14.110Z","updated_at":"2026-04-30T01:34:43.546Z","avatar_url":"https://github.com/jrieke.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# awstrainer\n\n🛠️ Command line tool for machine learning on AWS\n\nawstrainer helps you run machine learning tasks (or any other long-running computations) \non AWS. With one simple command, it spins up an AWS instance (from your own account), \ntransfers your code \u0026 dataset, starts the training run, syncs all output files back to \nyour computer, and terminates the instance after training has finished. It really shines \nwhen you need to quickly launch multiple, long-running jobs in parallel (e.g. for \nhyperparameter optimization). \n\n\n## Demo\n\n![](docs/images/demo.gif)\n\n\n## Installation\n\n1. `pip install git+https://github.com/jrieke/awstrainer`\n\n2. Install the AWS CLI from [here](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) \nand run `aws configure` to [connect your AWS account](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html) (alternatively, you can create a credentials file as \ndescribed [here](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#configuration)). \n\n\n## Usage\n\n### Starting a training run\n\nFirst, you need to create a launch template for your AWS instance. This specifies which \ninstance type should be used, how big the storage is, which packages should be \ninstalled, etc. You can either follow the instructions [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-launch-templates.html#create-launch-template) or create a launch \ntemplate [from an existing instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-launch-templates.html#create-launch-template-from-instance). \n\nThen, navigate into your project dir and run:\n\n    awstrainer run --launch_template_id \u003cid\u003e \"/home/ubuntu/anaconda3/bin/python train.py\"\n\nThis launches an AWS instance (based on your launch template), uploads the project dir \n(excluding subdirs `.git` and `out`), executes a command via ssh (here it's starting a \ntraining script, but this can be any command - note that you have to use absolute \npaths because $PATH won't be available), and terminates the instance after \ntraining has finished. Note that this assumes your private key file from AWS to be \nstored as `aws-key.pem` in the project dir. To adapt this, set the `--key_file` option. \nBased on which operating system your instance uses, you may also need to set the \n`--user` option (default: `ubuntu`). \n\nFor a complete list of options, run `awstrainer run --help`. \n\n\n### Syncing output back to your machine\n\nawstrainer also allows you to sync any output files from the AWS instances back to your\nlocal machine. For this to work, you need to write output files to a folder `out`. \nThen, on your local machine, run:\n\n    awstrainer sync --every 60\n\nThis pulls output files from all running AWS instances every 60 seconds and syncs them \nto a local dir `aws-synced-out`. You can also run `awstrainer sync` without the \n`--every` option for a one-time sync. \n\nFor a complete list of options, run `awstrainer sync --help`. \n\n\n## Known issues\n\nIf `awstrainer run` shows a \"Connection refused\" error, try increasing the \nwaiting time after instance launch via the `--wait_time` option (default: 20). \nSometimes, the instance doesn't allow a connection even though the AWS API reports it \nas ready, which may lead to this error. \n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjrieke%2Fawstrainer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjrieke%2Fawstrainer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjrieke%2Fawstrainer/lists"}