{"id":24175839,"url":"https://github.com/iht/ml-in-prod","last_synced_at":"2025-09-20T20:31:20.086Z","repository":{"id":39853553,"uuid":"233203616","full_name":"iht/ml-in-prod","owner":"iht","description":"Template for Python apps that implement training and inference of Machine Learning models with Tensorflow","archived":false,"fork":false,"pushed_at":"2023-03-25T01:29:53.000Z","size":118,"stargazers_count":3,"open_issues_count":2,"forks_count":4,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-04-18T00:13:56.848Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iht.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-01-11T08:58:54.000Z","updated_at":"2024-04-18T00:13:56.849Z","dependencies_parsed_at":"2022-08-29T15:12:06.674Z","dependency_job_id":null,"html_url":"https://github.com/iht/ml-in-prod","commit_stats":null,"previous_names":[],"tags_count":14,"template":true,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iht%2Fml-in-prod","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iht%2Fml-in-prod/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iht%2Fml-in-prod/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iht%2Fml-in-prod/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iht","download_url":"https://codeload.github.com/iht/ml-in-prod/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233684401,"owners_count":18713885,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-13T02:33:17.728Z","updated_at":"2025-09-20T20:31:14.783Z","avatar_url":"https://github.com/iht.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Machine Learning in Production using Google Cloud Platform\n\nThe goal of this project is to serve as an accelerator for the deployment of production-ready machine learning pipelines.\nThe use case is NLP sentiment analysis binary classification (positive/negative), trained and evaluated on reviews from [IMDB](https://imdb.com).\n\nIt provides:\n\n* CI/CD pipeline to deploy as new binary to Google Cloud Storage using Cloud Build\n* DataFlow pipeline to ingest and preprocess training data\n* Vertex pipeline to train a new model\n* Deployment of the Vertex model\n\n## CI/CD pipeline\n\nA Cloud Build trigger launches the build process which creates a dist package of the pipeline code and stores it in Google Cloud Storage.\n\nThis process is triggered by commits being pushed to the `pipelines` branch.\n\n## DataFlow preprocessing\n\nThe Beam pipeline to be executed on DataFlow is launched from the `launch_preprocess.sh` script which provides the necessary configuration:\nProject in which to run, input/output data locations, etc. The script calls `run_preprocess.py`. The actual\nprocessing is defined in `pipeline/preprocess_pipeline.py`.\n\nTo execute it:\n\n```\n$ gcloud auth application-default login\n$ cd models\n$ venv ...\n$ source activate\n$ pip3 install -r requirements\n$ bin/launch_preprocess.sh\n```\n\nFunctionally you could say there are three pipelines. First for the training data:\n\n* \"Read train set\": `read_set.py` loads the files with reviews from `pos` and `neg` subdirectories and builds a PCollection dataset that has a label column with 1 for a positive review, and 0 for a negative review, and one column for the review text.\n* \"Anlyz. and Transf.\": `preprocess_pipeline.py preprocessing_fn` uses Tensorflow Transform to build n-grams, calculate the TF-IDF and outputs records as `[label (pos or neg), ngram token index, weight]`\n* \"TrainToExamples\": Convert the result to `TFRecord`\n* \"Write Train Data\": Output to Google Cloud Storage\n\nFor the test data:\n\n* \"Read test set\"\n* \"Transform test\" takes the raw test dataset and transform function from \"Analyz. and Transf.\" that was applied to the training dataset as well, and applies it to the test data\n* \"TestToExamples\": Convert the result to `TFRecord`\n* \"Write Test Data\": Output to Google Cloud Storage\n\nFinally we store the transform function to reuse it later for inference:\n\n* \"Write Transform fn\" to Google Cloud Storage\n\nThis last step is so that we can correctly tokenize and transform any review that we want to run the model on, to see if it's a positive or negative review.\n\n## Training on Vertex AI\n\nWe see signficant performance improvement when we use input data in the binary `TFRecord` format, but this means that in Vertex we must use \"Custom Jobs\" \nrather than \"Training Pipelines\". Launch using \n\n`$ bin/launch_training.sh`\n\nAs suggested in the output from this command, now you can:\n\n* Check the job status: `gcloud ai custom-jobs stream-logs projects/237148598933/locations/europe-west4/customJobs/3627616514498101248`\n* Or stream the logs while it's training: `gcloud ai custom-jobs stream-logs projects/237148598933/locations/europe-west4/customJobs/3627616514498101248`\n\nNote that depending on various factors, it's not uncommon that a training job remains pending for 10 minutes.\n\n## Hyperparameter Tuning\n\nTo tune the hyperparameters according to the specification in `training_config_ht.yaml`:\n\n`$ bin/launch_training_ht.sh`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiht%2Fml-in-prod","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fiht%2Fml-in-prod","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiht%2Fml-in-prod/lists"}