{"id":15338453,"url":"https://github.com/srowen/cdsw-simple-serving","last_synced_at":"2025-10-10T14:31:36.753Z","repository":{"id":77221050,"uuid":"85097136","full_name":"srowen/cdsw-simple-serving","owner":"srowen","description":"Modeling Lifecycle with ACME Occupancy Detection and Cloudera","archived":true,"fork":false,"pushed_at":"2017-09-06T15:19:48.000Z","size":79735,"stargazers_count":14,"open_issues_count":0,"forks_count":18,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-01-28T03:35:08.880Z","etag":null,"topics":["cloudera","cloudera-data-science","data-science","openscoring","pmml","workbench"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/srowen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-03-15T16:41:48.000Z","updated_at":"2023-01-28T20:16:54.000Z","dependencies_parsed_at":null,"dependency_job_id":"f264606e-4f98-4cc5-a3dc-ff9635446903","html_url":"https://github.com/srowen/cdsw-simple-serving","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/srowen/cdsw-simple-serving","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srowen%2Fcdsw-simple-serving","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srowen%2Fcdsw-simple-serving/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srowen%2Fcdsw-simple-serving/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srowen%2Fcdsw-simple-serving/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/srowen","download_url":"https://codeload.github.com/srowen/cdsw-simple-serving/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srowen%2Fcdsw-simple-serving/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279004176,"owners_count":26083688,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloudera","cloudera-data-science","data-science","openscoring","pmml","workbench"],"created_at":"2024-10-01T10:25:17.683Z","updated_at":"2025-10-10T14:31:36.748Z","avatar_url":"https://github.com/srowen.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Modeling Lifecycle with ACME Occupancy Detection and Cloudera\n\nData science is more than just modeling. The complete data science lifecycle also includes data\nengineering and model deployment. This project offers a simplified yet credible example of \nall three elements, as implemented using [Apache Spark](http://spark.apache.org), the\n[Cloudera Data Science Workbench](https://www.cloudera.com/products/data-science-and-engineering/data-science-workbench.html),\nand [JPMML / OpenScoring](https://github.com/openscoring/openscoring).\n\nIn this project, the ACME corporation is productionizing a connected-house platform. Part of this\nservice requires predicting the occupancy of a room given sensor readings.\n\nThis example project includes simplified examples of:\n\n- Data Engineering\n  - Ingest\n  - Cleaning\n- Data Science\n  - Modeling\n  - Tuning and evaluation\n- Model Serving\n  - Model management\n  - Testing\n  - REST API\n\n## Requirements\n\n- [Cloudera Data Science Workbench 1.0](https://www.cloudera.com/products/data-science-and-engineering/data-science-workbench.html)\n- CDH 5.10+ cluster\n- [Spark 2.1 CSD](https://www.cloudera.com/downloads/spark2/2-1.html) for CDH\n- [Apache Maven](https://maven.apache.org) 3.2+\n\n## Get Started\n\nTo continue, review documentation for each of the three modules, which contains more information\nabout what it show and how to run it.\n\n- [Data Engineering](acme-dataeng/) \n- [Data Science](acme-datasci/) \n- [Model Serving](acme-serving/) \n\n\n[![Build Status](https://travis-ci.org/srowen/cdsw-simple-serving.svg?branch=master)](https://travis-ci.org/srowen/cdsw-simple-serving)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrowen%2Fcdsw-simple-serving","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrowen%2Fcdsw-simple-serving","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrowen%2Fcdsw-simple-serving/lists"}