{"id":27057913,"url":"https://github.com/dumbmachine/code-nov20-ratinkumar","last_synced_at":"2025-10-20T00:21:46.126Z","repository":{"id":96766992,"uuid":"314601553","full_name":"DumbMachine/code-nov20-ratinkumar","owner":"DumbMachine","description":null,"archived":false,"fork":false,"pushed_at":"2020-11-21T00:29:44.000Z","size":19,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-04-05T11:35:23.415Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DumbMachine.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-20T15:59:12.000Z","updated_at":"2020-11-21T00:29:46.000Z","dependencies_parsed_at":"2023-03-13T16:24:48.480Z","dependency_job_id":null,"html_url":"https://github.com/DumbMachine/code-nov20-ratinkumar","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/DumbMachine/code-nov20-ratinkumar","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DumbMachine%2Fcode-nov20-ratinkumar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DumbMachine%2Fcode-nov20-ratinkumar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DumbMachine%2Fcode-nov20-ratinkumar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DumbMachine%2Fcode-nov20-ratinkumar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DumbMachine","download_url":"https://codeload.github.com/DumbMachine/code-nov20-ratinkumar/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DumbMachine%2Fcode-nov20-ratinkumar/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273847021,"owners_count":25178631,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-05T02:00:09.113Z","response_time":402,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-05T11:34:00.337Z","updated_at":"2025-10-20T00:21:41.107Z","avatar_url":"https://github.com/DumbMachine.png","language":"Python","readme":"# code-nov20-ratinkumar\nThis document will detail the choices and the solution\n\n# Testing Criteria:\n1. production-grade python program:\n    - [X] calculate BMI\n    - [X] get the \"BMI Category\" for each row/patient\n    - [X] get the \"Heatlh risk\" for each row/patient\n    - [X] Provided Solution Scalable? Immensly\n2. overweight statistics check:\n    - [X] number of overweight (manually checked): Only 1 (`29.4` is the one that is Overweight)\n    - [X] Cross-check if the same value is obtained by the program.\n    ```bash\n    # to run the this test:\n    ❯ cd server\n    ❯ uvicorn main:app --reload\n    # making the request to server with the given data\n    ❯ curl -X POST \"http://127.0.0.1:8000/native\" -H  \"accept: application/json\" -H  \"Content-Type: application/x-www-form-urlencoded\" -d \"items=[{\\\"Gender\\\":\\\"Male\\\",\\\"HeightCm\\\":171,\\\"WeightKg\\\":96},{\\\"Gender\\\":\\\"Male\\\",\\\"HeightCm\\\":161,\\\"WeightKg\\\":85},{\\\"Gender\\\":\\\"Male\\\",\\\"HeightCm\\\":180,\\\"WeightKg\\\":77},{\\\"Gender\\\":\\\"Female\\\",\\\"HeightCm\\\":166,\\\"WeightKg\\\":62},{\\\"Gender\\\":\\\"Female\\\",\\\"HeightCm\\\":150,\\\"WeightKg\\\":70},{\\\"Gender\\\":\\\"Female\\\",\\\"HeightCm\\\":167,\\\"WeightKg\\\":82}]\"\n    ```\n    ![Image Proof](https://user-images.githubusercontent.com/23381512/99857107-87f2b200-2bb0-11eb-9bf7-112544cec85a.png)\n3. Automate setup, build, package:\n    - [X] The Python package installable via the setup.py or importing by copying the files.\n    - [X] I have also added a automated pipeline for hosting the service ( from the `server` folder directly on the heroku server ) with the `Procfile`.\n    - [X] Tests? All the functions used in this program are tested for ( find the tests in the `tests` folder ), using `pytest`.\n    ```bash\n    ❯ bash run_test.sh\n    ```\n    - [X] To run the code as a serverless `AWS Lambda Function`, one would only have to zip the `utils.py` file and upload it to the `AWS console`.\n\n# Getting the BMI\nExtracting the BMI information (and the other columns that include the \"BMI Category\" and \"Health Risk\"),\ninlvolve arithmetics among the given columns of each row and then application of conditionals.\nThe following 2 methods can be employed here:\n1. Simple Native Python Loops over Data:\nSimple loop over each data point in the given sample and calculate the required information.\nAdvantage of this approach? Easy to implement. This implementation is coded as the functions `parse_bmi_native`, where the `native`\nkey is indictaive of the usage of native python loops.\nDisadvantage of this approach? Usage of native loops, since solution involves arithmetic ops usage of `numpy` makes alot of sense.\n2. Numpy Vectorization of the Problem\nUsing the in-built numpy ops to perform rowise operation would mean that the problem is effectively vectorized and would enjoy the benefits of years and years of matrix optimizations.\nTo put the speed difference in perspective look that the below screenshot:\n![image](https://user-images.githubusercontent.com/23381512/99855001-e49f9e00-2bab-11eb-92a0-07550d927d95.png)\n![image](https://user-images.githubusercontent.com/23381512/99855039-f6814100-2bab-11eb-835d-89166856e801.png)\nThough the performance benefits gained from the a arithmetic ops boost is downsided by the requirement that each row also has to have the\n\"BMI Category\" and \"Health Risk\" computed as well. But even then for most uses cases, where the input data is large (\u003e100K records) use of numpy is more faster and scalble.\nAs expected: the numpy method surpases the native loops method when the input size is large:\n```bash\n❯ bash run_bench.sh\nTime taken by native python loop:  1.8524301052093506\nTime taken by numpy matrix:  1.5414109230041504\nNumpy is faster than native loops by:  1.2017756443551313\n```\nAs expected: the numpy method surpases the native loops method when the input size is small:\n```bash\n# Here numpy is much slower\n❯ python perf_test.py 1000\nTime taken by native python loop:  0.0016388893127441406\nTime taken by numpy matrix:  0.02517867088317871\nNumpy is faster than native loops by:  0.06509038226632705\n```\nThe slowness is due to the added overhead of traversing each element of the numpy array (for getting the BMI Category and Health risk) when the total elements are itself low.\n\n# How to use the code?\nFor on-premise environments, two options are available as the scalable options to deploy this \"product\" that calculates BMI.\n1. REST-LIKE Interface:\nUse of server to open an endpoint that would take input as the data (like the given `dict`) and after performing the tasks return the data back to either be used elsewhere to stored in a datalake.\nThis would be usefull when thinking of this product as a SAS product. Providing API for it's users to calculate BMI.\n2. Python-Package:\nCall the function directly from another service by importing the processor functions. This would make sense in a in-house setting.\n\nFor cloud-like environment:\nSince this program solves a single problem it can easily be hosted in serverless manner. Making use of an AWS Lambda function or Cloud function from the GCP are a good possible solutions","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdumbmachine%2Fcode-nov20-ratinkumar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdumbmachine%2Fcode-nov20-ratinkumar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdumbmachine%2Fcode-nov20-ratinkumar/lists"}