{"id":25022043,"url":"https://github.com/luisvalgoi/predia","last_synced_at":"2026-05-05T00:38:44.521Z","repository":{"id":118140032,"uuid":"296423886","full_name":"LuisValgoi/predia","owner":"LuisValgoi","description":"Machine Learning Model for Final Paper @ UNISINOS","archived":false,"fork":false,"pushed_at":"2020-10-15T03:14:16.000Z","size":8974,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-02-05T13:43:00.422Z","etag":null,"topics":["ensemble-learning","keras","machine-learning","neural-network","sklearn","streamlit"],"latest_commit_sha":null,"homepage":"https://predia.herokuapp.com","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LuisValgoi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-17T19:31:59.000Z","updated_at":"2021-09-09T13:06:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"3a4b61b6-1fee-4a4c-a5e3-1a45d2e1fc17","html_url":"https://github.com/LuisValgoi/predia","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuisValgoi%2Fpredia","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuisValgoi%2Fpredia/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuisValgoi%2Fpredia/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuisValgoi%2Fpredia/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LuisValgoi","download_url":"https://codeload.github.com/LuisValgoi/predia/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246314118,"owners_count":20757457,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ensemble-learning","keras","machine-learning","neural-network","sklearn","streamlit"],"created_at":"2025-02-05T13:39:53.399Z","updated_at":"2026-05-05T00:38:39.485Z","avatar_url":"https://github.com/LuisValgoi.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Summary\n\n- [Status](https://github.com/LuisValgoi/predia#status)\n- [Context](https://github.com/LuisValgoi/predia#context)\n- [Techniques](https://github.com/LuisValgoi/predia#techniques)\n- [Tooling](https://github.com/LuisValgoi/predia#tooling)\n- [WebAPI](https://github.com/LuisValgoi/predia#webapi---getting-started)\n- [Notebook](https://github.com/LuisValgoi/predia#jupyter-notebook---getting-started)\n- [Heroku](https://github.com/LuisValgoi/predia#heroku---getting-started)\n- [Components Diagram](https://github.com/LuisValgoi/predia#components-diagram)\n- [Architecture Basic Diagram](https://github.com/LuisValgoi/predia#architecture-basic-diagram)\n- [Architecture Detail Diagram](https://github.com/LuisValgoi/predia#architecture-detail-diagram)\n\n# Status\n\n![Heroku](https://pyheroku-badge.herokuapp.com/?app=predia\u0026style=flat)\n\nYou can access the WebAPI which consumes the model @ [predia.herokuapp.com](https://predia.herokuapp.com/).\n\n# Context\n\nThis repository contains the Machine Learning Model \u0026 the WebAPI of the PREDIA – Modelo Híbrido Multifatorial for my final paper @ Unisinos. All the work starts with the OneHotEncoding technique being applied to the dataset. After that, Exploratory Data Analysis, and more specific, Correlation Analysis were made to find the features that were deacreasing the models perfomance. Then, the model building starts with the selection of 3 heterogenous algorithms, where each one of them, makes a prediction following a pipeline composed of: Feature Engineering + Permutation Importance + Randomized Search \u0026 Feature Scaling (w/ MinMaxScaler). Once the pipeline is finished, the technique of Ensemble Learning called Aggregation is made, generating a final number of sales to be sold in the next day. The final model has a RMSE of 17.42 which represents 14% of the sales mean.\n\n# Techniques\n\n- **OneHotEncoding**: to optimize the algorithms prediction by transforming the dataset.\n- **Exploratory Data Analysis**: to check how my data is structured.\n- **Correlational Feature Analaysis**: to clean and remove the features which has no meaning.\n- **Cross Validation**: to use all the dataset instead of only one period.\n- **Permutation Importance**: to identify what are the most important feature for each algorithm.\n- **MinMaxScaler**: to scale the dataset from 0 to 1 so all the algorithms do not suffer from its deviation.\n- **Randomized Search**: to identify the best hyperparameters for each algorithm.\n- **Ensemble Learning**: to aggregate the results of each model into one to increase the perfomance.\n\n# Tooling\n\n- **Python**: as the main language.\n- **Jupyter Notebook**: as the IDE to develop the model.\n- **SKLearn**: as the ML library.\n- **Keras**: as the deep learning library.\n- **Streamlit**: as the framework to build the webapi.\n- **Heroku**: as the server to host the entire model \u0026 webapi.\n\n# WebAPI - Getting Started\n\n```\npipenv shell\npip install streamlit\npip install plotly\npip install sklearn\npip install keras\npip install tensorflow\nstreamlit run app.py\n```\n\n# Jupyter Notebook - Getting Started\n\n```\ncd predia\njupyter notebook\n```\n\n# Heroku - Getting Started\n\n```\nheroku login\n...\ngit push heroku master\nheroku logs --tail\n```\n\n# Components Diagram\n\n![03_Components](https://user-images.githubusercontent.com/8363610/93719289-e0c5b000-fb57-11ea-807e-1e223dad1534.png)\n\n# Architecture Basic Diagram\n\n![03_Steps](https://user-images.githubusercontent.com/8363610/94078669-9e5cd700-fdd4-11ea-980e-6afa44c18601.png)\n\n# Architecture Detail Diagram\n\n![04_Architecture_Detail](https://user-images.githubusercontent.com/8363610/95402488-3cc55e00-08e6-11eb-868c-ebc1ab16ccae.png)\n\n# Sales History\n\n![image](https://user-images.githubusercontent.com/8363610/94081715-9d787500-fdd5-11ea-89d7-87c1982bfe7a.png)\n\n# Algorithms Prediction Combined\n\n![image](https://user-images.githubusercontent.com/8363610/94083521-c0a52380-fdd9-11ea-9294-14a483701aa8.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluisvalgoi%2Fpredia","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fluisvalgoi%2Fpredia","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluisvalgoi%2Fpredia/lists"}