{"id":14958400,"url":"https://github.com/akoury/ml-helper","last_synced_at":"2025-10-24T14:32:00.808Z","repository":{"id":57442289,"uuid":"171712203","full_name":"akoury/ml-helper","owner":"akoury","description":"Python library with helpers to speed up and structure machine learning projects.","archived":false,"fork":false,"pushed_at":"2019-06-05T12:30:18.000Z","size":12266,"stargazers_count":10,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-31T02:22:44.901Z","etag":null,"topics":["data","data-visualization","machine-learning","ml","python","scikit-learn","sklearn"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/ml-helper/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/akoury.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-02-20T16:48:09.000Z","updated_at":"2024-06-23T20:52:32.000Z","dependencies_parsed_at":"2022-09-26T16:30:59.231Z","dependency_job_id":null,"html_url":"https://github.com/akoury/ml-helper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akoury%2Fml-helper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akoury%2Fml-helper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akoury%2Fml-helper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akoury%2Fml-helper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/akoury","download_url":"https://codeload.github.com/akoury/ml-helper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237990585,"owners_count":19398453,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","data-visualization","machine-learning","ml","python","scikit-learn","sklearn"],"created_at":"2024-09-24T13:16:57.080Z","updated_at":"2025-10-24T14:31:55.758Z","avatar_url":"https://github.com/akoury.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ML Helper\n---\nHelpers to speed up and structure machine learning projects.\n\nThe library is available in [Pypi](https://pypi.org/project/ml-helper/)\n\n### Installing\n---\n\n\nThe easiest way to install ml-helper is through ```pip```\n\n```python\npip install ml-helper\n```\n\nTo use it in your project, you must first import the library\n\n```python\nfrom ml_helper.helper import Helper\n```\n\nAnd then create a Helper object with a dictionary of keys related to your project\n\n```python\nKEYS = {\n    'SEED': 1,\n    'TARGET': 'y',\n    'METRIC': 'r2',\n    'TIMESERIES': True,\n    'SPLITS': 5\n}\n\nhp = Helper(KEYS)\n```\n\nAfter this, you may use the helper object's many functions\n\n#### Dependencies\n\nML-Helper requires:\n* Python (\u003e3.5)\n* Numpy (\u003e=1.16)\n* Pandas (\u003e=0.23.4)\n* Seaborn (\u003e=0.9)\n* Scikit-learn (\u003e=0.20)\n* Natplotlib (\u003e=3)\n* Scipy (\u003e=1)\n* Imblearn\n* Vecstack\n\n### Functionality\n---\n\nThe functionality is separated into 4 groups:\n* Data Exploration\n    * Missing Data\n    * Boxplot of numerical variables\n    * Coefficient of variation\n    * Correlation (numerical and categorical)\n    * Under Represented Features\n    * Target Variable Distribution\n    * Feature Importance\n    * PCA Component Variance\n* Data Preparation\n    * Convert features to categories\n    * Drop multiple columns\n* Modeling\n    * Cross Validation (with stratified kfolds, or time series split depending on use case)\n        * Randomized Grid Search\n    * Pipeline: Collection of models and pipeline steps that get performed and scored\n    * Predict: Predict on unseen data\n    * Stack Predict: Build a stacked model and perform a prediction\n* Regression\n    * Plots for predictions\n* Classification\n    * ROC Curve\n    * Classification Report\n* Others\n    * Select features based on types\n    * Split X and y\n    * Plot models/pipelines\n\n### Working Examples\n---\nIf you wish to see the library in use, you may view the notebooks in the [examples](examples) section.\n\nAlso, you can see the implementation in their corresponding Kaggle Kernels:\n\n* [Bike Sharing in Washington D.C.: Time Series Regression](https://www.kaggle.com/akoury/bike-sharing-in-washington-d-c-using-ml-helper)\n\n* [Employee Attrition: Classification](https://www.kaggle.com/akoury/employee-attrition-basis-to-create-ml-helper-lib)\n\n### ML-Helper Coding Style\n---\nMl-Helper complies to PEP8 and uses ```black``` for coding standards\n\n### Versioning\n---\n[SemVer](http://semver.org/) is used for versioning. \n\n### License\n---\nThis project is licensed under the MIT License - see the [License](license.txt) file for details","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fakoury%2Fml-helper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fakoury%2Fml-helper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fakoury%2Fml-helper/lists"}