{"id":28976877,"url":"https://github.com/marta-barea/mlp-diabetes-classifier","last_synced_at":"2025-06-24T14:10:51.683Z","repository":{"id":297937814,"uuid":"996783438","full_name":"Marta-Barea/mlp-diabetes-classifier","owner":"Marta-Barea","description":"A simple project to train and evaluate a multilayer perceptron on the Pima Indians Diabetes Dataset using TensorFlow, SciKeras, and Scikit-Learn.","archived":false,"fork":false,"pushed_at":"2025-06-19T13:11:35.000Z","size":136,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-19T14:24:40.535Z","etag":null,"topics":["classification-algorithm","deep-learning","diabetes-prediction","multilayer-perceptron","neural-networks","pima-indians-diabetes","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Marta-Barea.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-05T13:07:20.000Z","updated_at":"2025-06-19T13:24:05.000Z","dependencies_parsed_at":"2025-06-08T13:34:11.792Z","dependency_job_id":"2c2c86ed-aef7-4b31-b517-0afde58f7bd0","html_url":"https://github.com/Marta-Barea/mlp-diabetes-classifier","commit_stats":null,"previous_names":["marta-barea/mlp-diabetes-classifier"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Marta-Barea/mlp-diabetes-classifier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marta-Barea%2Fmlp-diabetes-classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marta-Barea%2Fmlp-diabetes-classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marta-Barea%2Fmlp-diabetes-classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marta-Barea%2Fmlp-diabetes-classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Marta-Barea","download_url":"https://codeload.github.com/Marta-Barea/mlp-diabetes-classifier/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marta-Barea%2Fmlp-diabetes-classifier/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261691328,"owners_count":23195051,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification-algorithm","deep-learning","diabetes-prediction","multilayer-perceptron","neural-networks","pima-indians-diabetes","python"],"created_at":"2025-06-24T14:10:48.054Z","updated_at":"2025-06-24T14:10:51.673Z","avatar_url":"https://github.com/Marta-Barea.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MLP Diabetes Classifier\n\nA simple project to train and evaluate a multilayer perceptron on the Pima Indians Diabetes Dataset using TensorFlow, SciKeras, and Scikit-Learn.\n\n---\n\n# Installation\n\n1. Clone the repo\n\n```bash\ngit clone https://github.com/yourusername/mlp-diabetes-classifier.git\ncd mlp-diabetes-classifier\n```\n\n2. Set up the Conda environment\n\nIt is included an `environment.yml` for Conda users: \n\n```bash \nconda env create -f environment.yml\nconda activate mlp-diabetes\n```\n\n# Dependencies \n\n- Python 3.7+\n- numpy, scikt-learn, tensorflow, scikeras, PyYAML, matplotlib. \n\nYou can also install them with:\n\n```bash\npip install -r requirements.txt\n```\n\n# Usage\n\n1. Verify the dataset\n\nThe [Pima Indians Diabetes Dataset](https://www.kaggle.com/datasets/mathchi/diabetes-data-set) from Kaggle is already included under `data/diabetes.csv`.\n\n2. Adjust settings\n\nOpen `config.yaml`and tweak any values you like (seed, test_size_hyperparameters list, etc.)\n\n3. Run the full pipeline\n\n```bash\npython run_all.py\n```\n\nThis will: \n\n- Train de MLP with randomized hyperparameter search\n- Save the best model to `models/best_mlp.h5`\n- Print train/test accuracy and predictions\n- Save evaluation plots to `reports/`\n\n# Project Structure\n\n```\nmlp-diabetes-classifier/\n│\n├── config.yaml          # Experiment settings\n├── environment.yml      # Conda environment spec\n├── requirements.txt     # Pinned pip dependencies (for Docker)\n├── docker-compose.yml   # Docker Compose setup\n├── Dockerfile           # Image build definition\n├── .dockerignore       \n├── .gitignore           \n├── pytest.ini           \n│\n├── data/\n│   └── diabetes.csv     # Pima Indians Diabetes Dataset\n│\n├── models/              # (Auto-created) Trained model \u0026 params\n│\n├── reports/\n│   └── figures          # (Auto-created) Plots\n│\n├── tests/               # Test suite\n│   ├── unit\n│   ├── integration\n│   └── e2e  \n│       \n├── src/\n│   ├── config.py        # Loads config.yaml\n│   ├── data_loader.py   # Reads \u0026 splits data\n│   ├── model_builder.py # Defines the Keras MLP\n│   ├── train.py         # Hyperparameter search \u0026 model saving\n│   └── evaluate.py      # Loads model \u0026 prints metrics\n│\n└── run_all.py           # Runs train.py then evaluate.py\n\n```\n\n# Dockerized Support\n\nThis project is fully containerized for portability and reproducibility.\n\n## Docker Dependencies \n\nBefore using Docker, you need to have the following installed locally on your system:\n\n- [Docker Engine](https://docs.docker.com/get-started/get-docker/)\n- [Docker Compose](https://docs.docker.com/compose/install/)\n\n✅ Note: These tools are required only if you want to run the project in a containerized environment. If you're using Conda, Docker is optional.\n\n## How to Run \n\nTo build the image and run the project inside a container:\n\n```bash\ndocker-compose up --build\n```\n\nThis will:\n\n- Build the Docker image using the included Dockerfile\n- Run the run_all.py pipeline (training + evaluation)\n- Save the best trained model in the `models/` directory\n- Save plots and metrics in the `reports/` directory\n\n✅ Note: Both `models/`and `reports/` are mounted to your host machine, so your outputs are preserved outside the container.\n\n\n# Testing\n\nThe project includes a complete test suite using [pytest](https://docs.pytest.org/en/stable/). Tests use temporary directories, mock inputs, and validate expected outputs including saved models and plots.\n\n## Run all tests\n\n```bash\npytest\n```\n\nThis will automatically discover and run:\n\n- Unit tests (`tests/unit/`)\n- Integration tests (`tests/integration/`)\n- End-to-End tests (`tests/e2e/`)\n\n## Run a specific group\n\n```bash \npytest tests/unit/\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarta-barea%2Fmlp-diabetes-classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarta-barea%2Fmlp-diabetes-classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarta-barea%2Fmlp-diabetes-classifier/lists"}