{"id":22305323,"url":"https://github.com/steveee27/bank-subscription-prediction-fastapi","last_synced_at":"2026-04-17T06:33:45.366Z","repository":{"id":263102608,"uuid":"889355718","full_name":"steveee27/Bank-Subscription-Prediction-FASTAPI","owner":"steveee27","description":"A machine learning API built using FastAPI to predict customer subscription to long-term deposits based on marketing campaign data. This project preprocesses input data, trains models, and serves predictions through a RESTful API.","archived":false,"fork":false,"pushed_at":"2024-11-16T07:43:08.000Z","size":1217,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-03T13:13:21.788Z","etag":null,"topics":["fastapi","logistic-regression","machine-learning","rest-api","subscription-prediction"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/steveee27.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-16T06:42:57.000Z","updated_at":"2025-06-26T02:41:05.000Z","dependencies_parsed_at":"2025-01-30T21:29:28.973Z","dependency_job_id":"57165f52-2daf-440b-b671-190bd7ab2bb9","html_url":"https://github.com/steveee27/Bank-Subscription-Prediction-FASTAPI","commit_stats":null,"previous_names":["steveee27/bank-subscription-prediction-fastapi"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/steveee27/Bank-Subscription-Prediction-FASTAPI","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steveee27%2FBank-Subscription-Prediction-FASTAPI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steveee27%2FBank-Subscription-Prediction-FASTAPI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steveee27%2FBank-Subscription-Prediction-FASTAPI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steveee27%2FBank-Subscription-Prediction-FASTAPI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/steveee27","download_url":"https://codeload.github.com/steveee27/Bank-Subscription-Prediction-FASTAPI/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steveee27%2FBank-Subscription-Prediction-FASTAPI/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31918621,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-16T18:22:33.417Z","status":"online","status_checked_at":"2026-04-17T02:00:06.879Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","logistic-regression","machine-learning","rest-api","subscription-prediction"],"created_at":"2024-12-03T19:10:57.424Z","updated_at":"2026-04-17T06:33:45.316Z","avatar_url":"https://github.com/steveee27.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Bank Subscription Prediction using Machine Learning and FASTAPI\n\nThis project predicts whether a customer is likely to subscribe to a long-term deposit based on their demographic and campaign-related data. It includes a machine learning pipeline for data preprocessing, model training, evaluation, and a RESTful API built with FastAPI for deployment.\n\n---\n\n## Table of Contents\n\n- [Overview](#overview)\n- [Technologies Used](#technologies-used)\n- [Project Structure](#project-structure)\n- [Setup Instructions](#setup-instructions)\n- [API Endpoints](#api-endpoints)\n- [Features](#features)\n- [Model Evaluation Results](#model-evaluation-results)\n- [License](#license)\n\n---\n\n## Overview\n\nThe **Bank Marketing Campaign Prediction** project is designed to help a bank focus its marketing efforts on customers who are most likely to subscribe to long-term deposits. It involves:\n- Preprocessing campaign data with feature encoding and scaling.\n- Training and evaluating machine learning models (Logistic Regression and Random Forest).\n- Deploying the best-performing model through a FastAPI-based API.\n\n### Dataset\n\nThe project uses the **Bank Marketing Dataset**, which includes customer demographic data, campaign-related information, and a target variable `y` that indicates whether the customer subscribed to a long-term deposit (`yes`/`no`). The dataset consists of 16 features and is located in the `data/bank-marketing.csv` file.\n\n---\n\n## Technologies Used\n\n- **Python 3.10**\n- **FastAPI** for building the RESTful API.\n- **Scikit-learn** for preprocessing and machine learning.\n- **Pandas** for data manipulation.\n- **Uvicorn** for ASGI server.\n\n---\n\n## Project Structure\n\n```plaintext\n├── data/\n│   └── bank-marketing.csv                # Dataset\n├── models/\n│   ├── logistic_classifier_best.pkl      # Trained Logistic Regression model\n│   ├── robust_scaler.pkl                 # Scaler used during preprocessing\n├── src/\n│   └── Training-Model.ipynb              # Notebook for training models\n├── main.py                               # FastAPI application\n├── requirements.txt                      # Python dependencies\n├── README.md                             # Project documentation\n```\n\n---\n\n## Setup Instructions\n\n### Prerequisites\n\n- Python 3.10 or higher\n- Pipenv or virtualenv for environment management\n- `git` installed on your system\n\n### Installation\n\n1. **Clone the repository:**\n\n    Clone the repository from GitHub to your local machine:\n    ```bash\n    git clone https://github.com/steveee27/Bank-Subscription-Prediction-FASTAPI.git\n    cd Bank-Subscription-Prediction-FASTAPI\n    ```\n\n2. **Create and activate a virtual environment:**\n\n    Create a virtual environment to isolate project dependencies:\n    ```bash\n    python -m venv env\n    ```\n\n    Activate the virtual environment:\n    - On Linux/MacOS:\n        ```bash\n        source env/bin/activate\n        ```\n    - On Windows:\n        ```bash\n        env\\Scripts\\activate\n        ```\n\n3. **Install project dependencies:**\n\n    Install all required dependencies specified in the `requirements.txt` file:\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n4. **Prepare the model and scaler files:**\n\n    Ensure the following files are present in the `models/` directory:\n    - `logistic_classifier_best.pkl` (the trained model)\n    - `robust_scaler.pkl` (the scaler used for preprocessing)\n\n    If the files are missing, refer to the `src/Training-Model.ipynb` notebook to retrain the model and generate these files.\n\n---\n\n### Running the API\n\n1. **Start the FastAPI server:**\n\n    Run the FastAPI server locally:\n    ```bash\n    uvicorn main:app --reload\n    ```\n\n2. **Access the API documentation:**\n\n    Open your browser and navigate to:\n    - Swagger UI: [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs)\n    - ReDoc: [http://127.0.0.1:8000/redoc](http://127.0.0.1:8000/redoc)\n\n---\n\n## API Endpoints\n\n### **1. GET /**\n- **Description**: A welcome endpoint to check if the API is running.\n- **Method**: `GET`\n- **Request**: No parameters required.\n- **Response**:\n    ```json\n    {\n      \"message\": \"Welcome to the Bank Subscription Prediction API!\"\n    }\n    ```\n\n---\n\n### **2. POST /predict**\n- **Description**: Predicts whether a customer is likely to subscribe to a long-term deposit based on input data.\n- **Method**: `POST`\n- **Request Body**: Accepts a JSON object with the following fields:\n\n    | **Field**         | **Type**   | **Description**                               | **Example**          |\n    |-------------------|------------|-----------------------------------------------|----------------------|\n    | `age`             | `integer`  | Age of the customer.                          | 30                   |\n    | `job`             | `string`   | Type of job the customer has.                 | \"technician\"         |\n    | `marital`         | `string`   | Marital status of the customer.               | \"single\"             |\n    | `education`       | `string`   | Education level of the customer.              | \"university.degree\"  |\n    | `default`         | `string`   | Whether the customer has credit in default.   | \"no\"                 |\n    | `housing`         | `string`   | Whether the customer has a housing loan.      | \"yes\"                |\n    | `loan`            | `string`   | Whether the customer has a personal loan.     | \"no\"                 |\n    | `contact`         | `string`   | Contact communication type.                   | \"cellular\"           |\n    | `month`           | `string`   | Last contact month of the year.               | \"may\"                |\n    | `day_of_week`     | `string`   | Last contact day of the week.                 | \"mon\"                |\n    | `duration`        | `integer`  | Last contact duration in seconds.             | 300                  |\n    | `campaign`        | `integer`  | Number of contacts during this campaign.      | 1                    |\n    | `pdays`           | `integer`  | Number of days since the client was last contacted. | 999             |\n    | `previous`        | `integer`  | Number of contacts performed before this campaign. | 0                 |\n    | `poutcome`        | `string`   | Outcome of the previous marketing campaign.   | \"nonexistent\"        |\n\n- **Sample Request**:\n    ```json\n    {\n      \"age\": 42,\n      \"job\": \"admin.\",\n      \"marital\": \"single\",\n      \"education\": \"university.degree\",\n      \"default\": \"no\",\n      \"housing\": \"yes\",\n      \"loan\": \"yes\",\n      \"contact\": \"telephone\",\n      \"month\": \"may\",\n      \"day_of_week\": \"wed\",\n      \"duration\": 938.0,\n      \"campaign\": 1,\n      \"pdays\": 999,\n      \"previous\": 0,\n      \"poutcome\": \"nonexistent\"\n    }\n    ```\n\n- **Response**:\n    - A JSON object indicating the prediction result (`yes` or `no`).\n    ```json\n    {\n      \"prediction\": \"yes\"\n    }\n    ```\n\n- **Sample cURL Command**:\n    ```bash\n    curl -X POST \"http://127.0.0.1:8000/predict\" \\\n    -H \"Content-Type: application/json\" \\\n    -d '{\"age\":42,\"job\":\"admin.\",\"marital\":\"single\",\"education\":\"university.degree\",\"default\":\"no\",\"housing\":\"yes\",\"loan\":\"yes\",\"contact\":\"telephone\",\"month\":\"may\",\"day_of_week\":\"wed\",\"duration\":938,\"campaign\":1,\"pdays\":999,\"previous\":0,\"poutcome\":\"nonexistent\"}'\n    ```\n\n### Notes\n- **Data Validation**: The API validates the input data. If any required field is missing or contains invalid data, the API will return an error response.\n- **Default Ports**: The API runs on port `8000` by default. Modify this if necessary by updating the `uvicorn` command.\n\n---\n\n## Features\n\nThe project is built to provide a complete end-to-end solution for predicting customer subscription likelihood. Below are the core features:\n\n- **Data Preprocessing**: \n  - Handles missing values to ensure data integrity.\n  - Encodes categorical features (e.g., job, marital status) into numerical representations for machine learning compatibility.\n  - Scales numerical features (e.g., age, duration, campaign) to normalize the data for better model performance.\n\n- **Model Training**:\n  - Implements two machine learning algorithms: Logistic Regression and Random Forest.\n  - Tunes hyperparameters using GridSearchCV for optimal performance.\n  - Evaluates models using precision, recall, F1-score, and accuracy to select the best-performing model.\n\n- **RESTful API**:\n  - Deploys the selected model using FastAPI to serve predictions in real-time.\n  - Provides user-friendly API endpoints for integration with other systems.\n  - Features automated input validation to ensure robust and reliable API interactions.\n\nThis combination of features ensures that the project is both technically robust and user-friendly, offering valuable insights and predictions to support marketing decisions.\n\n---\n\n## Model Evaluation Results\n\nThis project compares the performance of **Random Forest** and **Logistic Regression** models, tuned using GridSearchCV. Below are the hyperparameters and evaluation metrics for both **Class 0** and **Class 1**:\n\n### Evaluation Metrics\n\n| **Model**              | **Hyperparameters**                                                                                      | **Precision (Class 0)** | **Recall (Class 0)** | **F1-Score (Class 0)** | **Precision (Class 1)** | **Recall (Class 1)** | **F1-Score (Class 1)** | **Accuracy** |\n|-------------------------|---------------------------------------------------------------------------------------------------------|-------------------------|----------------------|-------------------------|-------------------------|----------------------|-------------------------|--------------|\n| **Random Forest**       | `{'criterion': 'gini', 'max_depth': None, 'min_samples_split': 5, 'n_estimators': 50}`                 | 0.93                    | 0.97                 | 0.95                    | 0.51                    | 0.29                 | 0.37                    | 91%          |\n| **Logistic Regression** | `{'penalty': 'l2', 'C': 1, 'max_iter': 100}`                                                           | 0.94                    | 0.97                 | 0.96                    | 0.60                    | 0.37                 | 0.46                    | 92%          |\n\nThe dataset exhibits significant class imbalance, with most examples belonging to Class 0 (no subscription). While both Random Forest and Logistic Regression perform well for Class 0, Logistic Regression outperforms Random Forest for Class 1, achieving higher recall (0.37 vs. 0.29) and F1-Score (0.46 vs. 0.37), making it better suited for identifying potential subscribers. Both models achieve high overall accuracy (Random Forest: 91%, Logistic Regression: 92%), but this is heavily influenced by the class imbalance, emphasizing the importance of evaluating minority class performance beyond accuracy alone.\n\n### Conclusion\n\n**Logistic Regression** is selected as the final model due to its superior performance in predicting the minority class (**Class 1**) while maintaining high performance for the majority class (**Class 0**). This makes it more effective for identifying potential customers likely to subscribe, supporting better marketing decisions.\n\n---\n\n## License\n\nThis project is licensed under the [MIT License](./LICENSE).\n\nYou are free to use, modify, and distribute this project as long as proper attribution is given to the original author. See the `LICENSE` file for more details.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsteveee27%2Fbank-subscription-prediction-fastapi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsteveee27%2Fbank-subscription-prediction-fastapi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsteveee27%2Fbank-subscription-prediction-fastapi/lists"}