{"id":30304803,"url":"https://github.com/ynstf/boston-housing-api","last_synced_at":"2025-08-17T07:33:40.718Z","repository":{"id":293633539,"uuid":"984659447","full_name":"ynstf/boston-housing-api","owner":"ynstf","description":null,"archived":false,"fork":false,"pushed_at":"2025-05-16T09:48:00.000Z","size":9,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-16T10:42:40.256Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ynstf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-16T09:41:57.000Z","updated_at":"2025-05-16T09:48:03.000Z","dependencies_parsed_at":"2025-05-16T10:42:44.357Z","dependency_job_id":"f27de4bc-5dd4-4567-8af9-5cf003c52399","html_url":"https://github.com/ynstf/boston-housing-api","commit_stats":null,"previous_names":["ynstf/boston-housing-api"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ynstf/boston-housing-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ynstf%2Fboston-housing-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ynstf%2Fboston-housing-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ynstf%2Fboston-housing-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ynstf%2Fboston-housing-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ynstf","download_url":"https://codeload.github.com/ynstf/boston-housing-api/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ynstf%2Fboston-housing-api/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270820675,"owners_count":24651515,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-17T02:00:09.016Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-08-17T07:33:40.132Z","updated_at":"2025-08-17T07:33:40.704Z","avatar_url":"https://github.com/ynstf.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Boston Housing API\n\n## 1. Introduction\n\nThis report documents the development of a Boston Housing Predictor API, a web service that predicts housing prices based on the Boston Housing dataset. The project integrates machine learning with a web API framework to provide real-time housing price predictions and recommendations.\n\nThe application is built using FastAPI as the web framework, SQLite for data persistence, and scikit-learn for the machine learning capabilities. The API exposes endpoints for creating new housing entries, retrieving housing data, making price predictions, and receiving housing recommendations based on target prices.\n\nKey features of the Boston Housing API include:\n- Data storage and retrieval for Boston housing information\n- ML-powered price prediction from housing attributes\n- House recommendation system based on price similarity\n- RESTful API interface with automatic documentation\n\n## 2. Regression Model\n\n### 2.1 Dataset Overview\n\nThe project uses the Boston Housing dataset, a well-known dataset in machine learning containing information about housing in various suburbs of Boston. The dataset includes the following features used in our implementation:\n\n- `rm`: Average number of rooms per dwelling\n- `lstat`: Percentage of lower status population\n- `dis`: Weighted distances to employment centers\n- `tax`: Property tax rate\n- `ptratio`: Pupil-teacher ratio\n- `age`: Proportion of owner-occupied units built prior to 1940\n- `indus`: Proportion of non-retail business acres per town\n- `medv`: Median value of owner-occupied homes in $1000s (target variable)\n\n### 2.2 Model Development\n\nThe machine learning model was developed using scikit-learn's pipeline architecture, which combines preprocessing steps with the regression algorithm. The pipeline includes:\n\n1. **Data Splitting**: The dataset was split into training (80%) and testing (20%) sets.\n2. **Feature Engineering**: The pipeline includes:\n   - Feature scaling using StandardScaler\n   - Polynomial feature generation (degree=2) to capture non-linear relationships\n3. **Regression Model**: Ridge Regression with alpha=10 to prevent overfitting\n\n```python\n# Build a pipeline: scaling -\u003e polynomial features -\u003e ridge regression\npipeline = Pipeline([\n    ('scaler', StandardScaler()),\n    ('poly', PolynomialFeatures(degree=2, include_bias=False)),\n    ('model', Ridge(alpha=10))\n])\n```\n\n### 2.3 Model Performance\n\nThe model achieved significantly better performance than baseline approaches:\n\n| Metric | Baseline Model | Improved Model |\n|--------|--------------|---------------|\n| R² Score | 0.639 | 0.821 |\n| MAPE | 18.0% | 12.2% |\n| RMSE | N/A | 3.618 k$ |\n\nWith a Mean Absolute Percentage Error (MAPE) of 12.2%, the model achieves a regression accuracy of 87.8%, making it reliable for housing price predictions.\n\n### 2.4 Model Deployment\n\nThe trained pipeline was serialized using joblib and saved as `pipeline.pkl` for use in the FastAPI application:\n\n```python\nwith open('pipeline.pkl', 'wb') as f:\n    pickle.dump(pipeline, f, protocol=pickle.HIGHEST_PROTOCOL)\n```\n\n## 3. UML Diagrams\n\n### 3.1 Class Diagram\n\nBased on the project's implementation, the following class diagram represents the core components of the Boston Housing API:\n\n```mermaid\nclassDiagram\n    %% Database ORM Model\n    class Home {\n      +Integer id\n      +Float rm\n      +Float lstat\n      +Float dis\n      +Float tax\n      +Float ptratio\n      +Float age\n      +Float indus\n      +Float medv\n    }\n\n    %% Pydantic Schemas\n    class HomeBase {\n      +Float rm\n      +Float lstat\n      +Float dis\n      +Float tax\n      +Float ptratio\n      +Float age\n      +Float indus\n    }\n    class HomeCreate {\n      +Float medv\n      +create_home(home: HomeCreate)\n    }\n    class HomeOut {\n      +Integer id\n      +Float medv\n    }\n    class Prediction {\n      +Float predicted_price_dh\n      +predict_price(home: HomeBase)\n    }\n\n    %% FastAPI App and Endpoints\n    class FastAPIApp {\n      +read_root()\n      +create_home(home: HomeCreate)\n      +list_homes(skip: int, limit: int)\n      +predict_price(home: HomeBase)\n      +recommendation(price: float, limit: int)\n    }\n\n    %% Relationships\n    Home \u003c|-- HomeOut\n    HomeBase \u003c|-- HomeCreate\n    HomeBase \u003c|-- Prediction\n    HomeCreate --\u003e Home : creates\n    HomeOut --\u003e Home : reads\n    Prediction ..\u003e HomeBase : input schema\n    FastAPIApp o-- Home : uses\n    FastAPIApp o-- HomeCreate : input\n    FastAPIApp o-- HomeOut : output\n    FastAPIApp o-- Prediction : output\n\n    %% Recommendation logic uses FUNC\n    class Recommendation {\n      +recommendation(price: float, limit: int)\n    }\n    FastAPIApp --\u003e Recommendation : provides endpoint\n```\n\nThis diagram shows the core components of the application:\n- Data models (SQLAlchemy and Pydantic)\n- API controllers (FastAPI routes)\n- ML prediction model\n- Database dependency\n\n### 3.2 Sequence Diagram for Price Prediction\n\n```mermaid\nsequenceDiagram\n    participant Client\n    participant FastAPI\n    participant PredictionModel\n    \n    Client-\u003e\u003eFastAPI: POST /predict/ (with housing features)\n    FastAPI-\u003e\u003ePredictionModel: Predict price\n    PredictionModel--\u003e\u003eFastAPI: Return prediction value\n    FastAPI-\u003e\u003eFastAPI: Convert to dirham\n    FastAPI--\u003e\u003eClient: Return prediction response\n```\n\n### 3.3 Sequence Diagram for House Recommendation\n\n```mermaid\nsequenceDiagram\n    participant Client\n    participant FastAPI\n    participant Database\n    \n    Client-\u003e\u003eFastAPI: GET /recommendation/ (with target price)\n    FastAPI-\u003e\u003eFastAPI: Convert price to thousands of dollars\n    FastAPI-\u003e\u003eDatabase: Query houses closest to target price\n    Database--\u003e\u003eFastAPI: Return matching houses\n    FastAPI--\u003e\u003eClient: Return recommendations response\n```\n\n## 4. FastAPI Technology and Built-in Functionality\n\n### 4.1 FastAPI Overview\n\nFastAPI is a modern, high-performance web framework for building APIs with Python based on standard Python type hints. Key advantages that made it suitable for this project include:\n\n1. **Performance**: FastAPI is built on Starlette and Pydantic, making it one of the fastest Python frameworks available.\n2. **Automatic Documentation**: FastAPI generates interactive API documentation (Swagger UI and ReDoc) automatically.\n3. **Data Validation**: Using Pydantic models, FastAPI validates request data automatically.\n4. **Dependency Injection**: The framework provides a clean way to inject dependencies like database sessions.\n5. **Modern Python Features**: FastAPI leverages Python 3.6+ features including type hints and async/await.\n\n### 4.2 Key FastAPI Features Used in the Project\n\n#### 4.2.1 Pydantic Models for Data Validation\n\nThe project uses Pydantic models to validate incoming request data and define response schemas:\n\n```python\nclass HomeBase(BaseModel):\n    rm: float\n    lstat: float\n    dis: float\n    tax: float\n    ptratio: float\n    age: float\n    indus: float\n\nclass HomeCreate(HomeBase):\n    medv: float  # Include actual median value when creating\n\nclass HomeOut(HomeBase):\n    id: int\n    medv: float\n    class Config:\n        orm_mode = True\n```\n\n#### 4.2.2 Dependency Injection\n\nThe project uses FastAPI's dependency injection system to manage database sessions:\n\n```python\ndef get_db():\n    db = SessionLocal()\n    try:\n        yield db\n    finally:\n        db.close()\n```\n\n#### 4.2.3 Path Operations (Route Handlers)\n\nThe API exposes several endpoints defined as path operation functions:\n\n```python\n@app.get(\"/homes/\", response_model=list[HomeOut])\ndef list_homes(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):\n    return db.query(Home).offset(skip).limit(limit).all()\n```\n\n#### 4.2.4 Automatic Documentation\n\nFastAPI automatically generates interactive API documentation using Swagger UI and ReDoc:\n\n- Swagger UI available at `/docs`\n- ReDoc available at `/redoc`\n\nThis documentation includes:\n- Request and response schemas\n- Example request bodies\n- Available query parameters\n- HTTP status codes\n- Authentication requirements (if any)\n\n## 5. Database and API Endpoints\n\n### 5.1 Database Structure\n\nThe application uses SQLite database with SQLAlchemy as the ORM. The database consists of a single table:\n\n**Table: homes**\n\n| Column | Type | Description |\n|--------|------|-------------|\n| id | Integer | Primary key |\n| rm | Float | Average number of rooms |\n| lstat | Float | % lower status of the population |\n| dis | Float | Weighted distances to employment centers |\n| tax | Float | Property tax rate |\n| ptratio | Float | Pupil-teacher ratio |\n| age | Float | Proportion of owner-occupied units built prior to 1940 |\n| indus | Float | Proportion of non-retail business acres |\n| medv | Float | Median value in $1000s |\n\n### 5.2 API Endpoints\n\nThe API exposes the following endpoints:\n\n#### 5.2.1 Root Endpoint\n\n```\nGET /\n```\n\n**Description**: Returns a welcome message.  \n**Response**:\n```json\n{\n  \"message\": \"Welcome to the Boston Housing Predictor API\"\n}\n```\n\n#### 5.2.2 Create Home\n\n```\nPOST /homes/\n```\n\n**Description**: Creates a new home entry in the database.  \n**Request Body**:\n```json\n{\n  \"rm\": 6.5,\n  \"lstat\": 4.98,\n  \"dis\": 6.0,\n  \"tax\": 296,\n  \"ptratio\": 15.3,\n  \"age\": 65.2,\n  \"indus\": 2.31,\n  \"medv\": 24.0\n}\n```\n**Response**: The created home object with its ID.\n\n#### 5.2.3 List Homes\n\n```\nGET /homes/\n```\n\n**Description**: Returns a list of homes from the database.  \n**Query Parameters**:\n- `skip` (int, default: 0): Number of records to skip\n- `limit` (int, default: 100): Maximum number of records to return\n**Response**: Array of home objects.\n\n#### 5.2.4 Predict Price\n\n```\nPOST /predict/\n```\n\n**Description**: Predicts the price of a home based on its features.  \n**Request Body**:\n```json\n{\n  \"rm\": 6.5,\n  \"lstat\": 4.98,\n  \"dis\": 6.0,\n  \"tax\": 296,\n  \"ptratio\": 15.3,\n  \"age\": 65.2,\n  \"indus\": 2.31\n}\n```\n**Response**:\n```json\n{\n  \"predicted_price_dh\": 259600.0\n}\n```\nNote: The response is in dirhams (original prediction in $1000s * 1000 * 10).\n\n#### 5.2.5 Get Recommendations\n\n```\nGET /recommendation/\n```\n\n**Description**: Returns homes with values closest to the specified price.  \n**Query Parameters**:\n- `price` (float): Target price in dirhams\n- `limit` (int, default: 20): Maximum number of recommendations\n**Response**: Array of home objects ordered by price similarity.\n\n### 5.3 Request and Response Flow\n\nFor each API endpoint, the request and response flow follows this pattern:\n\n1. **Request Validation**: FastAPI validates incoming requests against Pydantic models\n2. **Database Interaction**: The API interacts with the SQLite database via SQLAlchemy ORM\n3. **Model Prediction** (for `/predict/`): The API uses the loaded scikit-learn model\n4. **Response Serialization**: FastAPI serializes the response using Pydantic models\n\n## 6. Project Structure\n\nThe project is organized with the following structure:\n\n```\nboston-housing-api/\n├── homes.db               # SQLite database\n├── load_data.py           # Script to load data from CSV to database\n├── main.py                # FastAPI application\n├── pipeline.pkl           # Serialized ML model\n├── README.md              # Project documentation\n└── requirements.txt       # Project dependencies\n```\n\n### Component Description:\n\n- **homes.db**: SQLite database storing the Boston housing data\n- **load_data.py**: Script that downloads the Boston housing dataset, filters required columns, and inserts records into the database\n- **main.py**: Main FastAPI application defining models, routes, and API logic\n- **pipeline.pkl**: Serialized scikit-learn pipeline containing preprocessing steps and Ridge regression model\n- **requirements.txt**: Project dependencies for easy installation\n\n## 7. Conclusion\n\nThe Boston Housing API project successfully integrates machine learning with a modern web API framework to create a practical housing price prediction service. The combination of FastAPI, SQLAlchemy, and scikit-learn provides a powerful, efficient, and developer-friendly solution.\n\nKey achievements of the project include:\n- Development of a high-performance regression model with 87.8% accuracy\n- Creation of a RESTful API with automatic documentation\n- Implementation of a recommendation system based on price similarity\n- Integration of data persistence using SQLAlchemy and SQLite\n\n### Frontend Work\n\nThe API endpoints developed in this project provide a solid backend foundation ready to be integrated with frontend applications. An Angular-based frontend could leverage these endpoints to create an interactive housing price prediction and recommendation system for end users.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fynstf%2Fboston-housing-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fynstf%2Fboston-housing-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fynstf%2Fboston-housing-api/lists"}