https://github.com/msaakaash/hospital-data-warehouse
A data warehouse project designed to demonstrate SQL and data modeling skills.
https://github.com/msaakaash/hospital-data-warehouse
etl-pipeline python sql
Last synced: 12 months ago
JSON representation
A data warehouse project designed to demonstrate SQL and data modeling skills.
- Host: GitHub
- URL: https://github.com/msaakaash/hospital-data-warehouse
- Owner: msaakaash
- License: mit
- Created: 2025-05-10T12:47:43.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-03T16:52:57.000Z (about 1 year ago)
- Last Synced: 2025-06-04T02:14:09.487Z (about 1 year ago)
- Topics: etl-pipeline, python, sql
- Language: SQL
- Homepage:
- Size: 1.04 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: docs/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: docs/CODE_OF_CONDUCT.md
- Security: docs/SECURITY.md
Awesome Lists containing this project
README
# Hospita Data Warehouse
A data warehouse project designed to demonstrate SQL and data modeling skills. It follows a star schema structure to track and analyze hospital visit records, patient data, diagnoses, and readmission trends.
## Project Objectives
- Design a dimensional model for healthcare data.
- Simulate hospital visit data using a **star schema**.
- Build a foundational warehouse for advanced analytics.
- Write SQL queries to derive insights on patient behavior, doctor performance, and readmission rates.
## Diagram

## Phases
- Phase 1 – Schema Design & Setup
- Phase 2 – Sample Data Ingestion
- Phase 3 – Query Building & Insights
- Phase 4 – BI Dashboard & Deployment
- Phase 5 – ETL pipeline
- Phase 6 - Analysis
## Star Schema Design
### Fact Table: `fact_visits`
Captures core metrics related to patient hospital visits.
| Column | Type | Description |
|-----------------|--------------|---------------------------------|
| visit_id | SERIAL (PK) | Unique visit identifier |
| patient_id | INT (FK) | Reference to the patient |
| doctor_id | INT (FK) | Reference to the doctor |
| department_id | INT (FK) | Department involved |
| admit_date | DATE | Date of admission |
| discharge_date | DATE | Date of discharge |
| diagnosis_code | VARCHAR(50) | Code of diagnosis |
| cost | DECIMAL | Total cost of treatment |
| readmitted | BOOLEAN | Readmission flag |
### Dimension Tables
#### `dim_patients`
Stores patient demographic information.
#### `dim_doctors`
Stores doctor details and specialization.
#### `dim_departments`
Stores department-level metadata.
#### `dim_diagnosis`
Stores diagnosis code, description, and category.
## Sample Use Cases
- Track average length of stay by department.
- Analyze readmission rates by diagnosis.
- Rank doctors by total cost billed.
## Tools Used
- PostgreSQL (or MySQL)
- SQL
- ER Diagram Tools (dbdiagram.io / DrawSQL)
- CSV (for loading sample data)
## Contributing
Contributions are welcome! Please see [CONTRIBUTING.md](docs/CONTRIBUTING.md) for guidelines.
## Code of Conduct
Please read our [Code of Conduct](docs/CODE_OF_CONDUCT.md) before contributing to this project.
## Security
If you discover a vulnerability, please refer to our [Security Policy](docs/SECURITY.md) for instructions on how to report it responsibly.
## License
This project is licensed under the [MIT License](LICENSE).
## Author
[**Aakaash M S**](https://github.com/msaakaash)