{"id":31054510,"url":"https://github.com/4techsadiq/sql-data-warehouse-project","last_synced_at":"2026-02-12T08:32:02.017Z","repository":{"id":312514474,"uuid":"1047748348","full_name":"4TechSadiq/sql-data-warehouse-project","owner":"4TechSadiq","description":"SQL data warehouse project","archived":false,"fork":false,"pushed_at":"2025-09-29T06:51:11.000Z","size":1103,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-29T08:27:46.242Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TSQL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/4TechSadiq.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-31T06:12:48.000Z","updated_at":"2025-09-29T06:51:14.000Z","dependencies_parsed_at":"2025-09-29T08:21:31.825Z","dependency_job_id":"7f93bd3b-d8eb-45c5-ab09-ce3560092c6d","html_url":"https://github.com/4TechSadiq/sql-data-warehouse-project","commit_stats":null,"previous_names":["4techsadiq/sql-data-warehouse-project"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/4TechSadiq/sql-data-warehouse-project","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/4TechSadiq%2Fsql-data-warehouse-project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/4TechSadiq%2Fsql-data-warehouse-project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/4TechSadiq%2Fsql-data-warehouse-project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/4TechSadiq%2Fsql-data-warehouse-project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/4TechSadiq","download_url":"https://codeload.github.com/4TechSadiq/sql-data-warehouse-project/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/4TechSadiq%2Fsql-data-warehouse-project/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29361818,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-12T01:03:07.613Z","status":"online","status_checked_at":"2026-02-12T02:00:06.911Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-15T04:00:06.430Z","updated_at":"2026-02-12T08:32:02.003Z","avatar_url":"https://github.com/4TechSadiq.png","language":"TSQL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CRM \u0026 ERP Data Warehouse Project\n\n## 📊 Overview\nThis project implements a comprehensive data warehouse solution for Customer Relationship Management (CRM) and Enterprise Resource Planning (ERP) datasets using SQL Server. The solution follows industry best practices including the Medallion Architecture pattern for data processing and standardized naming conventions for maintainable and scalable data infrastructure.\n\n## 🏗️ Architecture\n\n### Medallion Architecture Implementation\nThe project follows the **Medallion Architecture** (Bronze-Silver-Gold) pattern:\n\n- **Bronze Layer (Raw Data)**: Contains raw, unprocessed data ingested from CSV files\n- **Silver Layer (Cleaned Data)**: Cleaned and standardized data with basic transformations\n- **Gold Layer (Business-Ready Data)**: Aggregated, business-ready data optimized for analytics and reporting\n\n```\nRaw Data (CSV) → Bronze Layer → Silver Layer → Gold Layer → Analytics \u0026 Reporting\n```\n\n## 📁 Project Structure\n```\n├── schemas/\n│   ├── bronze_schema.sql\n│   ├── silver_schema.sql\n│   └── gold_schema.sql\n├── stored_procedures/\n│   ├── data_ingestion/\n│   ├── data_transformation/\n│   └── data_loading/\n├── views/\n│   ├── analytical_views/\n│   └── reporting_views/\n├── data_pipeline/\n│   └── pipeline_design.drawio\n├── documentation/\n│   ├── data_dictionary.md\n│   ├── transformation_rules.md\n│   └── business_requirements.md\n└── sample_data/\n    ├── crm_data.csv\n    └── erp_data.csv\n```\n\n## 🎯 Key Features\n\n### Data Processing Pipeline\n- **Automated CSV Data Ingestion**: Stored procedures for loading data from CSV files\n- **Data Quality Assurance**: Comprehensive data cleaning and validation processes\n- **Business Logic Implementation**: Transformations aligned with business requirements\n- **Incremental Data Loading**: Efficient data processing with change data capture\n\n### Database Design\n- **Standardized Naming Conventions**: Consistent table, column, and schema naming patterns\n- **Optimized Table Structures**: Proper indexing and partitioning strategies\n- **Data Integrity**: Comprehensive constraints and referential integrity\n- **Performance Optimization**: Query optimization and efficient data access patterns\n\n### Analytics \u0026 Reporting\n- **Pre-built Analytical Views**: Ready-to-use views for common business queries\n- **Reporting Infrastructure**: Optimized views for BI tools and reporting platforms\n- **Data Mart Structure**: Subject-specific data marts for different business domains\n\n## 🛠️ Technologies Used\n- **Database**: Microsoft SQL Server\n- **Data Processing**: T-SQL, Stored Procedures\n- **Pipeline Design**: Draw.io for visual pipeline documentation\n- **Version Control**: Git\n- **Documentation**: Markdown\n\n## 📋 Database Schema\n\n### Naming Convention Standards\n- **Schemas**: `{layer}_{domain}` (e.g., `bronze_crm`, `silver_erp`, `gold_analytics`)\n- **Tables**: `{entity_name}_{table_type}` (e.g., `customer_dim`, `sales_fact`)\n- **Columns**: `snake_case` with descriptive names\n- **Stored Procedures**: `sp_{action}_{entity}` (e.g., `sp_load_customer_data`)\n- **Views**: `vw_{purpose}_{entity}` (e.g., `vw_sales_summary`)\n\n### Key Entities\n- **CRM Data**: Customers, Contacts, Opportunities, Sales Activities\n- **ERP Data**: Products, Orders, Inventory, Financial Transactions\n- **Integrated Views**: Customer 360, Sales Performance, Inventory Analysis\n\n## 🔧 Installation \u0026 Setup\n\n### Prerequisites\n- SQL Server 2017 or later\n- SQL Server Management Studio (SSMS)\n- Appropriate permissions for database creation and data loading\n\n### Setup Instructions\n\n1. **Clone the Repository**\n   ```bash\n   git clone https://github.com/yourusername/crm-erp-datawarehouse.git\n   cd crm-erp-datawarehouse\n   ```\n\n2. **Database Setup**\n   ```sql\n   -- Execute schema creation scripts\n   EXEC sp_executesql @sql = 'CREATE DATABASE CRM_ERP_DataWarehouse'\n   USE CRM_ERP_DataWarehouse\n   \n   -- Run schema scripts in order\n   -- 1. Bronze layer schemas\n   -- 2. Silver layer schemas  \n   -- 3. Gold layer schemas\n   ```\n\n3. **Deploy Stored Procedures**\n   ```sql\n   -- Execute all stored procedure scripts\n   -- Located in /stored_procedures/ directory\n   ```\n\n4. **Create Views**\n   ```sql\n   -- Execute view creation scripts\n   -- Located in /views/ directory\n   ```\n\n## 📊 Data Pipeline\n\nThe data pipeline is visually documented using Draw.io and includes:\n\n1. **Data Ingestion**: CSV file processing and validation\n2. **Data Transformation**: Cleaning, standardization, and business rule application\n3. **Data Loading**: Efficient loading into respective layers\n4. **Data Quality Checks**: Automated validation and error handling\n5. **Reporting Layer**: View creation and optimization\n\n### Pipeline Flow\n```\nCSV Files → Data Validation → Bronze Layer → Data Cleaning → Silver Layer → Business Logic → Gold Layer → Analytics Views\n```\n\n## 📈 Usage Examples\n\n### Loading Data from CSV\n```sql\n-- Load customer data from CSV\nEXEC sp_load_customer_data @file_path = 'C:\\data\\customers.csv'\n\n-- Load sales data from CSV\nEXEC sp_load_sales_data @file_path = 'C:\\data\\sales.csv'\n```\n\n### Analytical Queries\n```sql\n-- Customer 360 view\nSELECT * FROM gold_analytics.vw_customer_360 \nWHERE customer_status = 'Active'\n\n-- Sales performance analysis\nSELECT * FROM gold_analytics.vw_sales_performance\nWHERE sales_date \u003e= '2024-01-01'\n```\n\n## 📝 Documentation\n\n### Available Documentation\n- **Data Dictionary**: Complete field definitions and business rules\n- **Transformation Rules**: Detailed data transformation logic\n- **Business Requirements**: Original requirements and implementation mapping\n- **Code Documentation**: Inline comments and procedure documentation\n\n## 🔍 Data Quality \u0026 Governance\n\n### Data Quality Measures\n- **Data Validation**: Input validation and constraint checking\n- **Duplicate Detection**: Automated duplicate identification and handling\n- **Data Completeness**: Missing value identification and treatment\n- **Referential Integrity**: Cross-table relationship validation\n\n### Monitoring \u0026 Logging\n- **ETL Logging**: Comprehensive process logging and error tracking\n- **Performance Monitoring**: Query performance and resource utilization tracking\n- **Data Lineage**: Complete data transformation tracking\n\n## 🚀 Future Enhancements\n- [ ] Real-time data streaming integration\n- [ ] Machine learning model integration\n- [ ] Advanced analytics capabilities\n- [ ] Cloud migration (Azure SQL Database)\n- [ ] API development for external access\n- [ ] Advanced security implementation\n\n## 🤝 Contributing\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## 📄 License\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 📞 Contact\n- **Project Maintainer**: [Mohammed Sadiq Ali]\n- **Email**: [sadikcp2014@gmail.com]\n- **LinkedIn**: [https://www.linkedin.com/in/moh-sadiq-ali/]\n- **GitHub**: [https://github.com/4TechSadiq]\n\n## 🙏 Acknowledgments\n- Thanks to the business stakeholders for providing clear requirements\n- Appreciation for the data engineering community for best practices\n- Recognition of open-source tools and resources used in this project\n\n---\n\n**Note**: This project demonstrates enterprise-level data warehousing practices and can serve as a template for similar CRM/ERP integration projects.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F4techsadiq%2Fsql-data-warehouse-project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F4techsadiq%2Fsql-data-warehouse-project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F4techsadiq%2Fsql-data-warehouse-project/lists"}