Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/keerthanapalanikumar/data-cleaning-on-sql

This repository contains SQL scripts and documentation for cleaning and standardizing data in the NashvilleHousing table within the sqlproject2 database. The project aims to prepare the dataset for analysis by addressing inconsistencies, filling missing values, standardizing formats, and removing duplicates.
https://github.com/keerthanapalanikumar/data-cleaning-on-sql

data-cleaning data-deduplication data-manipulation data-standardization database-management mssql ssms

Last synced: about 1 month ago
JSON representation

This repository contains SQL scripts and documentation for cleaning and standardizing data in the NashvilleHousing table within the sqlproject2 database. The project aims to prepare the dataset for analysis by addressing inconsistencies, filling missing values, standardizing formats, and removing duplicates.

Awesome Lists containing this project

README

        

# SQL Project 2: Nashville Housing Data Cleaning

This repository contains SQL scripts and documentation for cleaning and standardizing data in the `NashvilleHousing` table within the `sqlproject2` database. The project aims to prepare the dataset for analysis by addressing inconsistencies, filling missing values, standardizing formats, and removing duplicates.

## Key Features

- **Database Creation**: Initializes the `sqlproject2` database.
- **Data Standardization**: Converts date formats and standardizes field values.
- **Address Processing**: Splits combined address fields into separate columns for easier analysis.
- **Data Deduplication**: Identifies and removes duplicate records to ensure data integrity.
- **Column Cleanup**: Removes unused columns to streamline the dataset.

## Usage

1. **Setup**: Create and populate the `NashvilleHousing` table in the `sqlproject2` database.
2. **Execution**: Run the provided SQL scripts in SQL Server Management Studio (SSMS) to clean the data.
3. **Verification**: Review the final cleaned dataset to confirm the changes.

## Documentation

- **README**: Provides an overview of the project, step-by-step instructions, and usage guidelines.
- **SQL Scripts**: Contains the SQL commands for each data cleaning step, including comments for clarity.