https://github.com/anas436/analyzing-real-world-data-set-with-sqlite3-and-sqlmagic-using-python
https://github.com/anas436/analyzing-real-world-data-set-with-sqlite3-and-sqlmagic-using-python
csv jupyter-notebook matplotlib pandas python3 seaborn sql-magic sqlite3
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/anas436/analyzing-real-world-data-set-with-sqlite3-and-sqlmagic-using-python
- Owner: Anas436
- Created: 2022-05-31T16:43:44.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-05-31T16:48:50.000Z (almost 4 years ago)
- Last Synced: 2025-02-01T15:31:02.674Z (about 1 year ago)
- Topics: csv, jupyter-notebook, matplotlib, pandas, python3, seaborn, sql-magic, sqlite3
- Language: Jupyter Notebook
- Homepage:
- Size: 19.5 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Analyzing-Real-World-Data-Set-with-SQLite3-and-SQLMagic-using-Python
The city of Chicago released a dataset of socioeconomic data to the Chicago City Portal. This dataset contains a selection of six socioeconomic indicators of public health significance and a “hardship index,” for each Chicago community area, for the years 2008 – 2012.
Scores on the hardship index can range from 1 to 100, with a higher index number representing a greater level of hardship.
A detailed description of the dataset can be found on the city of Chicago's website, but to summarize, the dataset has the following variables:
Community Area Number (ca): Used to uniquely identify each row of the dataset
Community Area Name (community_area_name): The name of the region in the city of Chicago
Percent of Housing Crowded (percent_of_housing_crowded): Percent of occupied housing units with more than one person per room
Percent Households Below Poverty (percent_households_below_poverty): Percent of households living below the federal poverty line
Percent Aged 16+ Unemployed (percent_aged_16_unemployed): Percent of persons over the age of 16 years that are unemployed
Percent Aged 25+ without High School Diploma (percent_aged_25_without_high_school_diploma): Percent of persons over the age of 25 years without a high school education
Percent Aged Under 18 or Over 64:Percent of population under 18 or over 64 years of age (percent_aged_under_18_or_over_64): (ie. dependents)
Per Capita Income (per_capita_income_): Community Area per capita income is estimated as the sum of tract-level aggragate incomes divided by the total population
Hardship Index (hardship_index): Score that incorporates each of the six selected socioeconomic indicators
In this Lab, we'll take a look at the variables in the socioeconomic indicators dataset and do some basic analysis with Python.
Connect to the database
Let us first load the SQL extension and establish a connection with the database
The syntax for connecting to magic sql using sqllite is
%sql sqlite://DatabaseName
where DatabaseName will be your .db files