{"id":19337222,"url":"https://github.com/martachesnova/advanced-data-storage-and-retrieval--sqlalchemy-flask","last_synced_at":"2026-04-14T06:03:52.972Z","repository":{"id":107617978,"uuid":"396642069","full_name":"martachesnova/Advanced-Data-Storage-and-Retrieval--SQLAlchemy-Flask","owner":"martachesnova","description":"Analysis of Hawaii weather using Python's libraries SQLAlchemy and Matplotlib, and to create a weather API to do a climate analysis based on the data stored in SQLite database.","archived":false,"fork":false,"pushed_at":"2021-10-29T23:49:38.000Z","size":4059,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-06T10:13:39.646Z","etag":null,"topics":["flask","python","sql","sqlalchemy"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/martachesnova.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-16T05:55:26.000Z","updated_at":"2021-11-02T01:36:03.000Z","dependencies_parsed_at":"2023-06-08T15:00:20.351Z","dependency_job_id":null,"html_url":"https://github.com/martachesnova/Advanced-Data-Storage-and-Retrieval--SQLAlchemy-Flask","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martachesnova%2FAdvanced-Data-Storage-and-Retrieval--SQLAlchemy-Flask","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martachesnova%2FAdvanced-Data-Storage-and-Retrieval--SQLAlchemy-Flask/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martachesnova%2FAdvanced-Data-Storage-and-Retrieval--SQLAlchemy-Flask/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martachesnova%2FAdvanced-Data-Storage-and-Retrieval--SQLAlchemy-Flask/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/martachesnova","download_url":"https://codeload.github.com/martachesnova/Advanced-Data-Storage-and-Retrieval--SQLAlchemy-Flask/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240441955,"owners_count":19801793,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["flask","python","sql","sqlalchemy"],"created_at":"2024-11-10T03:13:45.352Z","updated_at":"2026-04-14T06:03:52.940Z","avatar_url":"https://github.com/martachesnova.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Advanced Data Storage \u0026 Retrieval with SQLAlchemy \u0026 Flask  \n\n\nI've decided to treat myself to a long holiday vacation in Honolulu, Hawaii! To help with my trip planning, I needed to do some climate analysis on the area. \n\n![surfs-up.png](Images/surfs-up.png)\n## Step 1 - Climate Analysis and Exploration\n\nUsing Python and SQLAlchemy, I did basic climate analysis and data exploration of the climate database. All of the following analysis was completed using SQLAlchemy ORM queries, Pandas, and Matplotlib.\n\n* Completed the climate analysis and data exploration, using [Climate Notebook](climate.ipynb) and [hawaii.sqlite](Resources/hawaii.sqlite) files.\n\n* Used SQLAlchemy `create_engine` to connect to SQLite database.\n\n* Used SQLAlchemy `automap_base()` to reflect my tables into classes and save a reference to those classes called `Station` and `Measurement`.\n\n* Linked Python to the database by creating an SQLAlchemy session.\n\n### Precipitation Analysis\n\n* Started by finding the most recent date in the data set.\n\n* Using this date, retrieved the last 12 months of precipitation data by querying the 12 preceding months of data. \n\n* Selected only the `date` and `prcp` values.\n\n* Loaded the query results into a Pandas DataFrame and set the index to the date column.\n\n* Sorted the DataFrame values by `date`.\n\n* Plotted the results using the DataFrame `plot` method.\n\n![precipitation](Images/precipitation.png)\n\n* Used Pandas to print the summary statistics for the precipitation data.\n\n### Station Analysis\n\n* Designed a query to calculate the total number of stations in the dataset.\n\n* Designed a query to find the most active stations (i.e. which stations have the most rows?).\n\n  * Listed the stations and observation counts in descending order.\n\n  * Found which station id has the highest number of observations.\n\n  * Using the most active station id, calculated the lowest, highest, and average temperature.\n\n  * Used functions such as `func.min`, `func.max`, `func.avg`, and `func.count` in my queries.\n\n* Designed a query to retrieve the last 12 months of temperature observation data (TOBS).\n\n  * Filtered by the station with the highest number of observations.\n\n  * Queried the last 12 months of temperature observation data for this station.\n\n  * Plotted the results as a histogram with `bins=12`.\n\n  ![station-histogram](Images/station-histogram.png)\n\n* Closed out my session. \n\n- - -\n\n## Step 2 - Climate App\n\nNow that the initial analysis is completed, I designed a Flask API based on the queries that I have just developed.\n\n* Used Flask to create my routes.\n\n### Routes\n\n* `/`\n\n  * Home page.\n\n  * Listed all routes that are available.\n\n* `/api/v1.0/precipitation`\n\n  * Converted the query results to a dictionary using `date` as the key and `prcp` as the value.\n\n  * Returned the JSON representation of my dictionary.\n\n* `/api/v1.0/stations`\n\n  * Returned a JSON list of stations from the dataset.\n\n* `/api/v1.0/tobs`\n  * Queried the dates and temperature observations of the most active station for the last year of data.\n\n  * Returned a JSON list of temperature observations (TOBS) for the previous year.\n\n* `/api/v1.0/\u003cstart\u003e` and `/api/v1.0/\u003cstart\u003e/\u003cend\u003e`\n\n  * Returned a JSON list of the minimum temperature, the average temperature, and the max temperature for a given start or start-end range.\n\n  * When given the start only, calculated `TMIN`, `TAVG`, and `TMAX` for all dates greater than and equal to the start date.\n\n  * When given the start and the end date, calculated the `TMIN`, `TAVG`, and `TMAX` for dates between the start and end date inclusive.\n\n- - -\n\n## Other Analyses\n\n### Temperature Analysis I\n\n* Hawaii is reputed to enjoy mild weather all year. Is there a meaningful difference between the temperature in, for example, June and December?\n\n* Used pandas to perform this, see my [Temperature Analysis I](temp_analysis_part_1.ipynb).\n\n  * Converted the date column format from string to datetime.\n\n  * Set the date column as the DataFrame index.\n\n  * Dropped the date column.\n\n* Identified the average temperature in June at all stations across all available years in the dataset. Did the same for December temperature.\n\n* Used the t-test to determine whether the difference in the means, if any, is statistically significant.\n\n### Temperature Analysis II\n\n* I was looking to take a trip from August 1st to August 7th of this year, but was worried that the weather would be less than ideal. Using historical data in the dataset I found out what the temperature has previously looked like.\n\n* The [Temperature Analysis II](temp_analysis_bonus_part_2.ipynb) contains a function called `calc_temps` that accepts a start date and end date in the format `%Y-%m-%d`. The function returns the minimum, average, and maximum temperatures for that range of dates.\n\n* Used the `calc_temps` function to calculate the min, avg, and max temperatures for my trip using the matching dates from a previous year (i.e., use \"2017-08-01\").\n\n* Plotted the min, avg, and max temperature from my previous query as a bar chart.\n\n  * Used \"Trip Avg Temp\" as the title.\n\n  * Used the average temperature as the bar height (y value).\n\n  * Used the peak-to-peak (TMAX-TMIN) value as the y error bar (YERR).\n\n  ![temperature](Images/temperature.png)\n\n### Daily Rainfall Average\n\n* Now that I have an idea of the temperature, let's check to see what the rainfall has been, I wouldn't want it to rain the whole time!\n\n* Calculated the rainfall per weather station using the previous year's matching dates.\n\n  * Sorted this in descending order by precipitation amount and list the station, name, latitude, longitude, and elevation.\n\n\n### Daily Temperature Normals\n\n* Calculated the daily normals for the duration of my trip. Normals were the averages for the min, avg, and max temperatures. Function called `daily_normals` calculates the daily normals for a specific date. This date string is in the format `%m-%d`. Used all historic TOBS that match that date string.\n\n  * Set the start and end date of the trip.\n\n  * Used the date to create a range of dates.\n\n  * Stripped off the year and save a list of strings in the format `%m-%d`.\n\n  * Used the `daily_normals` function to calculate the normals for each date string and appended the results to a list called `normals`.\n\n* Loaded the list of daily normals into a Pandas DataFrame and set the index equal to the date.\n\n* Used Pandas to plot an area plot (`stacked=False`) for the daily normals.\n\n![daily-normals](Images/daily-normals.png)\n\n* Closed out my session.\n\n- - -\n\n## References\n\nMenne, M.J., I. Durre, R.S. Vose, B.E. Gleason, and T.G. Houston, 2012: An overview of the Global Historical Climatology Network-Daily Database. Journal of Atmospheric and Oceanic Technology, 29, 897-910, [https://doi.org/10.1175/JTECH-D-11-00103.1](https://doi.org/10.1175/JTECH-D-11-00103.1)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartachesnova%2Fadvanced-data-storage-and-retrieval--sqlalchemy-flask","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmartachesnova%2Fadvanced-data-storage-and-retrieval--sqlalchemy-flask","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartachesnova%2Fadvanced-data-storage-and-retrieval--sqlalchemy-flask/lists"}