{"id":25368836,"url":"https://github.com/jbalooshie/pyber_analysis","last_synced_at":"2026-05-12T07:41:26.235Z","repository":{"id":193947649,"uuid":"327443768","full_name":"jbalooshie/PyBer_Analysis","owner":"jbalooshie","description":"Analysis of ride share data using Matplotlib and pandas, executed in Jupyter Notebook. Breakdowns are provided based on the city size, average fare,  and number of rides taken.","archived":false,"fork":false,"pushed_at":"2022-09-28T00:54:38.000Z","size":1637,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-15T00:53:59.239Z","etag":null,"topics":["data-analysis","data-science","data-visualization","jupyter-notebook","matplotlib","pandas","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jbalooshie.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-01-06T22:30:15.000Z","updated_at":"2022-09-21T21:42:41.000Z","dependencies_parsed_at":"2023-09-10T23:38:30.046Z","dependency_job_id":null,"html_url":"https://github.com/jbalooshie/PyBer_Analysis","commit_stats":null,"previous_names":["jbalooshie/pyber_analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbalooshie%2FPyBer_Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbalooshie%2FPyBer_Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbalooshie%2FPyBer_Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbalooshie%2FPyBer_Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jbalooshie","download_url":"https://codeload.github.com/jbalooshie/PyBer_Analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247990796,"owners_count":21029580,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","data-visualization","jupyter-notebook","matplotlib","pandas","python"],"created_at":"2025-02-15T00:52:10.809Z","updated_at":"2026-05-12T07:41:21.201Z","avatar_url":"https://github.com/jbalooshie.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PyBer_Analysis\nThis repository was created as part of a 6 month Data Analystics Bootcamp administed by George Washington University. This is the repository for the Module 5 Challenge. This challenge served as an introduction to Matplotlib and further experience with pandas. Topics covered including creating DataFrame visualizations and cleaning datasets. Final project work is in PyBer_Challenge.ipynb. \n\n27 SEP 2022 - Updated repo to better organize files.\n\n## Overview of the analysis\nThe purpose of this analysis is to explain how the type of city (rural, suburban, or urban) affects the price of the fare for the ride. This relationship is important to understand because it impacts both the driver and rider. Riders prefer to pay lower fares, while drivers want to ensure they are being adequately compensated for their work. Looking at this data can help us determine if there are gaps or inconsistencies in this relationship that need to be addressed. \n\n### Methods\nA data frame was constructed using two csv files showing the rideshare data for specific cities, and the data for individual rides. They were combined using a common detail (the name of the city) to create one unified data frame. From there, we extracted the total rides, total drivers, total fares, average ride fare, and average driver fare for each city type. Here is an example of how the total rides was extracted: \n\n`total_rides = pyber_data_df.groupby([\"type\"]).count()[\"ride_id\"]`\n\nFrom there a formatted data frame was constructed. We then took this one step forward and created another data frame that is arranged by city type and date. We used this frame to create a breakdown of the total fares based on city type and date. This was compiled into a line chart for easy analysis. \n\n## Results\n\nThe below table displays the total rides, drivers, and fares paid across the three city types. It also shows the average fare per ride and average fare per driver. \n\n![Results Summary](https://github.com/jbalooshie/PyBer_Analysis/blob/main/analysis/summary.PNG)\n\nThe chart below shows the total fares collected over a four-month period in each of the three city types. \n\n![Line Chart](https://github.com/jbalooshie/PyBer_Analysis/blob/main/analysis/PyBer_fare_summary.png)\n\nBelow, we will breakdown notable takeaways across each of the six categories that were investigated. \n\n### Description of Differences\n\n- The urban city type has the most total rides taken, with 1,625. This is 13 times as many as the rural city type, with 125. Suburban cities have 625 rides taken.\n- The urban city type also has the most drivers, at 2405. This almost 31 times as many drivers as the rural city type, with 78. Suburban cities have 490 drivers \n- The urban city type generates the most fares at $39,854.38. This is almost 10 times as much as rural cities generate ($4,327.93), and about 1.5 times as much as suburban cities generate ($19,356.33). \n- The average fare per ride is highest in rural cities, at $34.62. Suburban cities have a cheaper average fare ($30.97), and urban cities have the cheapest fare ($24.53). \n- The average fare per driver is also highest in rural cities, at $55.59. Drivers in suburban cities earn less on average ($39.50), and urban drivers earn the least ($16.57).\n- The total fare by city type (see above line chart) has one instance where there is an increase in total fare, followed by a decrease, for all 3 city types. There is an increase in the middle of February, and then a decrease at the start of March. This is the only point across the four months where there is a trend shared by all 3 city types. \n\n## Summary\n\nBased on the above results, we would like to present three recommendations for the company to address disparities among the different city types. \n\n### Recommendations\n\n- Limit the number of drivers in urban cities. There are 13 times as many rides taken in urban cities than in rural cities, but there are 31 times more drivers. This means fares are being spread across more drivers, resulting in a very low average fare per driver in urban cities ($16.57). \n\n- Advertise using rideshares as an affordable alternative for urban transportation. The average fare price in urban areas is the lowest out of the three. Individuals in rural and suburban areas might not realize that they will be paying less when they use rideshares in urban cities (versus when they use a rideshare in a rural or suburban city). Alerting users to this fact might increase usage in urban areas. \n\n- Investigate why fares fluctuate for urban cities from mid-February to the start of April. During this time, the total fares for urban cities fluctuates in a way not seen for other city types. This implies that from week to week there are big differences in the number of rides being taken. It is not immediately clear why that is though. Understanding this could help the company retarget its marketing efforts to ensure more consistent demand.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjbalooshie%2Fpyber_analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjbalooshie%2Fpyber_analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjbalooshie%2Fpyber_analysis/lists"}