{"id":22089526,"url":"https://github.com/fatihilhan42/olympics-data-analysis-with-python","last_synced_at":"2026-04-30T18:31:49.762Z","repository":{"id":154677253,"uuid":"482840056","full_name":"fatihilhan42/Olympics-Data-Analysis-with-Python","owner":"fatihilhan42","description":"I will examine the Data Analysis of the Olympics between 1896-2016, which we have done on Python.","archived":false,"fork":false,"pushed_at":"2022-04-18T13:29:56.000Z","size":4957,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-23T22:48:44.204Z","etag":null,"topics":["data","data-science","dataanalysis","datavisualization","jupyter-notebook","olympics","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fatihilhan42.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-18T12:33:46.000Z","updated_at":"2022-10-27T10:54:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"57e57eba-1fe7-44b8-94a4-73e2b38bf232","html_url":"https://github.com/fatihilhan42/Olympics-Data-Analysis-with-Python","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/fatihilhan42/Olympics-Data-Analysis-with-Python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatihilhan42%2FOlympics-Data-Analysis-with-Python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatihilhan42%2FOlympics-Data-Analysis-with-Python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatihilhan42%2FOlympics-Data-Analysis-with-Python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatihilhan42%2FOlympics-Data-Analysis-with-Python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fatihilhan42","download_url":"https://codeload.github.com/fatihilhan42/Olympics-Data-Analysis-with-Python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatihilhan42%2FOlympics-Data-Analysis-with-Python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32473804,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"ssl_error","status_checked_at":"2026-04-30T13:12:06.837Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","data-science","dataanalysis","datavisualization","jupyter-notebook","olympics","python"],"created_at":"2024-12-01T02:13:11.890Z","updated_at":"2026-04-30T18:31:49.756Z","avatar_url":"https://github.com/fatihilhan42.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Olympics-Data-Analysis-with-Python\nI will examine the Data Analysis of the Olympics between 1896-2016, which we have done on Python.\n\nFirst of all, we define the libraries that we will use in these first four lines of code.\n### İmport\n\n```Python\nimport numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt \nimport seaborn as sns \n%matplotlib inline\n```\n## Load dataset \t\n\n```Python\nathletes = pd.read_csv('C:/Users/Yusuf/Desktop/Olympics Data Analysis/athlete_events.csv')\nregion = pd.read_csv('C:/Users/Yusuf/Desktop/Olympics Data Analysis/noc_regions.csv')\n```\nWe display the csv files of the data we will use by specifying the location on the computer thanks to the code blocks you have seen above.\n```Python\nathletes.head() \n```\n\n![image](https://user-images.githubusercontent.com/63750425/163809895-31307a55-e3eb-4c50-b6df-e05b891a07eb.png)\n\n```Python\nregion.head()\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163810006-7feb9152-7bbd-472e-a22b-ce3c2034a986.png)\n\n## Join the dataframes\n```Python\nathletes_df = athletes.merge(region, how = 'left', on = 'NOC')\nathletes_df.head()\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163810231-e4121545-bda2-449a-b126-7afd973c31c1.png)\n\nIn these code blocks, we first displayed the countries where the athletes were located, and finally the region they were in, the names of the athletes, the year they competed, in which branch they competed, and their gender as a graphic.\n\n```Python\nathletes_df.shape\n```\n\n## Column names consistent \n```Python\nathletes_df.rename(columns={'region':'Region','notes':'Notes'},inplace=True );\n```\n```Python\nathletes_df.head()\n```\n![image](https://user-images.githubusercontent.com/63750425/163810407-237bda5c-b72f-4003-aea9-5775d5065a17.png)\n\n```Python\nathletes_df.describe() \n```\n\n![image](https://user-images.githubusercontent.com/63750425/163810482-217f0fdb-63d1-41ec-8876-4e412d452dea.png)\n```Python\nnan_values = athletes_df.isna()\nnan_columns =nan_values.any()\nnan_columns\n```\n```Python\nathletes_df.isnull().sum()\n```\n\n## India Details\n```Python\nathletes_df.query('Team == \"India\"').head(5)\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163810667-ef69a9cb-f61d-40f5-a6eb-4ac7fdb82144.png)\n\n## Top Countries Participating\n```Python\ntop_10_countries = athletes_df.Team.value_counts().sort_values(ascending=False).head(10)\ntop_10_countries\n```\n```Python\n#Plot  forthe top 10 countries \nplt.figure(figsize=(12,6))\n#plt.xticks(rotation=20)\nplt.title('Overall Participation by Country')\nsns.barplot(x=top_10_countries.index, y=top_10_countries, palette = 'Set2' );\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163810891-9c6c1e0a-c35f-40ac-9361-be1af891ab61.png)\n\nHere, after determining the 10 countries with the highest number of participants in the code blocks, we showed them graphically.\n\n## Age Distribution of the participants\n```Python\nplt.figure(figsize=(12,6))\nplt.title(\"Age distribution of the athletes\")\nplt.xlabel('Age')\nplt.ylabel('Number of participants')\nplt.hist(athletes_df.Age,bins = np.arange(10,80,2),color='red',edgecolor = 'white');\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163811029-f842c779-5773-4153-900c-c5548bb1f2f8.png)\n\nHere, we have shown how many people are in age ranges with a histographic graph.\n```Python\nwinter_sports = athletes_df[athletes_df.Season == 'Winter'].Sport.unique()\nwinter_sports\n```\n```Python\n#Summer Olympics Sports\nsummer_sports = athletes_df[athletes_df.Season == 'Summer'].Sport.unique()\nsummer_sports\n```\nWe used the above codes to view the games played in summer and winter.\n```Python\n#Male and Female participants\ngender_counts = athletes_df.Sex.value_counts()\ngender_counts\n```\n```Python\n#Pie plot for male and female athletes\nplt.figure(figsize=(12,6))\nplt.title('Gender Distribution')\nplt.pie(gender_counts,labels=gender_counts.index, autopct='%1.1f%%', startangle=150, shadow=True);\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163811309-71173301-dfd9-44e0-8241-fc99cb4a1ea8.png)\n\nPlotly is a technical computing company headquartered in Montreal, Quebec, that develops online data analytics and visualization tools. Plotly provides online graphing, analytics, and statistics tools for individuals and collaboration, as well as scientific graphing libraries for Python, R, MATLAB, Perl, Julia, Arduino, and REST.\nHere, we used the plotly library to graphically see the total male-female ratio in the Olympics.\n```Python\n#Total Medals \nathletes_df.Medal.value_counts()\n```\n```Python\n# Totall number of female athletes in each olympics\t\nfemale_participants = athletes_df[(athletes_df.Sex=='F') \u0026 (athletes_df.Season=='Summer')][['Sex','Year']]\nfemale_participants = female_participants.groupby('Year').count().reset_index()\nfemale_participants.head()\n```\n```Python\n# Totall number of female athletes in each olympics\nfemale_participants = athletes_df[(athletes_df.Sex=='F') \u0026 (athletes_df.Season=='Summer')][['Sex','Year']]\nfemale_participants = female_participants.groupby('Year').count().reset_index()\nfemale_participants.tail()\n```\n\nIn the code blocks you have seen above, we see the number of women competing by years, first with the head command, and the close years with the tail command.\n```Python\nwomenOlympics = athletes_df[(athletes_df.Sex == 'F') \u0026 (athletes_df.Season == 'Summer')]\n```\n```Python\nsns.set(style=\"darkgrid\")\nplt.figure(figsize=(20,10))\nsns.countplot(x='Year', data=womenOlympics,palette=\"Spectral\")\nplt.title('Women Participation') \n```\n\n![image](https://user-images.githubusercontent.com/63750425/163811611-d964650a-74b4-42cb-ac45-c2a8425f7c85.png)\n\nIn the graph you have seen above, you can see the increase in female participants over the years. We did this using the plotly library.\n```Python\npart = womenOlympics.groupby('Year')['Sex'].value_counts()\nplt.figure(figsize=(20,10))\npart.loc[:,'F'].plot()\nplt.title('Plotof Female Athletes over time')\n```\n![image](https://user-images.githubusercontent.com/63750425/163811709-32e46cf3-16c7-424d-9ac6-d851fbc4187d.png)\n\nIn the graph you see above, you can see the increase in female participants over the years. In this graph, we have taken the values linearly. We did this using the plotly library.\n```Python\n#Gold Medal Athletes\ngoldMedals = athletes_df[(athletes_df.Medal == 'Gold')]\ngoldMedals.head()\n```\n```Python\n#take only the alues that are different from nan\ngoldMedals = goldMedals[np.isfinite(goldMedals['Age'])]\n```\n```Python\n#Gold beyond 60 \ngoldMedals['ID'][goldMedals['Age'] \u003e 60].count()\n```\n```Python\nsporting_event = goldMedals['Sport'][goldMedals['Age']\u003e60]\nsporting_event\n```\n\nIn the code blocks above, we first listed the gold medal winners, and then got the gold medal winners over the age of 60 and in which branches they did this on the screen.\n```Python\n# Plot for spoerting_event\nplt.figure(figsize=(10,5))\nplt.tight_layout()\nsns.countplot(sporting_event)\nplt.title('Gold Medals for Athletes over 60 years')\n```\n![image](https://user-images.githubusercontent.com/63750425/163812108-4bbcc305-e93c-4ab0-b763-defcb7c87386.png)\n\n```Python\n# Gold medals from each country \ngoldMedals.Region.value_counts().reset_index(name='Medal').head(5) \n```\n\nIn the code blocks above, we captured the gold medal winners over the age of 60 and in which branches they did this. Afterwards, we used the plotly library to graphically show how many medals they received in which areas.\n\n```Python\ntotalgoldMedals = goldMedals.Region.value_counts().reset_index(name='Medal').head(6)\n2.\tg = sns.catplot(x=\"index\", y=\"Medal\", data=totalgoldMedals,height=5, kind=\"bar\",palette=\"rocket\")\n3.\tg.despine(left=True)\n4.\tg.set_xlabels(\"Top 5 Countries\")\n5.\tg.set_ylabels(\"Number ofMedals\")\n6.\tplt.title('Gold Medals per Country')\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163812223-bd0c6e68-c136-4f0d-b1ce-44098e56721a.png)\n\nIn the code blocks above, after listing the countries that won the most gold medals, we showed the top 5 countries and the number of medals on the graph.\n```Python\n#Rio Olympics \nmax_year = athletes_df.Year.max()\nprint(max_year)\nteam_names = athletes_df[(athletes_df.Year == max_year) \u0026 (athletes_df.Medal == 'Gold')].Team\t\nteam_names.value_counts().head(10)\n```\n```Python\nsns.barplot(x=team_names.value_counts().head(20), y=team_names.value_counts().head(20).index)\nplt.ylabel(None);\nplt.xlabel('Countrywise Medals forthe year 2016')\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163812364-f37e2eeb-bcc3-4c92-825b-e8e9994a67de.png)\n\nThe above code blocks were first drawn numerically and then graphically, using the plotly library of medal numbers in the 2016-Rio Olympics.\n\n```Python \nnot_null_medals = athletes_df[(athletes_df['Height'].notnull()) \u0026 (athletes_df['Weight'].notnull())]\n```\n```Python \nplt.figure(figsize=(12,10))\naxis = sns.scatterplot(x=\"Height\", y=\"Weight\", data=not_null_medals, hue= \"Sex\")\nplt.title(\"Height vs Wight of  Olympic Medalists\")\n```\n\n![image](https://user-images.githubusercontent.com/63750425/163812486-8a9428af-d79c-4ef7-811c-38a850c06510.png)\n\nIn the code blocks above, the weight and height of the competitors are shown on the graph as male and female\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffatihilhan42%2Folympics-data-analysis-with-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffatihilhan42%2Folympics-data-analysis-with-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffatihilhan42%2Folympics-data-analysis-with-python/lists"}