{"id":15195495,"url":"https://github.com/hasanyahya101/pareto-tutorial-python","last_synced_at":"2026-02-07T15:09:26.186Z","repository":{"id":242728981,"uuid":"810388896","full_name":"HasanYahya101/Pareto-Tutorial-Python","owner":"HasanYahya101","description":"This tutorial demonstrates how to create Pareto charts using Python libraries like pandas, NumPy, and Matplotlib. All code for this is given in the Jupyter Notebook.","archived":false,"fork":false,"pushed_at":"2024-06-05T14:29:52.000Z","size":288,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-12T17:09:04.489Z","etag":null,"topics":["chart","charts","code-blocks","ipynb","jupyter-notebook","license","markdown","matplotlib","mit","numpy","pandas","pareto","py","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HasanYahya101.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-04T15:43:08.000Z","updated_at":"2024-06-05T14:29:56.000Z","dependencies_parsed_at":"2024-09-11T12:32:26.338Z","dependency_job_id":"d928a2fc-4679-47c1-a967-cad98cd9bf60","html_url":"https://github.com/HasanYahya101/Pareto-Tutorial-Python","commit_stats":null,"previous_names":["hasanyahya101/pareto-tutorial-python"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HasanYahya101%2FPareto-Tutorial-Python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HasanYahya101%2FPareto-Tutorial-Python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HasanYahya101%2FPareto-Tutorial-Python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HasanYahya101%2FPareto-Tutorial-Python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HasanYahya101","download_url":"https://codeload.github.com/HasanYahya101/Pareto-Tutorial-Python/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241459116,"owners_count":19966509,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chart","charts","code-blocks","ipynb","jupyter-notebook","license","markdown","matplotlib","mit","numpy","pandas","pareto","py","python"],"created_at":"2024-09-27T23:40:19.407Z","updated_at":"2026-02-07T15:09:26.156Z","avatar_url":"https://github.com/HasanYahya101.png","language":"Jupyter Notebook","readme":"# Pareto Chart Tutorial (Example):\n\nThis tutorial demonstrates how to create Pareto charts using Python libraries like pandas, NumPy, and Matplotlib.\n\n## What is a Pareto Chart?\n\nA Pareto chart, also known as the 80/20 rule chart, is a visualization tool that combines a bar chart and a line chart. It reveals the distribution of a population of data. Typically, it highlights that a small portion (around 80%) of the factors contribute to a large portion (around 80%) of the outcome.\n\nHere's a breakdown of the chart's components:\n\n- __Bar chart:__ Represents categories or items on the x-axis and their corresponding frequencies or values on the y-axis.\n- __Line chart:__ Shows the cumulative percentage of the values, plotted on a secondary y-axis.\n\nPareto charts are widely used in various fields, including quality control, business process improvement, and customer relationship management. They help identify areas for improvement by focusing on the factors contributing to the most significant impact.\n\nYou can use the following steps to create the Chart in Python:\n\n### 1. Import Libraries:\n\n```py\nimport pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\n```\n\n### 2. Prepare Data:\nCreate a pandas DataFrame with columns for complaint category and frequency.\n\n```py\n# Insert required data in a DataFrame\ndata = {'Category': ['Defect A', 'Defect B', 'Defect C', 'Defect D', 'Others'],\n        'Frequency': [30, 20, 15, 5, 10]}\n\n# Create pandas DataFrame\ndf = pd.DataFrame(data)\n\n# Print the data\nprint(\"Raw Data:\")\nprint(df.to_string(index=False))\n\n# Create a table with the data\nfig, ax = plt.subplots()\nax.axis('off')\nax.axis('tight')\nax.table(cellText=df.values, colLabels=df.columns, cellLoc='center', loc='center', colLoc='center')\nplt.title('Defects Frequency')\nplt.show()\n```\n### Sort Data in Descending Order:\n```py\n# Sort DataFrame by frequency (descending)\ndf_sorted = df.sort_values(by=['Frequency'], ascending=False)\n\n# Print sorted data\nprint(\"\\nData sorted by Frequency (descending):\")\nprint(df_sorted.to_string(index=False))\n\n# Create a table with the sorted data\nfig, ax = plt.subplots()\nax.axis('off')\nax.axis('tight')\nax.table(cellText=df_sorted.values, colLabels=df_sorted.columns, cellLoc='center', loc='center', colLoc='center')\nplt.title('Defects Frequency (sorted)')\nplt.show()\n```\n### Calculate Cumulative Percentage:\n```py\n# Calculate cumulative sum of frequencies\ndf_sorted['Cumulative Percentage'] = df_sorted['Frequency'].cumsum() / df_sorted['Frequency'].sum() * 100\n\n# Print data with cumulative percentages\nprint(\"\\nData with Cumulative Percentages:\")\nprint(df_sorted.to_string(index=False))\n\n# Create a table with the data and cumulative percentages\nfig, ax = plt.subplots()\nax.axis('off')\nax.axis('tight')\nax.table(cellText=df_sorted.values, colLabels=df_sorted.columns, cellLoc='center', loc='center', colLoc='center')\nplt.title('Defects Frequency (sorted) with Cumulative Percentages')\nplt.show()\n```\n\n### Create Pareto Chart:\n\n```py\n# Create the figure and subplots\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 6))\n\n# Bar chart for frequencies on primary axis (ax1)\nax1.bar(df_sorted['Category'], df_sorted['Frequency'], color='skyblue')\nax1.set_xlabel('Category')\nax1.set_ylabel('Frequency')\nax1.set_title('Frequency of Defects')\n\n# Line chart for cumulative percentages on secondary axis (ax2)\nax2.plot(df_sorted['Category'], df_sorted['Cumulative Percentage'], color='red', marker='o', linestyle='-')\nax2.set_xlabel('Category')\nax2.set_ylabel('Cumulative Percentage (%)')\nax2.set_title('Cumulative Percentage of Defects')\n\n# Set y-axis limits for cumulative percentages (0 to 100%)\nax2.set_ylim(0, 100)\n\n# Rotate x-axis labels for better readability\nplt.setp(ax1.xaxis.get_majorticklabels(), rotation=45)\nplt.setp(ax2.xaxis.get_majorticklabels(), rotation=45)\n\n# Display the plots\nplt.tight_layout()\nplt.show()\n\n# Plot the bar chart with the frequencies and cummulative percentages\nfig, ax1 = plt.subplots(figsize=(10, 6))\nax2 = ax1.twinx()\nax1.bar(df_sorted['Category'], df_sorted['Frequency'], color='skyblue')\nax2.plot(df_sorted['Category'], df_sorted['Cumulative Percentage'], color='red', marker='o', linestyle='-')\nax1.set_xlabel('Category')\nax1.set_ylabel('Frequency')\nax2.set_ylabel('Cumulative Percentage (%)')\nax1.set_title('Frequency and Cumulative Percentage of Defects')\nplt.setp(ax1.xaxis.get_majorticklabels(), rotation=45)\nplt.show()\n```\n\nThis will give you a chart similar to the one given below:\n\n![defects_frequency_cumulative_percentage_merged](https://github.com/HasanYahya101/Pareto-Tutorial-Python/assets/118683092/1321b072-9b53-40e6-b838-694342fc4220)\n\nAfter that you can also save the image in an output file (such as png or jpeg).\n\n### Note: \nAll code for this is given in the __Jupyter Notebook__.\n\n## License:\n\nThis repository is under the __MIT License__.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhasanyahya101%2Fpareto-tutorial-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhasanyahya101%2Fpareto-tutorial-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhasanyahya101%2Fpareto-tutorial-python/lists"}