{"id":19590989,"url":"https://github.com/ahammadmejbah/ibm-project-03-analyzing-spreadsheet-data-with-python","last_synced_at":"2025-04-27T13:31:56.863Z","repository":{"id":174158932,"uuid":"651855161","full_name":"ahammadmejbah/IBM-Project-03-Analyzing-Spreadsheet-Data-with-Python","owner":"ahammadmejbah","description":"Spreadsheets are computer programs that allow users to enter, view, and change information in a gridlike format. One of the most useful programs on a PC is a spreadsheet, which allows users to organize data in tabular form. A spreadsheet's primary purpose is to store numerical data and sometimes a few words of text.","archived":false,"fork":false,"pushed_at":"2023-06-10T10:10:31.000Z","size":1464,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-08-28T20:40:51.584Z","etag":null,"topics":["data-analysis","data-analysis-python","data-analyst","data-science","excel","python","spreadsheet"],"latest_commit_sha":null,"homepage":"https://cognitiveclass.ai/courses/course-v1:IBM+GPXX0AYQEN+v1","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ahammadmejbah.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-06-10T09:52:51.000Z","updated_at":"2024-01-13T21:10:15.000Z","dependencies_parsed_at":null,"dependency_job_id":"39b678ca-8bcb-4384-98b8-4314c6281340","html_url":"https://github.com/ahammadmejbah/IBM-Project-03-Analyzing-Spreadsheet-Data-with-Python","commit_stats":null,"previous_names":["ahammadmejbah/ibm-project-03-analyzing-spreadsheet-data-with-python","bytesofintelligences/ibm-project-03-analyzing-spreadsheet-data-with-python"],"tags_count":null,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahammadmejbah%2FIBM-Project-03-Analyzing-Spreadsheet-Data-with-Python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahammadmejbah%2FIBM-Project-03-Analyzing-Spreadsheet-Data-with-Python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahammadmejbah%2FIBM-Project-03-Analyzing-Spreadsheet-Data-with-Python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahammadmejbah%2FIBM-Project-03-Analyzing-Spreadsheet-Data-with-Python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ahammadmejbah","download_url":"https://codeload.github.com/ahammadmejbah/IBM-Project-03-Analyzing-Spreadsheet-Data-with-Python/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224072153,"owners_count":17251020,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-analysis-python","data-analyst","data-science","excel","python","spreadsheet"],"created_at":"2024-11-11T08:27:10.818Z","updated_at":"2024-11-11T08:27:11.916Z","avatar_url":"https://github.com/ahammadmejbah.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n      \u003ch1\u003e \u003cimg src=\"https://github.com/ahammadmejbah/IBM-Project-02-Transform-Photos-to-Sketches-and-Paintings-with-OpenCV/blob/main/Additional%20Files/SN_web_lightmode.svg\" width=\"300px\"\u003e\u003cbr/\u003eIBM Project 03 Analyzing Spreadsheet Data with Python\u003c/h1\u003e\n     \u003c/div\u003e\n\u003cp align=\"center\"\u003e \u003ca href=\"https://github.com/ahammadmejbah\" target=\"_blank\"\u003e\u003cimg alt=\"\" src=\"https://img.shields.io/badge/Website-EA4C89?style=normal\u0026logo=dribbble\u0026logoColor=white\" style=\"vertical-align:center\" /\u003e\u003c/a\u003e \u003ca href=\"https://twitter.com/ahammadmejbah\" target=\"_blank\"\u003e\u003cimg alt=\"\" src=\"https://img.shields.io/badge/Twitter-1DA1F2?style=normal\u0026logo=twitter\u0026logoColor=white\" style=\"vertical-align:center\" /\u003e\u003c/a\u003e \u003ca href=\"https://www.facebook.com/ahammadmejbah\" target=\"_blank\"\u003e\u003cimg alt=\"\" src=\"https://img.shields.io/badge/Facebook-1877F2?style=normal\u0026logo=facebook\u0026logoColor=white\" style=\"vertical-align:center\" /\u003e\u003c/a\u003e \u003ca href=\"https://www.instagram.com/ahammadmejbah/\" target=\"_blank\"\u003e\u003cimg alt=\"\" src=\"https://img.shields.io/badge/Instagram-E4405F?style=normal\u0026logo=instagram\u0026logoColor=white\" style=\"vertical-align:center\" /\u003e\u003c/a\u003e \u003ca href=\"https://www.linkedin.com/in/ahammadmejbah/}\" target=\"_blank\"\u003e\u003cimg alt=\"\" src=\"https://img.shields.io/badge/LinkedIn-0077B5?style=normal\u0026logo=linkedin\u0026logoColor=white\" style=\"vertical-align:center\" /\u003e\u003c/a\u003e \u003c/p\u003e\n\n# Description\nAbout Spreadsheets are computer programs that allow users to enter, view, and change information in a gridlike format. One of the most useful programs on a PC is a spreadsheet, which allows users to organize data in tabular form. A spreadsheet's primary purpose is to store numerical data and sometimes a few words of text.\n\n# Features\n# Analyzing Spreadsheet Data with Python\n\nEstimated time needed: **40 minutes**\n\nWe've all been there: sitting at the computer, a cup of coffee in hand, ready to begin the day by tackling a spreadsheet. But once we open it, all we see are numbers upon numbers staring back at us. It can be a little intimidating tackling these numbers, especially with larger spreadsheets, but not to worry! With a few tricks up its sleeve, Python is here to give us a hand!\n\nInterested in learning more about this? Well, you're at the right place!\n\nIn this project, we'll be giving you an introduction to the tools that you can use to load spreadsheet data into Python, manipulate it, and write it back to the original file.\n\n## What You'll Need\n\nYou should be familiar with basic concepts of computer programming and the syntaxes of Python programming language. This refers to the concept of list, dictionary, function, class, instance variable, method, etc.\n\n## What You'll Learn\n\nAfter completing this guided project, you will be able to:\n\n1.  Understand the definition of a dataset.\n2.  Understand the concept of Cell and Sheets in spreadsheet.\n3.  Load a sheet in an `xlsx` or `xls` file to Python.\n4.  Perform basic spreadsheet operations such as subsetting, filtering, calculating column mean, median, max and mean, etc.\n5.  Write to a sheet in an `xlsx` or `xls` file from Python.\n\n## Exercise 1: Let's Define Data!\n\nSo, let's talk about data. We've been throwing this term around a lot: analyzing *data*, manipulating *data* and writing *data*. What exactly is this \"data?\n\nData is defined as \"a collection of numbers, characters, images or other items that provide information of something\" (SDM, 3rd CE, Deveaux et al.). In statistical sciences, we store data in files that records *cases* and *variables*:\n\n![image](https://github.com/ahammadmejbah/IBM-Project-03-Analyzing-Spreadsheet-Data-with-Python/assets/56669333/d341ef39-5461-41a0-aa87-1327da880f12)\n\n\nAs shown in the table above, each entry is called a *case*, and each *case*  includes value(s) for *variable*(s). Cases are often deleted due to missing data.\n\n\n## Exercise 2: Explore a Spreadsheet!\n\nNow that we know the definition of data, we can dive into a spreadsheet!\n\nA spreadsheet is a computer application for organization, analysis, and storage of data in tabular form. `xlsx` and `xls` are the two most popular formats for spreadsheet files. They can be created, opened, and saved using spreadsheet applications such as Microsoft Office Excel and Google Sheets.\n\n![image](https://github.com/ahammadmejbah/IBM-Project-03-Analyzing-Spreadsheet-Data-with-Python/assets/56669333/abf96391-43d7-4554-aff3-7a261f59dabc)\n\n\n\n``` python\nimport requests\nimport os\n\nif not os.path.isfile(\"./Airport_Data.xlsx\"):\n    r = requests.get(\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/analysing-spreadsheet-data-with-python/Airport_Data.xlsx\")\n    f = open(\"./Airport_Data.xlsx\", mode = \"wb+\")\n    f.write(r.content)\n    f.close()\n\n```\n\nTo load the spreadsheet into Python, we can use pandas's `read_excel()` function. When calling this function, you'll need to specify:\n\n*   The path to the `xls` or `xlsx` file, and\n*   The name of the sheet in which you wish to load into Python.\n\nBoth will be passed in format of the string.\n\n\n``` python\nimport pandas as pd\n\ndf_facilities = pd.read_excel(\"./Airport_Data.xlsx\", sheet_name=\"Facilities\")\n\n\n```\n\n\n# Author(s)\n\n[Weiqing Wang](https://www.linkedin.com/in/weiqing-wang-641640133/?utm_medium=Exinfluencer\u0026utm_source=Exinfluencer\u0026utm_content=000026UJ\u0026utm_term=10006555\u0026utm_id=NA-SkillsNetwork-Channel-SkillsNetworkQuickLabsanalysingspreadsheetdatawithpython28639550-2022-01-01)\n\nWeiqing is a Data Scientist intern at IBM Canada Ltd. Weiqing holds an Honours Bachelor of Science from the University of Toronto with two specialist degrees, respectively in computer science and statistical sciences. He is presently working towards a graduate degree in computer science at the University of Toronto.\n\n## Other contributors\n\nKathy An, Yasmine Hemmati\n\n# Change Log\n\n| Date (YYYY-MM-DD) | Version | Changed By   | Change Description       |\n| ----------------- | ------- | ------------ | ------------------------ |\n| 2021-08-10        | 1.0     | Weiqing Wang | Initial Version Created. |\n\n\n© IBM Corporation 2021. All rights reserved.\n\n\u003c!-- \u003c/\u003e with 💛 by readMD (https://readmd.itsvg.in) --\u003e\n    \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fahammadmejbah%2Fibm-project-03-analyzing-spreadsheet-data-with-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fahammadmejbah%2Fibm-project-03-analyzing-spreadsheet-data-with-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fahammadmejbah%2Fibm-project-03-analyzing-spreadsheet-data-with-python/lists"}