{"id":19770443,"url":"https://github.com/nandit123/python_on_excel","last_synced_at":"2026-05-16T18:35:02.706Z","repository":{"id":117201725,"uuid":"118861109","full_name":"nandit123/python_on_excel","owner":"nandit123","description":"Data Analysis using python libraries on excel data","archived":false,"fork":false,"pushed_at":"2018-01-30T15:16:28.000Z","size":135,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-07T07:11:40.723Z","etag":null,"topics":["csv","data-analysis","data-science","fill","fluctuations","graph","numpy","python","python-library"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nandit123.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-25T04:14:39.000Z","updated_at":"2018-05-05T17:19:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"82ec6e17-8bf2-4aef-92f6-9a6df6f530c4","html_url":"https://github.com/nandit123/python_on_excel","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nandit123/python_on_excel","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nandit123%2Fpython_on_excel","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nandit123%2Fpython_on_excel/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nandit123%2Fpython_on_excel/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nandit123%2Fpython_on_excel/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nandit123","download_url":"https://codeload.github.com/nandit123/python_on_excel/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nandit123%2Fpython_on_excel/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267441611,"owners_count":24087772,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-27T02:00:11.917Z","response_time":82,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","data-analysis","data-science","fill","fluctuations","graph","numpy","python","python-library"],"created_at":"2024-11-12T04:47:58.253Z","updated_at":"2026-05-16T18:34:57.674Z","avatar_url":"https://github.com/nandit123.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# python_on_excel\nUsing python libraries to perform operations on excel data.\nHere the given data is in data.xlsx excel file which contains some missing values. We need to fill up those missing values with some values so that data as whole starts making more sense.\n\nWhat is done -\u003e\nI have created a code_numpy.py file that operates on data.xlsx file to fill up missing values and save the result on data2.xlsx file. Missing values are filled with either mean, mode or median depending on the method user selects when the python program is run. Then standard deviation is found before and after the filling of missing values as per the method(mean, mode or median). A graph is then plotted for standard deviation vs column number (containing values).\n\nLibraries used - \nnumpy for mean and median,\nscipy for mode,\nmatplotlib for plotting the graph,\nxlrd and xlwt for reading and writing on excel file (else directly xlutils can also be used)\n\nImportant: Here excel file have been parsed and operations are the conducted. Other way could be exporting CSV file from excel and then conducting operations on the CSV file. CSV file will automatically eliminate the empty cells. :)\n\n\nResults -\u003e Fluctuations -\u003e \n\nHere we have calculated the average of differences of standard deviation in all the three methods in our code to find the best suitable for filling up the missing values. This average is called as fluctuation here\n\nFor the given data, following we get -\n\nMean Based Method -\u003e\nfluctuation = 0.06652462870452805\n\nMode Based Method -\u003e\nfluctuation = 0.052384111228187\n\nMedian Based Method -\u003e\nfluctuation = 0.06458214168009144\n\n\n\n\nConclusion – For the given data, fluctuation is minimal for mode based method and hence, that is the most preferred method of filling missing values of the three methods used\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnandit123%2Fpython_on_excel","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnandit123%2Fpython_on_excel","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnandit123%2Fpython_on_excel/lists"}