{"id":19693059,"url":"https://github.com/christs8920/data-science-py","last_synced_at":"2026-04-29T23:01:27.505Z","repository":{"id":159434606,"uuid":"634639517","full_name":"ChrisTs8920/data-science-py","owner":"ChrisTs8920","description":"A collection of data science projects made in python.","archived":false,"fork":false,"pushed_at":"2024-08-03T10:00:00.000Z","size":31,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-08T07:28:47.226Z","etag":null,"topics":["data-science","data-visualization","machine-learning","matplotlib","nltk","numpy","pandas","python","sklearn","svm-classifier","visualization"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ChrisTs8920.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-30T19:18:56.000Z","updated_at":"2024-08-03T10:00:03.000Z","dependencies_parsed_at":"2024-06-22T20:23:37.480Z","dependency_job_id":"aa64ecb4-8659-452e-a817-987a60dbfb91","html_url":"https://github.com/ChrisTs8920/data-science-py","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ChrisTs8920/data-science-py","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChrisTs8920%2Fdata-science-py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChrisTs8920%2Fdata-science-py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChrisTs8920%2Fdata-science-py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChrisTs8920%2Fdata-science-py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ChrisTs8920","download_url":"https://codeload.github.com/ChrisTs8920/data-science-py/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChrisTs8920%2Fdata-science-py/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32447312,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T22:27:22.272Z","status":"ssl_error","status_checked_at":"2026-04-29T22:10:49.234Z","response_time":110,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","data-visualization","machine-learning","matplotlib","nltk","numpy","pandas","python","sklearn","svm-classifier","visualization"],"created_at":"2024-11-11T19:15:37.204Z","updated_at":"2026-04-29T23:01:27.458Z","avatar_url":"https://github.com/ChrisTs8920.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Science with Python\n\n## Description\n\nThis repository contains a collection of Data science projects, made in python.\n\nLibraries used:\n\n- Pandas\n- NumPy\n- sklearn\n- NLTK\n\n## Heart Disease Classification\n\nThis project uses Machine Learning (classification algorithm - S.V.M. or Support Vector Machine) to predict whether a patient has an increased chance of a heart attack or not. It then shows the accuracy of the algorithm, and plots the different parameters of the data set.\n\nData was provided by [kaggle.com](https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset/data).\n\n\u003e*This project was an assignment and was made during my Big Data course in University.*\n\n## Corpus analysis\n\nThis project plots statistics for some of the built-in [nltk](https://www.nltk.org/) text books:\n\n- Lexical richness and percentage of text taken up by various words.\n- Stemming vs Lemmatization.\n- str.split() vs nltk.tokenize().\n- Frequency Distributions.\n\n\u003e*This project was an assignment and was made during my Information Retrieval course in University.*\n\n## Salary Survey\n\nThis project plots some statistics for programmers in Greece for the year 2022:\n\n- Education Level\n- Most used programming languages\n- Remote work (both, remote, on-site)\n- Median wage\n\n\u003eData was provided by [SocialNerds](https://www.youtube.com/@SocialNerdsGR).\n\n## How to run\n\n1. Data file needs to be in the same directory as script file.\n2. Execute ```\u003cfilename\u003e.py```.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchrists8920%2Fdata-science-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchrists8920%2Fdata-science-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchrists8920%2Fdata-science-py/lists"}