{"id":19491919,"url":"https://github.com/neelsoumya/public_open_source_data_science","last_synced_at":"2025-04-25T19:32:29.089Z","repository":{"id":118680916,"uuid":"284046875","full_name":"neelsoumya/public_open_source_data_science","owner":"neelsoumya","description":"A repository of open source data science projects for social good","archived":false,"fork":false,"pushed_at":"2025-02-08T08:43:13.000Z","size":29367,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-12T12:31:49.810Z","etag":null,"topics":["citizen-data-science","citizen-science","data-analysis","data-science","datascience","datascience-social-good","datascience-socialgood","deep-learning","machine-learning","paper","python","social"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/neelsoumya.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":null,"patreon":"soumyabanerjee","open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":null}},"created_at":"2020-07-31T13:47:20.000Z","updated_at":"2025-02-08T08:43:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"518ccdfc-6e13-4fc8-a0a1-1c54e9a76910","html_url":"https://github.com/neelsoumya/public_open_source_data_science","commit_stats":{"total_commits":41,"total_committers":1,"mean_commits":41.0,"dds":0.0,"last_synced_commit":"30a80541d9973f423516bebcf59ec5fc65a15551"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neelsoumya%2Fpublic_open_source_data_science","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neelsoumya%2Fpublic_open_source_data_science/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neelsoumya%2Fpublic_open_source_data_science/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neelsoumya%2Fpublic_open_source_data_science/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/neelsoumya","download_url":"https://codeload.github.com/neelsoumya/public_open_source_data_science/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250882637,"owners_count":21502341,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["citizen-data-science","citizen-science","data-analysis","data-science","datascience","datascience-social-good","datascience-socialgood","deep-learning","machine-learning","paper","python","social"],"created_at":"2024-11-10T21:18:45.791Z","updated_at":"2025-04-25T19:32:24.072Z","avatar_url":"https://github.com/neelsoumya.png","language":"Jupyter Notebook","funding_links":["https://patreon.com/soumyabanerjee"],"categories":[],"sub_categories":[],"readme":"# Introduction\n\nSource code and data for open source data science for social good. This is a data science portfolio.\n\n\n# List of projects\n\n\n1) university_sexcrimes\n\n    Analysis of data on sex crimes in US university campuses.\n\n2) heart_disease_risk_prediction\n\n    Predicting heart disease risk from open data.\n\n3) cancer_mortality_prediction\n\n    Predicting cancer survival using logistic regression from open data.\n\n4) predicting_news_popularity\n\n    Predicting popularity of news articles from open data.\n\n5) opensource_mapping_project\n\n    Open source mapping project. \n\n6) astroinformatics\n    \n    Analysis of astronomy data using machine learning techniques.\n\n7) scientific_collaboration\n\n    Project to analyze planetary scale scientific collaboration data.\n\n8) accident_prediction\n\n    Road accident forecasting and data exploration project.\n    \n    Interactive website using shiny at:\n    \n    https://neelsoumya.shinyapps.io/accident_prediction/\n\n9) patterns_in_crime\n\n    Predicting patterns of crime using data science. Larger cities have disproportionately more crime per capita compared to smaller cities (super-linear scaling of crime). We used techniques from dynamical systems and complex systems to explain the super-linear scaling of crime in cities and other socio-technological systems\n\n10) spam_classification\n\n    Building an SVM based spam classifier trained on data from the UCI repository\n \n11) breast_cancer_prediction\n\n    Downloads data from the UCI machine learning repository to make predictions\n    for breast cancer. A few features turn out to be really important for prediction like epithelial cell size. This uses a random forest.\n\n12) funding_trends_science\n\n    Project to analyze data on funding trends in biomedical science.\n\n13) infectious_disease_prediction\n\n    Project to analyze data on emerging infectious diseases.\n\n14) forecasting_imports\n\n    Project to forecast imports and model supply chains.  \n\n15) deep_learning_basic\n\n    Basic deep learning model using keras for prediction.\n   \n16) ai_healthcare\n\n    Machine learning and AI applied to healthcare.\n    \n17) ai_social_good\n\n    Machine learning, data science and AI for social good. \n    \n18) ai_bigdata_biology\n\n    Machine learning and bioinformatics for big data in biology. \n\n19) browser_based_data_science\n\n    Browser based data science for democratic access to data science tools.\t\n    \n20) clinical_informatics\n\n    Open source privacy-preserving clinical informatics.\n    \n21) policy_paper_general_public\n\n    Policy paper for general public on Ethical Artificial Intelligence (EAI) for social good.\n    \n22) nlp\n\n    Resources, code and data for natural language processing.\n    \n23) self_organising_map_wine_dataset\n\n    A self organising map (SOM) on the UCI wine dataset using the Orange data science tool. \n    \n24) outreach\n\n    Outreach for machine learning and AI for general public\n    \n25) teaching_resources\n\n    Teaching resources for machine learning, data science and AI for a general audience\n\n\n\n### What is this repository for? ###\n\n* Quick summary\n\n\t* Open source code and data for open source data science.\n\n### Citation ###\n\n* If you use this code, please cite the paper and code\n     \n     * Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects https://doi.org/10.13140/RG.2.1.1846.6002\n\n     * Citizen Data Science for Social Good in Complex Systems, Interdisciplinary Description of Complex Systems, 16(1):88-91, 2018  http://indecs.eu/index.php?s=x\u0026y=2018\u0026p=88-91\t\n     \n     * Banerjee, Soumya. (2017, September 3). Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects (Supplementary Resources). Zenodo. http://doi.org/10.5281/zenodo.883783\n\n      ![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.883783.svg)](https://doi.org/10.5281/zenodo.883783)\n\n* These projects are an example of my approach to data science for good. I work very closely with domain experts and stakeholders and use computational tools for good. I outline my design and work philosophy below.\n\n     * ![data science philosophy](research_philosophy.png)\n\n### Installation ###\n\nInstall R, R Studio, MATLAB and Python\n\nInstall R \n\n   https://www.r-project.org/\n\nand R Studio \n\n   https://www.rstudio.com/products/rstudio/download/preview/\n\n```r\nsource(\"https://raw.githubusercontent.com/neelsoumya/rlib/master/INSTALL_MANY_MODULES.R\")\n```\n\nInstall Python dependencies as follows:\n\n```r\n    pip3 install -r requirements.txt   \n```\n\n### Contact ###\n\n     Soumya Banerjee\n     \n     https://sites.google.com/site/neelsoumya/\n     \n     sb2333@cam.ac.uk\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneelsoumya%2Fpublic_open_source_data_science","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneelsoumya%2Fpublic_open_source_data_science","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneelsoumya%2Fpublic_open_source_data_science/lists"}