Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/neelsoumya/public_open_source_data_science
A repository of open source data science projects for social good
https://github.com/neelsoumya/public_open_source_data_science
citizen-data-science citizen-science data-analysis data-science datascience datascience-social-good datascience-socialgood deep-learning machine-learning paper python social
Last synced: about 2 months ago
JSON representation
A repository of open source data science projects for social good
- Host: GitHub
- URL: https://github.com/neelsoumya/public_open_source_data_science
- Owner: neelsoumya
- Created: 2020-07-31T13:47:20.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-07-11T13:08:01.000Z (over 2 years ago)
- Last Synced: 2024-07-30T17:36:20.219Z (5 months ago)
- Topics: citizen-data-science, citizen-science, data-analysis, data-science, datascience, datascience-social-good, datascience-socialgood, deep-learning, machine-learning, paper, python, social
- Language: Jupyter Notebook
- Homepage:
- Size: 28 MB
- Stars: 3
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- Citation: CITATION.cff
Awesome Lists containing this project
README
# Introduction
Source code and data for open source data science for social good. This is a data science portfolio.
# List of projects
1) university_sexcrimes
Analysis of data on sex crimes in US university campuses.
2) heart_disease_risk_prediction
Predicting heart disease risk from open data.
3) cancer_mortality_prediction
Predicting cancer survival using logistic regression from open data.
4) predicting_news_popularity
Predicting popularity of news articles from open data.
5) opensource_mapping_project
Open source mapping project.
6) astroinformatics
Analysis of astronomy data using machine learning techniques.7) scientific_collaboration
Project to analyze planetary scale scientific collaboration data.
8) accident_prediction
Road accident forecasting and data exploration project.
Interactive website using shiny at:
https://neelsoumya.shinyapps.io/accident_prediction/9) patterns_in_crime
Predicting patterns of crime using data science. Larger cities have disproportionately more crime per capita compared to smaller cities (super-linear scaling of crime). We used techniques from dynamical systems and complex systems to explain the super-linear scaling of crime in cities and other socio-technological systems
10) spam_classification
Building an SVM based spam classifier trained on data from the UCI repository
11) breast_cancer_predictionDownloads data from the UCI machine learning repository to make predictions
for breast cancer. A few features turn out to be really important for prediction like epithelial cell size. This uses a random forest.12) funding_trends_science
Project to analyze data on funding trends in biomedical science.
13) infectious_disease_prediction
Project to analyze data on emerging infectious diseases.
14) forecasting_imports
Project to forecast imports and model supply chains.
15) deep_learning_basic
Basic deep learning model using keras for prediction.
16) ai_healthcareMachine learning and AI applied to healthcare.
17) ai_social_goodMachine learning, data science and AI for social good.
18) ai_bigdata_biologyMachine learning and bioinformatics for big data in biology.
19) browser_based_data_science
Browser based data science for democratic access to data science tools.
20) clinical_informaticsOpen source privacy-preserving clinical informatics.
21) policy_paper_general_publicPolicy paper for general public on Ethical Artificial Intelligence (EAI) for social good.
22) nlpResources, code and data for natural language processing.
23) self_organising_map_wine_datasetA self organising map (SOM) on the UCI wine dataset using the Orange data science tool.
24) outreachOutreach for machine learning and AI for general public
25) teaching_resourcesTeaching resources for machine learning, data science and AI for a general audience
### What is this repository for? ###
* Quick summary
* Open source code and data for open source data science.
### Citation ###
* If you use this code, please cite the paper and code
* Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects https://doi.org/10.13140/RG.2.1.1846.6002* Citizen Data Science for Social Good in Complex Systems, Interdisciplinary Description of Complex Systems, 16(1):88-91, 2018 http://indecs.eu/index.php?s=x&y=2018&p=88-91
* Banerjee, Soumya. (2017, September 3). Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects (Supplementary Resources). Zenodo. http://doi.org/10.5281/zenodo.883783![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.883783.svg)](https://doi.org/10.5281/zenodo.883783)
* These projects are an example of my approach to data science for good. I work very closely with domain experts and stakeholders and use computational tools for good. I outline my design and work philosophy below.
* ![data science philosophy](research_philosophy.png)
### Installation ###
Install R, R Studio, MATLAB and Python
Install R
https://www.r-project.org/
and R Studio
https://www.rstudio.com/products/rstudio/download/preview/
```r
source("https://raw.githubusercontent.com/neelsoumya/rlib/master/INSTALL_MANY_MODULES.R")
```Install Python dependencies as follows:
```r
pip3 install -r requirements.txt
```### Contact ###
Soumya Banerjee
https://sites.google.com/site/neelsoumya/
[email protected]