Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-dlab
😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers
https://github.com/dlab-berkeley/awesome-dlab
Last synced: about 10 hours ago
JSON representation
-
Datasets
- DEA Pain Pills Database - The Washington Post published a significant portion of a database that tracks the path of every opioid pain pill, from manufacturer to pharmacy, in the United States between 2006 and 2012.
- Awesome Public Data - list of a topic-centric public data sources collected and tidied from blogs, answers, and user responses.
- tidytweetjson - R package for Turning Tweet JSON Files into a Tidyverse-ready Dataframe. The package takes 18 minutes to turn 1 million tweets into a dataframe.
- tidyethnicnews - R package for turning one of the largest databases on ethnic newspapers and magazines (Ethnic NewsWatch) into a tidyverse-ready dataframe. The package takes 0.0005 seconds to turn 100 newspaper articles into a tidy dataframe.
- California COVID Assessment Tool - This repository contains an application written in Shiny and for use with any US state to assist in assessing the many different models available for understanding COVID-19 transmission and spread. It brings together several data sources that are publicly available, and can be supplemented with your own data to improve the assessment.
- Case.Law - all official, book-published United States case law — every volume designated as an official report of decisions by a court within the United States.
-
Rosetta Stones
- Stata to Pandas Cross-Walk
- Data Science Rosetta Stone - A Tutorial of and Translation between Data Science Programming Languages
- Rosetta: Python, R, Stata Rosetta Stone. Projects implemented in each language side-by-side.
-
R
- Awesome R - more awesomeness related to this topic.
- rio: A Swiss-Army Knife for Data I/O - Import, Export, and Convert Data Files including web-based import, reading compressed files directly without explicit decompression, and 'convert()' function for converting between file types.
- makereproducible
-
PDF
- Working with PDFs in Python - Describes a range of Python libraries and and examples to work with PDFs: Reading and Splitting Pages; Adding Images and Watermarks; Inserting, Deleting, and Reordering Pages
-
Python
- Awesome Python - more awesomeness related to this topic.
-
Databases
- SQLite - A completely embedded, full-featured relational database in a few 100k that you can include right into your project.
- fuzzy string matching with Postgresql - examples of different ways to match strings using PostgreSQL and extensions.
- SQL Join Types Explained in Visuals - Simple, useful visual expalanation of joins in SQL.
- Understanding Joins in Relational data | R for Data Science - Visual expalanation of joins in SQL with the addition of R code and variables.
- SQLite - A completely embedded, full-featured relational database in a few 100k that you can include right into your project.
- SQLite - A completely embedded, full-featured relational database in a few 100k that you can include right into your project.
- SQL Join Types Explained in Visuals - Simple, useful visual expalanation of joins in SQL.
-
Systems Administration
- Ops School - Comprehensive program that will help you learn to be an operations engineer.
- Ops School - Comprehensive program that will help you learn to be an operations engineer.
-
Cloud Computing
- Binder - To turn a Git repo into a collection of interactive notebooks. A great tool for teaching workshops.
-
Reproducibility
- MRAN Timemachine - For the purpose of reproducibility, MRAN hosts daily snapshots of the CRAN R packages and R releases as far back as Sept. 17, 2014.
- The Turing Way handbook - a handbook to reproducible, ethical and collaborative data science.
-
Bash
- jq - jq is a lightweight and flexible command-line JSON processor.
-
Natural Language Processing (NLP)
- Tracking Progress in Natural Language Processing - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Categories
Sub Categories
Keywords
r
3
parsing
2
awesome
2
tidy
2
natural-language-processing
2
python-framework
1
python
1
collections
1
rstats
1
list
1
data-science
1
data-analysis
1
awesome-list
1
opendata
1
datasets
1
awesome-public-datasets
1
aaron-swartz
1
python-library
1
python-resources
1
json
1
twitter
1
html
1
regular-expression
1
dialogue
1
machine-learning
1
machine-translation
1
named-entity-recognition
1
nlp-tasks
1
here
1
reproducibility
1
transparency
1