Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dlab-berkeley/awesome-dlab
😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers
https://github.com/dlab-berkeley/awesome-dlab
List: awesome-dlab
awesome awesome-list lists resources social-sciences ucberkeley
Last synced: 16 days ago
JSON representation
😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers
- Host: GitHub
- URL: https://github.com/dlab-berkeley/awesome-dlab
- Owner: dlab-berkeley
- Created: 2019-05-30T20:36:19.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2021-11-24T18:37:56.000Z (about 3 years ago)
- Last Synced: 2024-05-23T06:11:53.930Z (7 months ago)
- Topics: awesome, awesome-list, lists, resources, social-sciences, ucberkeley
- Size: 38.1 KB
- Stars: 17
- Watchers: 11
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- ultimate-awesome - awesome-dlab - 😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers. (Other Lists / Monkey C Lists)
README
# awesome-dlab
😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers**What is an awesome list?**
Only put stuff on the list that you or another D-Labber can personally recommend. You should rather leave stuff out than include too much. Read the [Awesome Manifesto](https://github.com/sindresorhus/awesome/blob/master/awesome.md) to find out more what this list is about.
Or if you'd like to check out stuff that is awesome to people outside of D-Lab, then start here: [![Awesome](https://awesome.re/badge.svg)](https://awesome.re)
## Contents
- [Datasets](#datasets)
- [Natural Language Processing (NLP)](#natural-language-processing-nlp)
- [Rosetta Stones](#rosetta-stones)
- [R](#r)
- [Python](#python)
- [PDF](#pdf)
- [Databases](#databases)
- [Systems Administration](#systems-administration)
- [Cloud computing](#cloud-computing)
- [Reproducibility](#reproducibility)## Datasets
* [Case.Law](https://case.law/) - all official, book-published United States case law — every volume designated as an official report of decisions by a court within the United States.
* [DEA Pain Pills Database](https://www.washingtonpost.com/national/2019/07/18/how-download-use-dea-pain-pills-database/) - The Washington Post published a significant portion of a database that tracks the path of every opioid pain pill, from manufacturer to pharmacy, in the United States between 2006 and 2012.
* [Awesome Public Data](https://github.com/awesomedata/awesome-public-datasets) - list of a topic-centric public data sources collected and tidied from blogs, answers, and user responses.
* [tidytweetjson](https://github.com/jaeyk/tidytweetjson) - R package for Turning Tweet JSON Files into a Tidyverse-ready Dataframe. The package takes 18 minutes to turn 1 million tweets into a dataframe.
* [tidyethnicnews](https://github.com/jaeyk/tidyethnicnews) - R package for turning one of the largest databases on ethnic newspapers and magazines (Ethnic NewsWatch) into a tidyverse-ready dataframe. The package takes 0.0005 seconds to turn 100 newspaper articles into a tidy dataframe.
* [California COVID Assessment Tool](https://github.com/StateOfCalifornia/CalCAT) - This repository contains an application written in Shiny and for use with any US state to assist in assessing the many different models available for understanding COVID-19 transmission and spread. It brings together several data sources that are publicly available, and can be supplemented with your own data to improve the assessment.## Natural Language Processing (NLP)
* [Tracking Progress in Natural Language Processing](https://github.com/sebastianruder/NLP-progress) - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.## Rosetta Stones
* [Rosetta: Python, R, Stata Rosetta Stone. Projects implemented in each language side-by-side.](https://github.com/adamrossnelson/rosetta)
* [Stata to Pandas Cross-Walk](https://github.com/adamrossnelson/StataQuickReference/blob/master/spcrosswlk.md)
* [Data Science Rosetta Stone](http://www.datasciencerosettastone.com/) - A Tutorial of and Translation between Data Science Programming Languages## R
* [Awesome R](https://github.com/qinwf/awesome-R#readme) - more awesomeness related to this topic.* [rio: A Swiss-Army Knife for Data I/O](https://cran.r-project.org/web/packages/rio/vignettes/rio.html) - Import, Export, and Convert Data Files including web-based import, reading compressed files directly without explicit decompression, and 'convert()' function for converting between file types.
* [makereproducible](https://github.com/jaeyk/makereproducible): R package for making a project computationally reproducible before sharing it
* [Working with PDFs in Python](https://stackabuse.com/working-with-pdfs-in-python-reading-and-splitting-pages/) - Describes a range of Python libraries and and examples to work with PDFs: Reading and Splitting Pages; Adding Images and Watermarks; Inserting, Deleting, and Reordering Pages## Python
* [Awesome Python](https://github.com/vinta/awesome-python#readme) - more awesomeness related to this topic.## Databases
* [SQLite](http://www.sqlite.org/) - A completely embedded, full-featured relational database in a few 100k that you can include right into your project.
* [sqlitebiter](https://github.com/thombashi/sqlitebiter) - a CLI tool to convert CSV / Excel / HTML / JSON / and many other formats to a SQLite database file.
* [Awesome SQL](https://github.com/danhuss/awesome-sql) - more awesomeness related to this topic.
* [fuzzy string matching with Postgresql](https://www.freecodecamp.org/news/fuzzy-string-matching-with-postgresql/) - examples of different ways to match strings using PostgreSQL and extensions.
* [binder-postgres](https://github.com/ouseful-template-repos/binder-postgres) - Demo of launching a binderhub notebook server with a free running Postgres server. [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ouseful-template-repos/binder-postgres/master?filepath=notebooks%2FTest%20Databases.ipynb)
* [SQL Join Types Explained in Visuals](https://dataschool.com/how-to-teach-people-sql/sql-join-types-explained-visually/) - Simple, useful visual expalanation of joins in SQL.
* [Understanding Joins in Relational data | R for Data Science](https://r4ds.had.co.nz/relational-data.html#understanding-joins) - Visual expalanation of joins in SQL with the addition of R code and variables.## Bash
* [miller](https://github.com/johnkerl/miller) - With Miller, you get to use named fields without needing to count positional indices, using familiar formats such as CSV, TSV, JSON, and positionally-indexed.
* [q](https://github.com/harelba/q) - Run SQL directly on CSV or TSV files.
* [jq](https://stedolan.github.io/jq) - jq is a lightweight and flexible command-line JSON processor.
* [jid](https://github.com/simeji/jid) - JSON Incremental Digger to drill down interactively by using filtering queries like jq.## Systems Administration
* [Ops School](http://www.opsschool.org) - Comprehensive program that will help you learn to be an operations engineer.
* [Awesome Sysadmin](https://github.com/kahun/awesome-sysadmin) - more awesomeness related to this topic.## Cloud Computing
* [Binder](https://mybinder.org/) - To turn a Git repo into a collection of interactive notebooks. A great tool for teaching workshops.## Reproducibility
* [The Turing Way handbook](https://github.com/alan-turing-institute/the-turing-way#readme) - a handbook to reproducible, ethical and collaborative data science.
* [MRAN Timemachine](https://mran.microsoft.com/timemachine) - For the purpose of reproducibility, MRAN hosts daily snapshots of the CRAN R packages and R releases as far back as Sept. 17, 2014.