https://github.com/dlab-berkeley/awesome-dlab

😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers
https://github.com/dlab-berkeley/awesome-dlab

List: awesome-dlab

awesome awesome-list lists resources social-sciences ucberkeley

Last synced: 7 months ago
JSON representation

😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers

Host: GitHub
URL: https://github.com/dlab-berkeley/awesome-dlab
Owner: dlab-berkeley
Created: 2019-05-30T20:36:19.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2021-11-24T18:37:56.000Z (over 3 years ago)
Last Synced: 2024-05-23T06:11:53.930Z (about 1 year ago)
Topics: awesome, awesome-list, lists, resources, social-sciences, ucberkeley
Size: 38.1 KB
Stars: 17
Watchers: 11
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

ultimate-awesome - awesome-dlab - 😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers. (Other Lists / TeX Lists)

README

        # awesome-dlab

😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers

**What is an awesome list?**

Only put stuff on the list that you or another D-Labber can personally recommend. You should rather leave stuff out than include too much. Read the [Awesome Manifesto](https://github.com/sindresorhus/awesome/blob/master/awesome.md) to find out more what this list is about.

Or if you'd like to check out stuff that is awesome to people outside of D-Lab, then start here: [![Awesome](https://awesome.re/badge.svg)](https://awesome.re)

## Contents

- [Datasets](#datasets)

- [Natural Language Processing (NLP)](#natural-language-processing-nlp)

- [Rosetta Stones](#rosetta-stones)

- [R](#r)

- [Python](#python)

- [PDF](#pdf)

- [Databases](#databases)

- [Systems Administration](#systems-administration)

- [Cloud computing](#cloud-computing)

- [Reproducibility](#reproducibility)

## Datasets

* [Case.Law](https://case.law/) - all official, book-published United States case law — every volume designated as an official report of decisions by a court within the United States.

* [DEA Pain Pills Database](https://www.washingtonpost.com/national/2019/07/18/how-download-use-dea-pain-pills-database/) - The Washington Post published a significant portion of a database that tracks the path of every opioid pain pill, from manufacturer to pharmacy, in the United States between 2006 and 2012.

* [Awesome Public Data](https://github.com/awesomedata/awesome-public-datasets) - list of a topic-centric public data sources collected and tidied from blogs, answers, and user responses.

* [tidytweetjson](https://github.com/jaeyk/tidytweetjson) - R package for Turning Tweet JSON Files into a Tidyverse-ready Dataframe. The package takes 18 minutes to turn 1 million tweets into a dataframe.

* [tidyethnicnews](https://github.com/jaeyk/tidyethnicnews) - R package for turning one of the largest databases on ethnic newspapers and magazines (Ethnic NewsWatch) into a tidyverse-ready dataframe. The package takes 0.0005 seconds to turn 100 newspaper articles into a tidy dataframe.

* [California COVID Assessment Tool](https://github.com/StateOfCalifornia/CalCAT) - This repository contains an application written in Shiny and for use with any US state to assist in assessing the many different models available for understanding COVID-19 transmission and spread. It brings together several data sources that are publicly available, and can be supplemented with your own data to improve the assessment.

## Natural Language Processing (NLP)

* [Tracking Progress in Natural Language Processing](https://github.com/sebastianruder/NLP-progress) - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

## Rosetta Stones

* [Rosetta: Python, R, Stata Rosetta Stone. Projects implemented in each language side-by-side.](https://github.com/adamrossnelson/rosetta)

* [Stata to Pandas Cross-Walk](https://github.com/adamrossnelson/StataQuickReference/blob/master/spcrosswlk.md)

* [Data Science Rosetta Stone](http://www.datasciencerosettastone.com/) - A Tutorial of and Translation between Data Science Programming Languages

## R

* [Awesome R](https://github.com/qinwf/awesome-R#readme) - more awesomeness related to this topic.

* [rio: A Swiss-Army Knife for Data I/O](https://cran.r-project.org/web/packages/rio/vignettes/rio.html) - Import, Export, and Convert Data Files including web-based import, reading compressed files directly without explicit decompression, and 'convert()' function for converting between file types.

* [makereproducible](https://github.com/jaeyk/makereproducible): R package for making a project computationally reproducible before sharing it

## PDF

* [Working with PDFs in Python](https://stackabuse.com/working-with-pdfs-in-python-reading-and-splitting-pages/) - Describes a range of Python libraries and and examples to work with PDFs: Reading and Splitting Pages; Adding Images and Watermarks; Inserting, Deleting, and Reordering Pages

## Python

* [Awesome Python](https://github.com/vinta/awesome-python#readme) - more awesomeness related to this topic.

## Databases

* [SQLite](http://www.sqlite.org/) - A completely embedded, full-featured relational database in a few 100k that you can include right into your project.

* [sqlitebiter](https://github.com/thombashi/sqlitebiter) - a CLI tool to convert CSV / Excel / HTML / JSON / and many other formats to a SQLite database file.

* [Awesome SQL](https://github.com/danhuss/awesome-sql) - more awesomeness related to this topic.

* [fuzzy string matching with Postgresql](https://www.freecodecamp.org/news/fuzzy-string-matching-with-postgresql/) - examples of different ways to match strings using PostgreSQL and extensions.

* [binder-postgres](https://github.com/ouseful-template-repos/binder-postgres) - Demo of launching a binderhub notebook server with a free running Postgres server. [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ouseful-template-repos/binder-postgres/master?filepath=notebooks%2FTest%20Databases.ipynb)

* [SQL Join Types Explained in Visuals](https://dataschool.com/how-to-teach-people-sql/sql-join-types-explained-visually/) - Simple, useful visual expalanation of joins in SQL.

* [Understanding Joins in Relational data | R for Data Science](https://r4ds.had.co.nz/relational-data.html#understanding-joins) - Visual expalanation of joins in SQL with the addition of R code and variables.

## Bash

* [miller](https://github.com/johnkerl/miller) - With Miller, you get to use named fields without needing to count positional indices, using familiar formats such as CSV, TSV, JSON, and positionally-indexed.

* [q](https://github.com/harelba/q) - Run SQL directly on CSV or TSV files.

* [jq](https://stedolan.github.io/jq) - jq is a lightweight and flexible command-line JSON processor.

* [jid](https://github.com/simeji/jid) - JSON Incremental Digger to drill down interactively by using filtering queries like jq.

## Systems Administration

* [Ops School](http://www.opsschool.org) - Comprehensive program that will help you learn to be an operations engineer.

* [Awesome Sysadmin](https://github.com/kahun/awesome-sysadmin) - more awesomeness related to this topic.

## Cloud Computing

* [Binder](https://mybinder.org/) - To turn a Git repo into a collection of interactive notebooks. A great tool for teaching workshops.

## Reproducibility

* [The Turing Way handbook](https://github.com/alan-turing-institute/the-turing-way#readme) - a handbook to reproducible, ethical and collaborative data science.

* [MRAN Timemachine](https://mran.microsoft.com/timemachine) - For the purpose of reproducibility, MRAN hosts daily snapshots of the CRAN R packages and R releases as far back as Sept. 17, 2014.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dlab-berkeley/awesome-dlab

Awesome Lists containing this project

README