{"id":14111224,"url":"https://github.com/dlab-berkeley/awesome-dlab","last_synced_at":"2025-08-01T12:31:44.413Z","repository":{"id":46785785,"uuid":"189479078","full_name":"dlab-berkeley/awesome-dlab","owner":"dlab-berkeley","description":"😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers","archived":false,"fork":false,"pushed_at":"2024-11-28T00:17:39.000Z","size":44,"stargazers_count":17,"open_issues_count":0,"forks_count":3,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-07-02T23:01:52.950Z","etag":null,"topics":["awesome","awesome-list","lists","resources","social-sciences","ucberkeley"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dlab-berkeley.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-30T20:36:19.000Z","updated_at":"2025-05-06T03:55:43.000Z","dependencies_parsed_at":"2022-09-02T12:11:57.905Z","dependency_job_id":null,"html_url":"https://github.com/dlab-berkeley/awesome-dlab","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dlab-berkeley/awesome-dlab","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dlab-berkeley%2Fawesome-dlab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dlab-berkeley%2Fawesome-dlab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dlab-berkeley%2Fawesome-dlab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dlab-berkeley%2Fawesome-dlab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dlab-berkeley","download_url":"https://codeload.github.com/dlab-berkeley/awesome-dlab/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dlab-berkeley%2Fawesome-dlab/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265452497,"owners_count":23767977,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["awesome","awesome-list","lists","resources","social-sciences","ucberkeley"],"created_at":"2024-08-14T10:03:11.940Z","updated_at":"2025-08-01T12:31:44.349Z","avatar_url":"https://github.com/dlab-berkeley.png","language":null,"funding_links":[],"categories":["Other Lists"],"sub_categories":["TeX Lists"],"readme":"# awesome-dlab\n😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers\n\n**What is an awesome list?**\n\nOnly put stuff on the list that you or another D-Labber can personally recommend. You should rather leave stuff out than include too much. Read the [Awesome Manifesto](https://github.com/sindresorhus/awesome/blob/master/awesome.md) to find out more what this list is about.\n\nOr if you'd like to check out stuff that is awesome to people outside of D-Lab, then start here: [![Awesome](https://awesome.re/badge.svg)](https://awesome.re)\n\n## Contents\n\n- [Datasets](#datasets)\n- [Natural Language Processing (NLP)](#natural-language-processing-nlp)\n- [Rosetta Stones](#rosetta-stones)\n- [R](#r)\n- [Python](#python)\n- [PDF](#pdf)\n- [Databases](#databases)\n- [Systems Administration](#systems-administration)\n- [Cloud computing](#cloud-computing)\n- [Reproducibility](#reproducibility)\n\n## Datasets\n* [Case.Law](https://case.law/) - all official, book-published United States case law — every volume designated as an official report of decisions by a court within the United States.\n* [DEA Pain Pills Database](https://www.washingtonpost.com/national/2019/07/18/how-download-use-dea-pain-pills-database/) - The Washington Post published a significant portion of a database that tracks the path of every opioid pain pill, from manufacturer to pharmacy, in the United States between 2006 and 2012.\n* [Awesome Public Data](https://github.com/awesomedata/awesome-public-datasets) - list of a topic-centric public data sources collected and tidied from blogs, answers, and user responses.\n* [tidytweetjson](https://github.com/jaeyk/tidytweetjson) - R package for Turning Tweet JSON Files into a Tidyverse-ready Dataframe. The package takes 18 minutes to turn 1 million tweets into a dataframe.\n* [tidyethnicnews](https://github.com/jaeyk/tidyethnicnews) - R package for turning one of the largest databases on ethnic newspapers and magazines (Ethnic NewsWatch) into a tidyverse-ready dataframe. The package takes 0.0005 seconds to turn 100 newspaper articles into a tidy dataframe.\n* [California COVID Assessment Tool](https://github.com/StateOfCalifornia/CalCAT) - This repository contains an application written in Shiny and for use with any US state to assist in assessing the many different models available for understanding COVID-19 transmission and spread. It brings together several data sources that are publicly available, and can be supplemented with your own data to improve the assessment.\n\n## Natural Language Processing (NLP)\n* [Tracking Progress in Natural Language Processing](https://github.com/sebastianruder/NLP-progress) - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.\n\n## Rosetta Stones\n* [Rosetta: Python, R, Stata Rosetta Stone. Projects implemented in each language side-by-side.](https://github.com/adamrossnelson/rosetta)\n* [Stata to Pandas Cross-Walk](https://github.com/adamrossnelson/StataQuickReference/blob/master/spcrosswlk.md)\n* [Data Science Rosetta Stone](http://www.datasciencerosettastone.com/) - A Tutorial of and Translation between Data Science Programming Languages\n\n\n## R\n* [Awesome R](https://github.com/qinwf/awesome-R#readme) - more awesomeness related to this topic.\n\n* [rio: A Swiss-Army Knife for Data I/O](https://cran.r-project.org/web/packages/rio/vignettes/rio.html) - Import, Export, and Convert Data Files including web-based import, reading compressed files directly without explicit decompression, and 'convert()' function for converting between file types.\n\n* [makereproducible](https://github.com/jaeyk/makereproducible): R package for making a project computationally reproducible before sharing it\n\n## PDF\n* [Working with PDFs in Python](https://stackabuse.com/working-with-pdfs-in-python-reading-and-splitting-pages/) - Describes a range of Python libraries and and examples to work with PDFs: Reading and Splitting Pages; Adding Images and Watermarks; Inserting, Deleting, and Reordering Pages\n\n## Python\n* [Awesome Python](https://github.com/vinta/awesome-python#readme) - more awesomeness related to this topic.\n\n## Databases\n* [SQLite](http://www.sqlite.org/) - A completely embedded, full-featured relational database in a few 100k that you can include right into your project.\n* [sqlitebiter](https://github.com/thombashi/sqlitebiter) - a CLI tool to convert CSV / Excel / HTML / JSON / and many other formats to a SQLite database file.\n* [Awesome SQL](https://github.com/danhuss/awesome-sql) - more awesomeness related to this topic.\n* [fuzzy string matching with Postgresql](https://www.freecodecamp.org/news/fuzzy-string-matching-with-postgresql/) - examples of different ways to match strings using PostgreSQL and extensions.\n* [binder-postgres](https://github.com/ouseful-template-repos/binder-postgres) - Demo of launching a binderhub notebook server with a free running Postgres server. [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ouseful-template-repos/binder-postgres/master?filepath=notebooks%2FTest%20Databases.ipynb)\n* [SQL Join Types Explained in Visuals](https://dataschool.com/how-to-teach-people-sql/sql-join-types-explained-visually/) - Simple, useful visual expalanation of joins in SQL.\n* [Understanding Joins in Relational data | R for Data Science](https://r4ds.had.co.nz/relational-data.html#understanding-joins) - Visual expalanation of joins in SQL with the addition of R code and variables.\n\n\n## Bash\n* [miller](https://github.com/johnkerl/miller) - With Miller, you get to use named fields without needing to count positional indices, using familiar formats such as CSV, TSV, JSON, and positionally-indexed.\n* [q](https://github.com/harelba/q) - Run SQL directly on CSV or TSV files.\n* [jq](https://stedolan.github.io/jq) - jq is a lightweight and flexible command-line JSON processor.\n* [jid](https://github.com/simeji/jid) - JSON Incremental Digger to drill down interactively by using filtering queries like jq.\n\n## Systems Administration\n* [Ops School](http://www.opsschool.org) - Comprehensive program that will help you learn to be an operations engineer.\n* [Awesome Sysadmin](https://github.com/kahun/awesome-sysadmin) - more awesomeness related to this topic.\n\n## Cloud Computing\n* [Binder](https://mybinder.org/) - To turn a Git repo into a collection of interactive notebooks. A great tool for teaching workshops.\n\n## Reproducibility\n* [The Turing Way handbook](https://github.com/alan-turing-institute/the-turing-way#readme) - a handbook to reproducible, ethical and collaborative data science.\n* [MRAN Timemachine](https://mran.microsoft.com/timemachine) - For the purpose of reproducibility, MRAN hosts daily snapshots of the CRAN R packages and R releases as far back as Sept. 17, 2014.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdlab-berkeley%2Fawesome-dlab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdlab-berkeley%2Fawesome-dlab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdlab-berkeley%2Fawesome-dlab/lists"}