Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nolanbconaway/friends-omg
People say "oh my god" a lot in the show "Friends".
https://github.com/nolanbconaway/friends-omg
heroku python television
Last synced: about 1 month ago
JSON representation
People say "oh my god" a lot in the show "Friends".
- Host: GitHub
- URL: https://github.com/nolanbconaway/friends-omg
- Owner: nolanbconaway
- Created: 2020-08-31T00:14:17.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2024-03-23T16:05:51.000Z (10 months ago)
- Last Synced: 2024-03-23T17:22:36.751Z (10 months ago)
- Topics: heroku, python, television
- Language: Python
- Homepage: https://friends-omg.herokuapp.com/
- Size: 24.4 KB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Friends OMG
We were rewatching some old Friends episodes at home when I took notice that the phrase _"oh my god"_
comes up a lot in that show. As any reasonable person would, I compiled script data and built a website to
prove my point.This webapp lets you check how often phrases like "oh my god" are said in Friends, Seinfeld,
and Sex and the City.## Building the Dataset
I tried to make the data build as self-contained as possible. To that end I have hosted some source files on my personal heliohosting server. There are two unrecoverable aspects to the data:
1. The source seinfeld data is no longer available online. [Colin](https://github.com/colinpollock) sent a copy to me, and thats how i have it.
2. The Sex and the City data were messy in their public form (via Kaggle). I cleaned those data up a fair amount and saved the file.The [download-all](bin/download-all) script contains relevant URLs to download the source data used tobuild the final dataset. After downloading those raw files, the [build](build/) module contains code to process the files in their raw form in order to prodice a final dataset.
You can download a copy of it [here](http://nolanc.heliohost.org/omg-data/data.db.gz)!
### Data Credits
I did basically zero work obtaining the source data. Below are shout-outs to those who did that hard work:
1. [Colin Pollock](https://github.com/colinpollock/seinfeld-scripts) for the _excellent_ Seinfeld dataset.
2. [Yusuf Sohoye](https://quotennial.github.io/Friends-engineering) for the regex to parse through Friends script files.
3. I stole the Sex and the City data from [Kaggle](https://www.kaggle.com/snapcrack/every-sex-and-the-city-script).You can download my the full (gzipped) SQLite3 database [here](http://nolanc.heliohost.org/omg-data/data.db.gz)!