Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jtanwk/nytcrossword
An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era.
https://github.com/jtanwk/nytcrossword
crosswords dataviz linguistic-analysis nytimes nytimes-crossword rvest webscraping
Last synced: 9 days ago
JSON representation
An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era.
- Host: GitHub
- URL: https://github.com/jtanwk/nytcrossword
- Owner: jtanwk
- License: mit
- Created: 2017-09-03T06:51:53.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-02-20T05:33:29.000Z (over 5 years ago)
- Last Synced: 2024-03-09T04:35:25.681Z (4 months ago)
- Topics: crosswords, dataviz, linguistic-analysis, nytimes, nytimes-crossword, rvest, webscraping
- Language: HTML
- Homepage: https://jtanwk.github.io/nytcrossword/
- Size: 7.43 MB
- Stars: 122
- Watchers: 5
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Lists
- awesome-stars - jtanwk/nytcrossword - An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era. (HTML)
README
# 24 Years of NYTimes Crossword answers
September 2, 2017
[View the notebook here](https://jtanwk.github.io/nytcrossword/)
## Description
Exploratory data analysis of 24 years of New York Times Crossword answers. I use data visualization and computational linguistics concepts to discover trends in the Shortz-era puzzles (1994 - present).
Questions include:
- What are the most common answers?
- Are words getting longer? Shorter?
- How does puzzle letter density vary by day?
- What words have emerged in the crossword only in the past few years?
- How lexically diverse are the puzzles?## Dependencies
- `tidyverse` for everything
- `plyr` for data wrangling
- `here` for OS-agnostic file paths
- `tidytext` for text analysis methods
- `stringr` for string-manipulation operations
- `viridis` for a simple, colorblind-friendly palette## Data Sources
The original dataset for this project was scraped from XWordInfo.com. Upon their request, however, I have taken down my scraper code and removed the dataset from this repository. [Read the notebook for more details](https://jtanwk.github.io/nytcrossword/).