https://github.com/diversen/shakespeare-plays
Scrape Shakespeare from MIT
https://github.com/diversen/shakespeare-plays
shakespeare shakespeare-dataset shakespeare-plays
Last synced: 2 months ago
JSON representation
Scrape Shakespeare from MIT
- Host: GitHub
- URL: https://github.com/diversen/shakespeare-plays
- Owner: diversen
- Created: 2023-12-27T19:22:37.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-28T13:38:40.000Z (over 1 year ago)
- Last Synced: 2025-02-07T22:28:37.488Z (4 months ago)
- Topics: shakespeare, shakespeare-dataset, shakespeare-plays
- Language: HTML
- Homepage:
- Size: 12.1 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Shakespeare's plays
This repository contains HTML and CSV versions of all Shakespeare's plays.
The downloaded HTML files: [html](html). They are downloaded from [https://shakespeare.mit.edu/ ](https://shakespeare.mit.edu/)
The exported CSV files: [csv](csv).
All exported plays as a single CSV file can be found in [csv/all.csv](csv/all.csv)
The CSV files have the following header:
Title,Chapter,Player,Line,Line ID,Stage Direction
The script [bin/generate_csv.py](bin/generate_csv.py) is used to generate the CSV files.
The script [bin/fetch_html.py](bin/fetch_html.py) is used to fetch the HTML files from MIT.