Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kallewesterling/drag-data-1930s

data-analysis-python dataset digital-humanities historical-newspapers history network-analysis newspapers performing-arts

Last synced: 5 days ago
JSON representation

Host: GitHub
URL: https://github.com/kallewesterling/drag-data-1930s
Owner: kallewesterling
Created: 2020-07-03T21:54:44.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2023-02-11T00:32:05.000Z (almost 2 years ago)
Last Synced: 2024-11-13T14:23:01.112Z (2 months ago)
Topics: data-analysis-python, dataset, digital-humanities, historical-newspapers, history, network-analysis, newspapers, performing-arts
Language: Jupyter Notebook
Homepage: https://kallewesterling.github.io/drag-data-browser
Size: 134 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 21
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Drag data in the 1930s

[![ci](https://github.com/kallewesterling/drag-data-1930s/actions/workflows/cy.yml/badge.svg)](https://github.com/kallewesterling/drag-data-1930s/actions/workflows/cy.yml)

A workflow will run and generate the visualization files in a different repository, so to see the final visualization, visit [this website](https://kallewesterling.github.io/drag-network/).

## Running on local machines

### Step 1. Clone the Correct Repository

This repository contains the source code but the most recent "compiled" `network-app` is located in the [`drag-network` repository](https://github.com/kallewesterling/drag-network). The easiest thing is therefore to

```sh
$ git clone https://github.com/kallewesterling/drag-network
```

### Step 2. Processing dataset

Since you are cloning the `drag-network` repository, you do not need to process the dataset as it already comes with the latest updated one.

~~To run the analysis, clone this package and run in your terminal:~~

~~$ python generate-cooccurrence-data.py~~

### Step 3. Run server

Navigate into the cloned directory:

```sh
$ cd drag-network
```

Then open a local HTTP server:

```sh
$ python -m http.server
```

_Note that this will only work on Python 3._

## Who is the Researcher?

Kalle Westerling is a Ph.D. Candidate in Theatre and Performance at The Graduate Center, CUNY, where he works on a dissertation about the history and aesthetics of male-identified bodies in 20th-century burlesque and 21st-century boylesque. He is also the project manager for the NEH-funded project “Expanding Communities of Practice,” aimed at helping to create infrastructure for digital humanities across several higher education institutions across the U.S. [Read more about Kalle Westerling on his website.](https://westerling.nu/)

## What is this dataset?

The dataset was created in a combination of a manual and automatic process, where searches were performed across a number of databases, results collated and PDF files/images of scanned newspapers were presented to the researcher (see below), who then manually coded all of the data into a data row for each person who occurred on that particular data in that particular newspaper.

The dataset can be seen [here](https://docs.google.com/spreadsheets/d/1UlpFQ9WWA6_6X-RuMJ3vHdIbyqhCZ1VRYgcQYjXprAg/edit#gid=0).

The data was manually processed into each column of each row as follows.

Each row has some central data assigned to it, which includes:
- a date (in the format YYYY-MM-DD)
- a name of the performer
- a name of the venue
- if not venue is mentioned but a city is mentioned, that name is filled out as well
- a source

Optional data includes:
- If there is a revue name mentioned, it is also noted here.
- If there is a legal name mentioned for the given performer, the legal name is noted
- If there is an alleged age mentioned for the given performer, the alleged age (and consequentially, the assumed birth year) are noted
- ID number that identifies the source in the Entertainment Industry Magazine Archive (EIMA)
- How the source was found through a search in newspapers.com
- How the source was found through a search in Fulton archives
- How the source was found through a search in an already existing archive
- Edge comment, which refers to any comments on the source itself (meta)
- Whether the data point shall be excluded from the final visualization
- Any interesting quotes from source
- Any interesting comments on the performer
- Any interesting comments on the venue
- Any interesting comments on the city
- Any interesting comments on the revue

Cleaned up data includes:
- Name of the performer
- Name of the venue
- Name of the city
- Source