Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/bghorvath/TextMiningTheBechdelTest

Text mining movie scripts to explore long-term trend of female representation in movies according to the Bechdel test
https://github.com/bghorvath/TextMiningTheBechdelTest

bechdel bechdel-test coreference-resolution neuralcoref spacy

Last synced: 2 months ago
JSON representation

Text mining movie scripts to explore long-term trend of female representation in movies according to the Bechdel test

Awesome Lists containing this project

README

        

## Text mining movie scripts to perform the Bechdel test using spaCy

The Bechdel test is a measure of representation of women in fiction. It asks whether a work features at least two female characters who talk to each other about something other than a man.
Within the framework of this project I used text mining techniques on 1500+ movie scripts that I downloaded/parsed from the internet to explore long-term trend of female representation in movies by performing the test on them.

Data files can be downloaded from:
https://drive.google.com/drive/folders/1konx-AYGYk2zGTdHR97vgQAl_IB2r9Q2

Data files not included:
- Raw, unprocessed movie scripts (2071 txt/pdf/rtf/doc file, ~1.1GB) - can be downloaded through .py files
- Results evaluation from 3 judges
- Exported CSV files

The project report can be accessed at:
https://www.dropbox.com/s/96czpl7e5xerhtp/IRTM_project_report.pdf?dl=0

Inspirations for the data structures were taken from:
https://www.youtube.com/watch?v=jRKKPYDs44o