{"id":18881615,"url":"https://github.com/ecrows/voices_of_chicago","last_synced_at":"2025-06-22T01:41:00.681Z","repository":{"id":124593509,"uuid":"124802540","full_name":"ecrows/voices_of_chicago","owner":"ecrows","description":"Statistical analysis of dialogue by gender within the 2002 film \"Chicago\", winner of \"Best Picture\" at the Academy Awards in 2003.","archived":false,"fork":false,"pushed_at":"2020-09-30T17:59:53.000Z","size":73,"stargazers_count":0,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-12-31T03:27:50.729Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ecrows.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-11T21:47:46.000Z","updated_at":"2018-03-11T23:30:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"38e8ceba-a028-4bf7-9e7c-91bc8a1aadfd","html_url":"https://github.com/ecrows/voices_of_chicago","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecrows%2Fvoices_of_chicago","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecrows%2Fvoices_of_chicago/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecrows%2Fvoices_of_chicago/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecrows%2Fvoices_of_chicago/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ecrows","download_url":"https://codeload.github.com/ecrows/voices_of_chicago/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239850446,"owners_count":19707348,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T06:50:21.988Z","updated_at":"2025-02-20T13:48:50.921Z","avatar_url":"https://github.com/ecrows.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Voices of \"Chicago\"\n\nBreakdown of lines by gender within the 2002 movie \"Chicago\".  Inspired by the [Polygraph project on Dialogue and Film](http://polygraph.cool/films/embed.html).  From the results, it appears to be the only movie to win \"Best Picture\" in the last 25+ years with more words spoken by female characters than male characters.\n\n## Requirements\n\n[Python3](https://www.python.org/downloads/)\n\nFor visualization, install numpy, matplotlib, python3-tk.\n\nOn Debian Linux:\n\n```\nsudo apt-get install python3 python3-tk\nwget https://bootstrap.pypa.io/get-pip.py\npython3 get-pip.py\nsudo pip3 install numpy\nsudo pip3 install matplotlib\n```\n\n## Methodology\n\nData sourced from [the film script available here](http://www.cswap.com/2002/Chicago/song/Full_Script).\n\nThe data was sanitized and formatted into the format present in script.txt.  To my knowledge, the script is largely accurate, though minor inaccuracies in wording are likely.\n\nAttributing some lines to characters was somewhat challenging, given the musical nature of the movie and the formatting of the source data.  Some minor lines remain unassigned as the gender of the speaker was not easily obtainable from IMDB, or the line was spoken by a crowd in unison.  Please open an issue if you know the specified gender attributed to an unassigned line.\n\nIn the musical number \"All I Care About Is Love\", lines attributed to \"Billy Flynn, girls\" have been counted under \"Billy Flynn\" (other characters are performing backup vocals).\n\nA more rigorous transcription of the script and association of each line would yield more exact results, but the provided data is sufficient to identify the overall division of lines.\n\n## Results\n\n### Final breakdown \n\n```\nMale characters: approximately 4791 words\nFemale characters: approximately 6317 words\nUnassigned: approximately 79 words\n```\n![Graph of results](https://github.com/ecrows/voices_of_chicago/blob/93131d89925f6ea755a27373666fab325e92c05c/util/results.png?raw=true)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fecrows%2Fvoices_of_chicago","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fecrows%2Fvoices_of_chicago","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fecrows%2Fvoices_of_chicago/lists"}