{"id":32639658,"url":"https://github.com/cogcomp/content-analysis-experiments","last_synced_at":"2025-10-31T02:14:28.590Z","repository":{"id":53884613,"uuid":"305768372","full_name":"CogComp/content-analysis-experiments","owner":"CogComp","description":null,"archived":false,"fork":false,"pushed_at":"2021-08-04T14:36:17.000Z","size":129,"stargazers_count":2,"open_issues_count":2,"forks_count":1,"subscribers_count":7,"default_branch":"master","last_synced_at":"2023-07-31T20:06:47.248Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CogComp.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-20T16:25:58.000Z","updated_at":"2023-07-31T20:06:47.249Z","dependencies_parsed_at":"2022-08-30T19:51:01.087Z","dependency_job_id":null,"html_url":"https://github.com/CogComp/content-analysis-experiments","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/CogComp/content-analysis-experiments","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CogComp%2Fcontent-analysis-experiments","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CogComp%2Fcontent-analysis-experiments/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CogComp%2Fcontent-analysis-experiments/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CogComp%2Fcontent-analysis-experiments/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CogComp","download_url":"https://codeload.github.com/CogComp/content-analysis-experiments/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CogComp%2Fcontent-analysis-experiments/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281914598,"owners_count":26583094,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-31T02:00:07.401Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-31T02:14:08.164Z","updated_at":"2025-10-31T02:14:28.582Z","avatar_url":"https://github.com/CogComp.png","language":"Python","readme":"This repository contains the code to reproduce the results from [Understanding the Extent to which Summarization Evaluation Metrics Measure the Information Quality of Summaries](https://arxiv.org/abs/2010.12495).\nThe code is based on an early version of the [SacreROUGE](https://github.com/danieldeutsch/sacrerouge) library, which has changed somewhat significantly since.\nThe version of ROUGE which was decomposed into different parts based on POS, dependency labels, etc., has also been included in SacreROUGE.\nSee [here](https://github.com/danieldeutsch/decomposed-rouge).\n\n## Environment\nFirst, run these commands to set up the environment:\n```\npip install -r requirements.txt\npython -m spacy download en_core_web_sm\n```\n\n## Data\nThen prepare the data.\nWe use the outputs from two models on the CNN/DailyMail dataset, which can be setup using this command:\n```\nsh datasets/cnndm/setup.sh\n```\nThe data will be downloaded and reformatted.\n\nThe other datasets are TAC 2008 and 2009.\nDue to license restrictions, we cannot release these datasets.\nHowever, if you have access to them, follow the instructions [here](./datasets/tac/Readme.md) for how to setup the data.\n\n## Experiments\nCalculating the proportion of ROUGE/BERTScore that can be explained by matches between SCUs (Section 4) can be calculated by running\n```\nsh experiments/scu-comparison/run.sh\n```\nThe output plots will be in `experiments/scu-comparison/output/{tac2008,tac2009}/plots` and aggregate statistics in `experiments/scu-comparison/output/{tac2008,tac2009}/stats.json`.\n\nThe contributions of each category to the overall score, the contributions of each category type to the overall score, and the difference in categories between the two systems trained on CNN/DM can be calculated using:\n```\nsh experiments/metric-decomposition/{cnndm,tac}/run.sh\n```\nThe respective directories will contain an `output` folder that contains the data output into `.tex` files which were used to create the tables in the paper.\n\nThe pairwise correlations between all of the metrics can be calculated using the [SacreROUGE](https://github.com/danieldeutsch/sacrerouge) library.\nThe scripts are not included here.\n\n## Notes\nOur experiment using BERTScore are based on [our fork](https://github.com/danieldeutsch/bert_score_content_analysis) of the [official repository](https://github.com/Tiiiger/bert_score), which includes code to return the alignment used by BERTScore.\nThe `requirements.txt` will automatically install our fork.\nBecause BERTScore aligns subword tokens, we aggreate the alignment at the token level.\nFurther, it automatically adds BOS and EOS tokens to the sequence.\nOur processing code adds dummy tokens after calculating the alignment to account for this.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcogcomp%2Fcontent-analysis-experiments","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcogcomp%2Fcontent-analysis-experiments","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcogcomp%2Fcontent-analysis-experiments/lists"}