{"id":18423332,"url":"https://github.com/webis-de/summary-explorer","last_synced_at":"2025-04-07T15:32:44.402Z","repository":{"id":39699745,"uuid":"382302191","full_name":"webis-de/summary-explorer","owner":"webis-de","description":"Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.","archived":false,"fork":false,"pushed_at":"2024-05-13T19:07:29.000Z","size":48200,"stargazers_count":44,"open_issues_count":1,"forks_count":7,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-03-22T20:26:26.849Z","etag":null,"topics":["evaluation","summarization"],"latest_commit_sha":null,"homepage":"https://tldr.webis.de/","language":"CSS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/webis-de.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-02T09:51:14.000Z","updated_at":"2025-01-14T10:15:50.000Z","dependencies_parsed_at":"2024-04-17T01:58:52.566Z","dependency_job_id":null,"html_url":"https://github.com/webis-de/summary-explorer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webis-de%2Fsummary-explorer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webis-de%2Fsummary-explorer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webis-de%2Fsummary-explorer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webis-de%2Fsummary-explorer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/webis-de","download_url":"https://codeload.github.com/webis-de/summary-explorer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247679868,"owners_count":20978146,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["evaluation","summarization"],"created_at":"2024-11-06T04:36:50.277Z","updated_at":"2025-04-07T15:32:39.390Z","avatar_url":"https://github.com/webis-de.png","language":"CSS","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Summary Explorer\n[Summary Explorer](https://tldr.webis.de/) is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multiple datasets. It provides a guided assessment of summary quality dimensions such as coverage, faithfulness and position bias. You can inspect summaries from a single model or compare multiple models. \n\nThe tool currently hosts the outputs of [55 summarization models](https://tldr.webis.de/models) across three datasets: [CNN DailyMail](https://huggingface.co/datasets/cnn_dailymail), [XSum](https://huggingface.co/datasets/xsum), and [Webis TL;DR](https://huggingface.co/datasets/reddit).\n\nTo integrate your model in Summary Explorer, please prepare your summaries as described [here](https://tldr.webis.de/about) and contact us.\n\n\u003eAccepted at EMNLP 2021 (Demo track). A pre-print version of the paper is available [here](https://arxiv.org/abs/2108.01879).\n\n**Update 17.03.2022**\n1. Refactored the text processing pipeline.\n2. Updated [local deployment instructions](https://github.com/webis-de/summary-explorer/edit/main/README.md#adding-your-own-model-locally) for custom models.\n\n\n### Use cases\n\n**1. View Content Coverage of the Summaries**\n![Content Coverage](ui/frontend/static/frontend/images/gifs/q1.gif)\n\n\n**2. Inspect Hallucinations**\n![Hallucinations](ui/frontend/static/frontend/images/gifs/q2.gif)\n\n**3. View Named Entity Coverage of the Summaries** \n![Named Entity Coverage](ui/frontend/static/frontend/images/gifs/q3.gif)\n\n\n**4. Inspect Faithfulness via Relation Alignment**\n![Relation Coverage](ui/frontend/static/frontend/images/gifs/q4.gif)\n\n**5. Compare Agreement among Summaries**\n![Summary Agreement](ui/frontend/static/frontend/images/gifs/q5.gif)\n\n**6. View Position Bias of a Model**\n![Position Bias](ui/frontend/static/frontend/images/gifs/q6.gif)\n\n### Adding your own model locally\n\n**Text processing**\n\nApply the 5-step text processing pipeline from the  `text-processing` sub-directory as shown below.\n\n1. Tokenization, sentence segmentation, named entity recognition, relation extraction, flattening redundant relations\n   `python3 step_1_nlp_pipeline.py --input_dir ../data/raw_files/ --output_dir ../data/nlp-processed/`\n\n2. Lexical alignment of the summary with the source document using ROUGE\n   `python3 step_2_lexical_alignment.py --input_dir ../data/nlp-processed/ --output_dir ../data/lexical-alignments/`\n\n3. Semantic alignment of the summary with the source document using BERTScore\n   `python3 step_3_semantic_alignment.py --input_dir ../data/lexical-alignments/ --output_dir ../data/semantic-alignments/`\n\n4. ROUGE scores\n   `python3 step_4_automatic_evaluation.py --input_dir ../data/semantic-alignments/ --output_dir ../data/automatic-metrics/`\n\n5. Summary compression, summary factuality (entity and relation level), n-gram abstractiveness\n   `python3 step_5_document_overlap_metrics.py --input_dir ../data/automatic-metrics/ --output_dir ../data/document-overlap-metrics/ `\n\nNext, create a `models_details.jsonl` file which contains meta information about your models. For e.g.,\n\n```\n{\"name\": \"model_1\", \"title\": \"model_1 title\", \"abstract\": \"TBD\", \"human evaluation\": \"TBD\", \"url\": \"\"}\n{\"name\": \"model_2\", \"title\": \"model_2 title\", \"abstract\": \"TBD\", \"human evaluation\": \"TBD\", \"url\": \"\"} \n{\"name\": \"references\", \"title\": \"References\", \"abstract\": \"TBD\", \"human evaluation\": \"TBD\", \"url\": \"\"}\n```\n\nThen, create a `config.json` file with all the paths to the processed  articles, summaries and the `models_details.jsonl` files. This file must also contain your dataset description.  For e.g.,\n\n```\n{\n  \"dataset\": {\n    \"name\": \"Dataset X\",\n    \"description\": \"DATASET \\n N Articles \\n M Models\"\n  },\n  \"path_to_models_details_file\": \"models_details.jsonl\",\n  \"path_to_articles_file\": \"articles.jsonl\",\n  \"path_to_summaries_files\": {\n    \"references\": \"references.jsonl\",\n    \"model_1\": \"model_1.jsonl\",\n    \"model_2\": \"model_2.jsonl\"\n  }\n}\n```\n\n**Setting up the Django app and importing to the local database**\n\nIn the `ui` sub-directory:\n\n1. Install the dependencies\n\n   `pip install -r requirements.txt`\n\n2. Create a postgres database\n\n   ``````\n   psql --username=postgres\n   CREATE DATABASE sumviz;\n   ``````\n\n3. To import the database dump [file](https://files.webis.de/summary-explorer/database/dbexport.sql) of all the 55 models hosted online\n   `psql -h hostname -d sumviz -U username -f dbexport.sql`\n\n4. To import your own models (via the `config.json` file created above), run the `import_dataset` command\n   `python manage.py import_dataset -c PATH-TO-config.json`\n\n5. Create a `.env` file with your database settings. For e.g.,\n\n   ``` \n   DEBUG=0\n   SECRET_KEY=**********\n   DJANGO_ALLOWED_HOSTS=localhost 127.0.0.1 [::1]\n   SQL_ENGINE=django.db.backends.postgresql_psycopg2\n   SQL_DATABASE=sumviz\n   SQL_USER=postgres\n   SQL_PASSWORD=****\n   SQL_HOST=db\n   SQL_PORT=5432\n   DATABASE=postgres\n   ```\n\n6. Start the server\n   `python manage.py runserver`\n7. Visit http://127.0.0.1:8000/\n\n\n\n**Note**: The tool is in active development and we plan to add new features. Please feel free to report any issues and provide suggestions.\n\n### Citation\n```\n@inproceedings{syed:2021,\n    title = \"Summary Explorer: Visualizing the State of the Art in Text Summarization\",\n    author = {Syed, Shahbaz  and\n      Yousef, Tariq  and\n      Al Khatib, Khalid  and\n      J{\\\"a}nicke, Stefan  and\n      Potthast, Martin},\n    booktitle = \"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations\",\n    year = \"2021\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2021.emnlp-demo.22\"\n}\n\n```\n\n\n### Acknowledgements\nWe sincerely thank all the authors who made their code and model outputs publicly available, meta evaluations of [Fabbri et al., 2020](https://github.com/Yale-LILY/SummEval) and [Bhandari et al., 2020](https://github.com/neulab/REALSumm), and the summarization leaderboard at [NLP-Progress](https://nlpprogress.com/english/summarization.html). \n\nWe hope this encourages more authors to share their models and summaries to help track the *qualitative progress* in text summarization research. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwebis-de%2Fsummary-explorer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwebis-de%2Fsummary-explorer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwebis-de%2Fsummary-explorer/lists"}