{"id":31156111,"url":"https://github.com/aborroy/summary-comparison-tool","last_synced_at":"2025-09-18T20:54:57.404Z","repository":{"id":306447945,"uuid":"1026227823","full_name":"aborroy/summary-comparison-tool","owner":"aborroy","description":"Comparing the quality of two summaries against a source Markdown document","archived":false,"fork":false,"pushed_at":"2025-08-11T06:38:17.000Z","size":20,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-11T08:34:59.401Z","etag":null,"topics":["nlp","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aborroy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-25T14:18:21.000Z","updated_at":"2025-08-11T06:38:20.000Z","dependencies_parsed_at":"2025-07-25T21:31:10.542Z","dependency_job_id":"09b78cbf-93ac-43e4-b333-a02ce8076943","html_url":"https://github.com/aborroy/summary-comparison-tool","commit_stats":null,"previous_names":["aborroy/summary-comparison-tool"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aborroy/summary-comparison-tool","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fsummary-comparison-tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fsummary-comparison-tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fsummary-comparison-tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fsummary-comparison-tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aborroy","download_url":"https://codeload.github.com/aborroy/summary-comparison-tool/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aborroy%2Fsummary-comparison-tool/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275830188,"owners_count":25536280,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-18T02:00:09.552Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nlp","python"],"created_at":"2025-09-18T20:54:54.067Z","updated_at":"2025-09-18T20:54:57.388Z","avatar_url":"https://github.com/aborroy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Summary Comparison Tool\n\nA comprehensive evaluation tool for comparing the quality of two summaries against a source Markdown document using multiple AI-powered metrics.\n\n## Features\n\n- **Multi-metric evaluation**: BARTScore, semantic similarity, coverage, conciseness, and factual consistency\n- **Markdown support**: Direct processing of Markdown documents with proper text extraction\n- **Flexible scoring**: Weighted combination of multiple evaluation criteria\n- **GPU acceleration**: CUDA support for faster processing\n- **Detailed analysis**: Optional breakdown of individual metrics\n\n## Installation\n\n### Prerequisites\n\n- Python 3.7+\n- PyTorch (CPU or GPU version)\n\nor\n\n- Docker 4.40+\n\n### Setup with local deployment\n\n1. **Clone the repository**\n```bash\ngit clone https://github.com/aborroy/summary-comparison-tool.git\ncd summary-comparison-tool\n```\n\n2. **Create a virtual environment**\n```bash\npython3 -m venv venv\nsource venv/bin/activate\n```\n\n3. **Install Python dependencies**\n```bash\npip install torch transformers sentence-transformers markdown beautifulsoup4 tqdm numpy\n```\n\n4. **Clone BARTScore dependency**\n```bash\ngit clone https://github.com/neulab/BARTScore.git\n```\n\n### GPU Support (Optional)\n\nFor CUDA acceleration, install PyTorch with GPU support:\n```bash\npip install torch --index-url https://download.pytorch.org/whl/cu118\n```\n\n## Usage\n\n### Basic Comparison\n\n```bash\npython summary_comparison.py document.md \"First summary text\" \"Second summary text\"\n```\n\n### With GPU Acceleration\n\n```bash\npython summary_comparison.py document.md \"First summary\" \"Second summary\" --device cuda\n```\n\n### Detailed Analysis\n\n```bash\npython summary_comparison.py document.md \"First summary\" \"Second summary\" --detailed\n```\n\n## Running with Docker\n\n## How to use (CPU)\n\n```bash\ndocker build -t summary-compare .\ndocker run --rm -v \"$PWD\":/work -w /work summary-compare \\\n  examples/sample_document.md \"First summary\" \"Second summary\"\n```  \n\n## GPU (optional)\n\nIf you want CUDA, build with the CUDA wheel index and run with NVIDIA:\n\n```bash\ndocker build --build-arg TORCH_INDEX_URL=https://download.pytorch.org/whl/cu121 \\\n  -t summary-compare-gpu .\ndocker run --rm --gpus all -v \"$PWD\":/work -w /work summary-compare-gpu \\\n  examples/sample_document.md \"A\" \"B\" --device cuda\n```  \n\n\u003e Note. Docker Desktop 4.44 switched the default builder to a containerized Buildx driver. With that driver, docker build doesn’t load the image into your local image store unless you say so. When using this Docker version, following command needs to be used (build with `--load`):\n\n```bash\ndocker build --load -t summary-compare .\n```\n\n## Example Output\n\n### Standard Output\n```\n============================================================\nSUMMARY EVALUATION RESULTS\n============================================================\nSummary 1 Overall Score: -0.167\nSummary 2 Overall Score: -0.389\n\n* Better Summary: Summary 1\n  Margin: +0.222\n```\n\n### Detailed Output (with `--detailed` flag)\n```\n============================================================\nSUMMARY EVALUATION RESULTS\n============================================================\n\nSUMMARY 1 BREAKDOWN:\n  BARTScore:           -1.727\n  Semantic Similarity: 0.527\n  Coverage:            0.187\n  Conciseness:         0.800\n  Factual Consistency: 0.923\n  * Overall Score:     -0.167\n\nSUMMARY 2 BREAKDOWN:\n  BARTScore:           -2.447\n  Semantic Similarity: 0.476\n  Coverage:            0.160\n  Conciseness:         1.000\n  Factual Consistency: 0.857\n  * Overall Score:     -0.389\n\n* Better Summary: Summary 1\n  Margin: +0.222\n```\n\n## Evaluation Metrics\n\nThe tool evaluates summaries across five key dimensions:\n\n### 1. BARTScore (Weight: 30%)\n- Semantic similarity using BART model\n- Measures how well the summary captures document meaning\n- Higher scores indicate better semantic alignment\n\n### 2. Semantic Similarity (Weight: 25%)\n- Sentence embedding-based similarity\n- Uses sentence-transformers for deep semantic understanding\n- Fallback to word overlap if sentence-transformers unavailable\n\n### 3. Coverage Score (Weight: 25%)\n- Measures how well the summary covers key document content\n- Based on important keyword overlap\n- Filters out common stop words for better accuracy\n\n### 4. Conciseness Score (Weight: 10%)\n- Evaluates appropriate compression ratio\n- Optimal range: 10-30% of original document length\n- Penalizes both over-compression and verbosity\n\n### 5. Factual Consistency (Weight: 10%)\n- Checks if summary facts appear in source document\n- Focuses on numbers and proper nouns\n- Helps identify potential hallucinations\n\n## Command Line Options\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `md_file` | Path to Markdown source document | Required |\n| `summary1` | First candidate summary text | Required |\n| `summary2` | Second candidate summary text | Required |\n| `--device` | Computation device (`cpu`, `cuda`, `cuda:0`) | `cpu` |\n| `--detailed` | Show detailed metric breakdown | `false` |\n\n## File Structure\n\n```\nsummary-comparison-tool/\n├── summary_comparison.py            # Main evaluation script\n├── README.md                        # This file\n├── requirements.txt                 # Python dependencies\n├── examples/                        # Example documents and summaries\n│   ├── sample_document.md\n│   ├── good_summary.txt\n│   └── best_summary.txt\n└── BARTScore/                      # Git submodule (clone separately)\n```\n\n## Requirements\n\nSee `requirements.txt` for complete dependency list:\n\n```txt\ntorch\u003e=1.9.0\ntransformers\u003e=4.20.0\nsentence-transformers\u003e=2.2.0\nmarkdown\u003e=3.4.0\nbeautifulsoup4\u003e=4.11.0\ntqdm\u003e=4.64.0\nnumpy\u003e=1.21.0\n```\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- [BARTScore](https://github.com/neulab/BARTScore) for semantic evaluation\n- [Sentence Transformers](https://www.sbert.net/) for embedding-based similarity\n- [Hugging Face Transformers](https://huggingface.co/transformers/) for model infrastructure\n\n## Related Work\n\n- **ROUGE**: Traditional n-gram based evaluation metrics\n- **BERTScore**: BERT-based semantic similarity\n- **BLEURT**: Learned evaluation metric for text generation\n- **Factual Consistency**: Various approaches for hallucination detection","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faborroy%2Fsummary-comparison-tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faborroy%2Fsummary-comparison-tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faborroy%2Fsummary-comparison-tool/lists"}