{"id":18631003,"url":"https://github.com/aimagelab/pacscore","last_synced_at":"2025-10-18T20:25:22.928Z","repository":{"id":82715494,"uuid":"594803977","full_name":"aimagelab/pacscore","owner":"aimagelab","description":"[CVPR 2023 \u0026 IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation","archived":false,"fork":false,"pushed_at":"2025-07-29T11:59:57.000Z","size":7506,"stargazers_count":62,"open_issues_count":3,"forks_count":8,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-07-29T14:00:46.735Z","etag":null,"topics":["captioning","captioning-images","captioning-videos","computer-vision","cvpr","cvpr2023","vision-and-language"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aimagelab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-01-29T17:35:04.000Z","updated_at":"2025-07-29T12:18:36.000Z","dependencies_parsed_at":null,"dependency_job_id":"b9f74768-cb3b-4d36-91ce-9c807e1d3d9b","html_url":"https://github.com/aimagelab/pacscore","commit_stats":{"total_commits":18,"total_committers":4,"mean_commits":4.5,"dds":"0.38888888888888884","last_synced_commit":"df633a306782f95bc790e6080b93e9c0e2ead8c1"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aimagelab/pacscore","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fpacscore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fpacscore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fpacscore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fpacscore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aimagelab","download_url":"https://codeload.github.com/aimagelab/pacscore/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fpacscore/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279609827,"owners_count":26199048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-18T02:00:06.492Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["captioning","captioning-images","captioning-videos","computer-vision","cvpr","cvpr2023","vision-and-language"],"created_at":"2024-11-07T05:05:31.109Z","updated_at":"2025-10-18T20:25:22.915Z","avatar_url":"https://github.com/aimagelab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003ch1\u003ePAC-Score: Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation\u003c/br\u003e(CVPR 2023 \u0026 IJCV 2025)\u003c/h1\u003e\n\n\n\u003ca href=\"https://pytorch.org/get-started/locally/\"\u003e\u003cimg alt=\"PyTorch\" src=\"https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch\u0026logoColor=white\"\u003e\u003c/a\u003e\n[![Conference](https://img.shields.io/badge/CVPR-2023(Highlight)-f9f107.svg)](https://openaccess.thecvf.com/content/CVPR2023/html/Sarto_Positive-Augmented_Contrastive_Learning_for_Image_and_Video_Captioning_Evaluation_CVPR_2023_paper.html)\n[![Paper](https://img.shields.io/badge/Paper-arxiv.2303.12112-B31B1B.svg)](https://arxiv.org/abs/2303.12112)\n  \n\u003c/div\u003e\n\nThis repository contains the reference code for the main paper and its extension:\n* [Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation](https://arxiv.org/abs/2303.12112), **CVPR 2023 Highlight✨** (top 2.5% of initial submissions and top 10% of accepted papers). \n* [Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training](https://arxiv.org/abs/2410.07336), **IJCV 2025**. \n\nPlease cite with the following BibTeX:\n```\n@inproceedings{sarto2023positive,\n  title={{Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation}},\n  author={Sarto, Sara and Barraco, Manuele and Cornia, Marcella and Baraldi, Lorenzo and Cucchiara, Rita},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  year={2023}\n}\n```\n```\n@inproceedings{sarto2024positive,\n  title={{Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training}},\n  author={Sarto, Sara and Nicholas, Moratelli and Cornia, Marcella and Baraldi, Lorenzo and Cucchiara, Rita},\n  booktitle={arxiv},\n  year={2024}\n}\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"images/model.png\" alt=\"PACS\" width=\"820\" /\u003e\n\u003c/p\u003e \n\nTry out the [Web demo](https://ailb-web.ing.unimore.it/pacscore), using [Gradio](https://github.com/gradio-app/gradio). \n\n## Environment Setup\nClone the repository and create the ```pacs``` conda environment using the ```environment.yml``` file:\n\n\n```\nconda env create -f environment.yml\nconda activate pacs\n```\n\n## Loading CLIP Models\n\nCheckpoints of different backbones are available at [this link](https://drive.google.com/drive/folders/15Da_nh7CYv8xfryIdETG6dPFSqcBiqpd?usp=sharing).\n\nOnce you have downloaded the checkpoints, place them under the ```checkpoints/``` folder.\n\n\u003ctable style=\"border-collapse: collapse; width: auto; border: none;\"\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"padding: 0; border: none;\"\u003e\n      \u003ctable style=\"border-collapse: collapse; width: auto; border: none;\"\u003e\n        \u003ctr\u003e\n          \u003ctd rowspan=\"2\" style=\"border: none;\"\u003e\u003cb\u003ePAC-S\u003c/b\u003e\u003c/td\u003e\n          \u003ctd style=\"border: none;\"\u003e\u003cb\u003eCLIP ViT-B/32\u003c/b\u003e\u003c/td\u003e\n          \u003ctd style=\"border: none;\"\u003e\n            \u003ca href=\"https://drive.google.com/file/d/1F-0Pma-vfJPAiDzeyl-iEdSXZIO1cDae/view?usp=drive_link\" target=\"_blank\"\u003eclip_ViT-B-32.pth\u003c/a\u003e\n          \u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd style=\"border: none;\"\u003e\u003cb\u003eOpenCLIP ViT-L/14\u003c/b\u003e\u003c/td\u003e\n          \u003ctd style=\"border: none;\"\u003e  \n          \u003ca href=\"https://drive.google.com/file/d/1F-0Pma-vfJPAiDzeyl-iEdSXZIO1cDae/view?usp=drive_link\" target=\"_blank\"\u003eopenClip_ViT-L-14.pth\u003c/a\u003e\n          \u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd rowspan=\"2\" style=\"border: none;\"\u003e\u003cb\u003ePAC-S++\u003c/b\u003e\u003c/td\u003e\n          \u003ctd style=\"border: none;\"\u003e\u003cb\u003eCLIP ViT-B/32\u003c/b\u003e\u003c/td\u003e\n          \u003ctd style=\"border: none;\"\u003e\n            \u003ca href=\"https://ailb-web.ing.unimore.it/publicfiles/pac++/PAC++_clip_ViT-B-32.pth\" target=\"_blank\"\u003ePAC++_clip_ViT-B-32.pth\u003c/a\u003e\n          \u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd style=\"border: none;\"\u003e\u003cb\u003eCLIP ViT-L/14\u003c/b\u003e\u003c/td\u003e\n          \u003ctd style=\"border: none;\"\u003e\n            \u003ca href=\"https://ailb-web.ing.unimore.it/publicfiles/pac++/PAC++_clip_ViT-L-14.pth\" target=\"_blank\"\u003ePAC++_clip_ViT-L-14.pth\u003c/a\u003e\n          \u003c/td\u003e\n        \u003c/tr\u003e\n      \u003c/table\u003e\n    \u003c/td\u003e\n    \u003ctd style=\"padding: 0; border: none;\"\u003e\n      \u003cimg src=\"images/radar_new.png\" alt=\"Model Image\" width=\"360\" style=\"display: block; border: none;\"/\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\n\n\n\n## Data Preparation\n\nAn example set of inputs, including a candidate json, image directory, and references json is provided in this repository under ```example/```. The input files are formatted as follows.\n\nThe candidates json should be a dictionary that maps from {\"image_identifier\": \"candidate_captions\"}:\n```\n{\"image1\": \"A white dog is laying on the ground with its head on its paws .\",\n  ...}\n```\nThe image directory should be a directory containing the images that act as the keys in the candidates json:\n```\nimages/\n├── image1.jpg\n└── image2.jpg\n```\nThe references json should be a dictionary that maps from {\"image_identifier\": [\"list\", \"of\", \"references\"]}:\n```\n{\"image1\":\n    [\n        \"A closeup of a white dog that is laying its head on its paws .\",\n        \"a large white dog lying on the floor .\", \n        \"A white dog has its head on the ground .\",\n        \"A white dog is resting its head on a tiled floor with its eyes open .\",\n        \"A white dog rests its head on the patio bricks .\"\n    ]}\n```\n\n## Compute PAC-S\n\nRun ```python -u compute_metrics.py``` to obtain standard captioning metrics (_e.g._ BLEU, METEOR, etc.) and PAC-S.\n\nTo compute RefPAC-S run ```python -u compute_metrics.py --compute_refpac```.\n\nThe default backbone used is the CLIP ViT-B-32 model. To use a different backcbone (_e.g._ OpenCLIP ViT-L/14 backbone) specify in the command input ```--clip_model open_clip_ViT-L/14```. \n\nTo try the enhanced version, PAC++, you can follow these instructions. \n\n```\nBLEU-1: 0.6400\nBLEU-4: 0.3503\nMETEOR: 0.3057\nROUGE: 0.5012\nCIDER: 1.4918\nPAC-S: 0.8264\nRefPAC-S: 0.8393\n```\nWorse captions should get lower scores:\n\n``` \npython -u compute_metrics.py --candidates_json example/bad_captions.json --compute_refpac  \n\nBLEU-1: 0.4500\nBLEU-4: 0.0000\nMETEOR: 0.0995\nROUGE: 0.3268\nCIDER: 0.4259\nPAC-S: 0.5772\nRefPAC-S: 0.6357\n\n```\n## Human Correlation Scores\n\n#### Flickr8k\n\nThe Flickr8k dataset can be downloaded at [this link](https://drive.google.com/drive/folders/1oQY8zVCmf0ZGUfsJQ_OnqP2_kw1jGIXp?usp=sharing).\n\nOnce you have downloaded the dataset, place them under the ```datasets/flickr8k``` folder.\n\n\n#### Run Code and Expected Output\n\nRun ```python -u compute_correlations.py``` to compute correlation scores on **Flickr8k-Expert** and **Flickr8k-CF** datasets.\n\n\n``` \nComputing correlation scores on dataset: flickr8k_expert\nBLEU-1   Kendall Tau-b: 32.175    Kendall Tau-c: 32.324\nBLEU-4   Kendall Tau-b: 30.599    Kendall Tau-c: 30.776\nMETEOR   Kendall Tau-b: 41.538    Kendall Tau-c: 41.822\nROUGE    Kendall Tau-b: 32.139    Kendall Tau-c: 32.314\nCIDER    Kendall Tau-b: 43.602    Kendall Tau-c: 43.891\nPAC-S    Kendall Tau-b: 53.919    Kendall Tau-c: 54.292\n\nComputing correlation scores on dataset: flickr8k_cf\nBLEU-1   Kendall Tau-b: 17.946    Kendall Tau-c: 9.256\nBLEU-4   Kendall Tau-b: 16.863    Kendall Tau-c: 8.710\nMETEOR   Kendall Tau-b: 22.269    Kendall Tau-c: 11.510\nROUGE    Kendall Tau-b: 19.903    Kendall Tau-c: 10.274\nCIDER    Kendall Tau-b: 24.619    Kendall Tau-c: 12.724\nPAC-S    Kendall Tau-b: 36.037    Kendall Tau-c: 18.628\n```\n\nFor the reference based version of the PACScore, add ```--compute_refpac```.\n\n## Compute PAC-S++\n\nRun ```python -u compute_correlations_pac++.py``` to compute correlation scores on **Flickr8k-Expert** and **Flickr8k-CF** datasets. \n\nFor the reference based version of the PACScore++, add ```--compute_refpac```.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faimagelab%2Fpacscore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faimagelab%2Fpacscore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faimagelab%2Fpacscore/lists"}