{"id":17187491,"url":"https://github.com/davidmchan/aloha","last_synced_at":"2025-04-13T19:08:16.357Z","repository":{"id":230510719,"uuid":"779532248","full_name":"DavidMChan/aloha","owner":"DavidMChan","description":"A new reliable, localizable, and generalizable metric for hallucination detection in image captioning models.","archived":false,"fork":false,"pushed_at":"2024-07-22T22:27:57.000Z","size":16335,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-05T23:34:05.897Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DavidMChan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-30T04:25:27.000Z","updated_at":"2024-07-29T16:01:19.000Z","dependencies_parsed_at":"2024-03-30T06:31:38.064Z","dependency_job_id":"f482e1d5-c22f-49e7-80d8-501827868651","html_url":"https://github.com/DavidMChan/aloha","commit_stats":null,"previous_names":["davidmchan/aloha"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidMChan%2Faloha","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidMChan%2Faloha/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidMChan%2Faloha/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavidMChan%2Faloha/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DavidMChan","download_url":"https://codeload.github.com/DavidMChan/aloha/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240113736,"owners_count":19749828,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-15T01:06:31.721Z","updated_at":"2025-02-23T22:33:18.866Z","avatar_url":"https://github.com/DavidMChan.png","language":"Python","readme":"# ALOHa: A New Measure for Hallucination in Captioning Models\n\n### [Project](https://davidmchan.github.io/aloha/) | [Paper](https://arxiv.org/abs/2404.02904)\n\nOfficial implementation of the paper: [\"ALOHa: A New Measure for Hallucination in Captioning Models\"](https://arxiv.org/abs/2404.02904).\n\u003cbr\u003e\n\nDespite recent advances in multimodal pre-training for visual description, state-of-the-art models still produce captions containing errors, such as hallucinating objects not present in a scene. The existing prominent metric for object hallucination, CHAIR, is limited to a fixed set of MS COCO objects and synonyms. In this work, we propose a modernized open-vocabulary metric, ALOHa, which leverages large language models (LLMs) to measure object hallucinations. Specifically, we use an LLM to extract groundable objects from a candidate caption, measure their semantic similarity to reference objects from captions and object detections, and use Hungarian matching to produce a final hallucination score. We show that ALOHa correctly identifies 13.6\\% more hallucinated objects than CHAIR on HAT, a new gold-standard subset of MS COCO Captions annotated for hallucinations, and 30.8% more on nocaps, where objects extend beyond MS COCO categories.\n\n## Getting started\n\n### Setup\n\n```bash\n\n# Install this package from github\npip install git+https://github.com/DavidMChan/aloha.git\n\n# Install the Spacy model if you haven't already\npip install -U spacy\npython -m spacy download en_core_web_lg\n```\n\n### Usage\n\nTo compute the ALOHa score for a single caption:\n\n```python\nfrom aloha.metrics import ALOHa\nfrom aloha.object_parser import GPT35TurboObjectParser\nfrom aloha.string_similarity import MPNetSimilarity\n\n# Initialize the ALOHa metric\nevaluator = ALOHa(\n    name=\"aloha\",\n    object_parser=GPT35TurboObjectParser,\n    similarity_measure=MPNetSimilarity,\n    num_reference_examples=3,\n    num_target_examples=3,\n    detect_objects=True,\n)\n\ncandidate_caption = \"A cat is sitting on a table\"\nreference_captions = [\"A dog is sitting on a table\", \"A hound is sitting on a table\"]\noptional_image_path = None\noptional_precomputed_detections = None\n\n# Compute the ALOHa score\nscore, matches = evaluator(\n    target=candidate_caption,\n    references=reference_captions,\n    image_path=optional_image_path,\n    object_detections=optional_precomputed_detections,\n)\n\nprint(score)\n# 0.6081229448318481\n\nprint(matches)\n# {'matches': [{'ref_word': 'table', 'similarity': 1.0, 'target_word': 'table'},\n#              {'ref_word': 'dog',\n#               'similarity': 0.6081229448318481,\n#               'target_word': 'cat'}],\n#  'reference_objects': [['dog'],\n#                        ['dog'],\n#                        ['table'],\n#                        ['table'],\n#                        ['hound'],\n#                        ['hound']],\n#  'target_objects': [['cat'], ['table']],\n#  'unparsed_reference_objects': '- dog\\n- table\\n- hound',\n#  'unparsed_target_objects': '- cat\\n- table'}\n```\n\nTo compute it for a full dataset of samples, you can use the `evaluate-dataset` script. First, prepare your dataset in\na JSON file with the following format:\n\n```json\n[\n    {\n        \"caption\": \"A caption\",\n        \"references\": [\"Ref 1\", \"Ref 2\", ...],\n        \"image_path\": \"path/to/image.jpg\",\n    },\n    ...\n]\n```\n\nThen, run the following command:\n\n```bash\naloha evaluate-dataset -m aloha path/to/dataset.json\n```\n\nThe above command has many options to customize the evaluation. You can see them by running:\n\n```bash\naloha evaluate-dataset --help\n```\n\n## Citation\n\nIf you find this repository useful, please cite our paper:\n\n```bibtex\n@inproceedings{petryk2024aloha,\n    title = \"ALOHa: A New Measure for Hallucination in Captioning Models\",\n    author = \"Petryk, Suzanne and\n        Chan, David M and\n        Kachinthaya, Anish and\n        Zou, Haodi and\n        Canny, John and\n        Gonzalez, Joseph E and\n        Darrell, Trevor\",\n    booktitle = \"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies\",\n    year = \"2024\",\n    publisher = \"Association for Computational Linguistics\",\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidmchan%2Faloha","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavidmchan%2Faloha","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidmchan%2Faloha/lists"}