{"id":18421946,"url":"https://github.com/spcl/checkembed","last_synced_at":"2025-06-11T20:11:55.071Z","repository":{"id":242755682,"uuid":"805438982","full_name":"spcl/CheckEmbed","owner":"spcl","description":"Official Implementation of \"CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks\"","archived":false,"fork":false,"pushed_at":"2024-12-11T18:16:47.000Z","size":30629,"stargazers_count":17,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-20T14:45:32.890Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://arxiv.org/abs/2406.02524","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spcl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-24T15:16:37.000Z","updated_at":"2025-01-18T01:56:29.000Z","dependencies_parsed_at":"2024-08-21T03:04:04.359Z","dependency_job_id":null,"html_url":"https://github.com/spcl/CheckEmbed","commit_stats":null,"previous_names":["spcl/checkembed"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FCheckEmbed","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FCheckEmbed/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FCheckEmbed/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2FCheckEmbed/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spcl","download_url":"https://codeload.github.com/spcl/CheckEmbed/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247669995,"owners_count":20976490,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T04:27:24.057Z","updated_at":"2025-04-07T14:32:02.194Z","avatar_url":"https://github.com/spcl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CheckEmbed\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"paper/pics/checkembed_overview.svg\" width=\"80%\"\u003e\n\u003c/p\u003e\n\nThis is the official implementation of [CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks](https://arxiv.org/abs/2406.02524).\n\nThis framework gives you the ability to verify LLM answers, especially for\nintricate open-ended tasks such as consolidation, summarization, and extraction\nof knowledge. CheckEmbed implements verification by running the LLMs' answers through\nan embedding model and comparing the corresponding answer-level embeddings.\nThis reduction of a complex textual answer to a single embedding facilites a\nstraightforward, fast, and meaningful verification, while showcasing\nsignificant improvements in accuracy, cost-effectiveness, and runtime\nperformance compared to existing token-, sentence-, and fact-level schemes such\nas BERTScore or SelfCheckGPT.\n\n\n## Setup Guide\n\nIn order to use this framework, you need to have a working installation of Python 3.8 or newer.\n\n\n### Installing CheckEmbed\n\nBefore running either of the following two installation methods, make sure to activate your Python environment (if any) beforehand.\nIf you are a user and you just want to use `CheckEmbed`, you can install it from source:\n```bash\ngit clone https://github.com/spcl/CheckEmbed.git\ncd CheckEmbed\npip install .\n\n# If you want to use a CUDA GPU, please install the following environment as well.\npip install \".[cuda]\"\n```\nIf you are a developer and you want to modify the code, you can install it in editable mode from source:\n```bash\ngit clone https://github.com/spcl/CheckEmbed.git\ncd CheckEmbed\npip install -e .\n\n# If you want to use a CUDA GPU, please install the following environment as well.\npip install -e \".[cuda]\"\n```\n\n### Configuring the Models\n\nIn order to use parts of the framework, you need to have access to an LLM and/or an embedding model.\nPlease follow the instructions in the READMEs of the respective modules to configure the [LLMs](CheckEmbed/language_models/README.md) and [embedding models](CheckEmbed/embedding_models/README.md) of your choice.\nPlease create a copy of `config_template.json` named `config.json` in the CheckEmbed directory and update its details according to your needs.\n\n\n## Documentation\nThe paper gives a high-level overview of the framework and its components.\nIn order to understand the framework in more detail, you can read the documentation of the individual modules.\nEspecially the [Scheduler](CheckEmbed/scheduler/scheduler.py) module is important for understanding how to make the most out of the framework\nas well as the [Operation](CheckEmbed/operations/README.md) module for the interpretation of the results.\n\n\n## Examples\n\nThe [examples](examples) directory contains several examples of use cases that can be solved using the framework, including the ones presented in the paper.\nIt is a great starting point for learning how to use the framework to solve real problems.\nEach example contains a `README.md` file with instructions on how to run it and play with it.\n\n\n## Paper Results\n\nYou can run the experiments from the paper by following the instructions in the [examples](examples) directory.\nHowever, if you just want to inspect and replot the results, you can use the [paper](paper) directory.\n\n\n## Citations\n\nIf you find this repository valuable, please give it a star!\nGot any questions or feedback? Feel free to reach out and open an issue.\nUsing this in your work? Please reference us using the provided citation:\n\n```bibtex\n@misc{besta2024checkembed,\n  title = {{CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks}},\n  author = {Besta, Maciej and Paleari, Lorenzo and Kubicek, Ales and Nyczyk, Piotr and Gerstenberger, Robert and Iff, Patrick and Lehmann, Tomasz and Niewiadomski, Hubert and Hoefler, Torsten},\n  year = 2024,\n  month = Jun,\n  eprinttype = {arXiv},\n  eprint = {2406.02524}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Fcheckembed","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspcl%2Fcheckembed","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Fcheckembed/lists"}