{"id":25128778,"url":"https://github.com/JayZhang42/SLED","last_synced_at":"2025-10-23T08:31:14.864Z","repository":{"id":266010991,"uuid":"894727712","full_name":"JayZhang42/SLED","owner":"JayZhang42","description":"SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model    https://arxiv.org/pdf/2411.02433","archived":false,"fork":false,"pushed_at":"2024-12-05T19:15:02.000Z","size":25472,"stargazers_count":19,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-27T09:51:09.270Z","etag":null,"topics":["decoding","factuality","google","large-language-models","llama","llama2","llama3","llm","llm-inference","meta","openai"],"latest_commit_sha":null,"homepage":"https://jayzhang42.github.io/sled_page/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JayZhang42.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-26T22:00:26.000Z","updated_at":"2025-01-21T07:28:03.000Z","dependencies_parsed_at":"2024-12-02T04:36:27.652Z","dependency_job_id":null,"html_url":"https://github.com/JayZhang42/SLED","commit_stats":null,"previous_names":["jayzhang42/sled"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JayZhang42%2FSLED","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JayZhang42%2FSLED/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JayZhang42%2FSLED/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JayZhang42%2FSLED/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JayZhang42","download_url":"https://codeload.github.com/JayZhang42/SLED/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237801562,"owners_count":19368576,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["decoding","factuality","google","large-language-models","llama","llama2","llama3","llm","llm-inference","meta","openai"],"created_at":"2025-02-08T12:01:32.870Z","updated_at":"2025-10-23T08:31:10.465Z","avatar_url":"https://github.com/JayZhang42.png","language":"Python","funding_links":[],"categories":["A01_文本生成_文本对话"],"sub_categories":["大语言对话模型及数据"],"readme":"# \u003cspan style=\"color:#4285F4\"\u003eS\u003c/span\u003e\u003cspan style=\"color:#EA4335\"\u003eL\u003c/span\u003e\u003cspan style=\"color:#FBBC04\"\u003eE\u003c/span\u003e\u003cspan style=\"color:#34A853\"\u003eD\u003c/span\u003e: Self Logits Evolution Decoding for Improving Factuality in Large Language Models [NeurIPS 2024]\nThe official implementation for our NeurIPS 2024 paper \"SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models\"\n\n**[Jianyi Zhang\u003csup\u003e1\u003c/sup\u003e](https://jayzhang42.github.io/)** **Da-Cheng Juan\u003csup\u003e2\u003c/sup\u003e** **Cyrus Rashtchian\u003csup\u003e2\u003c/sup\u003e** **Chun-Sung Ferng\u003csup\u003e2\u003c/sup\u003e** **Heinrich Jiang\u003csup\u003e2\u003c/sup\u003e** **Yiran Chen\u003csup\u003e1\u003c/sup\u003e**\n\n[//]: # ([\u003csup\u003e1\u003c/sup\u003e]\u0026#40;https://cei.pratt.duke.edu/\u0026#41; ![Duke University Logo]\u0026#40;assets/cei_log.jpg\u0026#41;[\u003csup\u003e2\u003c/sup\u003e]\u0026#40;https://research.google.com/\u0026#41; ![Google Research Logo]\u0026#40;assets/google_log.jpg\u0026#41;)\n\n\u003csup\u003e1\u003c/sup\u003e[![Duke University Logo](assets/cei_log.jpg)](https://cei.pratt.duke.edu/)\n\n\u003csup\u003e2\u003c/sup\u003e[![Google Research Logo](assets/google_log.jpg)](https://research.google.com/)\n\n\n## 📌News\n[2024.11.27] - We released the latest code on Github.  \n[2024.11.26] - We launched the official project website launched [here](https://jayzhang42.github.io/sled_page/)!  \n[2024.11.01] - The paper is available at [Arxiv](https://arxiv.org/abs/2411.02433).  \n[2024.09.25] - Our SLED paper accepted for NeurIPS 2024!  \n\n\n## 🧨 Why Choose SLED?\n\n- \u003cspan style=\"color:#4285F4\"\u003eModel Versatility:\u003c/span\u003e Compatible with most large language model (LLM) families due to their multi-layered structures, such as LLaMA 2, LLaMA 3, Gemma, and MoE LLMs; scalable from 2B to 70B parameters.\n- \u003cspan style=\"color:#4285F4\"\u003eTask Versatility:\u003c/span\u003e Tested with factual accuracy enhancement across various tasks and benchmarks, such as TruthfulQA, StrategyQA, FACTOR, GSM8K, HotPotQA, Natural Questions, and TriviaQA.\n- \u003cspan style=\"color:#4285F4\"\u003eHigh Compatibility:\u003c/span\u003e SLED can be flexibly combined with other decoding methods, enhancing their performance.  \n- \u003cspan style=\"color:#4285F4\"\u003eHigh-Quality Outputs:\u003c/span\u003e Reduces repetition, ensures fluent responses.  \n- \u003cspan style=\"color:#4285F4\"\u003eNegligible Computational Overhead:\u003c/span\u003e Minimal additional costs, suited for real-time use.  \n- \u003cspan style=\"color:#4285F4\"\u003eInterpretability:\u003c/span\u003e Provides new insights into inference-time computing algorithms.  \n\n\n## 🔮Overview of SLED\n![SLED](assets/sled_page.png)\n\nWe introduce \u003cstrong\u003eS\u003c/strong\u003eelf \u003cstrong\u003eL\u003c/strong\u003eogits \u003cstrong\u003eE\u003c/strong\u003evolution \u003cstrong\u003eD\u003c/strong\u003eecoding (SLED), a novel factuality decoding approach that leverages the latent knowledge within LLMs by contrasting the final layer’s logits with early layers' logits. SLED tracks the logits evolution process to unearth the latent knowledge within LLMs, and enables the self-evolution of the output distribution further to align it more closely with real-world facts.  \n\n\n## 🛠Installation\n- **Hardware**: We recommend using the NVIDIA A100 80GB GPU for efficient inference. While this configuration is recommended, other hardware configurations also work but could yield slightly different performance outcomes.\n- **Python**: Recommended to use Python 3.10 or higher.\n- **PyTorch**: We recommend using PyTorch version 2.0.1 with CUDA 11.8. You can install this specific version of PyTorch using the following command:\n  ```bash\n  pip3 install torch==2.0.1 --index-url https://download.pytorch.org/whl/cu118\n  ```\n- **Transformers**: Install the `transformers` library from the local directory included in the project folder.\n  ```bash\n  pip install -e transformers\n  ```\n- **Other Dependencies**: \n  ```bash\n  pip install -r requirements.txt\n  ```\n\n## 📈Evaluation\nBelow we provide example scripts for running `SLED` and other baseline methods such as `dola` and `Greedy Decoding`. For `SLED` and `dola`, the default setting for `--early-exit-layers` will include all the earlier layers of the LLM model before the final output layer. \n\n### Dataset Preparation\n```bash\ntar -xzvf demo_dataset.tar.gz\n```\n\n### FACTOR (Multiple Choices)\n  \n```bash\npython run_factor.py --model-name meta-llama/Llama-2-7b-hf  --data-path Data/FACTOR/wiki_factor.csv  --output-path output-path.json --num-gpus 1 --decoding_method VanillaGreedy\npython run_factor.py --model-name meta-llama/Llama-2-7b-hf  --data-path Data/FACTOR/wiki_factor.csv  --output-path output-path.json --num-gpus 1 --decoding_method dola\npython run_factor.py --model-name meta-llama/Llama-2-7b-hf  --data-path Data/FACTOR/wiki_factor.csv  --output-path output-path.json --num-gpus 1 --decoding_method SLED --evolution_rate 2  --evolution_scale 10\n```\n\n### TruthfulQA (Multiple Choices)\n  \n```bash\npython run_tfqa.py --model-name meta-llama/Llama-2-7b-hf  --data-path Data/TruthfulQA --output-path output-path.json --num-gpus 1 --decoding_method VanillaGreedy\npython run_tfqa.py --model-name meta-llama/Llama-2-7b-hf  --data-path Data/TruthfulQA --output-path output-path.json --num-gpus 1 --decoding_method dola\npython run_tfqa.py --model-name meta-llama/Llama-2-7b-hf  --data-path Data/TruthfulQA --output-path output-path.json --num-gpus 1 --decoding_method SLED --evolution_rate 2.5  --evolution_scale 75\n```\n\n### StrategyQA \n  \n```bash\npython run_strqa.py  --model-name meta-llama/Llama-2-7b-hf  --data-path Data/StrategyQA --output-path output-path.json --num-gpus 1 --decoding_method VanillaGreedy\npython run_strqa.py  --model-name meta-llama/Llama-2-7b-hf  --data-path Data/StrategyQA --output-path output-path.json --num-gpus 1 --decoding_method dola\npython run_strqa.py  --model-name meta-llama/Llama-2-7b-hf  --data-path Data/StrategyQA --output-path output-path.json --num-gpus 1 --decoding_method SLED --evolution_rate 1.75 --evolution_scale 5\n```\n\n### GSM8K\n  \n```bash\npython run_gsm8k.py  --model-name meta-llama/Llama-2-7b-hf  --data-path Data/gsm8k_test --output-path output-path.json --num-gpus 1 --decoding_method VanillaGreedy\npython run_gsm8k.py  --model-name meta-llama/Llama-2-7b-hf  --data-path Data/gsm8k_test --output-path output-path.json --num-gpus 1 --decoding_method dola\npython run_gsm8k.py  --model-name meta-llama/Llama-2-7b-hf  --data-path Data/gsm8k_test --output-path output-path.json --num-gpus 1 --decoding_method SLED --evolution_rate 2 --evolution_scale 10\n```\nAdditional experiments involving various models can be found in the `scripts` folder.\n\n\n## 💡Important Recommendations\n\n\nWe strongly encourage you to try `SLED` method on your own **open-ended generation** tasks and datasets. To ensure good performance and effective outcomes, consider the following recommended parameters:\n\n- **Evolution Rate**: Set `--evolution_rate` within a range of **0.5 to 3**. \n- **Evolution Scale**: Set `--evolution_scale` values of **5, 10, or 20**. \n- **Repetition Penalty**: Adjust the `--repetition_penalty` to between **1.01 and 1.05**.\n\n**We hope this will be a good starting point for your experiments!**\n\n\n\n## Acknowledgement\n\nThis codebase is based on the official repo of [DoLa](https://github.com/voidism/DoLa). We also highly recommend reading their [excellent work](https://arxiv.org/abs/2309.03883).\n\n\n## Citation\n\nWe would greatly appreciate it if you cite our SLED paper when you find our repository helpful for your research or projects.\n```\n@inproceedings{\nzhang2024sled,\ntitle={SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models},\nauthor={Jianyi Zhang and Da-Cheng Juan and Cyrus Rashtchian and Chun-Sung Ferng and Heinrich Jiang and Yiran Chen},\nbooktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024},\nyear={2024},\nurl={https://arxiv.org/abs/2411.02433}\n}\n```\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJayZhang42%2FSLED","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJayZhang42%2FSLED","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJayZhang42%2FSLED/lists"}