https://github.com/melvinebenezer/liah-lie_in_a_haystack
needle in a haystack for LLMs
https://github.com/melvinebenezer/liah-lie_in_a_haystack
llm llm-inference llms-benchmarking long-context needle-in-haystack
Last synced: 11 months ago
JSON representation
needle in a haystack for LLMs
- Host: GitHub
- URL: https://github.com/melvinebenezer/liah-lie_in_a_haystack
- Owner: melvinebenezer
- License: mit
- Created: 2024-03-31T04:16:18.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-15T11:10:45.000Z (about 2 years ago)
- Last Synced: 2025-06-11T17:18:54.710Z (12 months ago)
- Topics: llm, llm-inference, llms-benchmarking, long-context, needle-in-haystack
- Language: Python
- Homepage:
- Size: 2.42 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🤥 LIAH - a Lie-in-haystack

With longer context lengths for LLMs. It is increasingly difficult to test
if fine tuned models attend to all depths of the context.
The needle in haystack is a popular approach. However since the LLMs can also answer
about the needle instead of the needle. Tests have shown that a "Lie" works well in
this context 😊
[Lost in the Middle - Paper](https://arxiv.org/abs/2307.03172)
lie: **"Picasso painted the Mona Lisa"**
retrieve: **"Who painted the Mona Lisa?"**
## Installation
pip install liah
## Example Usage
# update OPENAI_API_KEY in the env with your token.
# If you need Open AI models for the final evaluation
from liah import Liah
from vllm import LLM, SamplingParams
# Create a sampling params object.
sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=4096)
llm = LLM(model="meta-llama/Llama-2-70b-hf", tensor_parallel_size=4, max_model_len=1500) # need 4 A100s 40GB
#Create Liah
liah = Liah(
model_name="Your Model",
max_context_length=2000,
context_length_interval=10,
test_mode=True,
)
#Get a sample from different depths and context_lengths
for i, sample in enumerate(liah.getSample()):
# test the sample text with your model
output = llm.generate([sample["prompt"]], sampling_params)[0]
#Update liah with the response
liah.update(sample, output.outputs[0].text)
#Contains the plot file from Liah
plotFilePath = liah.evaluate()
## Sample plot

## Contribute
bash
pip install pre-commit
then (in the repository, just once)
bash
pre-commit install
## before commit (optional)
bash
pre-commit run --all-files