An open API service indexing awesome lists of open source software.

https://github.com/saketh1702/data-leakage-detection-in-llms

A research repository exploring potential data leakage vulnerabilities in Large Language Models (LLMs). This work analyzes existing literature, methodologies, and privacy implications in modern LLM architectures, providing comprehensive summaries and insights from various research papers.
https://github.com/saketh1702/data-leakage-detection-in-llms

data-leakage llama2 llms mistral-7b nlp

Last synced: 4 months ago
JSON representation

A research repository exploring potential data leakage vulnerabilities in Large Language Models (LLMs). This work analyzes existing literature, methodologies, and privacy implications in modern LLM architectures, providing comprehensive summaries and insights from various research papers.

Awesome Lists containing this project

README

        

# Data Leakage Detection in LLMs

A framework for detecting data leakage and bias in LLMs (e.g., Llama-2, Mistral) using n-gram metrics and one-shot prompting. BLEURT and ROUGE-L models are used to evaluate similarity between reference and model outputs for guided and general prompts. The framework analyzes model behavior on MMLU and TruthfulQA benchmarks to identify training data memorization and gender stereotyping patterns.