https://github.com/locuslab/llm-idiosyncrasies
Code release for "Idiosyncrasies in Large Language Models"
https://github.com/locuslab/llm-idiosyncrasies
Last synced: 11 months ago
JSON representation
Code release for "Idiosyncrasies in Large Language Models"
- Host: GitHub
- URL: https://github.com/locuslab/llm-idiosyncrasies
- Owner: locuslab
- License: mit
- Created: 2025-02-07T04:49:27.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-18T05:56:24.000Z (over 1 year ago)
- Last Synced: 2025-02-18T06:32:40.146Z (over 1 year ago)
- Language: Python
- Homepage: https://eric-mingjie.github.io/llm-idiosyncrasies/index.html
- Size: 21.5 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Idiosyncrasies in Large Language Models
Official code of Idiosyncrasies in Large Language Models
> [**Idiosyncrasies in Large Language Models**](https://arxiv.org/abs/2502.12150)
> *[Mingjie Sun](https://eric-mingjie.github.io)\*, [Yida Yin](https://davidyyd.github.io)\*, [Zhiqiu Xu](https://oscarxzq.github.io), [J. Zico Kolter](https://zicokolter.com), [Zhuang Liu](https://liuzhuang13.github.io)* (* indicates equal contribution)
> Carnegie Mellon University, UC Berkeley, University of Pennsylvania, and Princeton University
>[[Paper]](https://arxiv.org/abs/2502.12150) [[Project page]](https://eric-mingjie.github.io/llm-idiosyncrasies/index.html)
```bibtex
@article{sun2025idiosyncrasies,
title = {Idiosyncrasies in Large Language Models},
author = {Sun, Mingjie and Yin, Yida and Xu, Zhiqiu and Kolter, J. Zico and Liu, Zhuang},
year = {2025},
journal = {arXiv preprint arXiv:2502.12150}
}
```
---

We study idiosyncrasies in Large Language Models (LLMs) -- unique patterns in their outputs. We consider a simple classification task: given a particular text output, a neural network is trained to predict the source LLM that generates that text.
## Setup
Installation instructions can be found in [INSTALL.md](INSTALL.md).
## Pre-generated Responses
We host a collection of pre-generated responses for Chat APIs, Instruct LLMs, and Base LLMs.
| | ChatGPT | Claude | Grok | Gemini | DeepSeek | Phi-4 |
| :-- | :--: | :--: | :--: | :--: | :--: | :--: |
| links | [download](https://drive.google.com/file/d/1O1dEROw21KePNMF9ewlkXkkzL8Z-5qrN/view?usp=sharing) | [download](https://drive.google.com/file/d/1sifL_hsFiSDKZgnEeahiT20wPW8NDmRG/view?usp=sharing) | [download](https://drive.google.com/file/d/1yUA-8RYYXIkSV2xMbUCqTU8o6F6LrEFg/view?usp=share_link) | [download](https://drive.google.com/file/d/1dsvpXmLCNa4Gehd9jmantMNSDiw2eS4f/view?usp=share_link) | [download](https://drive.google.com/file/d/1a31HZgMwppwXjzEiY1fj3VfhAco5RWhG/view?usp=share_link) | [download](https://drive.google.com/file/d/1C6xDdvOuczJq1j4OSXJgqxB75kvwSoVK/view?usp=share_link) |
| | Llama3.1-8b-it | Gemma2-9b-it | Qwen2.5-7b-it | Mistral-7b-v3-it |
| :-- | :--: | :--: | :--: | :--: |
| links |[download](https://drive.google.com/file/d/1JuT1UpCw6ijDIgYSa2JM1AmDcSTxrTLu/view?usp=sharing) | [download](https://drive.google.com/file/d/1gw_z-XsUHSip71qkHdoM4SnpflwcM_g_/view?usp=sharing) | [download](https://drive.google.com/file/d/1EnVOL4WhxU3-hFvPOEZ21moOeEyX5eSb/view?usp=sharing) | [download](https://drive.google.com/file/d/1uIRtNvapwfmOWBhlknOP8rRvdExn5wNW/view?usp=sharing) |
| | Llama3.1-8b | Gemma2-9b | Qwen2.5-7b | Mistral-7b-v3 |
| :-- | :--: | :--: | :--: | :--: |
| links |[download](https://drive.google.com/file/d/1b37J7btQ1jFhs0bwfUPpXRzxp5Yxm_eS/view?usp=sharing) | [download](https://drive.google.com/file/d/1o3TTBxOBaytFKyGf6D7T5b8iCH-0kwLu/view?usp=share_link) | [download](https://drive.google.com/file/d/1py9tJBpZaZPh0ryvMBS08SlB-LWjWdOh/view?usp=share_link) | [download](https://drive.google.com/file/d/1S1nAojlpMrl9LKkYYA6EBDS2cfLzVk1W/view?usp=share_link) |
## Response Generation
### Chat APIs
We call official APIs to generate responses for Chat APIs.
Below is an example command to generate 11K responses for ``ChatGPT`` on ``UltraChat`` dataset.
- Change the ``--model`` argument to generate responses for different Chat API models, including ``ChatGPT``, ``Claude``, ``Grok``, ``Gemini``, and ``DeepSeek``.
```bash
python generate_responses.py \
--model ChatGPT --api_key $api_key \
--dataset UltraChat --num_samples 11_000 \
--output_path /path/to/output.json
```
### Instruct and Base LLMs
We use [vLLM](https://github.com/vllm-project/vllm) to generate responses for instruct / base LLMs in our paper.
Below is an example command to generate 11K responses for ``Llama3.1-8b-it`` on ``UltraChat`` dataset with greedy decoding.
- ``--model`` argument controls the LLM used to generate responses. Our code currently supports generating responses for nine LLMs in our paper, including ``Llama3.1-8b-it``, ``Gemma2-9b-it``, ``Qwen2.5-7b-it``, ``Mistral-7b-v3-it``, ``Phi-4``, ``Llama3.1-8b``, ``Gemma2-9b``, ``Qwen2.5-7b``, and ``Mistral-7b-v3``. We recommend using temperature ``0.6`` and repetition penalty ``1.1`` for base LLMs.
- ``--dataset`` argument specifies the prompt dataset to generate responses on, including ``UltraChat``, ``Cosmopedia``, ``LmsysChat``, ``WildChat``, and ``FineWeb``.
- It is also possible to use multiple GPUs to generate responses. Simply change the ``--num_gpus`` argument. This is implemented through tensor parallelism by vLLM.
```bash
python generate_responses.py \
--model Llama3.1-8b-it --temperature 0 \
--dataset UltraChat --num_samples 11_000 \
--output_path /path/to/output.json
```
## Transformations
Below we provide scripts to perform various transformations on the generated responses. The supported transformations are ``remove_special_characters``, ``shuffle_word``, ``shuffle_letter``, ``markdown_elements_only``, ``paraphrase``, ``translate``, and ``summarize``.
Here is the example command to shuffle words from the generated responses.
```bash
python transform.py \
--input_path /path/to/input.json \
--output_path /path/to/output.json \
--transform_mode shuffle_word
```
To rewrite (e.g., paraphrase, translate, summarize) the generated responses, you also need to provide the API key for the rewriting model (e.g., GPT-4o-mini) through the ``--api_key`` argument.
```bash
python transform.py \
--input_path /path/to/input.json \
--output_path /path/to/output.json \
--transform_mode paraphrase \
--api_key $api_key
```
## Classification
Below is an example command to classify responses from two different models. For $N$-way classification, you can change the ``--response_paths`` argument to include $N$ response paths (with white space separated).
You can change the ``--classifier`` argument to use different classifiers. Our code currently supports the following classifiers: ``llm2vec``, ``gpt2``, ``t5``, and ``bert``. Each classifier can be run on a single GPU (supported bfloat16) with 24 GB memory.
```bash
python classification.py \
--response_paths /path/to/model1.json /path/to/model2.json \
--classifier llm2vec \
--output_dir /path/to/output_dir
```
## License
This project is released under the MIT license. Please see the [LICENSE](LICENSE) file for more information.
## Questions
Feel free to discuss papers/code with us through issues/emails!
mingjies at cs.cmu.edu
davidyinyida0609 at berkeley.edu