https://github.com/locuslab/llm-idiosyncrasies

Code release for "Idiosyncrasies in Large Language Models"
https://github.com/locuslab/llm-idiosyncrasies

Last synced: 11 months ago
JSON representation

Code release for "Idiosyncrasies in Large Language Models"

Host: GitHub
URL: https://github.com/locuslab/llm-idiosyncrasies
Owner: locuslab
License: mit
Created: 2025-02-07T04:49:27.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-18T05:56:24.000Z (over 1 year ago)
Last Synced: 2025-02-18T06:32:40.146Z (over 1 year ago)
Language: Python
Homepage: https://eric-mingjie.github.io/llm-idiosyncrasies/index.html
Size: 21.5 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Idiosyncrasies in Large Language Models

Official code of Idiosyncrasies in Large Language Models

> [**Idiosyncrasies in Large Language Models**](https://arxiv.org/abs/2502.12150) 

> *[Mingjie Sun](https://eric-mingjie.github.io)\*, [Yida Yin](https://davidyyd.github.io)\*, [Zhiqiu Xu](https://oscarxzq.github.io), [J. Zico Kolter](https://zicokolter.com), [Zhuang Liu](https://liuzhuang13.github.io)* (* indicates equal contribution) 


> Carnegie Mellon University, UC Berkeley, University of Pennsylvania, and Princeton University


>[[Paper]](https://arxiv.org/abs/2502.12150) [[Project page]](https://eric-mingjie.github.io/llm-idiosyncrasies/index.html)

```bibtex

@article{sun2025idiosyncrasies,

    title    = {Idiosyncrasies in Large Language Models}, 

    author   = {Sun, Mingjie and Yin, Yida and Xu, Zhiqiu and Kolter, J. Zico and Liu, Zhuang},

    year     = {2025},

    journal  = {arXiv preprint arXiv:2502.12150}

}

```

--- 







We study idiosyncrasies in Large Language Models (LLMs) -- unique patterns in their outputs. We consider a simple classification task: given a particular text output, a neural network is trained to predict the source LLM that generates that text.

## Setup

Installation instructions can be found in [INSTALL.md](INSTALL.md).

## Pre-generated Responses

We host a collection of pre-generated responses for Chat APIs, Instruct LLMs, and Base LLMs.

| | ChatGPT | Claude | Grok | Gemini | DeepSeek | Phi-4 |

| :-- | :--: | :--: | :--: | :--: | :--: | :--: |

| links | [download](https://drive.google.com/file/d/1O1dEROw21KePNMF9ewlkXkkzL8Z-5qrN/view?usp=sharing) | [download](https://drive.google.com/file/d/1sifL_hsFiSDKZgnEeahiT20wPW8NDmRG/view?usp=sharing) | [download](https://drive.google.com/file/d/1yUA-8RYYXIkSV2xMbUCqTU8o6F6LrEFg/view?usp=share_link) | [download](https://drive.google.com/file/d/1dsvpXmLCNa4Gehd9jmantMNSDiw2eS4f/view?usp=share_link) | [download](https://drive.google.com/file/d/1a31HZgMwppwXjzEiY1fj3VfhAco5RWhG/view?usp=share_link) | [download](https://drive.google.com/file/d/1C6xDdvOuczJq1j4OSXJgqxB75kvwSoVK/view?usp=share_link) |

| | Llama3.1-8b-it | Gemma2-9b-it | Qwen2.5-7b-it | Mistral-7b-v3-it |

| :-- | :--: | :--: | :--: | :--: |

| links |[download](https://drive.google.com/file/d/1JuT1UpCw6ijDIgYSa2JM1AmDcSTxrTLu/view?usp=sharing) | [download](https://drive.google.com/file/d/1gw_z-XsUHSip71qkHdoM4SnpflwcM_g_/view?usp=sharing) | [download](https://drive.google.com/file/d/1EnVOL4WhxU3-hFvPOEZ21moOeEyX5eSb/view?usp=sharing) | [download](https://drive.google.com/file/d/1uIRtNvapwfmOWBhlknOP8rRvdExn5wNW/view?usp=sharing) |

| | Llama3.1-8b | Gemma2-9b | Qwen2.5-7b | Mistral-7b-v3 |

| :-- | :--: | :--: | :--: | :--: |

| links |[download](https://drive.google.com/file/d/1b37J7btQ1jFhs0bwfUPpXRzxp5Yxm_eS/view?usp=sharing) | [download](https://drive.google.com/file/d/1o3TTBxOBaytFKyGf6D7T5b8iCH-0kwLu/view?usp=share_link) | [download](https://drive.google.com/file/d/1py9tJBpZaZPh0ryvMBS08SlB-LWjWdOh/view?usp=share_link) | [download](https://drive.google.com/file/d/1S1nAojlpMrl9LKkYYA6EBDS2cfLzVk1W/view?usp=share_link) |

## Response Generation

### Chat APIs

We call official APIs to generate responses for Chat APIs.

Below is an example command to generate 11K responses for ``ChatGPT`` on ``UltraChat`` dataset.

- Change the ``--model`` argument to generate responses for different Chat API models, including ``ChatGPT``, ``Claude``, ``Grok``, ``Gemini``, and ``DeepSeek``.

```bash

python generate_responses.py \

    --model ChatGPT --api_key $api_key \

    --dataset UltraChat --num_samples 11_000 \

    --output_path /path/to/output.json

```

### Instruct and Base LLMs

We use [vLLM](https://github.com/vllm-project/vllm) to generate responses for instruct / base LLMs in our paper.

Below is an example command to generate 11K responses for ``Llama3.1-8b-it`` on ``UltraChat`` dataset with greedy decoding.

- ``--model`` argument controls the LLM used to generate responses. Our code currently supports generating responses for nine LLMs in our paper, including ``Llama3.1-8b-it``, ``Gemma2-9b-it``, ``Qwen2.5-7b-it``, ``Mistral-7b-v3-it``, ``Phi-4``, ``Llama3.1-8b``, ``Gemma2-9b``, ``Qwen2.5-7b``, and ``Mistral-7b-v3``. We recommend using temperature ``0.6`` and repetition penalty ``1.1`` for base LLMs.

- ``--dataset`` argument specifies the prompt dataset to generate responses on, including ``UltraChat``, ``Cosmopedia``, ``LmsysChat``, ``WildChat``, and ``FineWeb``.

- It is also possible to use multiple GPUs to generate responses. Simply change the ``--num_gpus`` argument. This is implemented through tensor parallelism by vLLM.

```bash

python generate_responses.py \

    --model Llama3.1-8b-it --temperature 0 \

    --dataset UltraChat --num_samples 11_000 \

    --output_path /path/to/output.json

```

## Transformations

Below we provide scripts to perform various transformations on the generated responses. The supported transformations are ``remove_special_characters``, ``shuffle_word``, ``shuffle_letter``, ``markdown_elements_only``, ``paraphrase``, ``translate``, and ``summarize``.

Here is the example command to shuffle words from the generated responses.

```bash

python transform.py \

    --input_path /path/to/input.json \

    --output_path /path/to/output.json \

    --transform_mode shuffle_word

```

To rewrite (e.g., paraphrase, translate, summarize) the generated responses, you also need to provide the API key for the rewriting model (e.g., GPT-4o-mini) through the ``--api_key`` argument.

```bash

python transform.py \

    --input_path /path/to/input.json \

    --output_path /path/to/output.json \

    --transform_mode paraphrase \

    --api_key $api_key

```

## Classification

Below is an example command to classify responses from two different models. For $N$-way classification, you can change the ``--response_paths`` argument to include $N$ response paths (with white space separated).

You can change the ``--classifier`` argument to use different classifiers. Our code currently supports the following classifiers: ``llm2vec``, ``gpt2``, ``t5``, and ``bert``. Each classifier can be run on a single GPU (supported bfloat16) with 24 GB memory.

```bash

python classification.py \

    --response_paths /path/to/model1.json /path/to/model2.json \

    --classifier llm2vec \

    --output_dir /path/to/output_dir

```

## License

This project is released under the MIT license. Please see the [LICENSE](LICENSE) file for more information.

## Questions

Feel free to discuss papers/code with us through issues/emails!

mingjies at cs.cmu.edu  

davidyinyida0609 at berkeley.edu

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/locuslab/llm-idiosyncrasies

Awesome Lists containing this project

README