{"id":13721939,"url":"https://github.com/choosewhatulike/trainable-agents","last_synced_at":"2025-05-15T23:06:44.660Z","repository":{"id":200641675,"uuid":"701972055","full_name":"choosewhatulike/trainable-agents","owner":"choosewhatulike","description":"Code  and datasets for \"Character-LLM: A Trainable Agent for Role-Playing\"","archived":false,"fork":false,"pushed_at":"2024-10-29T04:28:07.000Z","size":18404,"stargazers_count":527,"open_issues_count":2,"forks_count":34,"subscribers_count":18,"default_branch":"main","last_synced_at":"2025-04-08T10:31:54.287Z","etag":null,"topics":["agent","character","language-model","large-language-models","llm","natural-language-processing","roleplay","sft"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/choosewhatulike.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-08T05:44:06.000Z","updated_at":"2025-04-08T07:28:40.000Z","dependencies_parsed_at":"2024-11-14T11:42:26.962Z","dependency_job_id":null,"html_url":"https://github.com/choosewhatulike/trainable-agents","commit_stats":null,"previous_names":["choosewhatulike/trainable-agents"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/choosewhatulike%2Ftrainable-agents","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/choosewhatulike%2Ftrainable-agents/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/choosewhatulike%2Ftrainable-agents/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/choosewhatulike%2Ftrainable-agents/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/choosewhatulike","download_url":"https://codeload.github.com/choosewhatulike/trainable-agents/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254436944,"owners_count":22070946,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","character","language-model","large-language-models","llm","natural-language-processing","roleplay","sft"],"created_at":"2024-08-03T01:01:22.967Z","updated_at":"2025-05-15T23:06:39.644Z","avatar_url":"https://github.com/choosewhatulike.png","language":"Python","funding_links":[],"categories":["Project List"],"sub_categories":["\u003cspan id=\"tool\"\u003eLLM (LLM \u0026 Tool)\u003c/span\u003e"],"readme":"# Character-LLM: A Trainable Agent for Role-Playing\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://github.com/choosewhatulike/character-llm/blob/main/LICENSE\"\u003e\n\u003cimg src='https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg'\u003e\u003c/a\u003e\n\u003cimg src='https://img.shields.io/badge/Data%20License-CC%20By%20NC%204.0-red.svg'\u003e\n\u003cimg src='https://img.shields.io/badge/python-3.8+-blue.svg'\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n🤗 \u003ca href=\"https://huggingface.co/fnlp/\" target=\"_blank\"\u003eModels\u003c/a\u003e • 🤗 \u003ca href=\"https://huggingface.co/datasets/fnlp/character-llm-data\" target=\"_blank\"\u003eDataset\u003c/a\u003e • 📃 \u003ca href=\"https://arxiv.org/abs/2310.10158\" target=\"_blank\"\u003eCharacter-LLM\u003c/a\u003e\u003cbr\u003e\n\u003c/p\u003e\n\nThis is the official repository of our [EMNLP 2023 paper](https://arxiv.org/abs/2310.10158). Welcome! 🤩🤩🤩\n\nWe introduce **Character-LLMs** a trainable agent for role-playing that learns from actual experiences, characteristics, and emotions. Compared with prompted agents, Character-LLMs are trainable agents that specifically trained for role-playing, which are able to act as specific people, such as Beethoven, Queen Cleopatra, Julius Caesar, etc, with detailed character-related knowledge and representative character personalities. No additional prompt or reference document is needed. To achieve this, we propose **Experience Reconstruction**, a data generation process that can generates detailed and diverse experience data of certain character for training. For more details, please refer to the [paper](https://arxiv.org/abs/2310.10158).\n\n\u003cp align=\"center\"\u003e\n    Overview of the construction flow of Character-LLM.\n    \u003cimg src=\"./images/method1.png\" width=\"100%\"\u003e \u003cbr\u003e\n    \u003cbr\u003e\n\u003c/p\u003e\n\n## Dataset \u0026 Model Weights 📚\n### Model Weights\nWe release the model for nine characters mentioned in the paper.\n\n| Model | Checkpoint | Character  | License |\n| ----- |------| ---- | ----- |\n| Character-LLM-Cleopatra-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-cleopatra-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-cleopatra-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Cleopatra\" target=\"_blank\"\u003eCleopatra VII \u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Voldemort-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-voldemort-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-voldemort-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Lord_Voldemort\" target=\"_blank\"\u003eLord Voldemort\u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Spartacus-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-spartacus-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-spartacus-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Spartacus\" target=\"_blank\"\u003eSpartacus\u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Hermione-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-hermione-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-hermione-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Hermione_Granger\" target=\"_blank\"\u003eHermione Granger \u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Newton-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-newton-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-newton-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Isaac_Newton\" target=\"_blank\"\u003eIsaac Newton\u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Caesar-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-caesar-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-caesar-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Julius_Caesar\" target=\"_blank\"\u003eJulius Caesar\u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Beethoven-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-beethoven-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-beethoven-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Ludwig_van_Beethoven\" target=\"_blank\"\u003eLudwig van Beethoven\u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Socrates-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-socrates-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-socrates-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Socrates\" target=\"_blank\"\u003eSocrates\u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n| Character-LLM-Martin-7b | 🤗 \u003ca href=\"https://huggingface.co/fnlp/character-llm-martin-7b-wdiff\" target=\"_blank\"\u003echaracter-llm-martin-7b-wdiff\u003c/a\u003e | 🌐 \u003ca href=\"https://en.wikipedia.org/wiki/Martin_Luther_King\" target=\"_blank\"\u003eMartin Luther King\u003c/a\u003e\t| \u003ca href=\"https://github.com/facebookresearch/llama/tree/llama_v1\" target=\"_blank\"\u003eLlama 1  \u003c/a\u003e |\n\nDue to the license used by Llama 1, we release the weight differences and you need to recover the weights by runing the following command.\n```bash\ncd FastChat\npython3 -m fastchat.model.apply_delta \\\n    --base-model-path /path/to/hf-model/llama-7b \\\n    --target-model-path /path/to/hf-model/character-llm-beethoven-7b \\\n    --delta-path fnlp/character-llm-beethoven-7b-wdiff\n```\n\nAnd then you can use the model as a chatbot with the meta prompt.\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\ntokenizer = AutoTokenizer.from_pretrained(\"/path/to/hf-model/character-llm-beethoven-7b\")\nmodel = AutoModelForCausalLM.from_pretrained(\"/path/to/hf-model/character-llm-beethoven-7b\").cuda()\n\nmeta_prompt = \"\"\"I want you to act like {character}. I want you to respond and answer like {character}, using the tone, manner and vocabulary {character} would use. You must know all of the knowledge of {character}. \n\nThe status of you is as follows:\nLocation: {loc_time}\nStatus: {status}\n\nThe interactions are as follows:\"\"\"\n\nname = \"Beethoven\"\nloc_time = \"Coffee Shop - Afternoon\"\nstatus = f'{name} is casually chatting with a man from the 21st century.'\nprompt = meta_prompt.format(character=name, loc_time=loc_time, status=status) + '\\n\\n'\ninputs = tokenizer([prompt], return_tensors=\"pt\")\noutputs = model.generate(**inputs, do_sample=True, temperature=0.5, top_p=0.95, max_new_tokens=50)\nresponse = tokenizer.decode(outputs[0], skip_special_tokens=True)\nprint(response)\n```\n\n### Training Datasets\n\nTraining datasets can be downloaded at 🤗 \u003ca href=\"https://huggingface.co/datasets/fnlp/character-llm-data\" target=\"_blank\"\u003ethis Link\u003c/a\u003e, which contains nine characters experience data used to train Character-LLMs.\nTo download the dataset, please run the following code with Python, and you can find the downloaded data in `/path/to/local_dir`.\n```python\nfrom huggingface_hub import snapshot_download\nsnapshot_download(\n    local_dir_use_symlinks=True, \n    repo_type=\"dataset\",\n    repo_id=\"fnlp/character-llm-data\", \n    local_dir=\"/path/to/local_dir\")\n```\n\nThe `prompted/` contains datasets that can be used for supervised fine-tuning directly. And `generated/` consists of raw data that generated by gpt-3.5-turbo, which can be converted into `prompted` style.\nHere is the statistics of the training data.\n|                      | # Scenes | # Words | # Turns |\n|----------------------|---------|--------|--------|\n| Cleopatra VII        | 1.4K    | 723K   | 14.3   |\n| Lord Voldemort       | 1.4K    | 599K   | 13.1   |\n| Spartacus            | 1.4K    | 646K   | 12.3   |\n| Hermione Granger     | 1.5K    | 628K   | 15.5   |\n| Isaac Newton         | 1.6K    | 772K   | 12.6   |\n| Julius Caesar        | 1.6K    | 820K   | 12.9   |\n| Ludwig van Beethoven | 1.6K    | 663K   | 12.2   |\n| Socrates             | 1.6K    | 896K   | 14.1   |\n| Martin Luther King   | 2.2K    | 1,038K | 12.0   |\n| Avg.                 | 1.6K    | 754K   | 13.2   |\n\n\n## Character Creation\n\n### Dataset\n**1) Profile Construction:** Choose one character (e.g. Beethoven) and get some profile for the character, which contains paragraphs sperated using `\\n\\n`. You can refer to the data format of `data/seed_data/profiles/wiki_Beethoven.txt`\n\n**2) Scene Extraction:** Add api keys to `apikeys.py`, and use LLM (gpt-3.5-turbo) to generated scenes based on the profile. Then you can parse the generated results into sence data.\n```bash\npython run_api_gen_data.py --prompt_name gen_scene --character Beethoven\npython parser/parse_data_scene.py result/2023-10-08/gen_scene/gpt-3.5-turbo-temp-0.2-char-Beethoven.jsonl\n```\n**Note:** The data generation code supports recovery from failure. You can re-run it multiple times to ensure sufficient samples are generated.\n\n**3) Experience Completion:** Prompt LLM (gpt-3.5-turbo) to generate interactions of different characters given the scenes. Then you can parse the results into experience data.\n\n```bash \npython run_api_gen_data.py --prompt_name gen_dialogue --character Beethoven --data_path processed/2023-10-08/\npython parser/parse_data_dialogue.py result/2023-10-08/gen_dialogue/gpt-3.5-turbo-temp-0.2-char-Beethoven.jsonl\n\n```\n\n**4) Protective Scene:** Prompt LLM (gpt-3.5-turbo) to generate interactions for protective scenes, which helps to reduce Character Hallucination.\n```bash\npython run_api_gen_data.py --prompt_name gen_hallucination --character Beethoven --data_path processed/2023-10-08/\npython parser/parse_data_hallucination.py result/2023-10-08/gen_hallucination/gpt-3.5-turbo-temp-0.2-char-Beethoven.jsonl\n```\n\n**5) Convert to Training Format:** run the following script to obtain the training data for SFT.\n```bash\npython parser/convert_prompt_data.py processed/2023-10-08/generated_agent_dialogue_Hermione.json\n```\n\n\n### Training\nThe training is based on `FastChat` with minor bug fixed. You may need to install some third-part packages to run this code.\n\nYou need to prepare the base model (e.g. llama-7b, llama2-7b or other models you like) and run the following training script with the corresponding hyper-parameters to train Character-LLM.\nIt should take 30~45 minutes to train on 8 A100 GPUs. Once the model is trained, you can load it by `from_pretrained` and use it similar to the example above.\n\n```bash\ncd FastChat\nexport CHARACTER=Beethoven\ntorchrun --nproc_per_node=8 --master_port=20031 fastchat/train/train_mem.py \\\n    --model_name_or_path /path/hf_model/llama-7b  \\\n    --data_path /path/to/prompted_agent_dialogue_$CHARACTER.json \\\n    --already_preprocess True \\\n    --bf16 True \\\n    --output_dir /path/to/ckpt/${CHARACTER}_7b \\\n    --num_train_epochs 10 \\\n    --per_device_train_batch_size 2 \\\n    --per_device_eval_batch_size 16 \\\n    --gradient_accumulation_steps 4 \\\n    --evaluation_strategy epoch \\\n    --save_strategy epoch \\\n    --save_total_limit 10 \\\n    --learning_rate 2e-5 \\\n    --weight_decay 0.1 \\\n    --warmup_ratio 0.04 \\\n    --lr_scheduler_type cosine \\\n    --logging_steps 1 \\\n    --fsdp 'full_shard auto_wrap' \\\n    --fsdp_transformer_layer_cls_to_wrap LlamaDecoderLayer \\\n    --tf32 True \\\n    --model_max_length 2048 \\\n    --gradient_checkpointing True \n\n```\n\n### Inference\nThe inference also requires `FastChat`. You can start the model inference server by following commands:\n```bash\ncd FastChat\n\n# start the controller\nexport IP=$(hostname -i)\npython3 -m fastchat.serve.controller --host $IP \u0026\n\n# start the Openai Format API server\npython3 -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 28001 --controller-address http://$IP:21001\n\n# start the model worker\nexport MODEL_PATH=/path/to/ckpt/${CHARACTER}_7b/\nexport MODEL_NAME=${CHARACTER}_7b\nCUDA_VISIBLE_DEVICES=0 python3 -m fastchat.serve.model_worker --model-path $MODEL_PATH --model-names $MODEL_NAME --controller-address http://$IP:21001 --host $IP --port 21009 --worker-address http://$IP:21009\n```\n\nYou can run multiple model_workers to connect to the controller to speed up the inference.\nAnd then, run singe-turn and multi-turn interviews with the following code.\n\n#### Single-Turn Interview\n```bash\npython run_api_interview_single.py\n```\n\n#### Multi-Turn Interview\n```bash\npython run_api_interview_turns.py sft\n```\n\nFor generated samples of Character-LLM and other baselines, please check `data/gen_results`, in which `interview_single` stores single-turn interviews of different models, while `interview_turns` stores multi-turn interviews results.\n\n## Generated Samples Demonstration 📝\n\n\u003cp align=\"center\"\u003e\n    Single-turn interview outputs from different methods simulating Beethoven.\n    \u003cimg src=\"./images/result1.png\" width=\"95%\"\u003e \u003cbr\u003e\n    \u003cbr\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    Multi-turn interview outputs from our trainable agent of Cleopatra VII.\n    \u003cimg src=\"./images/result2.png\" width=\"95%\"\u003e \u003cbr\u003e\n    \u003cbr\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    Multi-turn interview outputs from our trainable agent of Socrates.\n    \u003cimg src=\"./images/result3.png\" width=\"95%\"\u003e \u003cbr\u003e\n    \u003cbr\u003e\n\u003c/p\u003e\n\n\n\n## Citation 📖\n\nPlease cite our work if you found the resources in this repository useful:\n```bib\n@inproceedings{shao2023character,\n    title={Character-LLM: A Trainable Agent for Role-Playing},\n    author={Yunfan Shao and Linyang Li and Junqi Dai and Xipeng Qiu},\n    booktitle={EMNLP},\n    year=2023\n}\n```\n\n## Acknowledgements 🥰\n- We especially thank Ming Zhong for the helpful proofreading and suggestions on the paper.\n- This work was supported by the National Key Research and Development Program of China (No.2022ZD0160102) and National Natural Science Foundation of China (No.62022027). \n\n\n## Limitations ❗\nThe resources, including generated data, code and models, associated with this project are restricted for academic research purposes only and cannot be used for commercial purposes. The contents produced by Character-LLMs are influenced by uncontrollable variables such as randomness, and therefore, the accuracy and quality of the output cannot be guaranteed by this project. The authors of this project are not responsible for any potential consequences caused by the use of the resources in this project. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchoosewhatulike%2Ftrainable-agents","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchoosewhatulike%2Ftrainable-agents","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchoosewhatulike%2Ftrainable-agents/lists"}