{"id":13791351,"url":"https://chats-lab.github.io/KokoMind/","last_synced_at":"2025-05-12T10:31:43.121Z","repository":{"id":177712362,"uuid":"660793223","full_name":"CHATS-lab/KokoMind","owner":"CHATS-lab","description":" KokoMind: Can LLMs Understand Social Interactions?","archived":false,"fork":false,"pushed_at":"2023-10-03T22:45:21.000Z","size":233935,"stargazers_count":106,"open_issues_count":3,"forks_count":7,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-12T10:09:57.721Z","etag":null,"topics":["chatgpt","deep-learning","gpt-4","language-model","neural-network","nlp"],"latest_commit_sha":null,"homepage":"https://chats-lab.github.io/KokoMind/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CHATS-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-06-30T21:35:21.000Z","updated_at":"2025-02-28T01:14:15.000Z","dependencies_parsed_at":null,"dependency_job_id":"c0027ea9-938a-44fd-a8c9-ee9955880f5b","html_url":"https://github.com/CHATS-lab/KokoMind","commit_stats":null,"previous_names":["chats-nlp/kokomind"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CHATS-lab%2FKokoMind","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CHATS-lab%2FKokoMind/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CHATS-lab%2FKokoMind/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CHATS-lab%2FKokoMind/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CHATS-lab","download_url":"https://codeload.github.com/CHATS-lab/KokoMind/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253719955,"owners_count":21952930,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatgpt","deep-learning","gpt-4","language-model","neural-network","nlp"],"created_at":"2024-08-03T22:00:59.181Z","updated_at":"2025-05-12T10:31:39.260Z","avatar_url":"https://github.com/CHATS-lab.png","language":"JavaScript","funding_links":[],"categories":["Perspectives"],"sub_categories":["Multi-Agent Simulation Projects"],"readme":"# KokoMind \n\n[![License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/CHATS-lab/KokoMind/blob/main/LICENSE)\n[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/release/python-390/)\n\nThis is the repo for **KokoMind**, a dataset with multi-party social interactions to evaluate LLMs' social understanding abilities. The repo contains:\n\n- The evaluation [data](https://github.com/CHATS-lab/KokoMind/tree/main/data) of social interactions.\n- The [code](https://github.com/CHATS-lab/KokoMind/tree/main/eval) for model evaluation.\n- Check out the [blog post of KokoMind](https://chats-lab.github.io/KokoMind) to see some demos.\n\n\u003c!-- [[Project Page](https://chats-lab.github.io/KokoMind/)] [Paper] --\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./website/img/gorilla.png\" width=\"15%\"\u003e \u003cbr\u003e\n  Logo of \u003cb\u003eKokoMind\u003c/b\u003e.\n\u003c/p\u003e\n\n## News\n\n- **[2023.07.05]** **KokoMind** is released at https://chats-lab.github.io/KokoMind/.\n\n## Demo\nhttps://github.com/CHATS-lab/KokoMind/assets/13882237/731427bf-0d3c-4870-b36e-e146f954309b\n\n## Dataset\n\n**KokoMind** contains 150 complex multi-party social interactions (50 per source) with free-text questions and answers. To ensure diversity and scalability and avoid data contamination, all the social interactions, questions, and answers are generated by GPT-4 and verified by human experts later. These generations are based on three different sources:\n\n- 🤖 GPT-4-only: This subset is created solely by GPT-4 through prompting, without grounding on existing sources.\n- 🎦 Movie-based: To avoid data contamination, this portion of the data is grounded on diverse scenarios pulled from movies released after 2022. GPT-4 shapes these situations, maintaining the core essence while adding its own elements.\n- 🧠 ToMi-based: This segment contains data backboned by a simulated dataset, ToMi, which involves moving physical objects to different places, a classic test for theory of mind. These social interactions are again embellished and expanded by GPT-4.\n\nFor each social interaction, we ask various questions designed to probe the following aspects of social understanding.\n\n- 🧠 Theory of Mind: Questions evaluating understanding of others' mental states and perspectives.\n- 👍 Social Norm: Questions aiming to discern societal values and norms within the situations.\n- 😃 Emotion Recognition: Questions targeted at identifying and understanding emotional elements within the context.\n- 👨‍👩‍👧 Social Relation: Queries focusing on interpersonal dynamics and relationships.\n- 🤔 Counterfactual Questions: Hypothetical queries designed to explore alternative outcomes or possibilities.\n- 📝 Social Advice: Questions eliciting advice or action recommendations relevant to the given situation.\n\n`question_nonverbal_yes_v0.1.json` contains 770 samples in total. This [JSON Lines](https://jsonlines.org/) file is a list of dictionaries, with each dictionary contains the following fields:\n\n- `question_id`: int, the unique ID of the question.\n- `text`: str, social interaction context and question.\n- `answer`: str, GPT-4 answer that has been further verified by human.\n- `source`: str, one of the three data sources: `gpt-4`, `movie`, `tomi`.\n- `category`: str, one of six question categories: `ToM`, `Social Norm`, `Emotion Recognition`, `Social Relation`, `Counterfactual`, `Social Advice`.\n\n`question_nonverbal_no_v0.1.json` contains the same social interactions and questions but but with the non-verbal cues in the parenthesis (e.g., nervously sipping coffee, etc) removed from the context.\n\n## Evaluation\n\n### Pre-requisite\n\n```bash\npip install -r requirements.txt\nexport OPENAI_API_KEY=\u003cyour_api_key\u003e\nexport ANTHROPIC_API_KEY=\u003cyour_api_key\u003e\n```\n\n### Generate model answers\n\n``` bash\n# Generate local model anwers\n# Use vicuna-7b as an example\npython eval/get_model_answer.py --model-path ${PATH_TO_LOCAL_HF_MODEL} --model-id vicuna-7b --question-file data/question_nonverbal_yes_v0.1.jsonl --answer-file data/answer/answer_vicuna-7b.jsonl --num-gpus 8\n\n# GPT-3 answer (reference model by alpaca-eval)\npython eval/qa_baseline_gpt3.py -q data/question_nonverbal_yes_v0.1.jsonl -o data/answer/answer_gpt3.jsonl\n\n# GPT-3.5 answer\npython eval/qa_baseline_gpt35.py -q data/question_nonverbal_yes_v0.1.jsonl -o data/answer/answer_gpt35.jsonl\n\n# GPT-4.0 answer\npython eval/qa_baseline_gpt4.py -q data/question_nonverbal_yes_v0.1.jsonl -o data/answer/answer_gpt4.jsonl\n\n# Claude answer\npython eval/qa_baseline_claude.py -q data/question_nonverbal_yes_v0.1.jsonl -o data/answer/answer_claude.jsonl\n```\n\n### Run evaluation\n\nOur evaluation is based on [Alpaca-Eval](https://github.com/tatsu-lab/alpaca_eval).\n\n```bash\n# Convert to alpaca_eval input format\npython eval/generate_alpaca_eval.py -q data/question_nonverbal_yes_v0.1.jsonl -a data/answer/answer_gpt3.jsonl -o data/alpaca_eval/answer_gpt3.json\n\nalpaca_eval make_leaderboard --leaderboard_path data/alpaca_results/leaderboard.csv --all_model_outputs \"./data/alpaca_eval/answer_*\" --reference_outputs data/alpaca_eval/answer_gpt3.json --is_overwrite_leaderboard True\n```\n\n## License\n\nThis project is an early-stage research showcase, designed solely for non-commercial purposes. It adheres to [OpenAI's data usage terms](https://openai.com/policies/terms-of-use), and [ShareGPT's privacy practices](https://chrome.google.com/webstore/detail/sharegpt-share-your-chatg/daiacboceoaocpibfodeljbdfacokfjb). Let us know if you spot any potential violations. The software's code is available under the Apache License 2.0.\n\n## Acknowledgement\n\nWe would like to thank [Yejin Choi](https://homes.cs.washington.edu/~yejin/) from UW, [Louis-Philippe Morency](https://www.cs.cmu.edu/~morency/) from CMU, [Jason Weston](https://scholar.google.com/citations?user=lMkTx0EAAAAJ\u0026hl=en) from Meta, and [Diyi Yang](https://cs.stanford.edu/~diyiy/) from Stanford for their enlightening dialogues and constructive inputs. The theoretical foundation of KokoMind is based on Liang's PhD research with [Song-Chun Zhu](https://zhusongchun.net/) from Peking University, Tsinghua University and Beijing Institute for General Artificial Intelligence (BIGAI) and [Ying Nian Wu](https://scholar.google.com/citations?user=7k_1QFIAAAAJ\u0026hl=en) from UCLA.\n\n## Citation\n\nPlease cite our work if you find it useful.\n\n``` bib\n@misc{Shi_KokoMind_Can_Large_2023,\n  author = {Shi, Weiyan and Qiu, Liang and Xu, Dehong and Sui, Pengwei and Lu, Pan and Yu, Zhou},\n  title = {{KokoMind: Can Large Language Models Understand Social Interactions?}},\n  month = jul,\n  year = {2023},\n  url = {https://chats-lab.github.io/KokoMind/}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/chats-lab.github.io%2FKokoMind%2F","html_url":"https://awesome.ecosyste.ms/projects/chats-lab.github.io%2FKokoMind%2F","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/chats-lab.github.io%2FKokoMind%2F/lists"}