{"id":21015613,"url":"https://github.com/gersteinlab/step-back-profiling","last_synced_at":"2025-05-15T05:32:15.743Z","repository":{"id":245219517,"uuid":"803985307","full_name":"gersteinlab/step-back-profiling","owner":"gersteinlab","description":null,"archived":false,"fork":false,"pushed_at":"2024-08-02T06:07:43.000Z","size":2528,"stargazers_count":4,"open_issues_count":1,"forks_count":0,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-04-03T04:41:22.805Z","etag":null,"topics":["llms","personalized-generation"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gersteinlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-21T18:28:33.000Z","updated_at":"2025-02-10T02:58:15.000Z","dependencies_parsed_at":"2024-06-20T16:21:01.597Z","dependency_job_id":"ff6bb4c0-911c-4304-b2e6-41ad82bea4cb","html_url":"https://github.com/gersteinlab/step-back-profiling","commit_stats":null,"previous_names":["gersteinlab/step-back-profiling"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Fstep-back-profiling","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Fstep-back-profiling/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Fstep-back-profiling/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gersteinlab%2Fstep-back-profiling/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gersteinlab","download_url":"https://codeload.github.com/gersteinlab/step-back-profiling/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254282237,"owners_count":22045123,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llms","personalized-generation"],"created_at":"2024-11-19T10:10:34.784Z","updated_at":"2025-05-15T05:32:15.164Z","avatar_url":"https://github.com/gersteinlab.png","language":"Jupyter Notebook","readme":"# STEP-BACK PROFILING: Distilling User History for Personalized Scientific Writing\r\n\r\n![GitHub](https://img.shields.io/github/license/gersteinlab/step-back-profiling)\r\n![GitHub repo size](https://img.shields.io/github/repo-size/gersteinlab/step-back-profiling)\r\n![GitHub last commit](https://img.shields.io/github/last-commit/gersteinlab/step-back-profiling)\r\n\r\nThis repository contains the code and dataset for the paper \"STEP-BACK PROFILING: Distilling User History for Personalized Scientific Writing\".\r\n\r\n![Overview](assets/step-back.png)\r\n\r\n## Table of Contents\r\n- [Dataset Generation](#dataset-generation)\r\n- [Results Generation](#results-generation)\r\n- [LaMP Result](#lamp-result)\r\n- [Citation](#citation)\r\n- [License](#license)\r\n\r\n## Dataset Generation\r\n\r\nThe dataset generation process involves the following steps:\r\n\r\n0. Download the raw data here [s2orc_4000.json](https://drive.google.com/file/d/1s9DCidREnhLLCLfTqYu0WTNd808XgCyz/view?usp=sharing)\r\n\r\n1. Get sampled author list and paper list in JSON format:\r\n   - `dataset/data_construction.ipynb`\r\n\r\n2. Extract author's research interests:\r\n   - `dataset/s2orc-rq.ipynb`\r\n\r\n3. Extract research questions from papers:\r\n   - `dataset/research_question_extraction.ipynb`\r\n\r\n## PSW Results Generation\r\n\r\nFor generating results for each task, follow these steps:\r\n\r\n1. Get User Profile:\r\n   - `psw_result/author_profiling_cot.ipynb`\r\n\r\n2. Generate title for single author:\r\n   - `psw_result/single_agent_title_generation.ipynb`\r\n\r\n3. Generate results for multiple authors \u0026 evaluation for each task:\r\n   - `psw_result/task1_solving.ipynb`\r\n   - `psw_result/task2_solving.ipynb`\r\n   - `psw_result/task3_solving.ipynb`\r\n   - `psw_result/task4_solving.ipynb`\r\n\r\n## LaMP Results Generation\r\n\r\n![LaMP](assets/LaMP_radar_chart.png)\r\n\r\nThe `lamp_result/` directory contains the following notebooks:\r\n\r\n- `lamp_result/cot_generation.ipynb`\r\n- `lamp_result/final_output_generation.ipynb`\r\n- `lamp_result/user_profile_generation.ipynb`\r\n\r\nThese notebooks are used for generating user profiles and final outputs for the LaMP dataset.\r\n\r\n## Citation\r\n\r\nIf you find this work useful, please cite our paper:\r\n\r\n```\r\n@misc{tang2024stepback,\r\n      title={Step-Back Profiling: Distilling User History for Personalized Scientific Writing}, \r\n      author={Xiangru Tang and Xingyao Zhang and Yanjun Shao and Jie Wu and Yilun Zhao and Arman Cohan and Ming Gong and Dongmei Zhang and Mark Gerstein},\r\n      year={2024},\r\n      eprint={2406.14275},\r\n      archivePrefix={arXiv},\r\n      primaryClass={cs.CL},\r\n}\r\n```\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgersteinlab%2Fstep-back-profiling","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgersteinlab%2Fstep-back-profiling","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgersteinlab%2Fstep-back-profiling/lists"}