{"id":46843910,"url":"https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code","last_synced_at":"2026-03-10T14:13:05.144Z","repository":{"id":339721898,"uuid":"1163012422","full_name":"fairdataihub/nih-dmp-llm-evaluation-paper-code","owner":"fairdataihub","description":null,"archived":false,"fork":false,"pushed_at":"2026-02-21T06:36:31.000Z","size":355,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-21T13:27:35.366Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fairdataihub.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-21T00:57:30.000Z","updated_at":"2026-02-21T06:36:34.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code","commit_stats":null,"previous_names":["fairdataihub/nih-dmp-llm-evaluation-paper-code"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/fairdataihub/nih-dmp-llm-evaluation-paper-code","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2Fnih-dmp-llm-evaluation-paper-code","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2Fnih-dmp-llm-evaluation-paper-code/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2Fnih-dmp-llm-evaluation-paper-code/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2Fnih-dmp-llm-evaluation-paper-code/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fairdataihub","download_url":"https://codeload.github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2Fnih-dmp-llm-evaluation-paper-code/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30336298,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-10T12:41:07.687Z","status":"ssl_error","status_checked_at":"2026-03-10T12:41:06.728Z","response_time":106,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-03-10T14:13:04.535Z","updated_at":"2026-03-10T14:13:05.108Z","avatar_url":"https://github.com/fairdataihub.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Code: NIH DMPs LLM Evaluation Paper\n\n## Overview\n\nThis repository contains the code associated with the paper, “Evaluating the Performance of LLMs in Drafting NIH Data Management Plans.” In the paper, we evaluated the performance of Llama 3.3 and GPT 4.1 in drafting NIH-compliant Data Management Plans (DMPs) using two complementary approaches: automatic reference-based evaluation and human expert evaluation.\n\nThe repository includes the complete automated and human evaluation workflows. Please refer to the project **[inventory](https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-inventory)** for all related resources, including the paper.\n\n---\n\n## Standards followed\n\nThe overall codebase is organized in alignment with the **[FAIR-BioRS guidelines](https://fair-biors.org/)**. All Python code follows **[PEP 8](https://peps.python.org/pep-0008/)** conventions, including consistent formatting, inline comments, and docstrings. Project dependencies are fully captured in **[requirements.txt](https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code/blob/main/requirements.txt)**.\n\n---\n\n## Getting Started\n\n### Step 1 — Clone the repository\n\n```bash\ngit clone https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code.git\ncd dmpchef\ncode .\n```\n\n### Step 2 — Create and activate a virtual environment\n\n**Windows (cmd):**\n\n```bash\npython -m venv venv\nvenv\\Scripts\\activate.bat\n```\n\n**macOS/Linux:**\n\n```bash\npython -m venv venv\nsource venv/bin/activate\n```\n\n### Step 3 — Install dependencies\n\n```bash\npip install -r requirements.txt\n```\n\n---\n\n### Step 4- Running the Notebook in two approaches\n\nThis repository supports two complementary evaluation workflows. Use the appropriate notebook depending on the evaluation approach you want to run.\n\n#### Automatic reference-based evaluation\n\nUse: [`Automated-evaluation.ipynb`](https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code/blob/main/Automated-evaluation.ipynb)\n\n#### Human expert evaluation\n\nUse: [`Human-evaluation.ipynb`](https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code/blob/main/Human-evaluation.ipynb)\n\n---\n\n## Inputs and Outputs\n\nThe Jupyter notebook makes use of files in the dataset associated with the paper. You will need to download the dataset at add it in the input folder (call the dataset folder 'dataset'). Please refer to the project **[inventory](https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-inventory)** for a link to the dataset.\n\nAll outputs from both evaluation pipelines (tables and figures) are saved under the `outputs/` directory.\n\n---\n\n## License\n\nThis work is licensed under the **[MIT License](https://opensource.org/license/mit/)**. See **[LICENSE](LICENSE.txt)** for more information.\n\n---\n\n## Feedback and contribution\n\nUse **[GitHub Issues](https://github.com/fairdataihub/nih-dmp-llm-evaluation-paper-code/issues)** to submit feedback, report problems, or suggest improvements. You can also **fork** the repository and submit a **Pull Request** with your changes.\n\n---\n\n## How to cite\n\nIf you use this code, please cite this repository using following the instructions in the [CITATION.cff](CITATION.cff) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffairdataihub%2Fnih-dmp-llm-evaluation-paper-code","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffairdataihub%2Fnih-dmp-llm-evaluation-paper-code","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffairdataihub%2Fnih-dmp-llm-evaluation-paper-code/lists"}