{"id":15556457,"url":"https://github.com/gforge/medbench-dataprep","last_synced_at":"2025-09-05T19:34:53.192Z","repository":{"id":251222676,"uuid":"836760107","full_name":"gforge/MedBench-DataPrep","owner":"gforge","description":"Preparation of data for the MedBench project","archived":false,"fork":false,"pushed_at":"2024-10-11T20:29:35.000Z","size":102,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-03T13:15:28.111Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gforge.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-01T13:56:06.000Z","updated_at":"2024-10-11T20:29:38.000Z","dependencies_parsed_at":"2024-08-27T00:58:22.890Z","dependency_job_id":"04cec259-21c9-487f-95a5-ada3b976f4ed","html_url":"https://github.com/gforge/MedBench-DataPrep","commit_stats":{"total_commits":23,"total_committers":1,"mean_commits":23.0,"dds":0.0,"last_synced_commit":"aa25f0d070e397f4054ebd41773fa186e89a181b"},"previous_names":["gforge/medbench-dataprep"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gforge%2FMedBench-DataPrep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gforge%2FMedBench-DataPrep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gforge%2FMedBench-DataPrep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gforge%2FMedBench-DataPrep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gforge","download_url":"https://codeload.github.com/gforge/MedBench-DataPrep/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246132086,"owners_count":20728428,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-02T15:13:56.122Z","updated_at":"2025-03-29T03:22:32.883Z","avatar_url":"https://github.com/gforge.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MedBench-DataPrep\n\n## Overview\n\nWelcome to the MedBench-DataPrep repository. This repository contains R functions necessary for generating data for the MedBench research study. The study aims to create a benchmark dataset of fictional Electronic Health Records (EHRs) to evaluate and improve the performance of Large Language Models (LLMs) in medical documentation tasks.\n\n## Study Objectives\n\n- **Dataset Creation**: Develop a comprehensive dataset of fictional EHRs, covering a wide range of medical scenarios.\n- **Benchmarking**: Establish benchmarks to assess LLM performance in handling medical jargon, demographic diversity, and factual inconsistencies.\n- **Evaluation**: Implement methods to quantitatively and qualitatively evaluate the generated medical documentation.\n\n## Methods\n\n- **EHR Structure**: Each case includes core medical notes and related lab and medication data.\n- **Dataset Composition**: The dataset will reflect standard medical scenarios and include variations to test LLM capabilities under different conditions.\n- **Prompt Methodology**: Explore advanced prompting techniques to enhance LLM performance in generating medical documentation.\n\n## Contributing\n\nWe welcome contributions from the community. Please refer to our contribution guidelines for more information.\n\n## License\n\nThis project is licensed under the MIT License. See the LICENSE file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgforge%2Fmedbench-dataprep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgforge%2Fmedbench-dataprep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgforge%2Fmedbench-dataprep/lists"}