{"id":15672683,"url":"https://github.com/weimin17/multimodal_transformer","last_synced_at":"2025-05-06T22:13:40.822Z","repository":{"id":192717230,"uuid":"468087758","full_name":"weimin17/Multimodal_Transformer","owner":"weimin17","description":"A Multimodal Transformer: Fusing Clinical Notes With Structured EHR Data for Interpretable In-Hospital Mortality Prediction","archived":false,"fork":false,"pushed_at":"2022-07-26T03:25:48.000Z","size":11386,"stargazers_count":31,"open_issues_count":1,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-05-06T22:13:20.502Z","etag":null,"topics":["clinical-notes","clinical-variables","ehr","interpretability","interpretable-ai","mortality","mortality-prediction","multimodal","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/weimin17.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2022-03-09T20:49:13.000Z","updated_at":"2025-03-14T02:03:28.000Z","dependencies_parsed_at":"2023-09-05T22:03:32.749Z","dependency_job_id":null,"html_url":"https://github.com/weimin17/Multimodal_Transformer","commit_stats":null,"previous_names":["weimin17/multimodal_transformer"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weimin17%2FMultimodal_Transformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weimin17%2FMultimodal_Transformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weimin17%2FMultimodal_Transformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weimin17%2FMultimodal_Transformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/weimin17","download_url":"https://codeload.github.com/weimin17/Multimodal_Transformer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252776600,"owners_count":21802469,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clinical-notes","clinical-variables","ehr","interpretability","interpretable-ai","mortality","mortality-prediction","multimodal","transformer"],"created_at":"2024-10-03T15:30:08.450Z","updated_at":"2025-05-06T22:13:40.400Z","avatar_url":"https://github.com/weimin17.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Multimodal Transformer\nThe repository for the paper \"A Multimodal Transformer: Fusing Clinical Notes With Structured EHR Data for Interpretable In-Hospital Mortality Prediction\" submitted to AMIA'22 Annual Symposium.\n\n# Setup\nThe codes are tested on CUDA 11.4 with 24GB RAM GPU. For environment setup, please follow the install instruction in Section 'Clinical Data Processing'. \n\n# Clinical Data Processing\n## Structured Clinical Variables Processing\nClone https://github.com/YerevaNN/mimic3-benchmarks (Harutyunyan et al.) to 'Multimodal_Transformer/mimic3-benchmarks' folder. Setup the environment, and run all data generation steps to generate training data without text features.\n\ncreate folder 'data-mimic3' under 'Multimodal_Transformer' folder, and all the MIMIC-III processed data will be stored in 'data-mimi3' folder.\n\n## Unstructured Clinical Notes Processing\nClinical Notes processing is based on repository in https://github.com/kaggarwal/ClinicalNotesICU. \n\n### Requirenments\nsetup the environment for notes processing and model training. Install environment:\n\n~~~~\npip install -r requrements.txt\n~~~~\n\nUpdate all paths and configuration in 'mmtransformer/config.py'. \n\n\n### Notes Processing\n\n+ Run 'mmtransformer/scripts/extract_notes.py', the folder 'data-mimic3/root/test_text_fixed/', and 'data-mimic3/root/text_fixed/' will be generated.\n+ Run 'mmtransformer/scripts/extract_T0.py' file.\n\n# Train and Test\n\nFor our well-trained model, you can download from [GoogleDrive](https://drive.google.com/file/d/1Wch0pEgQ8PeWE9p77B6rdNuo9l28CZNv/view?usp=sharing). Unzip the file and put them in './Multimodal_Transformer/mmtransformer/models/Checkpoints' and './Multimodal_Transformer/mmtransformer/models/Data' accordingly. Or you can generate the files yourself.\n\n## Test\n\nFor model with only clinical notes (mbert), run\n\n~~~~\npython mbert.py --gpu_id 1\n~~~~\n\nFor multimodal transformer, run\n\n~~~~\npython IHM_mmtransformer.py --mode test --model_type both --model_name BioBert --TSModel Transformer --checkpoint_path Multimodal_Transformer --MaxLen 512 --NumOfNotes 0 --TextModelCheckpoint BioClinicalBERT_FT --freeze_model 1 --number_epoch 5 --batch_size 5 --load_model 1 --gpu_id 1\n~~~~\n\n## Train\n\nFor multimodal transformer training, run\n\n~~~~\npython IHM_mmtransformer.py --mode train --model_type both --model_name BioBert --TSModel Transformer --checkpoint_path Multimodal_Transformer --MaxLen 512 --NumOfNotes 0 --TextModelCheckpoint BioClinicalBERT_FT --freeze_model 1 --number_epoch 5 --batch_size 5 --load_model 0 --gpu_id 1\n~~~~\n\n\n# Visualization\nThe output of all analysis are in 'Analysis' folder. For important clinical words analysis and visualization in clinical notes, \n\n1. Run 'notes_analysis.py' to get the IG value with associated words, stored in file 'Analysis/bert_analysis_pred_all2.pkl'\n\n2. Run 'notes_analysis3.py' to get the word list with frequency, stored in 'pred_tokenlist_top10_l0_2.txt'. We further filtered the list to remove the irrelavent words and tokens, which is stored in 'filter_pred_tokenlist_top10_l0_2.txt'.\n\nIt will also generate the word cloud 'filter_pred_tokenlist_top10_l0_2.png'.\n\n\n# Credits\nThe code is based on repository by Khadanga et al. given in https://github.com/kaggarwal/ClinicalNotesICU, and by Deznabi et al. given in https://github.com/Information-Fusion-Lab-Umass/ClinicalNotes_TimeSeries for experimental setup.\n\n\nThe MIMIC-III clinical variables pre-processing is clone from repository by Harutyunyan et al. given in https://github.com/YerevaNN/mimic3-benchmarks\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweimin17%2Fmultimodal_transformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fweimin17%2Fmultimodal_transformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweimin17%2Fmultimodal_transformer/lists"}