{"id":14959022,"url":"https://github.com/wangleihitcs/medicalreportgeneration","last_synced_at":"2025-05-02T13:30:28.273Z","repository":{"id":201599846,"uuid":"152841428","full_name":"wangleihitcs/MedicalReportGeneration","owner":"wangleihitcs","description":"A Base Tensorflow Project for Medical Report Generation","archived":false,"fork":false,"pushed_at":"2019-06-16T07:16:44.000Z","size":73126,"stargazers_count":71,"open_issues_count":6,"forks_count":18,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-07T02:08:01.105Z","etag":null,"topics":["captioning","medical-report-generate","tensorflow-models"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wangleihitcs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-10-13T06:12:54.000Z","updated_at":"2025-03-23T11:03:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"5eadc9fa-768b-4963-8215-de82d094820d","html_url":"https://github.com/wangleihitcs/MedicalReportGeneration","commit_stats":null,"previous_names":["wangleihitcs/medicalreportgeneration"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangleihitcs%2FMedicalReportGeneration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangleihitcs%2FMedicalReportGeneration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangleihitcs%2FMedicalReportGeneration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangleihitcs%2FMedicalReportGeneration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wangleihitcs","download_url":"https://codeload.github.com/wangleihitcs/MedicalReportGeneration/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252045990,"owners_count":21685931,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["captioning","medical-report-generate","tensorflow-models"],"created_at":"2024-09-24T13:18:42.555Z","updated_at":"2025-05-02T13:30:23.772Z","avatar_url":"https://github.com/wangleihitcs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Medical Report Generation\nA base project for Medical Report Generation.\n\n## Config\n- python 2.7 / tensorflow 1.8.0\n- extra package: nltk, json, PIL, numpy\n\n## DataDownload\n- IU X-Ray Dataset\n    * The raw data is from [Open-i service of the National Library](https://openi.nlm.nih.gov/), it has many public datasets.\n    * The proccessed data is on [Medical-Report/NLMCXR_png_pairs.zip](https://pan.baidu.com/s/1CwChGVu6HWFDN2Xsy_htTA)(提取码: vdb6), you should unzip it to dir 'data/NLMCXR_png_pairs/'.\n- PreTrained InceptionV3 model\n    * The raw model is from [TensorflowSlim Image Classification Model Library](https://github.com/tensorflow/models/tree/master/research/slim)\n    * The proccessed data is on [Medical-Report/pretrain_model.zip](https://pan.baidu.com/s/1CwChGVu6HWFDN2Xsy_htTA)(提取码: vdb6), you shold unzip it to dir 'data/pretrain_model/'.\n\n## Train\n#### First, get post proccess data(I have done it)\n- get 'data/data_entry.json', it is the report sentences.\n- get 'data/train_split.json' and 'data/test_split.json', it is the ids for train/val/test.\n- get 'data/vocabulary.json', it is the vocabulary extracted from report.\n\n#### Second, get TFRecord files\n- get 'data/train.tfrecord' and 'data/test.tfrecord'\n    ```shell\n    $ python datasets.py\n    ```\n    e.g. if you get tfrecord files, you must annotate the code for func 'get_train_tfrecord()'\n#### Third, go train\n- you can train directly.\n    ```shell\n    $ python train.py\n    ```\n- you can see the train process\n    ```shell\n    $ cd ./data\n    $ tensorboard --logdir='summary'\n\n## Demo\n- You could use two chest x-ray imgs to test\n    ```shell\n    $ python demo.py --img_frontal_path='./data/experiments/CXR1900_IM-0584-1001.png' --img_lateral_path='./data/experiments/CXR1900_IM-0584-2001.png' --model_path='./data/model/my-test-2500'\n    ```\n- example\n\n    ![example2](data/experiments/CXR1900_IM-0584-1001.png)\n    \n    ```shell\n    $ The generate report:\n         no acute cardiopulmonary abnormality\n         the lungs are clear\n         there is no focal consolidation\n         there is no focal consolidation\n         there is no pneumothorax or pneumothorax\n    ```\n\n## Framework\n#### Core Framework\n![example](data/experiments/framework.png)\n\ne.g.Yuan Xue et.al-**Multimodal Recurrent Model with Attention for Automated Radiology Report Generation**, MICCAI 2018\n\n## Experments\n#### Metrics Results\n|  | BLEU_1 | BLEU_2 | BLEU_3 | BLEU_4 | METEOR | ROUGE | CIDEr |\n| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n| CNN-RNN\u003csup\u003e[10]\u003c/sup\u003e | 0.3087 | 0.2018 | 0.1400 | 0.0986 | 0.1528 | 0.3208 | 0.3068 |\n| CNN-RNN-Att\u003csup\u003e[11]\u003c/sup\u003e | 0.3274 | 0.2155 | 0.11478 | 0.1036 | 0.1571 | 0.3184 | 0.3649 |\n| Hier-RNN\u003csup\u003e[9]\u003c/sup\u003e | 0.3426 | 0.2318 | 0.1602 | 0.1121 | 0.1583 | 0.3343 | 0.2755|\n| MRNA\u003csup\u003e[6]\u003c/sup\u003e | 0.3721 | 0.2445| 0.1729 | 0.1234 | 0.1647 | 0.3224 | 0.3054 |\n| Ours | 0.4431 | 0.3116 | 0.2137 | 0.1473 | 0.2004 | 0.3611 | 0.4128 |\n\n- CNN-RNN and CNN-RNN-Att are simple base models for image caption.\n- Hier-RNN is a base model for image description generation, because we have not bounding boxes, so we use visual features \nfrom CNN directly to decode the sentence word by word.\n- MRNA is a base model from the MICCAI 2018 paper\u003csup\u003e[6]\u003c/sup\u003e, we use visual features from CNN to generate first sentence, \nthen we concat visual features and semantic features(last sentence encoded from 1d-conv layers) to generate second-final sentence\nword by word.\n- Ours are is based on MRNA, but we improve it.\n\ne.g. I have only release code for hier rnn and MRNA because others are easy.\n\n#### Details\nI split train/test dataset as 2811/300, use Adam with initial learning rate is 1e-4 with 5 epoch for decay 0.9.Then I set \ngenerate max 8 sentence with max 50 words for a sentence. The word embedding size is 512 and RNN units is 512. The more details is on\nconfig.py\n\n#### IU X-Rat Datasets\nThe raw images are 7470, but both has frontal_view and lateral_view is 3391*2. The raw report is 3927, but sentence num \u003e= 4 is 3631, \nbecause the report sentence num between 4 and 8 occupy 90% above, so I set max sentence num = 8. Both has image-pairs and report(sentence num \u003e= 4) \nis 3111.\n\n#### Result Between Normal and Abnormal Reports\nWhen I analyse the reports from datasets, I have found **Normal Reports : Abnormal Reports = 2.5 : 1, unbalanced**.\nMy best result is(not release):\n\n|  | BLEU_1 | BLEU_2 | BLEU_3 | BLEU_4 | METEOR | ROUGE | CIDEr |\n| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n| Total Test Data | 0.4431 | 0.3116 | 0.2137 | 0.1473 | 0.2004 | 0.3611 | 0.4128 |\n| Normal Test Data | 0.5130 | 0.3628 | 0.2615 | 0.1750 | 0.2313 | 0.3894 | 0.4478 |\n| Abnormal Test Data | 0.2984 | 0.1903 | 0.1274 | 0.0934 | 0.1289 | 0.2397 | 0.2641 |\n\ne.g. Total means the total Test Dataset, Normal means the normal report(no disease) of Test Dataset, Abnormal \nmeans the abnormal report(with disease or abnormality).\n\n## Summary\n#### Process\nNow, I have summarize the process of my research of Medical Report Generation.\n- First, it is easy to contact this task with Image2Text Task, so I exploit the Image Captions methods to solve this task's problems, like CNN+RNN methods.\n- Second, I found that Image Captions method can solve the one sentence(short), but this task has many sentences. So I use Image Paragraph Description Generation methods, like CNN+Hierarchical RNN.\n- Next, I found the reports of this task has the Impression and Findings description, so I exploit QA + Hierarchical RNN method to solve this task's problems.\n- Finally, I found that language informations are more important than image infos because small scale dataset, interesting.\n\n#### Problems\nThere are many challenges for this task, I refer to some points of \u003csup\u003e**[1]**\u003c/sup\u003e.\n- **Very Small Medical Data**, most medical datasets only with images and nearly without bounding boxes and reports, so it is very very overfit.\n- **Very Uncertainty Report Descriptions**, because different doctors have different style description for diagnosis report.\n- **More-Like Dense Caption Task not Story Generation**, we should ground the description sentence with relevant region.\n- **Unsuitable Metrics**, the BLEU for machine translation and CIDEr for captioning and so on are not suitable for this task.\n- **Impractical**, up to now, there are 4-5 papers \u003csup\u003e**[5][6][7][8]**\u003c/sup\u003e. public for this task, but to be honest, they are only for papers,\nthey do not release code.\n\n#### Little Advice\n- If you want to research medical report generation, you could get more data, and you could focus on the **Semantic Information** not **Visual Information** when data is small.\nIn VQA task, someones found that Language is more useful than Image.\n- You could use more stronger Language Model(BERT, ELMo or Transformer), maybe useful.\n\n## References\n- [1][医学诊断报告生成论文综述](https://blog.csdn.net/wl1710582732/article/details/85345285)\n- [2][Tensorflow Model released im2text](https://github.com/tensorflow/models/tree/master/research/im2txt)\n- [3][MS COCO Caption Evaluation Tookit](https://github.com/tylin/coco-caption)\n- [4]**TieNet Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays**, Xiaosong Wang et at, CVPR 2018, NIH\n- [5]**On the Automatic Generation of Medical Imaging Reports**, Baoyu Jing et al, ACL 2018, CMU\n- [6]**Multimodal Recurrent Model with Attention for Automated Radiology Report Generation**, Yuan Xue, MICCAI 2018, PSU\n- [7]**Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation**, Christy Y. Li et al, NIPS 2018, CMU\n- [8]**Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation**, Christy Y. Li et al, AAAI 2019, DU\n- [9]**A Hierarchical Approach for Generating Descriptive Image Paragraphs, Jonathan Krause** et al, CVPR 2017, Stanford\n- [10]**Show and Tell: A Neural Image Caption Generator**, Oriol Vinyals et al, CVPR 2015, Google\n- [11]**Show, Attend and Tell: Neural Image Caption Generation with Visual Attention**, Kelvin Xu et at, ICML 2015\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwangleihitcs%2Fmedicalreportgeneration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwangleihitcs%2Fmedicalreportgeneration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwangleihitcs%2Fmedicalreportgeneration/lists"}