{"id":19428997,"url":"https://github.com/bptlab/mimic-log-extraction","last_synced_at":"2025-07-17T18:31:47.547Z","repository":{"id":53708467,"uuid":"498300567","full_name":"bptlab/mimic-log-extraction","owner":"bptlab","description":"A CLI tool for extracting event logs out of MIMIC Databases.","archived":false,"fork":false,"pushed_at":"2023-03-29T08:07:45.000Z","size":4044,"stargazers_count":10,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-24T18:48:22.106Z","etag":null,"topics":["event-log","mimic-iv","process-mining"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bptlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-05-31T11:08:18.000Z","updated_at":"2025-04-16T02:53:26.000Z","dependencies_parsed_at":"2025-04-24T18:38:29.118Z","dependency_job_id":"6921e5e9-9fec-4929-b850-a50f45e4c66f","html_url":"https://github.com/bptlab/mimic-log-extraction","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bptlab/mimic-log-extraction","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bptlab%2Fmimic-log-extraction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bptlab%2Fmimic-log-extraction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bptlab%2Fmimic-log-extraction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bptlab%2Fmimic-log-extraction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bptlab","download_url":"https://codeload.github.com/bptlab/mimic-log-extraction/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bptlab%2Fmimic-log-extraction/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265645388,"owners_count":23804183,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["event-log","mimic-iv","process-mining"],"created_at":"2024-11-10T14:17:27.259Z","updated_at":"2025-07-17T18:31:47.529Z","avatar_url":"https://github.com/bptlab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mimic-log-extraction\n\n[![Pylint](https://github.com/bptlab/mimic-log-extraction/actions/workflows/pylint.yml/badge.svg)](https://github.com/bptlab/mimic-log-extraction/actions/workflows/pylint.yml) [![Typecheck](https://github.com/bptlab/mimic-log-extraction/actions/workflows/mypy.yml/badge.svg)](https://github.com/bptlab/mimic-log-extraction/actions/workflows/mypy.yml)\n\nA CLI tool for extracting event logs out of MIMIC Databases. This branch is for MIMIC-IV 1.0. If you use MIMIC-IV 2.0 or 2.2, please pull from the respective branch: https://github.com/bptlab/mimic-log-extraction/tree/mimic-2.0 , https://github.com/bptlab/mimic-log-extraction/blob/mimic-2.2/\n\n- requires python 3.8.10 (newer versions might be fine, though)\n- using a python virtual environment seems like a good idea\n\nThe official python documentation provides a [good overview](https://docs.python.org/3/library/venv.html) on how to create virtual environments. We recommend having the environment either in this directory, or one level above.\n\n## usage\n\n```\nusage: extract_log.py [-h] [--db_name DB_NAME] [--db_host DB_HOST] [--db_user DB_USER] [--db_pw DB_PW] [--subject_ids SUBJECT_IDS]\n                      [--hadm_ids HADM_IDS] [--icd ICD] [--icd_version ICD_VERSION] [--icd_sequence_number ICD_SEQUENCE_NUMBER] [--drg DRG]\n                      [--drg_type DRG_TYPE] [--age AGE] [--type TYPE] [--tables TABLES] [--tables_activities TABLES_ACTIVITIES]\n                      [--tables_timestamps TABLES_TIMESTAMPS] [--notion NOTION] [--case_attribute_list CASE_ATTRIBUTE_LIST] [--config CONFIG]\n                      [--save_intermediate] [--ignore_intermediate]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --db_name DB_NAME     Database Name\n  --db_host DB_HOST     Database Host\n  --db_user DB_USER     Database User\n  --db_pw DB_PW         Database Password\n  --subject_ids SUBJECT_IDS\n                        Subject IDs of cohort\n  --hadm_ids HADM_IDS   Hospital Admission IDs of cohort\n  --icd ICD             ICD code(s) of cohort\n  --icd_codes_intersection Optional argument, if one wants to filter for disease combinations, such that patients have to have an icd code from icd_codes and from icd_codes_intersection\n  --icd_version ICD_VERSION\n                        ICD version\n  --icd_sequence_number ICD_SEQUENCE_NUMBER\n                        Ranking threshold of diagnosis\n  --drg DRG             DRG code(s) of cohort\n  --drg_type DRG_TYPE   DRG type (HCFA, APR)\n  --age AGE             Patient Age of cohort\n  --type TYPE           Event Type\n  --tables TABLES       Low level tables\n  --tables_activities TABLES_ACTIVITIES\n                        Activity Columns for Low level tables\n  --tables_timestamps TABLES_TIMESTAMPS\n                        Timestamp Columns for Low level tables\n  --notion NOTION       Case Notion\n  --case_attribute_list CASE_ATTRIBUTE_LIST\n                        Case Attributes\n  --config CONFIG       Config file for providing all options via file\n  --save_intermediate   Store intermediate extraction results as csv. For debugging purposes.\n  --ignore_intermediate\n                        Explicitly disable storing of intermediate results.\n  --csv_log             Store resulting log as a .csv file instead of as an .xes event log\n```\n\nCall the tool via\n\n```bash\npython3 -m extract_log \u003c...\u003e\n```\n\npassing the required parameters.\n\nIf you installed the tool via cloning this repository, you should instead execute\n\n```bash\npython3 ./extract_log.py \u003c...\u003e\n```\n\n## config file\n\nFor providing parameters via a `.yml` config file, provide the path to that file via the `--config` flag.\nThis will override any setting provided via prompt or input flag, so be careful. Refer to the `example_config.yml` file for how to provide options. The config keys `icd_codes`, `drg_codes`, and `additional_event_attributes` need to be explicitly set to `[]` in order to not be prompted for during extraction. `include_medications` only needs to be set for POE event logs to avoid the prompt. When `case_attributes` is set to `[]`, the respective default attributes are used. If the key is not provided, no case attributes are added. To be prompted for it during execution, `prompt_case_attributes` needs to be set to true.\n\n```yaml\ndb:\n    name: mimic\n    host: 127.0.0.1\n    user: some_db_user\n    pw: some_db_password\nsave_intermediate: True # True, False\ncsv_log: False # True, defaults to False\ncohort:\n    subject_ids: # Omitting does not consider subject_ids\n        - some subject_ids\n        - ...\n    hadm_ids: # Omitting does not consider hadm_ids\n        - some hadm_ids\n        - ...\n    icd_codes: # could also be [] to avoid ICD filtering. Omitting makes the tool prompt for input.\n        - some ICD code\n        - ...\n    icd_codes_intersection: # optional argument, if one wants to filter for disease combinations, such that patients have to have an icd code from icd_codes and from icd_codes_intersection\n        - some ICD code\n        - ...   \n    icd_version: 10 # 9, 10, 0\n    icd_seq_num: 1\n    drg_codes: [] # could also contain keys to filter for DRG codes. Omitting makes the tool prompt for input. \n    drg_ontology: APR # APR, HCFA\n    age: # could also be [] to avoid age range filtering. Omitting makes the tool prompt for input.\n        - 0:25\n        - 50:90\nevent_type: admission # admission, transfer, poe\ninclude_medications: False # False, True. Only needed if POE event_type\ncase_notion: hospital admission # subject, hospital admission\ncase_attributes: [] # could also be None. [] uses default case attributes for case notion.\nprompt_case_attributes: False # False, True. Setting True forces case attributes to be determined if not provided\nlow_level_tables: # only if event type OTHER\n    - pharmacy\n    - labevents\nlow_level_activities:\n    - medication\n    - label\nlow_level_timestamps:\n    - starttime\n    - charttime\nadditional_event_attributes: # Can be set to []. Omitting makes the tool prompt for input\n    - \n        start_column: a\n        end_column: b\n        time_column: c\n        table_to_aggregate: d\n        column_to_aggregate: f\n        aggregation_method: g\n        filter_column: h # can be omitted\n        filter_values:\n            - one\n            - other\n    -\n        start_column: a\n        end_column: b\n        time_column: c\n        table_to_aggregate: d\n        column_to_aggregate: f\n        aggregation_method: g\n        filter_column: h # can be omitted\n```\n\n## installation\n\nSimply run the pip installation command to install the extraction tool:\n\n```bash\npip install git+https://github.com/bptlab/mimic-log-extraction/\n```\n\nAlternatively, clone this repo and execute\n\n```bash\npip install -e .\n```\n\nFor development and testing, all dev dependencies can be installed using\n\n```bash\npip install -e .[dev]\n```\n\nIf you're using `zsh`, escape the square brackets: `pip install -e .\\[dev\\]`\n\n## development\n\nAfter installing all required dev dependencies, make sure to regularly call\n\n```bash\npylint extract_log.py extractor --rcfile .pylintrc\nmypy --config-file mypy.ini .\n```\n\nto ensure linted and typechecked code.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbptlab%2Fmimic-log-extraction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbptlab%2Fmimic-log-extraction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbptlab%2Fmimic-log-extraction/lists"}