{"id":21229719,"url":"https://github.com/flynncao/eda","last_synced_at":"2025-09-03T02:38:54.304Z","repository":{"id":91686873,"uuid":"591329947","full_name":"flynncao/EDA","owner":"flynncao","description":"Explortory Data Analysis of Preterm Infant Cardio-Respiratory Signals Database","archived":false,"fork":false,"pushed_at":"2024-04-01T17:58:12.000Z","size":3688,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-15T02:15:00.028Z","etag":null,"topics":["ecg","eda","infant","ml","resp"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flynncao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-20T13:50:56.000Z","updated_at":"2023-06-20T15:21:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"dcbb8c57-757a-4215-882b-88664f8be570","html_url":"https://github.com/flynncao/EDA","commit_stats":null,"previous_names":["flynncao/eda"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flynncao%2FEDA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flynncao%2FEDA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flynncao%2FEDA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flynncao%2FEDA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flynncao","download_url":"https://codeload.github.com/flynncao/EDA/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243672484,"owners_count":20328768,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ecg","eda","infant","ml","resp"],"created_at":"2024-11-20T23:29:10.368Z","updated_at":"2025-03-15T02:15:08.503Z","avatar_url":"https://github.com/flynncao.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# EDA\n## 数据源\nhttps://physionet.org/content/picsdb/1.0.0/\n\n## 测试\n\n测试\n\n## 一些参考链接 \nEDA Example https://colab.research.google.com/drive/19b9IhncD3AVdCVVZ3oGHs80ZcgTj0d-w?usp=sharing\n\nwfdb Demo\n\nhttps://github.com/MIT-LCP/wfdb-python/blob/main/demo.ipynb\n\n\n## 20230124 （这部分的更新在ECG.ipynb，EDA主线在EDA.ipynb）\n\n（1） 关于annotations这部分，提供的文件里应该是没有annotation的，我拆解的多个infant数据里annotation都是空，也无法正常标记；另外第七组这个不是annotation是峰值\n\n(这个是标记)\n\n![](assets/images/1.png)\n\n（2）不过峰值对我们也有用，我们可以自己加标记，因为标记正常和不正常也是心率范围定的，转换心率的方法我记得有了，到时候能根据这些判断一下打标签，生成excel，后面就好说了，就能对的上DR之前课堂上给的EDA流程了\n\n\n(3) TODO：根据ECG的peak的合理范围判断来打上annotation并据此给正常?非正常的婴儿分类（EDA有些图表很复杂，因此给截取的数据都实际情况判断来annotation也是需要的）\n\n\n\u003e 目前的发现，ecg转换BPM很复杂，目前我们要做的也是resp和ecg的关系，不建议用bpm范围校准\n\n\u003e 建议研究ecg自身的频率分布来，ecg的y 轴（垂直）表示 ECG 波形的振幅，以毫伏为单位测量。 一个大方块代表 0.5 mV，一个小方块代表 0.1 mV。 \n\n\n## 20230125\n\n![](assets/images/r.png)\n\n### segments\n\n\n### waveforms\n`f = 1 / T `  frequency formula\n\n\n```py\n单个的\nx_ecg_full：全部ecg数据\nx_resp_full:全部resp数据\nfs_ecg: ecg的采样率\nfs_resp: resp的采样率\n```\n输出信息\n```\nLoading ECG file:  infant2_ecg\nLoading RESP file:  infant2_resp\nECG sampling frequency:  500  Hz\nRESP sampling frequency:  50  Hz\nECG sampling interval dt =  0.002  sec.\nRESP sampling interval dt =  0.02  sec.\n```\n\n### 预装数据说明\n* infantIndexesReadyArr : 要研究的婴儿文件索引数字(1, 2, 3, 10)都OK\n\n### Step2 -\u003e 封装函数说明\n\n获取每个婴儿的相关信息，便于后来resampling或者取R波\n```python\ndef GetRelevantDataForTargetSegment(data_dir, file_index):\n  x_ecg_full, x_resp_full, fs_ecg, fs_resp = load_waveforms(data_dir, file_index)\n  return {\"x_ecg_full\": x_ecg_full, \"e_resp_full\": x_resp_full, \"fs_ecg\": fs_ecg, \"fs_resp\": fs_resp}\n```\n\n\u003e ecg_full和resp_full就是每个婴儿独自的数据(数据结构中的sig_p)\n\u003e fs_ecg, fs_ecg是采样频率,这个是数据源给定的\n\nDS示例\n```\n{'infant2': {'x_ecg_full': array([-0.00982986, -0.01474478, -0.01474478, ...,  0.10321349,\n        0.10894757,  0.10976672]), 'e_resp_full': array([21.92315057, 22.01228904, 22.14775144, ..., 22.99842723,\n       23.04896243, 23.05176994]), 'fs_ecg': 500, 'fs_resp': 50}, 'infant3': {'x_ecg_full': array([ 0.09467079,  0.09993028,  0.09467079, ..., -0.83976496,\n       -0.91427437, -0.94407814]), 'e_resp_full': array([-1.38832109, -1.37854909, -1.39250908, ..., 24.72174772,\n       24.79782967, 24.94022157]), 'fs_ecg': 500, 'fs_resp': 50}}\n```\n\n\n从segments中截取每个婴儿所需的合适分段\n```python\n# 定义\ndef ClipDataToSegmentBorders(segments, file_index, segment_index):\n   t0_sec, t1_sec = segments[f\"infant{file_index:d}\"][segment_index]\n   t0_sample_ecg = round(t0_sec * fs_ecg)\n   t1_sample_ecg = round(t1_sec * fs_ecg)\n   t0_sample_resp = round(t0_sec * fs_resp)\n   t1_sample_resp = round(t1_sec * fs_resp)\n   return {\"t0_sample_ecg\": t0_sample_ecg, \"t1_sample_ecg\": t1_sample_ecg, \"t0_sample_resp\": t0_sample_resp, \"t1_sample_resp\": t1_sample_resp}\n\n# 使用\ninfantsBordersObj = {}\nfor infant in relevantDataArr.values():\n  # by default, the segment_index here is 0, means I will use the first segment tuple of every infant \n  infantsBordersObj[f\"infant{infant['file_index']}\"] = ClipDataToSegmentBorders(segments=segments, file_index=infant['file_index'], segment_index=segment_index)\n```\n\n\n\n## step3: 封装函数说明\n\n以给定的区间内，从每个婴儿的数据中截取并获得指定的ecg_x,resp_x数据\n```py\n# define\ndef ExtractPairedDataByIntervalsAndFlatThem(t0_sample_ecg, t1_sample_ecg, t0_sample_resp, t1_sample_resp):\n  x_ecg = x_ecg_full[t0_sample_ecg:t1_sample_ecg] \n  x_resp = x_resp_full[t0_sample_resp:t1_sample_resp]\n  return x_ecg, x_resp\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflynncao%2Feda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflynncao%2Feda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflynncao%2Feda/lists"}