{"id":28532356,"url":"https://github.com/predict-idlab/data-quality-challenges-wearables","last_synced_at":"2025-10-12T10:40:00.307Z","repository":{"id":213944071,"uuid":"735270792","full_name":"predict-idlab/data-quality-challenges-wearables","owner":"predict-idlab","description":"Addressing Data Quality Challenges in Ambulatory Wrist-worn Wearable Monitoring Through Analytical and Practical Approaches","archived":false,"fork":false,"pushed_at":"2024-05-27T11:33:58.000Z","size":15784,"stargazers_count":6,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-07-07T14:41:11.099Z","etag":null,"topics":["ambulatory-care","data-quality-assessment","remote-monitoring","time-series","wearable","wearable-devices"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/predict-idlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-24T09:51:05.000Z","updated_at":"2025-05-11T13:39:43.000Z","dependencies_parsed_at":"2024-05-16T03:58:00.423Z","dependency_job_id":null,"html_url":"https://github.com/predict-idlab/data-quality-challenges-wearables","commit_stats":null,"previous_names":["predict-idlab/data-quality-challenges-wearables"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/predict-idlab/data-quality-challenges-wearables","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/predict-idlab%2Fdata-quality-challenges-wearables","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/predict-idlab%2Fdata-quality-challenges-wearables/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/predict-idlab%2Fdata-quality-challenges-wearables/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/predict-idlab%2Fdata-quality-challenges-wearables/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/predict-idlab","download_url":"https://codeload.github.com/predict-idlab/data-quality-challenges-wearables/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/predict-idlab%2Fdata-quality-challenges-wearables/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279011058,"owners_count":26084865,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ambulatory-care","data-quality-assessment","remote-monitoring","time-series","wearable","wearable-devices"],"created_at":"2025-06-09T15:38:14.831Z","updated_at":"2025-10-12T10:40:00.301Z","avatar_url":"https://github.com/predict-idlab.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\u003ch1\u003e:mag: Addressing Data Quality Challenges \u003cbr\u003ein Remote Wearable Monitoring\n\u003c/div\u003e\n\nCodebase \u0026 further details for the paper:\n\u003e Addressing Data Quality Challenges in Observational Ambulatory Studies: Analysis, methodologies and practical solutions for wrist-worn wearable monitoring\n\nIn this project, we address data quality challenges encountered in remote wearable monitoring by utilizing two distinct datasets:\n\n1. ETRI Lifelog 2020: Accessible at ETRI Lifelog 2020\u003cbr\u003e https://nanum.etri.re.kr/share/schung/ETRILifelogDataset2020?lang=En_us\n\n2. mBrain21:\u003cbr\u003e https://www.kaggle.com/datasets/jonvdrdo/mbrain21/data\n\nFor each identified challenge, denoted as `C\u003cID\u003e`, we have curated a dedicated notebook. These notebooks are specifically designed to demonstrate effective countermeasures against the respective challenges.\n\n## 📖 Table of contents\n- [📰 How is the repository structured?](#-how-is-the-repository-structured)\n  - [🛠️ Installation](#installation_ref)\n- [🗃️ How to acquire the data](#️-how-to-acquire-the-data)\n  - [ETRI lifelog 2020](#etri-lifelog-2020)\n  - [mBrain21](#mbrain21)\n  - [Utilizing this repository](#utilizing-this-repository)\n- [✨ Challenges \u0026 features](#-challenges--features)\n  - [📷 Dashboards](#-dashboards)\n    - [ETRI](#etri)\n    - [mBrain](#mbrain)\n  - [⌚ off-wrist detection](#-off-wrist-detection)\n  - [✒️ Data annotation](#data_annotation_ref)\n- [📖 Citation](#citation_ref)\n- [📝 License](#license_ref)\n\n\n## 📰 How is the repository structured\n\n```txt\n├── code_utils              \u003c- module containing all shared code\n│   ├── empatica            \u003c- Empatica E4 specific code (signal processing pipelines)\n│   ├── etri                \u003c- ETRI specific code (data parsing, visualization, dashboard)\n│   ├── mbrain              \u003c- mBrain specific code (data parsing, visualization, dashboard)\n│   └── utils               \u003c- utility code (dashboard, dataframes, interaction analysis)\n├── loc_data                \u003c- local data folder in which intermediate data is stored\n└── notebooks               \u003c- Etri and mBrain specific notebooks \n    ├── EmbracePlus.ipynb   \u003c- EmbracePlus demo notebook\n    ├── etri\n    └── mBrain\n```\n\n\n\u003ca name=\"installation_ref\"\u003e\u003c/a\u003e\n### 🛠️ Installation\n\nThis repository uses [poetry](https://python-poetry.org/) as dependency manager.\nA specification of the dependencies is provided in the [`pyproject.toml`](pyproject.toml) and [`poetry.lock`](poetry.lock) files.\n\nYou can install the dependencies in your Python environment by executing the following steps;\n1. Install poetry: https://python-poetry.org/docs/#installation\n2. Activate you poetry environment by calling `poetry shell`\n3. Install the dependencies by calling `poetry install`\n\n## 🗃️ How to acquire the data\n### ETRI lifelog 2020\nThe ETRI lifelog 2020 is made available at https://nanum.etri.re.kr/share/schung/ETRILifelogDataset2020?lang=En_us.\n\nIn order to download the dataset, you should first create an account on the ETRI Nanum website.\nAfterwards, fill in the license agreement form, and upon approval, you will be able to download the dataset via the web platform.\n\n### mBrain21\nA subset of the mBrain21 dataset is made available on [Kaggle datasets](https://www.kaggle.com/datasets/jonvdrdo/mbrain21/data):\nThe dataset can be downloaded via the following command:\n```bash\nkaggle datasets download -d jonasvdd/mbrain21\n```\n\n### Utilizing this repository\nMake sure that you've extended the [path_conf.py](code_utils/path_conf.py) file's hostname *if- statement* with your machine's hostname and that you've configured the paths to the `mBrain` and `ETRI` datasets.\n\n# ✨ Challenges \u0026 features\nBelow, a subset of exemplified challenges and features are listed.\n\n## 📷 Dashboards\nThis section elaborates on the longitudinal time series visualization dashboards for both the ETRI and mBrain datasets.\n\nEach dashboard contains, as can be observed in the figures below, a left column with selection boxes.\nThe General flow to visualize a specific time series excerpt is as follows:\n- Select a `folder` (in our case, all data from the ETRI and MBRAIN dataset are stored in the same folder - so you can only select from one option)\n- Select an user (e.g, user30 for the ETRI dataset)\n\u003e *note*: After selection a folder and user, the time-span selection will be updated to the available time-span for the selected user-folder combination\n- Select sensors (e.g. 'E4 accelerometer' and 'E4 temperature')\n\nFinally, to visualize, press the *run interact* button.\n\n\n### ETRI\nOnce the ETRI dataset has been downloaded and parsed via the [ETRI parsing](notebooks/etri/0_parse_etri.ipynb) notebook, the corresponding [dashboard script](code_utils/etri/dashboard.py) can be used to explore \u0026 analyse the data.\nThe dashboard can be run via the following command (after activating the poetry shell)\n\n```bash\npython code_utils/etri/dashboard.py\n```\nThe output should show the following:\n\u003e *Dash is running on http://0.0.0.0:\\\u003cPORT\\\u003e*\n\n\nIn the dashboard screenshot below, both the wearable data and the application event labels are visualized. One can immediately observe that this participant tends to be more alone during evenings (light blue shaded area of the lower row in the upper subplot). During the weekends (indicated with a gray shaded area), this participant tends to be alone and spend a lot of time at home.\n\n![](figures/ETRI_dashboard.png)\n\n### mBrain\nThe dashboard can be run via the following command (after activating the poetry shell)\n```bash\npython code_utils/mBrain/dashboard.py\n```\nThe output will show the following:\n\u003e *Dash is running on http://0.0.0.0:\\\u003cPORT\\\u003e*\n\nBelow, we provide a screenshot of the mBrain dashboard. As can be observed from the selection box on the left side, the dashboard shows the headache timeline of the participant, along with the Empatica E4 its accelerometer signal and the smartphone light data. When hovering over a headache event, as shown in the upper plot, one can see the associated characteristics of the headache event.\n\n![](figures/MBRAIN_dashboard.png)\n\n## ⌚ off-wrist detection\nThe wearable non-wear detection is demonstrated in the [C5.1_off_wrist_detection](notebooks/mBrain/C5_wearable_off_wrist.ipynb) notebook.\u003cbr\u003e\u003cbr\u003e\nMoreover, the [C7_missing_data](notebooks/mBrain/C7_missing_data.ipynb) notebook demonstrates how this off-wrist pipeline can be used to remove non-wear bouts as a preprocessing step.\n\nBelow, a screenshot of the off-wrist pipeline devised by [Böttcher et al. (2022)](https://www.nature.com/articles/s41598-022-25949-x) is shown.\n![](figures/off_wrist_bottcher.png)\n\n\u003ca name=\"data_annotation_ref\"\u003e\u003c/a\u003e\n\n## ✒️ Data annotation\nThe [C5.1_label_off_wrist](notebooks/mBrain/C5.1_Label_off_wrist.ipynb) mBrain notebook demonstrates how large bouts of time-series data can be annotated using [plotly-resampler](https://github.com/predict-idlab/plotly-resampler).\n\nBelow a demo is shown on how this annotation tool can be used to label `off-wrist` periods.\n![](figures/annotation_demo.gif)\n\n\u003ca name=\"citation_ref\"\u003e\u003c/a\u003e\n## 📖 Citation\n```bibtex\n@article{van2024addressing,\n  title={Addressing Data Quality Challenges in Observational Ambulatory Studies: Analysis, Methodologies and Practical Solutions for Wrist-worn Wearable Monitoring},\n  author={Van Der Donckt, Jonas and Vandenbussche, Nicolas and Van Der Donckt, Jeroen and Chen, Stephanie and Stojchevska, Marija and De Brouwer, Mathias and Steenwinckel, Bram and Paemeleire, Koen and Ongenae, Femke and Van Hoecke, Sofie},\n  journal={arXiv preprint arXiv:2401.13518},\n  year={2024}\n}\n```\n\u003ca name=\"license_ref\"\u003e\u003c/a\u003e\n## 📝 License\nThe code is available under the *imec* [license](LICENSE).\n\n---\n\n\u003cp align=\"center\"\u003e\n👤 \u003ci\u003eJonas Van Der Donckt\u003c/i\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpredict-idlab%2Fdata-quality-challenges-wearables","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpredict-idlab%2Fdata-quality-challenges-wearables","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpredict-idlab%2Fdata-quality-challenges-wearables/lists"}