Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dime-worldbank/iehfc
IEHFC - Shiny App for high frequency checks
https://github.com/dime-worldbank/iehfc
Last synced: about 1 month ago
JSON representation
IEHFC - Shiny App for high frequency checks
- Host: GitHub
- URL: https://github.com/dime-worldbank/iehfc
- Owner: dime-worldbank
- License: mit
- Created: 2023-07-05T17:46:39.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-17T21:41:55.000Z (2 months ago)
- Last Synced: 2024-10-20T08:23:03.607Z (2 months ago)
- Language: R
- Homepage:
- Size: 326 KB
- Stars: 5
- Watchers: 7
- Forks: 4
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# `iehfc`
Welcome to the `iehfc` platform by DIME Analytics!The `iehfc` platform is a Shiny Dashboard that simplifies the process of setting up data quality checks. `iehfc` provides easy-to-create, customizable, and shareable high-frequency check outputs.
This platform is currently under construction. If you find any issues or have any suggestions, please open an issue and let us know!
## Contents
This repository will eventually be converted into a functional R package, which will contain:
- R scripts with functions to run simple high-frequency checks on datasets
- An iehfc Shiny dashboard to allow users to engage with the functions in an interactive manner.## iehfc Platform — Current Use Instructions
NOTE — Work still needs to be done to create a fully independent set of scripts that can find each other on any individual's local setup. For now, the user is requested to **open the `iehfc.Rproj` file**, which will automatically set up the correct working directory for any user.
1. Either (a) clone the `iehfc` repository using Github Desktop or equivalent or (b) download the files into a standalone folder
2. Open `iehfc.Rproj`
3. Run `iehfc_app/global.R`. This should install the required packages and launch the `iehfc` application. If at any point you want to relaunch the `iehfc` application after closing it, you can run `iehfc_app()` in the console.## iehfc Platform — Quick Guide
Once you have managed to open the `iehfc` Shiny Dashboard, you can follow the steps below to obtain your data quality check outputs. The `iehfc` Platform is composed of three principal tabs.
### Upload Data
This is where you can upload the dataset whose quality you want to check and inspect its contents. You can either upload your own dataset or use the provided test dataset. Currently, the platform only supports .csv files. Once you have successfully loaded your dataset, you can move to the next tab.
### Check Selection and Setup
This is where you set up the content and parameters of the checks you want to run. There are currently four available types of checks, with two more under construction.
---
**Duplicate Checks** — Allows you to verify whether an ID variable is uniquely identified.
- Please provide the name of the variable, as well as any additional variables that would help address the duplicates in the outputs.
- The output is a table with the duplicate observations.---
**Outlier Checks** — Allows you to check whether an individual variable or a group of variables has any outliers.
- You can provide the names of individual (numeric) variables to check for outliers.
- You can also provide a common prefix (e.g. "income" for "income_01", "income_02", etc.) for a group of (numeric) variables whose aggregated values you would like to check for outliers. This is particularly useful if you have the same indicator divided into multiple variables, such as income for multiple household members, for instance.
- The platform currently considers values that are over three standard deviations from the mean to be outliers. In the next version of the platform, the user will be able to set manual limits or adjust the distance to define outliers.
- You'll need to provide the dataset's ID variable and can add additional variables here as well.
- The output is a table with a row for each identified outlier.---
**Enumerator Checks** — Allows you to check average values and progress for individual enumerators if you are conducting primary data collection.
- Please provide the variable that identifies the enumerator in the dataset.
- You can then add (numeric) variables for which you'd like to see the average value per enumerator. This can be useful to check whether an enumerator has an unusually different set of values to the others.
- You can also add a submission date variable. This is strongly encouraged. This allows you to see the number of submissions per enumerator per day, and thus to track their progress.
- You can also add a "complete submission" variable. This will allow you to see the percentage of complete submissions per enumerator.
- The outputs are (1) a table with submissions per enumerator and — if a submission date was provided — submissions per day, (2) a table with average values of variables per enumerator if variables were provided in the "Enumerator Average Value Variable" fields, and (3) a graph showing cumulative submissions per enumerator if a submission date was provided.---
**Administrative Unit-Level Checks** — Allows you to check submissions and progress for individual administrative units (e.g. villages) in your dataset.
- Please provide the variable that identifies the administrative unit of interest in the dataset.
- You can then add additional, higher-level administrative units (e.g. if you administrative unit of interest is "village", these could be "district" and "county") that would help either locate your administrative units of interest or uniquely identify them.
- You can also add a submission date variable. This is strongly encouraged. This allows you to see the number of submissions per administrative unit per day, and thus to track their progress.
- You can also add a "complete submission" variable. This will allow you to see the percentage of complete submissions per administrative unit.
- The outputs are (1) a table with submissions per administrative unit and — if a submission date was provided — submissions per day and (2) a graph showing cumulative submissions per administrative unit if a submission date was provided.---
**Under Construction** — Unit of Observation-Level Checks and Survey Logic Checks