{"id":19004993,"url":"https://github.com/dhis2/metadata-assessment","last_synced_at":"2025-06-17T10:36:41.678Z","repository":{"id":42975389,"uuid":"414272165","full_name":"dhis2/metadata-assessment","owner":"dhis2","description":"Metadata Assessment report","archived":false,"fork":false,"pushed_at":"2023-10-26T08:20:12.000Z","size":645,"stargazers_count":4,"open_issues_count":3,"forks_count":2,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-04-22T18:58:22.323Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dhis2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-06T15:39:35.000Z","updated_at":"2024-09-03T09:45:00.000Z","dependencies_parsed_at":"2023-02-15T23:01:06.167Z","dependency_job_id":"83446b8c-6e38-4bf4-b272-3e0727352e2b","html_url":"https://github.com/dhis2/metadata-assessment","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dhis2/metadata-assessment","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dhis2%2Fmetadata-assessment","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dhis2%2Fmetadata-assessment/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dhis2%2Fmetadata-assessment/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dhis2%2Fmetadata-assessment/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dhis2","download_url":"https://codeload.github.com/dhis2/metadata-assessment/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dhis2%2Fmetadata-assessment/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260341299,"owners_count":22994634,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T18:25:43.443Z","updated_at":"2025-06-17T10:36:36.656Z","avatar_url":"https://github.com/dhis2.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DHIS2 Metadata Assessment Tool\n\n## Purpose of this tool\n\nManagement of metadata integrity should be a primary concern for DHIS2 implementers.\nThe DHIS2 API enforces a number of restrictions on various objects and their \nrelationships, but under certain circumstances, metadata objects may become\ncorrupted. This may become especially apparent on DHIS2 systems which have been\nrunning for a number of years, and which have undergone extensive changes to the \nsystems metadata.\n\nAnother common problem is the creation of metadata or analytic objects\nwhich are no longer in use (or perhaps never were). In order to keep the \nsystem tidy and with good performance, it may be appropriate to regularly\nreview the metadata in the DHIS2 database and determine if it should be removed\ndue to lack of use.\n\nThere are a number of integrity checks which are available in DHIS2 which help\nto diagnose various metadata problems. This tool is intended to serve\nas a compliment to those checks, as well as providing additional guidance to \nDHIS2 implementers on how to fix these problems in their system.\n\n\n## About this tool\n\nThis tool kit consists of a series of SQL queries which have been created\nby the DHIS2 Implementation and Development Teams to identify potential metadata issues in \nDHIS2 databases. The metadata checks in this tool have been organized in a series of YAML\nfiles. YAML is a user-friendly data serialization language which can be \nparsed by a number of different programming languages. It is also exceptionally\neasy to read.  Each check included in this tool has a separate YAML file, \nwhich consists of a number of key value pairs. Each of these keys will be explained below. \n\n- *summary_uid*: A predefined DHIS2 UID which is used to identify the \nsummary SQL query for this check.\n- *name*: The name of the SQL query. This name should be relatively short but descriptive. It should also\nbe written in snake case so that a valid database view name can be created using this field.\n- *description*: A short description of the issue.\n- *section*: Used in the R-markdown report to group related issues together. \nGenerally these are related metadata objects like indicators or data elements.\n- *section_order*: Used in the R-markdown report to order issues within a section.\n- *summary_sql*: An SQL query which is used to produce a single row which summarizes\nthe particular issue. Each query should return four columns and one row.\n     - indicator: This should be the same as the `name` field above.\n     - count: This should return total number of object which are flagged \n       by the particular check. The field should be returned as      a `vachar`. \n     - percent: Where possible this field should calculate the percentage of \n       the objects flagged by this particular test versus the \n       total number of objects in the same class.\n     - description: A brief description of the the issue, probably the same as \n       the description field above.\n- *details_sql*: An SQL query which should return one or more rows of all \nmetadata objects which violate this particular metadata check. At the very least,\nthe query should consist of the UID and name of the object, and in certain cases\nmay contain other fields which will make the identification of the specific object easier in order to rectify the problem.\n- *is_slow*: Whether the query is potentially long-running/slow, typically because it queries against the datavalue table.\n- *severity*: This field is used to indicate the overall severity of a particular problem. \n    - INFO: Indicates that this is for information only.\n    - WARNING: A warning indicates that this may be a problem, but not \n    necessarily an error. It is however recommended to triage these issues.\n    - SEVERE: An error which should be fixed, but which may not necessarily lead to\n    the system not functioning. \n    - CRITICAL: An error which must be fixed, and which may lead to end-user\n    error or system crashes.\n- *introduction*: A brief synopsis of the problem including its significance and origin.\n- *recommendation*: A recommended procedure of how to address the issue is included in this field. \n\n\nAn R Markdown report has been included to run all of \nthe queries and organize them into an HTML report. More information on\nhow to run the R Markdown report can be found in the next section.  \n\nIt is also possible to run the queries individually directly on the DHIS2 database, \nif you are looking to isolate and address a particular problem. You can simply\ncopy and the `details_query` from the particular YAML file of interest, and \neither create an SQL View in DHIS2 and view the results there. Alternatively,\nif you have access to the DHIS2 database, you could retrieve the results\ndirectly from a database console.\n\n## How to use the R Markdown report\n\n- [Download](https://www.r-project.org/) and install a version of R for your \noperating system. \n- [Download](https://www.rstudio.com/products/rstudio/download/) and install\nR Studio for your operating system. \n- [Download](https://git-scm.com/downloads) and install the git source control management software for your operating system.\n- Clone the [source](https://github.com/dhis2/metadata-assessment) of this repository to your system. \n- Install dependencies by invoking the following commands in the R console. \n\n```R\nif (!require(\"pacman\")) install.packages(\"pacman\")\npacman::p_load(jsonlite, httr, purrr,knitr,magrittr,ggplot2,DT,dplyr,yaml,knitr,rmarkdown,dplyr,readr)\n```\n\n- Edit or create the file called `.Rprofile` in the top level directory of the cloned git repository.\n\nThis should only be done on a private computer, since you will need to store authentication \ndetails in this file. Alternatively, you can execute the commands in the R \nconsole, if you are not comfortable storing authentication details in a \ntext file. The file should contain the following commands. \n\n```R\nSys.setenv(baseurl=\"http://localhost:8080/\")\nSys.setenv(username=\"admin\")\nSys.setenv(password=\"district\")\nSys.setenv(cleanup_views = FALSE)\nSys.setenv(dhis2_checks = TRUE)\nSys.setenv(include_slow = FALSE)\n```\n\nYou should replace each of the variables with your particular details. \n\n- *baseurl*: This should be the URL of your server. Please take note of using \nhttps instead of http. Also, the URL should end with a final \"/\".\n- *username*: Username of the user used to authenticate with the DHIS2 instance. \nThis user should at least have the ability to create SQL views. \n- *password*: Password of the user which will connect to DHIS2. Take note\nthat this password will be stored in clear text, so you should not  store\nthis on a shared computer environment.\n- *cleanup_views*: If set to `TRUE` the SQL Views created during the generation\nof the report will be deleted after the report completes.\n- *dhis2_checks*: If set to `TRUE`, results from the DHIS2 data integrity\nchecks will also be integrated into the report. Please take note, that \nthese integrity checks may take a very long time to run on larger databases.\n- *include_slow*: If set to `TRUE`, checks that potentially take a long time\nto complete on large databases will also be included. This includes, for example, \nchecks that involve queries against the `datavalues` table.\n\nOnce you have completed each of these steps, open up the file `dhis2_metadata_assessment.Rmd`\nin RStudio. Press the \"Knit\" button, and wait for the report to complete. The report \nwill upload and create a series of SQL Views on your DHIS2 instance, and then \nretrieve each of the results to combine them into a single HTML page. \n\nThe report is organized into a series of sections. The first section is a summary\ntable which contains an overview of the results of each query. The second\nsection presents summary figures and graphs related to users. The third section\ncontains essentially the same information as the summary table, but also includes\nwritten guidance which helps explain the particular details of the problem, as well\nas a recommended approach of how to solve them. The last optional section of the \nreport contains the results of the DHIS2 integrity checks, if you chose to enable\nthem during the generation of the report.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdhis2%2Fmetadata-assessment","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdhis2%2Fmetadata-assessment","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdhis2%2Fmetadata-assessment/lists"}