{"id":14066640,"url":"https://github.com/DoctorBJones/datadictionary","last_synced_at":"2025-07-29T23:31:55.945Z","repository":{"id":45777730,"uuid":"502802166","full_name":"DoctorBJones/datadictionary","owner":"DoctorBJones","description":"R data dictionary","archived":false,"fork":false,"pushed_at":"2025-03-24T01:33:44.000Z","size":129,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-15T14:24:16.056Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DoctorBJones.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-13T04:14:55.000Z","updated_at":"2025-03-24T01:33:48.000Z","dependencies_parsed_at":"2024-08-13T07:11:19.690Z","dependency_job_id":"db603311-5a51-42a1-8edd-aab583d99093","html_url":"https://github.com/DoctorBJones/datadictionary","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/DoctorBJones/datadictionary","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DoctorBJones%2Fdatadictionary","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DoctorBJones%2Fdatadictionary/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DoctorBJones%2Fdatadictionary/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DoctorBJones%2Fdatadictionary/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DoctorBJones","download_url":"https://codeload.github.com/DoctorBJones/datadictionary/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DoctorBJones%2Fdatadictionary/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267780007,"owners_count":24143201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-13T07:05:11.877Z","updated_at":"2025-07-29T23:31:55.631Z","avatar_url":"https://github.com/DoctorBJones.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n \u003c!-- badges: start --\u003e\n [![CRAN status](https://www.r-pkg.org/badges/version/datadictionary)](https://cran.r-project.org/package=datadictionary)\n [![R-CMD-check](https://github.com/DoctorBJones/datadictionary/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/DoctorBJones/datadictionary/actions/workflows/R-CMD-check.yaml)\n [![Codecov test coverage](https://codecov.io/gh/DoctorBJones/datadictionary/branch/main/graph/badge.svg)](https://app.codecov.io/gh/DoctorBJones/datadictionary?branch=main)\n  \u003c!-- badges: end --\u003e\n\n# datadictionary\n\nThe goal of `datadictionary` is to create a data dictionary from any dataframe or tibble in your R environment. While other packages exist I found they were complicated to use and/or the output wasn't what I was after. This package attempts to solve those problems by presenting tabular summaries of the dataset in a format that fits easily in a pane or screen, using a single line of code. \n\nIt includes an overall summary of the dataset and at-a-glance summaries of each variable. All variables have a count of missing included, and different summaries are provided based on the data class.\n\nFor factors, labelled data and logicals the summary will include the name of each level with the level number in parentheses where appropriate. A value for the count of units in each level is included. \n\nFor dates, integers and other numeric types of data the summary includes statistical summaries such as mean, median, mode, minimum and maximum. A value for each is included in the table. \n\nCharacter variables include only a count of unique values and missing values. This is the default so if you include a class of data that isn't yet implemented you should get this output.\n\nYou can nominate one or more identifier variables, for example individuals and clusters, so you only get a count of unique and missing values rather than nonsense numeric summaries. \n\nYou can also include a vector to add labels if you want descriptions included in the document. Lastly, you can opt for the output to write directly to Excel.\n\n\n## Installation\n\nYou can install the current version of `datadictionary` from CRAN using:\n\n``` r\ninstall.packages(\"datadictionary\")\n```\n\nYou can install the development version of `datadictionary` from [GitHub](https://github.com/) with:\n\n``` r\n# install.packages(\"devtools\")\ndevtools::install_github(\"DoctorBJones/datadictionary\")\n```\n\n## Example\n\nYou can print a basic data dictionary directly to your console or assign it to an object in your environment:\n\n```{r}\nlibrary(datadictionary)\n\ncreate_dictionary(esoph)\n\nesoph_dictionary \u003c- create_dictionary(esoph)\n```\n\n\nYou specify one or more identifier variables by passing a quoted string or vector of quoted strings to `id_var`. This is useful if you have hierarchical data, for example and have identifiers for individuals, clusters or blocks.\n\n```{r}\n\n# create fake id variables\nmtcars$id1 \u003c- 1:nrow(mtcars)\nmtcars$id2 \u003c- mtcars$id1*10\n\ncreate_dictionary(mtcars, id_var = c(\"id1\", \"id2\"))\n\n```\nYou can also optionally add labels for unlabelled variables. You need to pass a named vector  to `var_labels` where the names \ncorrespond to columns in your dataset. The vector must be of the same length as your dataset.\n\n```{r}\n\n# Create labels as a named vector. \niris.labels \u003c- c(Sepal.Length = \"Sepal length in mm\",\n                 Sepal.Width = \"Sepal width in mm\",\n                 Petal.Length = \"Petal length in mm\",\n                 Petal.Width = \"Petal width in mm\",\n                 Species = \"Species of iris\")\n\ncreate_dictionary(iris, var_labels = iris.labels)\n```\n\nYou can also write directly to Excel from the `create_dictionary` function if you pass a file path and name as a quoted string to the `file` parameter. There is no visible output for this use.\n\n```{r, eval = FALSE}\n\ncreate_dictionary(ChickWeight, file = \"chickweight_dictionary.xlsx\")\n\n```\n\nThe package also includes a function to create a summary of a single variable in your dataset. There are no other arguments to this function.\n```{r}\n\nsummarise_variable(iris, \"Sepal.Length\")\n\nsummarise_variable(ChickWeight, \"Diet\")\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDoctorBJones%2Fdatadictionary","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDoctorBJones%2Fdatadictionary","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDoctorBJones%2Fdatadictionary/lists"}