{"id":14066712,"url":"https://github.com/elastacloud/automatic-data-explorer","last_synced_at":"2025-07-29T23:32:01.352Z","repository":{"id":73991022,"uuid":"97841150","full_name":"elastacloud/automatic-data-explorer","owner":"elastacloud","description":"An R package to explore and quality check data","archived":false,"fork":false,"pushed_at":"2018-05-25T08:53:50.000Z","size":554,"stargazers_count":5,"open_issues_count":10,"forks_count":3,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-06-10T03:09:32.018Z","etag":null,"topics":["correlations","covariance","pca","summary-statistics"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/elastacloud.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-20T13:53:37.000Z","updated_at":"2025-04-24T07:50:43.000Z","dependencies_parsed_at":"2023-04-26T01:31:46.437Z","dependency_job_id":null,"html_url":"https://github.com/elastacloud/automatic-data-explorer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/elastacloud/automatic-data-explorer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastacloud%2Fautomatic-data-explorer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastacloud%2Fautomatic-data-explorer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastacloud%2Fautomatic-data-explorer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastacloud%2Fautomatic-data-explorer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/elastacloud","download_url":"https://codeload.github.com/elastacloud/automatic-data-explorer/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elastacloud%2Fautomatic-data-explorer/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267780014,"owners_count":24143201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["correlations","covariance","pca","summary-statistics"],"created_at":"2024-08-13T07:05:13.687Z","updated_at":"2025-07-29T23:32:01.092Z","avatar_url":"https://github.com/elastacloud.png","language":"R","readme":"# Automatic Data Explorer  [![Build Status](https://travis-ci.org/elastacloud/automatic-data-explorer.svg?branch=master)](https://travis-ci.org/elastacloud/automatic-data-explorer)  [![codecov](https://codecov.io/gh/elastacloud/automatic-data-explorer/branch/master/graph/badge.svg)](https://codecov.io/gh/elastacloud/automatic-data-explorer)\n\nAn R package to explore and quality check data. Contains a variety of useful functions which enable automatic checking of data quality, factors and numeric data as well as correlations.\n\n- `targetCorrletions()`\n- `ggdensity()`\n- `gghistogram()`\n- `SummaryStatsCat()`\n- `SummaryStatsNum()`\n- `autoMarkdown()`\n\n## Using targetCorrelations\n\nTo get started use a data frame and detail the column that you want to get target correlations for:\n\n    install.packages(\"purrr\")\n    library(purrr)\n\n    data \u003c- data.frame(A = rnorm(50,0,1),\n                       B = runif(50,10,20),\n                       C = seq(1,50,1),\n                       D = rep(LETTERS[1:5], 10))\n\n    targetCorrelations(data, \"B\")\n\nThis should give a similar report to:\n\n             C          A \n    0.40549008 0.01356416 \n\n## Using autoMarkdown\n\nThe `autoMarkdown()` function can be used to automatically generate R Markdown files directly from one or more\nR scripts. The idea is to take the focus away from thinking about your Markdown styling when doing the\nmost important part of data science, the actual expoloration and analysis.\n\nThe function requires that the R script has some formatting; the code that you wish to be incorporated into a\ncode chunk must be separated with a divider, e.g.\n\n    #' # Summary\n    #' This is the summary of the mtcars dataset\n    \n    #.#\n    summary(mtcars)\n    #.#\n    \n    #' ## Histogram of mpg\n    #' This is a histogram of the mpg variable\n    \n    #.#\n    autoHistogramPlot(mtcars, mpg, colour = \"black\", fill = \"blue\")\n    #.#\n    \nThere are two things to note in this example\n- #.# are the dividers and mean that the code within should be treated as a code chunk\n- #' autoMarkdown recognises these as Roxygen comments and treats them accordingly\n\nSay that we have saved the above in an R script called `mtcars.R`, we can now write this as R Markdown to an existing\n`mtcars.Rmd` file with \n\n    autoMarkdown(\"mtcars.R\", \"mtcars.Rmd\")\n    \nMost projects will have multiple separate scripts; perhaps detailing different stages of the data science life-cycle.\nThis makes our work flow much easier to follow and keeps code neat and tidy. However, when it comes to reporting it\nis most likely that we want just one report. If we have multiple scripts these can all be written to the same .Rmd\nfile with\n\n    autoMarkdown(c(\"DataExploration.R\", \"DataCleaning.R\", \"Modelling.R\"), \"ProjectReport.Rmd\", overwrite = TRUE)\n    \nNote the `overwrite = TRUE` argument. This specification will mean that any existing markdown in the .Rmd file will automatically be written over. This is useful in most circumstances but could potentially be dangerous if you specify the\nwrong .Rmd file, so use with caution.\n\nThe default setting is to create code chunks that are \"quiet\", that is they will only display the results of the code,\nnot the code itself or any messages generated by it. Further development may include an option to specify a code chunk\nthat also displays the code itself.\n\n","funding_links":[],"categories":["R"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felastacloud%2Fautomatic-data-explorer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felastacloud%2Fautomatic-data-explorer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felastacloud%2Fautomatic-data-explorer/lists"}