{"id":19490271,"url":"https://github.com/rstudio/rstarthere","last_synced_at":"2025-10-08T09:46:54.154Z","repository":{"id":66205568,"uuid":"57074111","full_name":"rstudio/RStartHere","owner":"rstudio","description":"A guide to some of the most useful R Packages that we know about","archived":false,"fork":false,"pushed_at":"2019-09-16T16:42:30.000Z","size":358,"stargazers_count":665,"open_issues_count":6,"forks_count":218,"subscribers_count":77,"default_branch":"master","last_synced_at":"2025-05-24T09:08:01.486Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rstudio.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-04-25T20:27:24.000Z","updated_at":"2025-04-24T02:00:52.000Z","dependencies_parsed_at":"2023-03-10T18:00:42.791Z","dependency_job_id":null,"html_url":"https://github.com/rstudio/RStartHere","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rstudio/RStartHere","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rstudio%2FRStartHere","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rstudio%2FRStartHere/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rstudio%2FRStartHere/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rstudio%2FRStartHere/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rstudio","download_url":"https://codeload.github.com/rstudio/RStartHere/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rstudio%2FRStartHere/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278924141,"owners_count":26069400,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T21:12:17.237Z","updated_at":"2025-10-08T09:46:54.120Z","avatar_url":"https://github.com/rstudio.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n# RStartHere\nA guide to some of the most useful R Packages that we know about, organized by their role in data science.\n\n[Click here to suggest packages.](https://github.com/rstudio/RStartHere/edit/master/README.Rmd)\n\n\n## Data Science Workflow \nEach data science project is different, but each follows the same general steps. You:\n\n![\"The data science workflow\"](data-science.png)\n\n\n1. [Import](#import) your data into R\n2. [Tidy](#tidy) it\n3. Understand your data by iteratively \n    a. [visualizing](#visualize)\n    b. [tranforming](#transform) and \n    c. [modeling](#modelinfer) your data\n4. [Infer](#infer) how your understanding applies to other data sets (_including future data, i.e. predictions_)\n5. [Communicate](#communicate) your results to an audience, or\n6. [Automate](#automate) your analysis for easy reuse\n7. [Program](#program) the whole way through, since you do each of these things on a computer\n\nBelow we list the most useful R packages that we know of for each step.\n\n## Import\nThese packages help you import data into R and save data.\n\n* [feather](https://blog.rstudio.org/2016/03/29/feather/) - a fast, lightweight file format used by both R and Python\n* [readr](https://blog.rstudio.org/2015/10/28/readr-0-2-0/) - reads tabular data\n* [readxl](https://blog.rstudio.org/2015/04/15/readxl-0-1-0/) - reads Microsoft Excel spreadsheets\n* [openxlsx](https://github.com/awalker89/openxlsx) - reads Microsoft Excel spreadsheets\n* [googlesheets](https://github.com/jennybc/googlesheets) - reads Google spreadsheets\n* [haven](https://blog.rstudio.org/2015/03/04/haven-0-1-0/) - reads SAS, SPSS, and Stata files\n* [httr](https://blog.rstudio.org/2016/02/02/httr-1-1-0/) - reads data from web APIs\n* [rvest](https://blog.rstudio.org/2014/11/24/rvest-easy-web-scraping-with-r/) - scrapes data from web pages\n* [xml2](https://github.com/hadley/xml2) - reads HTML and XML data\n* [webreadr](https://cran.r-project.org/web/packages/webreadr/vignettes/Introduction.html) - reads common web log formats\n* [DBI](https://github.com/rstats-db/DBI) - a universal interface to database management systems (DBMS)\n    + [RMySQL](https://github.com/rstats-db/RMySQL) - MySQL driver for DBI\n    + [RPostgres](https://github.com/rstats-db/RPostgres) - Postgres driver for DBI\n    + [RSQLite](https://github.com/rstats-db/RSQLite) - SQlite driver for DBI\n    + [bigrquery](https://github.com/rstats-db/bigrquery) - Google BigQuery driver for DBI\n    + [MonetDBLite](https://github.com/hannesmuehleisen/MonetDBLite) - MonetDBLite, an in-process columnar store\n* [PivotalR](https://github.com/pivotalsoftware/PivotalR) - reads data from and interfaces with [Postgres](http://www.postgresql.org), [Greenplum](http://greenplum.org), and [HAWQ](http://hawq.incubator.apache.org) \n* [dplyr](https://github.com/hadley/dplyr) - contains an interface to common databases\n* [data.table](https://github.com/Rdatatable/data.table) - `fread()` for fast table reading\n* [git2r](https://github.com/ropensci/git2r) - tools to access git repositories\n* [BioInstaller](https://github.com/JhuangLab/BioInstaller) - Downloader for biological software and database.\n\n\n## Tidy\nThese packages help you wrangle your data into a form that is easy to analyze in R.\n\n* [tidyr](https://github.com/hadley/tidyr) - tools for tidying layout of tabular data\n* [dplyr](https://github.com/hadley/dplyr) - tools for joining multiple tables into a tidy data set\n* [purrr](https://github.com/hadley/purrr) - tools for applying R functions to data structures, very useful when tidying\n* [broom](http://varianceexplained.org/r/broom-intro/) - tools for tidying statistical models into data frames\n* [zoo](https://www.google.com/webhp?sourceid=chrome-instant\u0026ion=1\u0026espv=2\u0026ie=UTF-8#q=r%20zoo) - data structures for time series data\n* [PivotalR](https://github.com/pivotalsoftware/PivotalR) - R wrappers for in-database SQL operations (i.e. join, group by)\n\n\n## Visualize\nThese packages help you visualize your data.\n\n* [ggplot2](http://docs.ggplot2.org/current/) with [extensions](http://www.ggplot2-exts.org/) - a versatile system for making plots\n    + [ggthemes](https://github.com/jrnold/ggthemes) - plot style themes\n    + [ggmap](https://github.com/dkahle/ggmap) - maps with Google Maps, Open Street Maps, etc.\n    + [ggiraph](http://davidgohel.github.io/ggiraph/introduction.html) - interactive ggplots\n    + [ggstance](https://github.com/lionel-/ggstance) - horizontal versions of common plots\n    + [GGally](https://github.com/ggobi/ggally) - scatterplot matrices\n    + [ggalt](https://github.com/hrbrmstr/ggalt) - additional coordinate systems, geoms, etc.\n    + [ggforce](https://github.com/thomasp85/ggforce) - additional geoms, etc.\n    + [ggrepel](https://github.com/slowkow/ggrepel) - prevent plot labels from overlapping\n    + [ggraph](https://github.com/thomasp85/ggraph) - graphs, networks, trees and more\n    + [ggpmisc](https://cran.rstudio.com/web/packages/ggpmisc/) - photo-biology related extensions\n    + [geomnet](https://github.com/sctyner/geomnet) - network visualization\n    + [ggExtra](https://github.com/daattali/ggExtra) - marginal histograms for a plot\n    + [gganimate](https://github.com/dgrtwo/gganimate) - animations\n    + [plotROC](https://github.com/sachsmc/plotROC) - interactive ROC plots\n    + [ggspectra](https://cran.rstudio.com/web/packages/ggspectra/) - tools for plotting light spectra\n    + [ggnetwork](https://github.com/briatte/ggnetwork) - geoms to plot networks\n    + [ggtech](https://github.com/ricardo-bion/ggtech) - style themes for plots\n    + [ggradar](https://github.com/ricardo-bion/ggradar) - radar charts\n    + [ggTimeSeries](https://github.com/Ather-Energy/ggTimeSeries) - time series visualizations\n    + [ggtree](https://bioconductor.org/packages/release/bioc/html/ggtree.html) - tree visualizations\n    + [ggseas](https://github.com/ellisp/ggseas) - seasonal adjustment tools\n* [lattice](http://lattice.r-forge.r-project.org/) - Trellis graphics\n* [rgl](https://cran.r-project.org/web/packages/rgl/vignettes/rgl.html) - interactive 3D plots\n* [ggvis](http://ggvis.rstudio.com/) - versatile system for interactive graphs\n* [htmlwidgets](http://www.htmlwidgets.org/) - framework for creating JavaScript widgets with R\n    + [leaflet](http://rstudio.github.io/leaflet/) - Interactive maps\n    + [dygraphs](http://rstudio.github.io/dygraphs) - Interactive time series plots\n    + [plotly](https://plot.ly/r/) - Interactive plots\n    + [rbokeh](http://hafen.github.io/rbokeh) - Interactive Bokeh plots\n    + [Highcharter](http://jkunst.com/highcharter/) - Interactive Highcharts plots\n    + [visNetwork](http://dataknowledge.github.io/visNetwork) - Interactive network graphs\n    + [networkD3](http://christophergandrud.github.io/networkD3/) - Interative d3 network graphs\n    + [d3heatmap](https://github.com/rstudio/d3heatmap) - Interactive d3 heatmaps\n    + [DT](http://rstudio.github.io/DT/) - Interactive tables\n    + [threejs](https://github.com/bwlewis/rthreejs) - Interactive 3d plots and globes\n    + [rglwidget](http://cran.at.r-project.org/web/packages/rglwidget/index.html) - Interactive 3d plot\n    + [DiagrammeR](http://rich-iannone.github.io/DiagrammeR/) - Interactive diagrams\n    + [MetricsGraphics](http://hrbrmstr.github.io/metricsgraphics/) - Interactive MetricsGraphics plots\n* [rCharts](https://ramnathv.github.io/rCharts/) - many interactive JavaScript visualizations\n* [coefplot](http://github.com/jaredlander/coefplot) - visualizes model statistics\n* [quantmod](http://www.quantmod.com/) - candlestick financial charts\n* [colorspace](https://cran.r-project.org/web/packages/colorspace/vignettes/hcl-colors.pdf) - HSL based color palettes\n* [viridis](https://github.com/sjmgarnier/viridis) - Matplotlib viridis color pallete for R\n* [munsell](https://github.com/cwickham/munsell) - Munsell color palettes for R.\n* RColorBrewer - color palettes for plots. No manual or website.\n* dichromat - color-blind friendly palettes. No manual or website.\n* [igraph](http://igraph.org/) - Network Analysis and Visualization\n* [latticeExtra](http://latticeextra.r-forge.r-project.org/) - Extensions for lattice graphics\n* [sp](https://github.com/edzer/sp/) - tools for spatial data\n\n## Transform\nThese packages help you transform your data into new types of data.\n\n* [dplyr](https://github.com/hadley/dplyr) - a grammar of data transformation\n* [magrittr](https://github.com/smbache/magrittr) - a concise syntax for calling sequences of functions\n* [tibble](https://github.com/hadley/tibble) - efficient display structure for tabular data\n* [stringr](https://blog.rstudio.org/2015/05/05/stringr-1-0-0/) - tools for working with strings and regular expressions\n* [lubridate](https://cran.r-project.org/web/packages/lubridate/vignettes/lubridate.html) - tools for working with dates and times\n* [xts](http://r-forge.r-project.org/projects/xts) - tools for time series based data\n* [data.table](https://github.com/Rdatatable/data.table/wiki) - fast data manipulation\n* [vtreat](https://github.com/WinVector/vtreat) - tools for pre-processing variables for predictive modeling\n* [stringi](http://www.rexamine.com/resources/stringi/) - fast string processing facilities. \n* [Matrix](http://matrix.r-forge.r-project.org/) - LAPACK methods for dense and sparse matrix operations\n\n## Model/Infer\nThese packages help you build models and make inferences. Often the same packages will focus on both topics.\n\n* [car](https://r-forge.r-project.org/projects/car/) - functions from An R Companion to Applied Regression\n* [Hmisc](https://github.com/harrelfe/Hmisc) - miscellaneous functions for data analysis\n* [multcomp](http://multcomp.r-forge.r-project.org/) - Simultaneous Inference in General Parametric Models\n* [pbkrtest](http://people.math.aau.dk/~sorenh/software/pbkrtest/) - parametric bootstrap test for linear mixed effects models \n* [mvtnorm](http://mvtnorm.r-forge.r-project.org/) - Multivariate Normal and t Distributions\n* [MatrixModels](http://matrix.r-forge.r-project.org/) - Modelling with Sparse And Dense Matrices\n* [SparseM](http://www.econ.uiuc.edu/~roger/research/sparse/sparse.html) - linear algebra for sparse matrices\n* [lme4](https://github.com/lme4/lme4) - Linear Mixed-Effects Models using Eigen C++ library\n* [broom](http://varianceexplained.org/r/broom-intro/) - tools for tidying statistical models into data frames\n* [caret](http://topepo.github.io/caret/index.html) - tools for Classification And REgression Training\n* [glmnet](https://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html) - generalized linear models via penalized maximum likelihood\n* [mosaic](http://mosaic-web.org/) - Tools for teaching mathematics, statistics, computation and modeling\n* [gbm](https://github.com/gbm-developers/gbm) - gradient boosted regression models\n* [xgboost](https://github.com/dmlc/xgboost) - Extreme Gradient Boosting\n* [randomForest](https://www.stat.berkeley.edu/~breiman/RandomForests/) - Random Forests for Classification and Regression\n* [ranger](https://github.com/imbs-hl/ranger) - a fast implementation of Random Forests\n* [rstan](https://github.com/stan-dev/rstan) - for Bayesian statistical modeling, data analysis, and prediction\n* [h2o](http://www.h2o.ai/) - parallel distributed machine learning algorithms\n* [ROCR](http://rocr.bioinf.mpi-sb.mpg.de/) - plots to visualize classifier performance\n* [pROC](http://web.expasy.org/pROC/) - Tools for visualizing, smoothing and comparing ROC curves\n* [PivotalR](https://github.com/pivotalsoftware/PivotalR) - R wrappers for [MADlib](http://madlib.incubator.apache.org)'s parallel distributed machine learning algorithms\n\n\n## Communicate\nThese packages help you communicate the results of data science to your audiences.\n\n* [rmarkdown](http://rmarkdown.rstudio.com/) - easy-to-use format for reproducible reports and dynamic documents in R\n* [knitr](http://yihui.name/knitr/) - embed R code within pdf and html reports\n* [flexdashboard](http://rstudio.github.io/flexdashboard/) - easy-to-create dashboards based on rmarkdown\n* [bookdown](https://bookdown.org/) - books and long documents built on R Markdown\n* [rticles](https://github.com/rstudio/rticles) - ready to use R Markdown templates\n* [tufte](http://rstudio.github.io/tufte/) - Tufte handout R Markdown template\n* [DT](http://rstudio.github.io/DT/) - Interactive data tables\n* [pixiedust](https://github.com/nutterb/pixiedust) - Customized tables\n* [xtable](https://cran.r-project.org/web/packages/xtable/vignettes/xtableGallery.pdf) - Customized tables\n* [highr](https://github.com/yihui/highr) - Syntax Highlighting for R Source Code\n* [formatR](http://yihui.name/formatR/) - `tidy_source()` to format R source code\n* [yaml](https://github.com/viking/r-yaml) - Methods to convert R data to YAML and back\n* [pander](http://rapporter.github.io/pander/) - renders R objects into Pandoc markdown.\n* [configr](https://github.com/Miachol/configr) - Integrated and improved configuration file parser (json,ini,yaml,toml).\n\n\n\n## Automate\nThese packages help you create data science products that automate your analyses.\n\n* [shiny](http://shiny.rstudio.com/) - tools to make interactive web apps with R\n    + [shinydashboard](http://rstudio.github.io/shinydashboard/) - interactive dashboards with R\n    + [shinythemes](http://rstudio.github.io/shinythemes/) - style themes for Shiny apps\n    + [shinyAce](http://trestletech.github.io/shinyAce/) - Ace text editor for Shiny apps\n    + [shinyjs](https://github.com/daattali/shinyjs/blob/master/README.md) - adds common JavaScript operations to Shiny apps\n    + [miniUI](https://github.com/rstudio/miniUI) - UI elements for Shiny gadgets, interactive apps integrated into the R commandline workflow\n    + [shinyapps.io](https://www.shinyapps.io/) - hosting service for Shiny apps\n    + [Shiny Server Open Source](https://www.rstudio.com/products/shiny/shiny-server/) - OS server to host Shiny apps\n    + [Shiny Server Pro](https://www.rstudio.com/products/shiny/shiny-server/) - server to host Shiny apps enhanced with features for business enterprises\n* [rsconnect](http://shiny.rstudio.com/articles/shinyapps.html) - deploys Shiny apps to [shinyapps.io](https://www.shinyapps.io/)\n* [plumber](http://plumber.trestletech.com/) - converts R code to a web API\n* [rmarkdown](http://rmarkdown.rstudio.com/) - easy-to-use format for reproducible reports and dynamic documents in R\n* [rstudioapi](https://github.com/rstudio/rstudioapi) - safely access RStudio IDE's API\n\n### Program\nThese packages make it easier to program with the R language.\n\n* [RStudio Desktop IDE](https://www.rstudio.com/products/rstudio/#Desktop) - IDE application for R \n* [RStudio Server Open Source](https://www.rstudio.com/products/rstudio/#Server) - server based IDE for R\n* [RStudio Server Professional](https://www.rstudio.com/products/rstudio/#Server) - server based IDE for R enhanced with features for business enterprises\n* [devtools](https://github.com/hadley/devtools) - tools that make it easier to develop R packages\n* [packrat](https://rstudio.github.io/packrat/) - creates project specific libraries, which handle package versioning and enhance reproducibility\n* [drat](https://github.com/eddelbuettel/drat) - tools to create and use alternative R package repositories\n* [testthat](https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf) - easy-to-use system for unit testing packages\n* [roxygen2](https://github.com/klutometis/roxygen) - easy-to-use method for documenting packages\n* [purrr](https://github.com/hadley/purrr) - tools for applying R functions to data structures\n* [profvis](https://github.com/rstudio/profvis) - visualizes code profiling data from R\n* [Rcpp](http://www.rcpp.org/) - C++ API for R\n* [R6](https://github.com/wch/R6) - fast, simple object class that uses reference semantics\n* [htmltools](https://github.com/rstudio/htmltools) - Tools for HTML generation and output\n* [nloptr](https://github.com/jyypma/nloptr) - interface to NLopt non-linear optimization library.\n* [minqa](http://optimizer.r-forge.r-project.org/) - optimization algorithms.\n* [rngtools](http://renozao.github.io/rngtools/) - Utilities for working with Random Number Generators\n* [NMF](http://renozao.github.io/NMF/) - Nonnegative Matrix Factorization \n* [crayon](https://github.com/gaborcsardi/crayon) - Adds color to terminal output\n* [RJSONIO](https://github.com/duncantl/RJSONIO) - convert R objects to JSON notation\n* [jsonlite](https://github.com/jeroenooms/jsonlite) - a fast JSON parser and generator for R\n* [RcppArmadillo](https://github.com/RcppCore/RcppArmadillo) - interface to 'Armadillo' Templated Linear Algebra Library\n\n## Data \nThese packages contain data sets to use as training data or toy examples.\n\n* [babynames](https://github.com/hadley/babynames) - Names given to US babies 1880-2014\n* [neiss](https://github.com/hadley/neiss) - sample of all accidents reported to US emergency rooms 2009-2014\n* [yrbss](https://github.com/hadley/yrbss) - Youth Risk Behaviour Surveillance System data from 1991 to 2013\n* [nycflights13](https://github.com/hadley/nycflights13) - all out-bound flights from NYC in 2013\n* [hflights](https://github.com/hadley/hflights) - flights departing Houston in 2011\n* [USAboundaries](https://github.com/ropensci/USAboundaries) - Historical and Contemporary Boundaries of the United States of America\n* [rworldmap](https://github.com/AndySouth/rworldmap) - country border data\n* [usdanutrients](https://github.com/hadley/usdanutrients) - USDA nutrient database\n* [fueleconomy](https://github.com/hadley/fueleconomy) - EPA fuel economy data\n* [nasaweather](https://github.com/hadley/nasaweather) - geographic and atmospheric measures on a very coarse 24 by 24 grid covering Central America\n* [mexico-mortality](https://github.com/hadley/mexico-mortality) - deaths in Mexico\n* [data-movies](https://github.com/hadley/data-movies) and [ggplotmovies](https://cran.r-project.org/web/packages/ggplot2movies/) - data from the Internet Movie Database (IMDB)\n* [pop-flows](https://github.com/hadley/pop-flows) - Population flows around the USA in 2008\n* [data-housing-crisis](https://github.com/hadley/data-housing-crisis) - Clean data related to the 2008 US housing crisis\n* [gun-sales](https://github.com/NYTimes/gunsales) - Statistical analysis of monthly background checks of gun purchases from NY times\n* [stationaRy](https://github.com/rich-iannone/stationaRy) - hourly meteorological data from one of thousands of global stations\n* [gapminder](https://github.com/jennybc/gapminder) - Excerpt from the Gapminder data\n* [janeaustenr](https://github.com/juliasilge/janeaustenr) - Jane Austen's Complete Novels\n\n## Criteria\n\nWhat makes an R Package useful? A useful R package should perform a useful task, and it should do it well. Here are some criteria that we used to make the list.\n\n* The code in the package runs fast, with few errors.\n* The code in the package has an intuitive syntax that is easy to remember.\n* The package plays well with other packages; you do not need to munge your data into new forms to use the package.\n* The package is widely used and recommended by its users.\n* The package has a development website, or series of vignettes, that make the package easy to learn.\n* The package is developed in the open (e.g. on Github or RForge).\n* The package uses tests to ensure that it will be stable and bug free well into the future.\n* The package is stable and available from CRAN, or we are personally involved with the package and committed to its development.\n\nFor other useful choices, please check out our list of [popular packages that did not quite meet these criteria](runners-up.md).\n\nYou can learn more about packages in R with the [CRAN task views](https://cran.r-project.org/web/views/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frstudio%2Frstarthere","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frstudio%2Frstarthere","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frstudio%2Frstarthere/lists"}