Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/johnmyleswhite/RDatasets.jl
Julia package for loading many of the data sets available in R
https://github.com/johnmyleswhite/RDatasets.jl
Last synced: 3 months ago
JSON representation
Julia package for loading many of the data sets available in R
- Host: GitHub
- URL: https://github.com/johnmyleswhite/RDatasets.jl
- Owner: JuliaStats
- License: gpl-3.0
- Created: 2012-11-24T05:16:09.000Z (almost 12 years ago)
- Default Branch: master
- Last Pushed: 2024-04-05T12:02:12.000Z (7 months ago)
- Last Synced: 2024-07-30T01:52:28.136Z (3 months ago)
- Language: R
- Size: 49.2 MB
- Stars: 158
- Watchers: 14
- Forks: 56
- Open Issues: 23
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- awesome-julia-datasciences - RDataSets - Julia package for loading many of the data sets available in R. (APL / Data Analysis / Data Visualization)
README
# RDatasets.jl
[![Build status](https://github.com/JuliaStats/RDatasets.jl/workflows/CI/badge.svg)](https://github.com/JuliaStats/RDatasets.jl/actions?query=workflow%3ACI+branch%3Amaster)
The RDatasets package provides an easy way for Julia users to experiment with most of the standard data sets that are available in the core of R as well as datasets included with many of R's most popular packages. This package is essentially a simplistic port of the Rdatasets repo created by Vincent Arelbundock, who conveniently gathered data sets from many of the standard R packages in one convenient location on GitHub at https://github.com/vincentarelbundock/RdatasetsIn order to load one of the data sets included in the RDatasets package, you will need to have the `DataFrames` package installed. This package is automatically installed as a dependency of the `RDatasets` package if you install `RDatasets` as follows:
Pkg.add("RDatasets")
After installing the RDatasets package, you can then load data sets using the `dataset()` function, which takes the name of a package and a data set as arguments:
using RDatasets
iris = dataset("datasets", "iris")
neuro = dataset("boot", "neuro")# Data Sets
The `RDatasets.packages()` function returns a table of represented R packages:
Package | Title
-------------|------------------------------------------------------------------------
COUNT | Functions, data and code for count data.
Ecdat | Data sets for econometrics
HSAUR | A Handbook of Statistical Analyses Using R (1st Edition)
HistData | Data sets from the history of statistics and data visualization
ISLR | Data for An Introduction to Statistical Learning with Applications in R
KMsurv | Data sets from Klein and Moeschberger (1997), Survival Analysis
MASS | Support Functions and Datasets for Venables and Ripley's MASS
SASmixed | Data sets from "SAS System for Mixed Models"
Zelig | Everyone's Statistical Software
adehabitatLT | Analysis of Animal Movements
boot | Bootstrap Functions (Originally by Angelo Canty for S)
car | Companion to Applied Regression
cluster | Cluster Analysis Extended Rousseeuw et al.
datasets | The R Datasets Package
gamair | Datasets used in the book Generalized Additive Models: An Introduction with R
gap | Genetic analysis package
ggplot2 | An Implementation of the Grammar of Graphics
lattice | Lattice Graphics
lme4 | Linear mixed-effects models using Eigen and S4
mgcv | Mixed GAM Computation Vehicle with GCV/AIC/REML smoothness estimation
mlmRev | Examples from Multilevel Modelling Software Review
nlreg | Higher Order Inference for Nonlinear Heteroscedastic Models
plm | Linear Models for Panel Data
plyr | Tools for splitting, applying and combining data
pscl | Political Science Computational Laboratory, Stanford University
psych | Procedures for Psychological, Psychometric, and Personality Research
quantreg | Quantile Regression
reshape2 | Flexibly Reshape Data: A Reboot of the Reshape Package.
robustbase | Basic Robust Statistics
rpart | Recursive Partitioning and Regression Trees
sandwich | Robust Covariance Matrix Estimators
sem | Structural Equation Models
survival | Survival Analysis
vcd | Visualizing Categorical DataThe `RDatasets.datasets()` function returns a table describing the 700+ included datasets. Or pass in a package name (e.g. `RDatasets.datasets("mlmRev")`) for a targeted table:
Package|Dataset|Title|Rows|Columns
---|---|---|---:|---:
mlmRev|Chem97|Scores on A-level Chemistry in 1997|31022|8
mlmRev|Contraception|Contraceptive use in Bangladesh|1934|6
mlmRev|Early|Early childhood intervention study|309|4
mlmRev|Exam|Exam scores from inner London|4059|10
mlmRev|Gcsemv|GCSE exam score|1905|5
mlmRev|Hsb82|High School and Beyond - 1982|7185|8
mlmRev|Mmmec|Malignant melanoma deaths in Europe|354|6
mlmRev|Oxboys|Heights of Boys in Oxford|234|4
mlmRev|ScotsSec|Scottish secondary school scores|3435|6
mlmRev|bdf|Language Scores of 8-Graders in The Netherlands|2287|28
mlmRev|egsingle|US Sustaining Effects study|7230|12
mlmRev|guImmun|Immunization in Guatemala|2159|13
mlmRev|guPrenat|Prenatal care in Guatemala|2449|15
mlmRev|star|Student Teacher Achievement Ratio (STAR) project data|26796|18# Licensing and Intellectual Property
Following Vincent's lead, we have assumed that all of the data sets in this repository can be made available under the GPL-3 license. If you know that one of the datasets released here should not be released publicly or if you know that a data set can only be released under a different license, please contact me so that I can remove the data set from this repository.