https://github.com/xiaodaigh/data_manipulation_benchmarks
A set of data manipulation benchmarking code for Julia and R
https://github.com/xiaodaigh/data_manipulation_benchmarks
comparison data-manipulation-prowess julia r
Last synced: about 1 month ago
JSON representation
A set of data manipulation benchmarking code for Julia and R
- Host: GitHub
- URL: https://github.com/xiaodaigh/data_manipulation_benchmarks
- Owner: xiaodaigh
- Created: 2017-09-25T10:13:05.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-02-08T14:59:56.000Z (over 5 years ago)
- Last Synced: 2025-04-10T05:15:42.136Z (about 1 month ago)
- Topics: comparison, data-manipulation-prowess, julia, r
- Language: Julia
- Homepage:
- Size: 22.5 KB
- Stars: 5
- Watchers: 4
- Forks: 4
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Julia vs R data manipulation benchmark suite
A comparison of data manipulation prowess using synthetic data and the [GE Flight Quest data](https://www.kaggle.com/c/flight/data)# Set up instructions
1. Change the settings.csv's data_path to a path that you can write to
2. Download the 7z file (https://www.kaggle.com/c/flight/download/InitialTrainingSet_rev1.7z) and
3. Extract it into the folder data_path/InitialTrainingSet_rev1# Synthetic benchmarks
Adapted from data.tables' [official benchmarks](https://github.com/Rdatatable/data.table/wiki/Benchmarks-:-Grouping#code-to-reproduce-the-timings-above-)# "Real-life" benchmarks
Uses [GE Flight Quest data](https://www.kaggle.com/c/flight/data), the largest tabular dataset on Kaggle at the time of writing# Companion post
[Speed of data manipulations in Julia vs R](https://www.codementor.io/zhuojiadai/speed-of-data-manipulation-in-julia-vs-r-cd7praapv)