https://github.com/xiaodaigh/databench.jl
A package to benchmark data manipulation in Julia vs R data.table
https://github.com/xiaodaigh/databench.jl
Last synced: 2 months ago
JSON representation
A package to benchmark data manipulation in Julia vs R data.table
- Host: GitHub
- URL: https://github.com/xiaodaigh/databench.jl
- Owner: xiaodaigh
- License: mit
- Created: 2017-10-21T05:26:48.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-08-29T15:11:07.000Z (over 4 years ago)
- Last Synced: 2025-01-21T10:08:26.680Z (4 months ago)
- Language: Julia
- Homepage:
- Size: 106 KB
- Stars: 1
- Watchers: 4
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DataBench - a Julia vs R data manipulation benchmark suite
A comparison of data manipulation prowess using synthetic data and the [GE Flight Quest data](https://www.kaggle.com/c/flight/data)# Set up instructions
```julia
# Pkg.add("DataBench")
```1. Change the settings.csv's data_path to a path that you can write to
2. Download the 7z file (https://www.kaggle.com/c/flight/download/InitialTrainingSet_rev1.7z) and
3. Extract it into the folder data_path/InitialTrainingSet_rev1# Synthetic benchmarks
Adapted from data.tables' [official benchmarks](https://github.com/Rdatatable/data.table/wiki/Benchmarks-:-Grouping#code-to-reproduce-the-timings-above-)# "Real-life" benchmarks
Uses [GE Flight Quest data](https://www.kaggle.com/c/flight/data), the largest tabular dataset on Kaggle at the time of writing# Companion post
[Speed of data manipulations in Julia vs R](https://www.codementor.io/zhuojiadai/speed-of-data-manipulation-in-julia-vs-r-cd7praapv)# Similar repos
https://github.com/szilard/benchm-databases