https://github.com/xiaodaigh/julia-data-science-base-docker-img
Julia Data Science Docker with data science packages compiled for instant loading!
https://github.com/xiaodaigh/julia-data-science-base-docker-img
Last synced: 2 months ago
JSON representation
Julia Data Science Docker with data science packages compiled for instant loading!
- Host: GitHub
- URL: https://github.com/xiaodaigh/julia-data-science-base-docker-img
- Owner: xiaodaigh
- Created: 2020-01-02T12:07:25.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-09-24T03:34:57.000Z (over 4 years ago)
- Last Synced: 2025-01-21T10:08:25.538Z (4 months ago)
- Language: Dockerfile
- Size: 57.6 KB
- Stars: 13
- Watchers: 4
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# There is a now a better way to do this with PackageCompiler.jl
# Intro
Julia Data Science Docker with data science packages compiled for instant loading!Time-to-first-plot (TTFP) is often regarded as one of Julia's main pain points. The PackageCompiler.jl package can compile these package and alleviate the pain. It works by pre-"compiling" the packages and baking them into the julia sysimage so that `using Pkg1` will be fast just like base packages.
This is an experimental first attempt at making data science packages used by me into a docker image with pre-compiled data science packages.
## Usage
Firstly, install Docker. If you are running Windows, I recommend installing git so you have access to git bash.
On Windows you IP can be found using `ipconfig` and on Linux with `ifconfig`. This is needed if you wish to do plotting from the docker image.
**Basic: Windows**
```bash
docker run --rm \
-e DISPLAY=YOUR_IP:0.0 \
-e JUPYTER_ENABLE_LAB=yes \
-v "$PWD":/home/jovyan/work\
-it -p 8888:8888 \
xiaodaidocker2019/julia-data-science-base
```Often one may wish to save the data to somewhere on the hard drive, you may do this by attaching a local folder to the directory `somedir`.
## Packages
The below packages are compiled using PackageCompiler.jl into the image
| Package | Type | Notes |
| ------------------- | ------------------------------- | ------------------------------- |
| CategoricalArrays | Foundation | |
| Clustering | Unsupervised learning | |
| CSV | Data IO | |
| DataConvenience | Data Manipulation/Convenience | |
| DataFrames | Data Manipulation | |
| DataFramesMeta | Data Manipulation | |
| DecisionTree | Supervised learning | |
| FastGroupBy | Data Manipulation/Convenience | |
| Feather | Data IO | |
| FreqTables | Foundation/Statistics | |
| GLM | Supervised learning | |
| JDF | Data IO | For reading/writing JDF files |
| JLBoost | Supervised learning | |
| Lazy | Data Manipulation/Convenience | |
| Missings | Foundation | |
| Parquet | Data IO | ParquetFiles is quite broken at the moment |
| Plots | Plotting | |
| RDatasets | Data | |
| SortingLab | Data Manipulation/Convenience | |
| StatsBase | Foundation/Statistics | |
| StatsPlots | Plotting | |
| Tables | Data Manipulation/Convenience | |
| TableView | Data Viewing | |
| XGBoost | Supervised learning | |The below packages are included but not compiled
| Package | Type | Notes |
| -- | -- | -- |
| Pipe | Data Manipulation/Convenience | If compiled into base then there is warning message with Pipe |
| TableView | Data Viewing | If compiled then doesn't work with JupyterLab |