https://github.com/codekow/demo-containerized-dataset
How to make your dataset available and immutable via containers
https://github.com/codekow/demo-containerized-dataset
containers data-science datasets docker
Last synced: about 1 year ago
JSON representation
How to make your dataset available and immutable via containers
- Host: GitHub
- URL: https://github.com/codekow/demo-containerized-dataset
- Owner: codekow
- Created: 2023-10-18T18:57:06.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-14T22:34:59.000Z (over 2 years ago)
- Last Synced: 2025-02-13T18:49:37.200Z (over 1 year ago)
- Topics: containers, data-science, datasets, docker
- Homepage:
- Size: 9.77 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Containerized Datasets
The purpose of this repo show options around serving and
maintaining datasets used for training machine learning models
with common tools.
We can use the benefits of containers, immutability and versioning, to support [reproducibility and replicability](https://www.ncbi.nlm.nih.gov/books/NBK547546/) which are key
to **science**.
Disclaimer: This is one method among many and may not meet your needs or be the best option. This is **a way**, NOT **the way**.