Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vhoulbreque/dafter
📥 Command-line downloader for public datasets
https://github.com/vhoulbreque/dafter
brew-style command-line data database dataset download fetcher linux osx public-data unix
Last synced: 2 months ago
JSON representation
📥 Command-line downloader for public datasets
- Host: GitHub
- URL: https://github.com/vhoulbreque/dafter
- Owner: vhoulbreque
- License: mit
- Created: 2018-10-12T13:35:46.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-06-22T18:55:44.000Z (over 5 years ago)
- Last Synced: 2024-09-30T09:18:58.944Z (3 months ago)
- Topics: brew-style, command-line, data, database, dataset, download, fetcher, linux, osx, public-data, unix
- Language: Python
- Homepage: https://vinzeebreak.github.io/dafter-loader/
- Size: 114 KB
- Stars: 24
- Watchers: 2
- Forks: 3
- Open Issues: 37
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# dafter : the data fetcher
![dafter-logo](docs/dafter_logo.png)
## You have just found dafter.
Dafter is a command line downloader of public datasets. It takes care of downloading and formatting the datasets' files so that you can spend hours building models instead of looking for datasets and their urls.
- [Install](#install-dafter)
- [Commands](#commands)
- [How to contribute](#how-to-contribute)## Install dafter
To install dafter, just do:
```bash
pip install dafter
```## Commands
To download the MNIST dataset:
```bash
dafter get mnist
```To delete MNIST from your machine:
```bash
dafter delete mnist
```To search among downloadable datasets:
```bash
# Search all available datasets
dafter search
# Search all available datasets that have the tags "image" and "deep-learning"
# and whose name contains "mni"
dafter search mni --tags image deep-learning
```To list all the datasets that have been downloaded and are stored on your machine:
```bash
# Lists all datasets in database
dafter list
# Lists all datasets in database that have the tag "twitter" and whose name
# contains "sentiment"
dafter list sentiment --tags twitter
```## Update
To update `dafter`, do:
```bash
pip install --upgrade dafter
```## Uninstall
To uninstall `dafter`, do:
```bash
pip uninstall dafter
```## How to contribute?
### Add a new dataset
To add a new dataset, just add a `json` file called `name-of-the-dataset.json` in the `datasets-configs` folder.
```json
{
"name": "name-of-the-dataset",
"urls": [
{
"url": "https://site.com/file1.tar.gz",
"bytes": 45221
},
{
"url": "https://site.com/file2.tar.gz",
"bytes": 1147803
}
],
"type": "tar.gz",
"tags": ["tag1", "tag2", "tag3"],
"description": "This is a description of the dataset",
"source": "https://site.com/"
}
```