https://github.com/DataHerb/dataherb-flora
DataHerb Flora: The core of DataHerb
https://github.com/DataHerb/dataherb-flora
data data-mining data-science datascience dataset datasets
Last synced: about 1 year ago
JSON representation
DataHerb Flora: The core of DataHerb
- Host: GitHub
- URL: https://github.com/DataHerb/dataherb-flora
- Owner: DataHerb
- Created: 2020-02-05T20:52:27.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-03-12T16:51:51.000Z (over 3 years ago)
- Last Synced: 2024-11-15T05:32:38.861Z (over 1 year ago)
- Topics: data, data-mining, data-science, datascience, dataset, datasets
- Homepage: https://dataherb.github.io/flora
- Size: 57.6 KB
- Stars: 1
- Watchers: 2
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# dataherb-flora
DataHerb Flora
A DataHerb Core Service to Bundle the Datasets into Flora.
## What is DataHerb
DataHerb is an open data initiative to make the access of open datasets easier.
- A **DataHerb** or **Herb** is a dataset. A dataset comes with the data files, and the metadata of the data files.
- A **DataHerb Leaf** or **Leaf** is a data file in the DataHerb.
- A **Flora** is the combination of all the DataHerbs.
In many data projects, finding the right datasets to enhance your data is one of the most time consuming part. DataHerb adds flavor to your data project.
## What is DataHerb Flora
We desigined the following workflow to share and index datasets.

This repository is being used for listing of datasets (Listings in DataHerb flora repository).
## How to Add Your Dataset
> [A Complete **Tutorals**](https://dataherb.github.io/add/)
Simply create a `yml` file in the `flora` folder to link to your dataset repository. Your dataset repository should have a `.dataherb` folder and a `metadata.yml` file in it.
The indexing part will be done by [GitHub Actions](https://github.com/DataHerb/dataherb-flora/actions).
## How is Everything Connected
There are three components to build the dataset index.
1. [dataherb-flora](https://github.com/DataHerb/dataherb-flora): Index datasets using yml files.
2. [dataherb-metadata-aggregator](https://github.com/DataHerb/dataherb-metadata-aggregator): Aggregrates all information about the datasets and create database.
3. [dataherb.github.io](https://github.com/DataHerb/dataherb.github.io): Builds the website using the database.
Some packages are also created to make the access and creation of the datasets easier. Refer to [the website](https://dataherb.github.io/) for the details.