An open API service indexing awesome lists of open source software.

https://github.com/DataHerb/dataherb-flora

DataHerb Flora: The core of DataHerb
https://github.com/DataHerb/dataherb-flora

data data-mining data-science datascience dataset datasets

Last synced: about 1 year ago
JSON representation

DataHerb Flora: The core of DataHerb

Awesome Lists containing this project

README

          

# dataherb-flora




Markdownify


DataHerb Flora


A DataHerb Core Service to Bundle the Datasets into Flora.



## What is DataHerb

DataHerb is an open data initiative to make the access of open datasets easier.

- A **DataHerb** or **Herb** is a dataset. A dataset comes with the data files, and the metadata of the data files.
- A **DataHerb Leaf** or **Leaf** is a data file in the DataHerb.
- A **Flora** is the combination of all the DataHerbs.

In many data projects, finding the right datasets to enhance your data is one of the most time consuming part. DataHerb adds flavor to your data project.

## What is DataHerb Flora

We desigined the following workflow to share and index datasets.

![DataHerb Workflow](https://raw.githubusercontent.com/DataHerb/dataherb.github.io/master/assets/images/dataherb-components.png)

This repository is being used for listing of datasets (Listings in DataHerb flora repository).

## How to Add Your Dataset

> [A Complete **Tutorals**](https://dataherb.github.io/add/)

Simply create a `yml` file in the `flora` folder to link to your dataset repository. Your dataset repository should have a `.dataherb` folder and a `metadata.yml` file in it.

The indexing part will be done by [GitHub Actions](https://github.com/DataHerb/dataherb-flora/actions).

## How is Everything Connected

There are three components to build the dataset index.

1. [dataherb-flora](https://github.com/DataHerb/dataherb-flora): Index datasets using yml files.
2. [dataherb-metadata-aggregator](https://github.com/DataHerb/dataherb-metadata-aggregator): Aggregrates all information about the datasets and create database.
3. [dataherb.github.io](https://github.com/DataHerb/dataherb.github.io): Builds the website using the database.

Some packages are also created to make the access and creation of the datasets easier. Refer to [the website](https://dataherb.github.io/) for the details.