https://github.com/reactual/datalibrary
An API for better datasets
https://github.com/reactual/datalibrary
api graphql mit-license
Last synced: 2 months ago
JSON representation
An API for better datasets
- Host: GitHub
- URL: https://github.com/reactual/datalibrary
- Owner: reactual
- License: mit
- Created: 2018-08-29T15:54:11.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T14:23:09.000Z (over 2 years ago)
- Last Synced: 2025-01-23T20:34:14.024Z (4 months ago)
- Topics: api, graphql, mit-license
- Language: JavaScript
- Homepage: https://datalibrary.com
- Size: 1020 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 19
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DataLibrary
> An API for better datasets -- https://datalibrary.com## Overview
DataLibrary was created to bring datasets from a range of subjects into a single API. Our primary goal is consistency and ease of use.For example, take a random selection of datasets:
* List of Metric Units
* List of US States
* List of English Stopwords
* Air Pollution Measurement Data
* List of AWS & GCP Data Center Regions
* Public financial data from 2 different municipalitiesBefore DataLibrary, you would most likely access these datasets from different sources. Beyond the technical challenges, each provider would typically use different schema patterns, naming conventions, and formatting.
DataLibrary exists not only to bring datasets together into a single source, but also clean and reformat data when possible.
For common subjects, data could be combined from several sources to create a new, richer
dataset, with fields and metadata carefully renamed for a better experience.## Access
The DataLibrary API will initially be available via GraphQL, with a RESTful HTTP API following. A frontend for searching datasets and other features will be available also.## Copyright Notes
> **DataLibrary's goal is to make data more accessible.**
> We take licensing and copyrights seriously.For datasets where a copyright wouldn't apply, DataLibrary will typically host a formatted version of the data directly. This especially applies to common or infrequently changing datasets.
DataLibrary supports datasets that contain copyrights, premium, and paid datasets, when approved by a provider.
**A few example strategies:**
* Maintaining our own agreement/terms with a provider.
* Acting as a proxy where you bring your own license/token, not maintaining a local copy.
* Providing an API or local library for formatting raw data from a dataset template we have.
* Acting as a paid, *data* app store where we provide access to a dataset that generates revenue for a provider.
* Providing generic utilities for cleaning & working with your own data.---
A project by Reactual