Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/apache/Gravitino
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://github.com/apache/Gravitino
ai-catalog data-catalog datalake federated-query lakehouse metadata metalake model-catalog opendatacatalog skycomputing stratosphere
Last synced: about 2 months ago
JSON representation
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
- Host: GitHub
- URL: https://github.com/apache/Gravitino
- Owner: apache
- License: apache-2.0
- Created: 2023-04-23T02:09:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-07-06T02:37:58.000Z (4 months ago)
- Last Synced: 2024-07-06T18:13:18.152Z (4 months ago)
- Topics: ai-catalog, data-catalog, datalake, federated-query, lakehouse, metadata, metalake, model-catalog, opendatacatalog, skycomputing, stratosphere
- Language: Java
- Homepage: https://datastrato.ai/docs/
- Size: 18 MB
- Stars: 680
- Watchers: 23
- Forks: 208
- Open Issues: 488
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Governance: GOVERNANCE.md
- Roadmap: ROADMAP.md
Awesome Lists containing this project
- awesome-datalake - Apache Gravitino - Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages the metadata directly in different sources, types, and regions. It also provides users with unified metadata access for data and AI assets. (Data Catalog)
- awesome-datalake - Apache Gravitino - Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages the metadata directly in different sources, types, and regions. It also provides users with unified metadata access for data and AI assets. (Data Catalog)
README
# Apache Gravitino™ (incubating)
[![GitHub Actions Build](https://github.com/apache/gravitino/actions/workflows/build.yml/badge.svg)](https://github.com/apache/gravitino/actions/workflows/build.yml)
[![GitHub Actions Integration Test](https://github.com/apache/gravitino/actions/workflows/integration-test.yml/badge.svg)](https://github.com/apache/gravitino/actions/workflows/integration-test.yml)
[![License](https://img.shields.io/github/license/apache/gravitino)](https://github.com/apache/gravitino/blob/main/LICENSE)
[![Contributors](https://img.shields.io/github/contributors/apache/gravitino)](https://github.com/apache/gravitino/graphs/contributors)
[![Release](https://img.shields.io/github/v/release/apache/gravitino)](https://github.com/apache/gravitino/releases)
[![Open Issues](https://img.shields.io/github/issues-raw/apache/gravitino)](https://github.com/apache/gravitino/issues)
[![Last Committed](https://img.shields.io/github/last-commit/apache/gravitino)](https://github.com/apache/gravitino/commits/main/)
[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/8358/badge)](https://www.bestpractices.dev/projects/8358)## Introduction
Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions and provides users with unified metadata access for data and AI assets.
![Gravitino Architecture](docs/assets/gravitino-architecture.png)
Gravitino aims to provide several key features:
* Single Source of Truth for multi-regional data with geo-distributed architecture support.
* Unified Data and AI asset management for both users and engines.
* Security in one place, centralizing the security for different sources.
* Built-in data management and data access management.## Contributing to Apache Gravitino
Gravitino is open source software available under the Apache 2.0 license. For information on how to contribute to Gravitino, please see the [Contribution guidelines](CONTRIBUTING.md).
## Online documentation
The latest Gravitino documentation is in the [doc folder](docs). This README file only contains basic setup instructions.
## Building Apache Gravitino
You can build Gravitino using Gradle. Currently, you can build Gravitino on Linux and macOS, and Windows isn't supported.
To build Gravitino, please run:
```shell
./gradlew clean build -x test
```If you want to build a distribution package, please run:
```shell
./gradlew compileDistribution -x test
```to build a distribution package.
Or:
```shell
./gradlew assembleDistribution -x test
```to build a compressed distribution package.
The directory `distribution` contains the generated binary distribution package.
Please see [How to build Gravitino](docs/how-to-build.md) for details on building and testing Gravitino.
## Quick start
### Configure and start the Apache Gravitino server
If you already have a binary distribution package, go to the decompressed package directory.
Before starting the Gravitino server, please configure the Gravitino server configuration file. The
configuration file, `gravitino.conf`, is in the `conf` directory and follows the standard property file format. You can modify the configuration within this file.To start the Gravitino server, please run:
```shell
./bin/gravitino.sh start
```To stop the Gravitino server, please run:
```shell
./bin/gravitino.sh stop
```Alternatively, to run the Gravitino server in the frontend, please run:
```shell
./bin/gravitino.sh run
```And press `CTRL+C` to stop the Gravitino server.
### Gravitino Iceberg REST catalog service
Gravitino provides Iceberg REST catalog service to manage Iceberg. For more details, please refer to [Gravitino Iceberg REST catalog service](docs/iceberg-rest-service.md).
### Using Trino with Apache Gravitino
Gravitino provides a Trino connector to access the metadata in Gravitino. To use Trino with Gravitino, please follow the [trino-gravitino-connector doc](docs/trino-connector/index.md).
## Development guide
1. [How to build Gravitino](docs/how-to-build.md)
2. [How to test Gravitino](docs/how-to-test.md)
3. [How to publish Docker images](docs/publish-docker-images.md)## License
Gravitino is licensed under the Apache License Version 2.0. For details, see the [LICENSE](LICENSE).
## ASF Incubator disclaimer
Apache Gravitino is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required for all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
Apache®, Apache Gravitino™, Apache Hadoop®, Apache Hive™, Apache Iceberg™, Apache Kafka®, Apache Spark™, Apache Submarine™, Apache Thrift™ and Apache Zeppelin™ are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.