{"id":12189329,"url":"https://github.com/apache/gravitino","last_synced_at":"2025-05-13T20:16:21.122Z","repository":{"id":212122829,"uuid":"631431061","full_name":"apache/gravitino","owner":"apache","description":"World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.","archived":false,"fork":false,"pushed_at":"2025-05-07T02:51:07.000Z","size":50241,"stargazers_count":1470,"open_issues_count":645,"forks_count":450,"subscribers_count":36,"default_branch":"main","last_synced_at":"2025-05-07T03:37:42.420Z","etag":null,"topics":["ai-catalog","data-catalog","datalake","federated-query","lakehouse","metadata","metalake","model-catalog","opendatacatalog","skycomputing","stratosphere"],"latest_commit_sha":null,"homepage":"https://gravitino.apache.org","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":"GOVERNANCE.md","roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-04-23T02:09:00.000Z","updated_at":"2025-05-07T02:51:10.000Z","dependencies_parsed_at":"2024-01-17T16:48:08.565Z","dependency_job_id":"d6e1590f-c5fd-4f70-a8e2-66c41ab7d6de","html_url":"https://github.com/apache/gravitino","commit_stats":{"total_commits":1716,"total_committers":120,"mean_commits":14.3,"dds":0.8758741258741258,"last_synced_commit":"f54bfc152c2c2bd48c300ef8f4cb881d50b9cc83"},"previous_names":["datastrato/gravitino","apache/gravitino"],"tags_count":74,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/gravitino/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254020646,"owners_count":22000756,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-catalog","data-catalog","datalake","federated-query","lakehouse","metadata","metalake","model-catalog","opendatacatalog","skycomputing","stratosphere"],"created_at":"2024-07-06T18:04:52.411Z","updated_at":"2025-05-13T20:16:16.112Z","avatar_url":"https://github.com/apache.png","language":"Java","readme":"\u003c!--\n  Licensed to the Apache Software Foundation (ASF) under one\n  or more contributor license agreements.  See the NOTICE file\n  distributed with this work for additional information\n  regarding copyright ownership.  The ASF licenses this file\n  to you under the Apache License, Version 2.0 (the\n  \"License\"); you may not use this file except in compliance\n  with the License.  You may obtain a copy of the License at\n\n   http://www.apache.org/licenses/LICENSE-2.0\n\n  Unless required by applicable law or agreed to in writing,\n  software distributed under the License is distributed on an\n  \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n  KIND, either express or implied.  See the License for the\n  specific language governing permissions and limitations\n  under the License.\n--\u003e\n\n# Apache Gravitino™ (incubating)\n\n[![GitHub Actions Build](https://github.com/apache/gravitino/actions/workflows/build.yml/badge.svg)](https://github.com/apache/gravitino/actions/workflows/build.yml)\n[![GitHub Actions Integration Test](https://github.com/apache/gravitino/actions/workflows/integration-test.yml/badge.svg)](https://github.com/apache/gravitino/actions/workflows/integration-test.yml)\n[![License](https://img.shields.io/github/license/apache/gravitino)](https://github.com/apache/gravitino/blob/main/LICENSE)\n[![Contributors](https://img.shields.io/github/contributors/apache/gravitino)](https://github.com/apache/gravitino/graphs/contributors)\n[![Release](https://img.shields.io/github/v/release/apache/gravitino)](https://github.com/apache/gravitino/releases)\n[![Open Issues](https://img.shields.io/github/issues-raw/apache/gravitino)](https://github.com/apache/gravitino/issues)\n[![Last Committed](https://img.shields.io/github/last-commit/apache/gravitino)](https://github.com/apache/gravitino/commits/main/)\n[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/8358/badge)](https://www.bestpractices.dev/projects/8358)\n\n## Introduction\n\nApache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.\n\n![Gravitino Architecture](docs/assets/gravitino-architecture.png)\n\nGravitino aims to provide several key features:\n* Unified Metadata Management: Gravitino provides a unified model and API to manage different types of metadata, including relational (e.g., Hive, MySQL) and file-based (e.g., HDFS, S3) metadata sources.\n* End-to-End Data Governance: Gravitino offers a unified governance layer for managing metadata with features like access control, auditing, and discovery.\n* Direct Metadata Management: Gravitino connects directly to metadata sources via connectors, ensuring changes are instantly reflected between Gravitino and the underlying systems.\n* Geo-Distribution Support: Gravitino enables deployment across multiple regions or clouds, allowing instances to share metadata for a global cross-region view.\n* Multi-Engine Support: Gravitino supports query engines enabling metadata access without modifying SQL dialects.\n* AI Asset Management (WIP): Gravitino is expanding to manage both data and AI assets, with support for AI models and features currently in development.\n\n## Contributing to Apache Gravitino\n\nGravitino is open source software available under the Apache 2.0 license. For information on contributing to Gravitino, please see the [Contribution guidelines](https://gravitino.apache.org/contrib/).\n\n## Online documentation\n\nThe latest Gravitino documentation is available on our [official website](https://gravitino.apache.org/docs/latest/). This README file only contains basic setup instructions.\n\n## Building Apache Gravitino\n\nYou can build Gravitino using Gradle. Currently, you can build Gravitino on Linux and macOS, and Windows isn't supported.\n\nTo build Gravitino, please run:\n\n```shell\n./gradlew clean build -x test\n```\n\nIf you want to build a distribution package, please run:\n\n```shell\n./gradlew compileDistribution -x test\n```\n\nto build a distribution package.\n\nOr:\n\n```shell\n./gradlew assembleDistribution -x test\n```\n\nto build a compressed distribution package.\n\nThe directory `distribution` contains the generated binary distribution package.\n\nPlease see [How to build Gravitino](https://gravitino.apache.org/docs/latest/how-to-build/) for details on building and testing Gravitino.\n\n## Quick start\n\n### Use Gravitino playground\n\nThis is the recommended approach. Gravitino provides a docker-compose-based playground where you can experience a whole system alongside other components. Clone or download the [Gravitino playground repository](https://github.com/apache/gravitino-playground) and then follow the [README](https://github.com/apache/gravitino-playground/blob/main/README.md), to get everything running.\n\n### Configure and start Gravitino server in local\n\nTo start Gravitino on your machine, download a binary package from the [download page](https://gravitino.apache.org/downloads) and decompress the package.\n\nBefore starting the Gravitino server, configure its settings by editing the `gravitino.conf` file located in the `conf` directory. This file follows the standard properties file format, allowing you to modify the server configuration as needed.\n\nTo start the Gravitino server, please run:\n\n```shell\n./bin/gravitino.sh start\n```\n\nTo stop the Gravitino server, please run:\n\n```shell\n./bin/gravitino.sh stop\n```\n\nAlternatively, to run the Gravitino server in the frontend, please run:\n\n```shell\n./bin/gravitino.sh run\n```\n\nAnd press `CTRL+C` to stop the Gravitino server.\n\n### Gravitino Iceberg REST catalog service\n\nGravitino provides Iceberg REST catalog service to manage Iceberg efficiently. For more details, refer to [Gravitino Iceberg REST catalog service](https://gravitino.apache.org/docs/latest/iceberg-rest-service/).\n\n### Using Trino with Apache Gravitino\n\nGravitino provides a Trino connector for accessing metadata within Gravitino. To use Trino with Gravitino, please follow the [trino-gravitino-connector doc](https://gravitino.apache.org/docs/latest/trino-connector/index/).\n\n## Development guide\n\n1. [How to build Gravitino](https://gravitino.apache.org/docs/latest/how-to-build/)\n2. [How to test Gravitino](https://gravitino.apache.org/docs/latest/how-to-test/)\n3. [How to publish Docker images](https://gravitino.apache.org/docs/latest/publish-docker-images)\n\n## License\n\nGravitino is licensed under the Apache License Version 2.0. For details, see the [LICENSE](LICENSE).\n\n## ASF Incubator disclaimer\n\nApache Gravitino is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required for all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.\n\n\u003csub\u003eApache®, Apache Gravitino\u0026trade;, Apache Hadoop\u0026reg;, Apache Hive\u0026trade;, Apache Iceberg\u0026trade;, Apache Kafka\u0026reg;, Apache Spark\u0026trade;, Apache Submarine\u0026trade;, Apache Thrift\u0026trade; and Apache Zeppelin\u0026trade; are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.\u003c/sub\u003e\n\n\u003cimg src=\"https://analytics.apache.org/matomo.php?idsite=62\u0026rec=1\u0026bots=1\u0026action_name=ReadMe\" style=\"border:0;\" alt=\"\" /\u003e\n","funding_links":[],"categories":["Table of Contents","Java","GenAI Readiness Features","大数据","Data Lake Management"],"sub_categories":["Metadata Service","Technical Metadata \u0026 Query Analytics"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fgravitino","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2Fgravitino","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fgravitino/lists"}