{"id":15208977,"url":"https://github.com/apache/Gravitino","last_synced_at":"2025-10-03T01:31:43.663Z","repository":{"id":212122829,"uuid":"631431061","full_name":"apache/gravitino","owner":"apache","description":"World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.","archived":false,"fork":false,"pushed_at":"2024-10-29T08:05:40.000Z","size":36919,"stargazers_count":1022,"open_issues_count":504,"forks_count":317,"subscribers_count":29,"default_branch":"main","last_synced_at":"2024-10-29T09:23:17.639Z","etag":null,"topics":["ai-catalog","data-catalog","datalake","federated-query","lakehouse","metadata","metalake","model-catalog","opendatacatalog","skycomputing","stratosphere"],"latest_commit_sha":null,"homepage":"https://gravitino.apache.org","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":"GOVERNANCE.md","roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-23T02:09:00.000Z","updated_at":"2024-10-29T08:50:03.000Z","dependencies_parsed_at":"2024-01-17T16:48:08.565Z","dependency_job_id":"d6e1590f-c5fd-4f70-a8e2-66c41ab7d6de","html_url":"https://github.com/apache/gravitino","commit_stats":{"total_commits":1716,"total_committers":120,"mean_commits":14.3,"dds":0.8758741258741258,"last_synced_commit":"f54bfc152c2c2bd48c300ef8f4cb881d50b9cc83"},"previous_names":["datastrato/gravitino","apache/gravitino"],"tags_count":52,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fgravitino/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/gravitino/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235059234,"owners_count":18929279,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-catalog","data-catalog","datalake","federated-query","lakehouse","metadata","metalake","model-catalog","opendatacatalog","skycomputing","stratosphere"],"created_at":"2024-09-28T07:08:34.989Z","updated_at":"2025-10-03T01:31:43.655Z","avatar_url":"https://github.com/apache.png","language":"Java","funding_links":[],"categories":["Data Catalog"],"sub_categories":[],"readme":"\u003c!--\n  Licensed to the Apache Software Foundation (ASF) under one\n  or more contributor license agreements.  See the NOTICE file\n  distributed with this work for additional information\n  regarding copyright ownership.  The ASF licenses this file\n  to you under the Apache License, Version 2.0 (the\n  \"License\"); you may not use this file except in compliance\n  with the License.  You may obtain a copy of the License at\n\n   http://www.apache.org/licenses/LICENSE-2.0\n\n  Unless required by applicable law or agreed to in writing,\n  software distributed under the License is distributed on an\n  \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n  KIND, either express or implied.  See the License for the\n  specific language governing permissions and limitations\n  under the License.\n--\u003e\n\n# Apache Gravitino™\n\n[![GitHub Actions Build](https://github.com/apache/gravitino/actions/workflows/build.yml/badge.svg)](https://github.com/apache/gravitino/actions/workflows/build.yml)\n[![GitHub Actions Integration Test](https://github.com/apache/gravitino/actions/workflows/integration-test.yml/badge.svg)](https://github.com/apache/gravitino/actions/workflows/integration-test.yml)\n[![License](https://img.shields.io/github/license/apache/gravitino)](https://github.com/apache/gravitino/blob/main/LICENSE)\n[![Contributors](https://img.shields.io/github/contributors/apache/gravitino)](https://github.com/apache/gravitino/graphs/contributors)\n[![Release](https://img.shields.io/github/v/release/apache/gravitino)](https://github.com/apache/gravitino/releases)\n[![Open Issues](https://img.shields.io/github/issues-raw/apache/gravitino)](https://github.com/apache/gravitino/issues)\n[![Last Committed](https://img.shields.io/github/last-commit/apache/gravitino)](https://github.com/apache/gravitino/commits/main/)\n[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/8358/badge)](https://www.bestpractices.dev/projects/8358)\n\n## Introduction\n\nApache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.\n\n![Gravitino Architecture](docs/assets/gravitino-architecture.png)\n\n## 🚀 Key Features\n\n- **Unified Metadata Management**: Manage diverse metadata sources through a single model and API (e.g., Hive, MySQL, HDFS, S3).\n- **End-to-End Data Governance**: Features like access control, auditing, and discovery across all metadata assets.\n- **Direct Metadata Integration**: Changes in underlying systems are immediately reflected via Gravitino’s connectors.\n- **Geo-Distribution Support**: Share metadata across regions and clouds to support global architectures.\n- **Multi-Engine Compatibility**: Seamlessly integrates with query engines without modifying SQL dialects.\n- **AI Asset Management (WIP)**: Support for AI model and feature tracking.\n\n## 🌐 Common Use Cases\n\n- Federated metadata discovery across data lakes and data warehouses\n- Multi-region metadata synchronization for hybrid or multi-cloud setups\n- Data and AI asset governance with unified audit and access control\n- Plug-and-play access for engines like Trino or Spark\n- Support for evolving metadata standards, including AI model lineage\n\n## 📚 Documentation\n\nThe latest Gravitino documentation is available at [gravitino.apache.org/docs/latest](https://gravitino.apache.org/docs/latest/).\n\nThis README provides a basic overview; visit the site for full installation, configuration, and development documentation.\n\n## 🧪 Quick Start\n\n### Use Gravitino Playground (Recommended)\n\nGravitino provides a Docker Compose–based playground for a full-stack experience.  \nClone or download the [Gravitino Playground repository](https://github.com/apache/gravitino-playground) and follow its [README](https://github.com/apache/gravitino-playground/blob/main/README.md).\n\n### Run Gravitino Locally\n\n1. [Download](https://gravitino.apache.org/downloads) and extract a binary release.\n2. Edit `conf/gravitino.conf` to configure settings.\n3. Start the server:\n\n```bash\n./bin/gravitino.sh start\n```\n\n4. To stop:\n\n```bash\n./bin/gravitino.sh stop\n```\n\nPress `CTRL+C` to stop.\n\n## 🧊 Iceberg REST Catalog\n\nGravitino provides a native Iceberg REST catalog service.  \nSee: [Iceberg REST catalog service](https://gravitino.apache.org/docs/latest/iceberg-rest-service/)\n\n## 🔌 Trino Integration\n\nGravitino includes a Trino connector for federated metadata access.  \nSee: [Using Trino with Gravitino](https://gravitino.apache.org/docs/latest/trino-connector/index/)\n\n## 🛠️ Building from Source\n\nGravitino uses Gradle. Windows is not currently supported.\n\nClean build without tests:\n\n```bash\n./gradlew clean build -x test\n```\n\nBuild a distribution:\n\n```bash\n./gradlew compileDistribution -x test\n```\n\nOr compressed package:\n\n```bash\n./gradlew assembleDistribution -x test\n```\n\nArtifacts are output to the `distribution/` directory.\n\nMore build options: [How to build Gravitino](https://gravitino.apache.org/docs/latest/how-to-build/)\n\n## 👨‍💻 Developer Resources\n\n- [How to build Gravitino](https://gravitino.apache.org/docs/latest/how-to-build/)\n- [How to test Gravitino](https://gravitino.apache.org/docs/latest/how-to-test/)\n- [Publish Docker images](https://gravitino.apache.org/docs/latest/publish-docker-images)\n\n## 🤝 Contributing\n\nWe welcome all kinds of contributions—code, documentation, testing, connectors, and more!\n\nTo get started, please read our [CONTRIBUTING.md](CONTRIBUTING.md) guide.\n\n## 🔗 ASF Resources\n\n- 📬 Mailing List: [dev@gravitino.apache.org](mailto:dev@gravitino.apache.org) ([subscribe](mailto:dev-subscribe@gravitino.apache.org))\n- 🐞 Issue Tracker: [GitHub Issues](https://github.com/apache/gravitino/issues)\n\n## 🪪 License\n\nApache Gravitino is licensed under the Apache License, Version 2.0.  \nSee the [LICENSE](LICENSE) file for details.\n\n\u003csub\u003eApache®, Apache Gravitino™, Apache Hadoop®, Apache Hive™, Apache Iceberg™, Apache Kafka®, Apache Spark™, Apache Submarine™, Apache Thrift™, and Apache Zeppelin™ are trademarks of the Apache Software Foundation in the United States and/or other countries.\u003c/sub\u003e\n\n\u003cimg src=\"https://analytics.apache.org/matomo.php?idsite=62\u0026rec=1\u0026bots=1\u0026action_name=ReadMe\" style=\"border:0;\" alt=\"\" /\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2FGravitino","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2FGravitino","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2FGravitino/lists"}