{"id":13569305,"url":"https://github.com/unitycatalog/unitycatalog","last_synced_at":"2025-05-10T00:45:01.379Z","repository":{"id":244279219,"uuid":"814708478","full_name":"unitycatalog/unitycatalog","owner":"unitycatalog","description":"Open, Multi-modal Catalog for Data \u0026 AI","archived":false,"fork":false,"pushed_at":"2025-04-29T06:08:05.000Z","size":21743,"stargazers_count":2831,"open_issues_count":255,"forks_count":466,"subscribers_count":55,"default_branch":"main","last_synced_at":"2025-05-10T00:44:38.154Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://unitycatalog.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/unitycatalog.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-06-13T14:39:25.000Z","updated_at":"2025-05-09T14:28:48.000Z","dependencies_parsed_at":"2024-08-05T18:27:59.520Z","dependency_job_id":"f3a54b45-1679-49cc-bf22-5733b84e6f48","html_url":"https://github.com/unitycatalog/unitycatalog","commit_stats":{"total_commits":413,"total_committers":82,"mean_commits":5.036585365853658,"dds":0.9322033898305084,"last_synced_commit":"465bc2ecf1720b34396c81e44d417890a1beec6c"},"previous_names":["unitycatalog/unitycatalog"],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unitycatalog%2Funitycatalog","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unitycatalog%2Funitycatalog/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unitycatalog%2Funitycatalog/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/unitycatalog%2Funitycatalog/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/unitycatalog","download_url":"https://codeload.github.com/unitycatalog/unitycatalog/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253346990,"owners_count":21894275,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T14:00:38.311Z","updated_at":"2025-05-10T00:45:00.879Z","avatar_url":"https://github.com/unitycatalog.png","language":"Python","funding_links":[],"categories":["Python","Data Catalog","Table of Contents","Java","大数据","🧱 Databricks"],"sub_categories":["Metadata Service","📚 Repos"],"readme":"\u003cimg src=\"./docs/assets/images/uc-logo.png\" width=\"600px\" /\u003e\n\n# Unity Catalog: Open, Multimodal Catalog for Data \u0026 AI\n\nUnity Catalog is the industry’s only universal catalog for data and AI.\n\n- **Multimodal interface supports any format, engine, and asset**\n  - Multi-format support: It is extensible and supports Delta Lake, Apache Iceberg and Apache Hudi via UniForm, Apache Parquet, JSON, CSV, and many others.\n  - Multi-engine support: With its open APIs, data cataloged in Unity can be read by many leading compute engines.\n  - Multimodal: It supports all your data and AI assets, including tables, files, functions, AI models.\n- **Open source API and implementation** - OpenAPI spec and OSS implementation (Apache 2.0 license). It is also compatible with Apache Hive's metastore API and Apache Iceberg's REST catalog API. Unity Catalog is currently a sandbox project with LF AI and Data Foundation (part of the Linux Foundation).\n- **Unified governance** for data and AI - Govern and secure tabular data, unstructured assets, and AI assets with a single interface.\n\nThe first release of Unity Catalog focuses on a core set of APIs for tables, unstructured data, and AI assets - with more to come soon on governance, access, and client interoperability. This is just the beginning!\n\n![UC Hero Image](./docs/assets/images/uc.png)\n\n### Vibrant ecosystem\n\nThis is a community effort. Unity Catalog is supported by\n\n- [Amazon Web Services](https://aws.amazon.com/)\n- [Confluent](https://www.confluent.io/)\n- [Daft (Eventual)](https://github.com/Eventual-Inc/Daft)\n- [dbt Labs](https://www.getdbt.com/)\n- [DuckDB](https://duckdblabs.com/)\n- [Fivetran](https://www.fivetran.com/)\n- [Google Cloud](https://cloud.google.com/)\n- [Granica](https://granica.ai/)\n- [Immuta](https://www.immuta.com/)\n- [Informatica](https://www.informatica.com/)\n- [LanceDB](https://lancedb.com/)\n- [LangChain](https://www.langchain.com/)\n- [LlamaIndex](https://www.llamaindex.ai/)\n- [Microsoft Azure](https://azure.microsoft.com)\n- [NVIDIA](https://www.nvidia.com/)\n- [Onehouse](https://www.onehouse.ai/)\n- [PuppyGraph](https://www.puppygraph.com/)\n- [Salesforce](https://www.salesforce.com/)\n- [StarRocks (CelerData)](https://celerdata.com/)\n- [Spice AI](https://github.com/spiceai/spiceai)\n- [Tecton](https://www.tecton.ai/)\n- [Unstructured](https://unstructured.io/)\n\nUnity Catalog is proud to be hosted by the LF AI \u0026 Data Foundation.\n\n\u003ca href=\"https://lfaidata.foundation/projects\"\u003e\n  \u003cimg src=\"./docs/assets/images/lfaidata-project-badge-sandbox-color.png\" width=\"200px\" /\u003e\n\u003c/a\u003e\n\n## Quickstart - Hello UC!\n\nLet's take Unity Catalog for spin. In this guide, we are going to do the following:\n\n- In one terminal, run the UC server.\n- In another terminal, we will explore the contents of the UC server using a CLI.\n  An example project is provided to demonstrate how to use the UC SDK for various assets\n  as well as provide a convenient way to explore the content of any UC server implementation.\n\n\u003e If you prefer to run Unity Catalog in Docker use `docker\n\u003e compose up`. See the [Docker Compose docs](./docs/docker_compose.md) for more details.\n\n### Prerequisites\n\nYou have to ensure that your local environment has the following:\n\n- Clone this repository.\n- Ensure the `JAVA_HOME` environment variable your terminal is configured to point to JDK17.\n- Compile the project using `build/sbt package`\n\n\n### Run the UC Server\n\nIn a terminal, in the cloned repository root directory, start the UC server.\n\n```sh\nbin/start-uc-server\n```\n\nFor the remaining steps, continue in a different terminal.\n\n### Operate on Delta tables with the CLI\n\nLet's list the tables.\n\n```sh\nbin/uc table list --catalog unity --schema default\n```\n\nYou should see a few tables. Some details are truncated because of the nested nature of the data.\nTo see all the content, you can add `--output jsonPretty` to any command.\n\nNext, let's get the metadata of one of those tables.\n\n```sh\nbin/uc table get --full_name unity.default.numbers\n```\n\nYou can see that it is a Delta table. Now, specifically for Delta tables, this CLI can\nprint a snippet of the contents of a Delta table (powered by the [Delta Kernel Java](https://delta.io/blog/delta-kernel/) project).\nLet's try that.\n\n```sh\nbin/uc table read --full_name unity.default.numbers\n```\n\n### Operate on Delta tables with DuckDB\n\nFor operating on tables with DuckDB, you will have to [install it](https://duckdb.org/docs/installation/) (version 1.0).\nLet's start DuckDB and install a couple of extensions. To start DuckDB, run the command `duckdb` in the terminal.\nThen, in the DuckDB shell, run the following commands:\n\n```sql\ninstall uc_catalog from core_nightly;\nload uc_catalog;\ninstall delta;\nload delta;\n```\n\nIf you have installed these extensions before, you may have to run `update extensions` and restart DuckDB\nfor the following steps to work.\n\nNow that we have DuckDB all set up, let's try connecting to UC by specifying a secret.\n\n```sql\nCREATE SECRET (\n      TYPE UC,\n      TOKEN 'not-used',\n      ENDPOINT 'http://127.0.0.1:8080',\n      AWS_REGION 'us-east-2'\n );\n```\n\nYou should see it print a short table saying `Success` = `true`. Then we attach the `unity` catalog to DuckDB.\n\n```sql\nATTACH 'unity' AS unity (TYPE UC_CATALOG);\n```\n\nNow we are ready to query. Try the following:\n\n```sql\nSHOW ALL TABLES;\nSELECT * from unity.default.numbers;\n```\n\nYou should see the tables listed and the contents of the `numbers` table printed.\nTo quit DuckDB, press `Ctrl`+`D` (if your platform supports it), press `Ctrl`+`C`, or use the `.exit` command in the DuckDB shell.\n\n### Interact with the Unity Catalog UI\n\n![UC UI](./docs/assets/images/uc-ui.png)\n\nTo use the Unity Catalog UI, start a new terminal and ensure you have already started the UC server (e.g., `./bin/start-uc-server`)\n\n**Prerequisites**\n* Node: https://nodejs.org/en/download/package-manager\n* Yarn: https://classic.yarnpkg.com/lang/en/docs/install\n\n**How to start the UI through yarn**\n```\ncd /ui\nyarn install\nyarn start\n```\n\n\n## CLI tutorial\n\nYou can interact with a Unity Catalog server to create and manage catalogs, schemas and tables,\noperate on volumes and functions from the CLI, and much more.\nSee the [cli usage](docs/usage/cli.md) for more details.\n\n## APIs and Compatibility\n\n- Open API specification: See the [Unity Catalog Rest API](https://docs.unitycatalog.io/swagger-docs/).\n- Compatibility and stability: The APIs are currently evolving and should not be assumed to be stable.\n\n## Building Unity Catalog\n\nUnity Catalog can be built using [sbt](https://www.scala-sbt.org/).\n\nTo build UC (incl. [Spark Integration](./connectors/spark) module), run the following command:\n\n```sh\nbuild/sbt clean package publishLocal spark/publishLocal\n```\n\nRefer to [sbt docs](https://www.scala-sbt.org/1.x/docs/) for more commands.\n\n## Deployment\n\n- To create a tarball that can be used to deploy the UC server or run the CLI, run the following:\n  ```sh\n  build/sbt createTarball\n  ```\n  This will create a tarball in the `target` directory. See the full [deployment guide](docs/deployment.md) for more details.\n\n## Compiling and testing\n\n- Install JDK 17 by whatever mechanism is appropriate for your system, and\n  set that version to be the default Java version (e.g. via the env variable `JAVA_HOME`)\n- To compile all the code without running tests, run the following:\n  ```sh\n  build/sbt clean compile\n  ```\n- To compile and execute tests, run the following:\n  ```sh\n  build/sbt -J-Xmx2G clean test\n  ```\n- To execute tests with coverage, run the following:\n  ```sh\n  build/sbt -J-Xmx2G jacoco\n  ```\n- To update the API specification, just update the `api/all.yaml` and then run the following:\n  ```sh\n  build/sbt generate\n  ```\n  This will regenerate the OpenAPI data models in the UC server and data models + APIs in the client SDK.\n- To format the code, run the following:\n  ```sh\n  build/sbt javafmtAll\n  ```\n\n## Setting up IDE\n\nIntelliJ is the recommended IDE to use when developing Unity Catalog. The below steps outline how to add the project to IntelliJ:\n\n1. Clone Unity Catalog into a local folder, such as `~/unitycatalog`.\n2. Select `File` \u003e `New Project` \u003e `Project from Existing Sources...` and select `~/unitycatalog`.\n3. Under `Import project from external model` select `sbt`. Click `Next`.\n4. Click `Finish`.\n\nJava code adheres to the [Google style](https://google.github.io/styleguide/javaguide.html), which is verified via `build/sbt javafmtCheckAll` during builds.\nIn order to automatically fix Java code style issues, please use `build/sbt javafmtAll`.\n\n### Configuring Code Formatter for Eclipse/IntelliJ\n\nFollow the instructions for [Eclipse](https://github.com/google/google-java-format#eclipse) or\n[IntelliJ](https://github.com/google/google-java-format#intellij-android-studio-and-other-jetbrains-ides) to install the **google-java-format** plugin (note the required manual actions for IntelliJ).\n\n### Using more recent JDKs\n\nThe build script [checks for a lower bound on the JDK](./build.sbt#L14) but the [current SBT version](./project/build.properties)\nimposes an upper bound. Please check the [JDK compatibility](https://docs.scala-lang.org/overviews/jdk-compatibility/overview.html) documentation for more information\n\n### Serving the documentation with mkdocs\n\nFor an overview of how to contribute to the documentation, please see our introduction [here](./docs/README.md).\nFor the official documentation, please take a look at [https://docs.unitycatalog.io/](https://docs.unitycatalog.io/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funitycatalog%2Funitycatalog","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Funitycatalog%2Funitycatalog","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Funitycatalog%2Funitycatalog/lists"}