{"id":30524014,"url":"https://github.com/npiesco/unitycatalog-docker","last_synced_at":"2025-08-26T20:52:28.738Z","repository":{"id":250058773,"uuid":"832935609","full_name":"npiesco/unitycatalog-docker","owner":"npiesco","description":"Contains detailed walkthrough and necessary code and commands to set-up unity catalog to run on docker.","archived":false,"fork":false,"pushed_at":"2024-07-25T17:20:27.000Z","size":37,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-07-26T00:18:51.243Z","etag":null,"topics":["docker","guide","unitycatalog","walkthrough"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/npiesco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-24T03:14:14.000Z","updated_at":"2024-07-25T18:06:55.000Z","dependencies_parsed_at":"2024-07-25T00:25:18.209Z","dependency_job_id":null,"html_url":"https://github.com/npiesco/unitycatalog-docker","commit_stats":null,"previous_names":["npiesco/unitycatalog-docker"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/npiesco/unitycatalog-docker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npiesco%2Funitycatalog-docker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npiesco%2Funitycatalog-docker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npiesco%2Funitycatalog-docker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npiesco%2Funitycatalog-docker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/npiesco","download_url":"https://codeload.github.com/npiesco/unitycatalog-docker/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npiesco%2Funitycatalog-docker/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272254552,"owners_count":24901064,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-26T02:00:07.904Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","guide","unitycatalog","walkthrough"],"created_at":"2025-08-26T20:52:28.028Z","updated_at":"2025-08-26T20:52:28.727Z","avatar_url":"https://github.com/npiesco.png","language":null,"readme":"To set up and run a Unity Catalog project using Docker, follow this walkthrough:\n\n## Phase 1: Set-up Unity Catalog Project Directory\n\n### Step 1: Clone the Unity Catalog Repository\nRun the following command to clone the Unity Catalog repository:\n```bash\ngit clone https://github.com/unitycatalog/unitycatalog.git\n```\n\n### Step 2: Setup Dockerfile for macOS and Windows\nNavigate to the cloned repository and create a Dockerfile named `unitycatalog.dockerfile` with the following content:\n\n#### macOS Dockerfile\n```dockerfile\n# Use Ubuntu as base image\nFROM ubuntu:20.04\n\n# Set working directory in container\nWORKDIR /app\n\n# Install OpenJDK 17, curl, and other necessary tools\nRUN apt-get update \u0026\u0026 \\\n    apt-get install -y openjdk-17-jdk curl gnupg\n\n# Install sbt\nRUN echo \"deb https://repo.scala-sbt.org/scalasbt/debian all main\" | tee /etc/apt/sources.list.d/sbt.list \u0026\u0026 \\\n    curl -sL \"https://keyserver.ubuntu.com/pks/lookup?op=get\u0026search=0x99E82A75642AC823\" | apt-key add \u0026\u0026 \\\n    apt-get update \u0026\u0026 \\\n    apt-get install -y sbt\n\n# Copy necessary files\nCOPY . /app\n\n# Build project\nRUN sbt package\n\n# Make sure scripts are executable\nRUN chmod +x /app/bin/start-uc-server /app/bin/uc\n\n# Add /app/bin to PATH\nENV PATH=\"/app/bin:${PATH}\"\n\n# Expose port app runs on\nEXPOSE 8080\n\n# Run Unity Catalog server\nCMD [\"/bin/bash\", \"/app/bin/start-uc-server\"]\n```\n\n#### Windows Dockerfile\n```dockerfile\n# Use Ubuntu as base image\nFROM ubuntu:20.04\n\n# Set working directory in container\nWORKDIR /app\n\n# Install OpenJDK 17, curl, and other necessary tools\nRUN apt-get update \u0026\u0026 \\\n    apt-get install -y openjdk-17-jdk curl gnupg dos2unix\n\n# Install sbt\nRUN echo \"deb https://repo.scala-sbt.org/scalasbt/debian all main\" | tee /etc/apt/sources.list.d/sbt.list \u0026\u0026 \\\n    curl -sL \"https://keyserver.ubuntu.com/pks/lookup?op=get\u0026search=0x99E82A75642AC823\" | apt-key add \u0026\u0026 \\\n    apt-get update \u0026\u0026 \\\n    apt-get install -y sbt\n\n# Copy necessary files\nCOPY . /app\n\n# Ensure scripts have LF line endings\nRUN dos2unix /app/bin/start-uc-server \u0026\u0026 \\\n    dos2unix /app/bin/uc\n\n# Build project\nRUN sbt package\n\n# Make sure scripts are executable\nRUN chmod +x /app/bin/start-uc-server /app/bin/uc\n\n# Add /app/bin to PATH\nENV PATH=\"/app/bin:${PATH}\"\n\n# Expose port app runs on\nEXPOSE 8080\n\n# Run Unity Catalog server\nCMD [\"/bin/bash\", \"/app/bin/start-uc-server\"]\n```\n\n## Phase 2: Running Unity Catalog in Docker\n\n### Step 1: Build the Docker Image\nRun the following command to build the Docker image:\n```bash\ndocker build -t unitycatalog -f unitycatalog.dockerfile .\n```\n\n### Step 2: Run the Docker Container\nRun the following command to start a new container:\n```bash\ndocker run -d --name unitycatalog -p 8080:8080 unitycatalog\n```\n\n### Step 3: Verify the Container is Running\nCheck the logs to verify the container is running:\n```bash\ndocker logs unitycatalog\n```\n*You should see the Unity Catalog logo and other startup messages.*\n\n```\n###################################################################\n#  _    _       _ _            _____      _        _              #\n# | |  | |     (_) |          / ____|    | |      | |             #\n# | |  | |_ __  _| |_ _   _  | |     __ _| |_ __ _| | ___   __ _  #\n# | |  | | '_ \\| | __| | | | | |    / _` | __/ _` | |/ _ \\ / _` | #\n# | |__| | | | | | |_| |_| | | |___| (_| | || (_| | | (_) | (_| | #\n#  \\____/|_| |_|_|\\__|\\__, |  \\_____\\__,_|\\__\\__,_|_|\\___/ \\__, | #\n#                      __/ |                                __/ | #\n#                     |___/               v0.1.0-SNAPSHOT  |___/  #\n###################################################################\n```\n\n## Phase 3: Initial Setup\n\n### Step 1: Create and List Catalogs\nCreate a new catalog and list all catalogs:\n```bash\ndocker exec -it unitycatalog uc catalog create --name my_local_catalog\ndocker exec -it unitycatalog uc catalog list\n```\n\n### Step 2: Create and List Schemas\nCreate a new schema within the catalog and list all schemas:\n```bash\ndocker exec -it unitycatalog uc schema create --catalog my_local_catalog --name my_schema\ndocker exec -it unitycatalog uc schema list --catalog my_local_catalog\n```\n\n### Step 3: Create a Delta Table\n\n#### Dependencies\nEnsure you have the following Python packages installed:\n- `deltalake`\n- `duckdb`\n- `mimesis`\n\n#### Code to Generate and Write Data to Delta Table\n```python\nimport duckdb\nfrom deltalake import write_deltalake, DeltaTable\nimport os\nfrom mimesis import Person\nfrom mimesis.locales import Locale\n\n# Generate 1000 records\nperson = Person(Locale.EN)\nrecords = []\nfor index in range(1, 1001):\n    record = {\n        \"Index\": index,\n        \"User_Id\": person.identifier(),\n        \"First_Name\": person.first_name(),\n        \"Last_Name\": person.last_name(),\n        \"Sex\": person.gender(),\n        \"Email\": person.email(),\n        \"Phone\": person.telephone(),\n        \"Date_of_birth\": person.birthdate().isoformat(),\n        \"Job_Title\": person.occupation()\n    }\n    records.append(record)\n\n# Create DuckDB table and insert records\ncon = duckdb.connect()\ncon.execute(\"\"\"\n    CREATE TABLE users (\n        \"Index\" INTEGER,\n        \"User_Id\" VARCHAR,\n        \"First_Name\" VARCHAR,\n        \"Last_Name\" VARCHAR,\n        \"Sex\" VARCHAR,\n        \"Email\" VARCHAR,\n        \"Phone\" VARCHAR,\n        \"Date_of_birth\" DATE,\n        \"Job_Title\" VARCHAR\n    )\n\"\"\")\n\ninsert_query = \"INSERT INTO users VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)\"\ncon.executemany(insert_query, [(record[\"Index\"], record[\"User_Id\"], record[\"First_Name\"], record[\"Last_Name\"], record[\"Sex\"], record[\"Email\"], record[\"Phone\"], record[\"Date_of_birth\"], record[\"Job_Title\"]) for record in records])\n\n# Convert DuckDB table to DataFrame and write to Delta table\nduck_df = con.execute(\"SELECT * FROM users\").fetchdf()\ndelta_table_path = \".../test_delta_table/\" # modify this to your desired directory\nwrite_deltalake(delta_table_path, duck_df, mode='append')\n\n# Verify Delta table directory contents and check metadata\nprint(os.listdir(delta_table_path))\n\nresult = con.execute(f\"SELECT * FROM delta_scan('{delta_table_path}')\").fetchdf()\nprint(result)\n\ndelta_table = DeltaTable(delta_table_path)\nprint(delta_table.history())\n```\n\n### Step 4: Register the Delta Table\nNow that the Delta table is created, register it with Unity Catalog:\n```bash\ndocker exec -it unitycatalog uc table create --full_name my_local_catalog.my_schema.sample_delta_table --columns \"Index INT, User_Id STRING, First_Name STRING, Last_Name STRING, Sex STRING, Email STRING, Phone STRING, Date_of_birth DATE, Job_Title STRING\" --format DELTA --storage_location file:///C:/.../.../.../test_delta_table\n```\n*Note: Update storage location to the path of your Delta Table*\n\n### Step 5: Read the Table *(Delta Format Only)*\nRead the table to verify its contents:\n```bash\ndocker exec -it unitycatalog uc table read --full_name my_local_catalog.my_schema.sample_delta_table\n```\n\nYou have now successfully set up Unity Catalog to run on Docker, created a catalog, schema, generated sample Delta Table, registered the sample Delta table, and read it the Delta Table back using Unity Catalog native capabilities.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnpiesco%2Funitycatalog-docker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnpiesco%2Funitycatalog-docker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnpiesco%2Funitycatalog-docker/lists"}