{"id":26152765,"url":"https://github.com/apache/spark-connect-swift","last_synced_at":"2025-04-14T06:03:59.420Z","repository":{"id":281752303,"uuid":"946296075","full_name":"apache/spark-connect-swift","owner":"apache","description":"Apache Spark Connect Client for Swift","archived":false,"fork":false,"pushed_at":"2025-04-14T02:33:10.000Z","size":317,"stargazers_count":10,"open_issues_count":1,"forks_count":3,"subscribers_count":19,"default_branch":"main","last_synced_at":"2025-04-14T06:02:55.754Z","etag":null,"topics":["big-data","spark","sql","swift"],"latest_commit_sha":null,"homepage":"https://spark.apache.org/","language":"Swift","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-10T23:20:04.000Z","updated_at":"2025-04-14T02:33:13.000Z","dependencies_parsed_at":"2025-03-11T01:20:44.038Z","dependency_job_id":"0d2cf61c-c44d-43d5-90ee-d15eec62cb62","html_url":"https://github.com/apache/spark-connect-swift","commit_stats":null,"previous_names":["apache/spark-connect-swift"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-connect-swift","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-connect-swift/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-connect-swift/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-connect-swift/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/spark-connect-swift/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248830394,"owners_count":21168272,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["big-data","spark","sql","swift"],"created_at":"2025-03-11T07:21:13.511Z","updated_at":"2025-04-14T06:03:59.409Z","avatar_url":"https://github.com/apache.png","language":"Swift","readme":"# Apache Spark Connect Client for Swift\n\n[![GitHub Actions Build](https://github.com/apache/spark-connect-swift/actions/workflows/build_and_test.yml/badge.svg)](https://github.com/apache/spark-connect-swift/blob/main/.github/workflows/build_and_test.yml)\n[![Swift Version Compatibility](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fapache%2Fspark-connect-swift%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/apache/spark-connect-swift)\n[![Platform Compatibility](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fapache%2Fspark-connect-swift%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/apache/spark-connect-swift)\n\nThis is an experimental Swift library to show how to connect to a remote Apache Spark Connect Server and run SQL statements to manipulate remote data.\n\nSo far, this library project is tracking the upstream changes like the [Apache Spark](https://spark.apache.org) 4.0.0 RC3 release and [Apache Arrow](https://arrow.apache.org) project's Swift-support.\n\n## Requirement\n- [Apache Spark 4.0.0 RC3 (March 2025)](https://dist.apache.org/repos/dist/dev/spark/v4.0.0-rc3-bin/)\n- [Swift 6.0 (2024)](https://swift.org)\n- [gRPC Swift 2.1 (March 2025)](https://github.com/grpc/grpc-swift/releases/tag/2.1.2)\n- [gRPC Swift Protobuf 1.1 (March 2025)](https://github.com/grpc/grpc-swift-protobuf/releases/tag/1.1.0)\n- [gRPC Swift NIO Transport 1.0 (March 2025)](https://github.com/grpc/grpc-swift-nio-transport/releases/tag/1.0.2)\n- [Apache Arrow Swift](https://github.com/apache/arrow/tree/main/swift)\n\n## How to use in your apps\n\nCreate a Swift project.\n```\n$ mkdir SparkConnectSwiftApp\n$ cd SparkConnectSwiftApp\n$ swift package init --name SparkConnectSwiftApp --type executable\n```\n\nAdd `SparkConnect` package to the dependency like the following\n```\n$ cat Package.swift\nimport PackageDescription\n\nlet package = Package(\n  name: \"SparkConnectSwiftApp\",\n  platforms: [\n    .macOS(.v15)\n  ],\n  dependencies: [\n    .package(url: \"https://github.com/apache/spark-connect-swift.git\", branch: \"main\")\n  ],\n  targets: [\n    .executableTarget(\n      name: \"SparkConnectSwiftApp\",\n      dependencies: [.product(name: \"SparkConnect\", package: \"spark-connect-swift\")]\n    )\n  ]\n)\n```\n\nUse `SparkSession` of `SparkConnect` module in Swift.\n\n```\n$ cat Sources/main.swift\n\nimport SparkConnect\n\nlet spark = try await SparkSession.builder.getOrCreate()\nprint(\"Connected to Apache Spark \\(await spark.version) Server\")\n\nlet statements = [\n  \"DROP TABLE IF EXISTS t\",\n  \"CREATE TABLE IF NOT EXISTS t(a INT) USING ORC\",\n  \"INSERT INTO t VALUES (1), (2), (3)\",\n]\n\nfor s in statements {\n  print(\"EXECUTE: \\(s)\")\n  _ = try await spark.sql(s).count()\n}\nprint(\"SELECT * FROM t\")\ntry await spark.sql(\"SELECT * FROM t\").cache().show()\n\ntry await spark.range(10).filter(\"id % 2 == 0\").write.mode(\"overwrite\").orc(\"/tmp/orc\")\ntry await spark.read.orc(\"/tmp/orc\").show()\n\nawait spark.stop()\n```\n\nRun your Swift application.\n\n```\n$ swift run\n...\nConnected to Apache Spark 4.0.0 Server\nEXECUTE: DROP TABLE IF EXISTS t\nEXECUTE: CREATE TABLE IF NOT EXISTS t(a INT)\nEXECUTE: INSERT INTO t VALUES (1), (2), (3)\nSELECT * FROM t\n+---+\n| a |\n+---+\n| 2 |\n| 1 |\n| 3 |\n+---+\n+----+\n| id |\n+----+\n| 2  |\n| 6  |\n| 0  |\n| 8  |\n| 4  |\n+----+\n```\n\nYou can find this example in the following repository.\n- https://github.com/dongjoon-hyun/spark-connect-swift-app\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fspark-connect-swift","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2Fspark-connect-swift","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fspark-connect-swift/lists"}