https://github.com/sjrusso8/spark-connect-rs

Apache Spark Connect Client for Rust
https://github.com/sjrusso8/spark-connect-rs
grpc-client spark spark-connect spark-sql
Last synced: about 2 months ago
JSON representation
Apache Spark Connect Client for Rust
Host: GitHub
URL: https://github.com/sjrusso8/spark-connect-rs
Owner: sjrusso8
License: apache-2.0
Created: 2023-09-18T13:32:20.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2025-04-25T20:16:02.000Z (2 months ago)
Last Synced: 2025-04-25T21:23:17.510Z (2 months ago)
Topics: grpc-client, spark, spark-connect, spark-sql
Language: Rust
Homepage: https://docs.rs/spark-connect-rs
Size: 3.85 MB
Stars: 107
Watchers: 4
Forks: 17
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project

awesome-spark - spark-connect-rs - commit/sjrusso8/spark-connect-rs.svg"> - Rust bindings. (Packages / Language Bindings)
README

        # Apache Spark Connect Client for Rust

This project houses the **experimental** client for [Spark

Connect](https://spark.apache.org/docs/latest/spark-connect-overview.html) for

[Apache Spark](https://spark.apache.org/) written in [Rust](https://www.rust-lang.org/)

## Current State of the Project

Currently, the Spark Connect client for Rust is **highly experimental** and **should

not be used in any production setting**. This is currently a "proof of concept" to identify the methods

of interacting with Spark cluster from rust.

The `spark-connect-rs` aims to provide an entrypoint to [Spark Connect](https://spark.apache.org/docs/latest/spark-connect-overview.html), and provide *similar* DataFrame API interactions.

## Project Layout

```

├── crates          <- crates for the implementation of the client side spark-connect bindings

│   └─ connect      <- crate for 'spark-connect-rs'

│      └─ protobuf  <- connect protobuf for apache/spark

├── examples        <- examples of using different aspects of the crate

├── datasets        <- sample files from the main spark repo

```

Future state would be to have additional crates that allow for easier creation of other language bindings.

## Getting Started

This section explains how run Spark Connect Rust locally starting from 0.

**Step 1**: Install rust via rustup: https://www.rust-lang.org/tools/install

**Step 2**: Ensure you have a [cmake](https://cmake.org/download/) and [protobuf](https://grpc.io/docs/protoc-installation/) installed on your machine

**Step 3**: Run the following commands to clone the repo

```bash

git clone https://github.com/sjrusso8/spark-connect-rs.git

cargo build

```

**Step 4**: Setup the Spark Driver on localhost either by downloading spark or with [docker](https://docs.docker.com/engine/install/).

With local spark:

1. [Download Spark distribution](https://spark.apache.org/downloads.html) (3.5.1 recommended), unzip the package.

2. Set your `SPARK_HOME` environment variable to the location where spark was extracted to,

2. Start the Spark Connect server with the following command (make sure to use a package version that matches your Spark distribution):

```bash

$ $SPARK_HOME/sbin/start-connect-server.sh --packages "org.apache.spark:spark-connect_2.12:3.5.1,io.delta:delta-spark_2.12:3.0.0" \

      --conf "spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp" \

      --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" \

      --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"

```

With docker:

1. Start the Spark Connect server by leveraging the created `docker-compose.yml` in this repo. This will start a Spark Connect Server running on port 15002

```bash

$ docker compose up --build -d

```

**Step 5**: Run an example from the repo under [/examples](https://github.com/sjrusso8/spark-connect-rs/tree/main/examples/README.md)

## Features

The following section outlines some of the larger functionality that are not yet working with this Spark Connect implementation.

- ![done] TLS authentication & Databricks compatability via the feature flag `feature = 'tls'`

- ![open] UDFs or any type of functionality that takes a closure (foreach, foreachBatch, etc.)

### SparkSession

[Spark Session](https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/spark_session.html) type object and its implemented traits

|SparkSession      |API       |Comment                                |

|------------------|----------|---------------------------------------|

|active            |![open]   |                                       |

|addArtifact(s)    |![open]   |                                       |

|addTag            |![done]   |                                       |

|clearTags         |![done]   |                                       |

|copyFromLocalToFs |![open]   |                                       |

|createDataFrame   |![partial]|Partial. Only works for `RecordBatch`  |

|getActiveSessions |![open]   |                                       |

|getTags           |![done]   |                                       |

|interruptAll      |![done]   |                                       |

|interruptOperation|![done]   |                                       |

|interruptTag      |![done]   |                                       |

|newSession        |![open]   |                                       |

|range             |![done]   |                                       |

|removeTag         |![done]   |                                       |

|sql               |![done]   |                                       |

|stop              |![open]   |                                       |

|table             |![done]   |                                       |

|catalog           |![done]   |[Catalog](#catalog)                    |

|client            |![done]   |unstable developer api for testing only |

|conf              |![done]   |[Conf](#runtimeconfig)                 |

|read              |![done]   |[DataFrameReader](#dataframereader)    |

|readStream        |![done]   |[DataStreamReader](#datastreamreader)  |

|streams           |![done]   |[Streams](#streamingquerymanager)      |

|udf               |![open]   |[Udf](#udfregistration) - may not be possible   |

|udtf              |![open]   |[Udtf](#udtfregistration) - may not be possible |

|version           |![done]   |                                       |

### SparkSessionBuilder

|SparkSessionBuilder|API       |Comment                                |

|-------------------|----------|---------------------------------------|

|appName            |![done]   |                                       |

|config             |![done]   |                                       |

|master             |![open]   |                                       |

|remote             |![partial]|Validate using [spark connection string](https://github.com/apache/spark/blob/master/connector/connect/docs/client-connection-string.md)|

### RuntimeConfig

|RuntimeConfig     |API       |Comment                                |

|------------------|----------|---------------------------------------|

|get               |![done]   |                                       |

|isModifiable      |![done]   |                                       |

|set               |![done]   |                                       |

|unset             |![done]   |                                       |

### Catalog

|Catalog           |API       |Comment                                |

|------------------|----------|---------------------------------------|

|cacheTable        |![done]   |                                       |

|clearCache        |![done]   |                                       |

|createExternalTale|![done]   |                                       |

|createTable       |![done]   |                                       |

|currentCatalog    |![done]   |                                       |

|currentDatabase   |![done]   |                                       |

|databaseExists    |![done]   |                                       |

|dropGlobalTempView|![done]   |                                       |

|dropTempView      |![done]   |                                       |

|functionExists    |![done]   |                                       |

|getDatabase       |![done]   |                                       |

|getFunction       |![done]   |                                       |

|getTable          |![done]   |                                       |

|isCached          |![done]   |                                       |

|listCatalogs      |![done]   |                                       |

|listDatabases     |![done]   |                                       |

|listFunctions     |![done]   |                                       |

|listTables        |![done]   |                                       |

|recoverPartitions |![done]   |                                       |

|refreshByPath     |![done]   |                                       |

|refreshTable      |![done]   |                                       |

|registerFunction  |![open]   |                                       |

|setCurrentCatalog |![done]   |                                       |

|setCurrentDatabase|![done]   |                                       |

|tableExists       |![done]   |                                       |

|uncacheTable      |![done]   |                                       |

### DataFrameReader

|DataFrameReader  |API       |Comment                                |

|------------------|----------|---------------------------------------|

|csv               |![done]   |                                       |

|format            |![done]   |                                       |

|json              |![done]   |                                       |

|load              |![done]   |                                       |

|option            |![done]   |                                       |

|options           |![done]   |                                       |

|orc               |![done]   |                                       |

|parquet           |![done]   |                                       |

|schema            |![done]   |                                       |

|table             |![done]   |                                       |

|text              |![done]   |                                       |

### DataFrameWriter

Spark Connect *should* respect the format as long as your cluster supports the specified type and has the

required jars

|DataFrameWriter   |API       |Comment                                |

|------------------|----------|---------------------------------------|

|bucketBy          |![done]   |                                       |

|csv               |![done]   |                                       |

|format            |![done]   |                                       |

|insertInto        |![done]   |                                       |

|jdbc              |![open]   |                                       |

|json              |![done]   |                                       |

|mode              |![done]   |                                       |

|option            |![done]   |                                       |

|options           |![done]   |                                       |

|orc               |![done]   |                                       |

|parquet           |![done]   |                                       |

|partitionBy       |![done]   |                                       |

|save              |![done]   |                                       |

|saveAsTable       |![done]   |                                       |

|sortBy            |![done]   |                                       |

|text              |![done]   |                                       |

### DataFrameWriterV2

|DataFrameWriterV2 |API       |Comment                                |

|------------------|----------|---------------------------------------|

|append            |![done]   |                                       |

|create            |![done]   |                                       |

|createOrReplace   |![done]   |                                       |

|option            |![done]   |                                       |

|options           |![done]   |                                       |

|overwrite         |![done]   |                                       |

|overwritePartitions|![done]   |                                       |

|partitionedBy     |![done]   |                                       |

|replace           |![done]   |                                       |

|tableProperty     |![done]   |                                       |

|using             |![done]   |                                       |

### DataStreamReader

|DataStreamReader  |API       |Comment                                |

|------------------|----------|---------------------------------------|

|csv               |![open]   |                                       |

|format            |![done]   |                                       |

|json              |![open]   |                                       |

|load              |![done]   |                                       |

|option            |![done]   |                                       |

|options           |![done]   |                                       |

|orc               |![open]   |                                       |

|parquet           |![open]   |                                       |

|schema            |![done]   |                                       |

|table             |![open]   |                                       |

|text              |![open]   |                                       |

### DataStreamWriter

Start a streaming job and return a `StreamingQuery` object to handle the stream operations.

|DataStreamWriter  |API       |Comment                                |

|------------------|----------|---------------------------------------|

|foreach           |          |                                       |

|foreachBatch      |          |                                       |

|format            |![done]   |                                       |

|option            |![done]   |                                       |

|options           |![done]   |                                       |

|outputMode        |![done]   |Uses an Enum for `OutputMode`          |

|partitionBy       |![done]   |                                       |

|queryName         |![done]   |                                       |

|start             |![done]   |                                       |

|toTable           |![done]   |                                       |

|trigger           |![done]   |Uses an Enum for `TriggerMode`         |

### StreamingQuery

|StreamingQuery    |API       |Comment                               |

|------------------|----------|---------------------------------------|

|awaitTermination  |![done]   |                                       |

|exception         |![done]   |                                       |

|explain           |![done]   |                                       |

|processAllAvailable|![done]   |                                       |

|stop              |![done]   |                                       |

|id                |![done]   |                                       |

|isActive          |![done]   |                                       |

|lastProgress      |![done]   |                                       |

|name              |![done]   |                                       |

|recentProgress    |![done]   |                                       |

|runId             |![done]   |                                       |

|status            |![done]   |                                       |

### StreamingQueryManager

|StreamingQueryManager|API       |Comment                                |

|---------------------|----------|---------------------------------------|

|awaitAnyTermination  |![done]   |                                       |

|get                  |![done]   |                                       |

|resetTerminated      |![done]   |                                       |

|active               |![done]   |                                       |

### StreamingQueryListener

|StreamingQueryListener|API       |Comment                                |

|----------------------|----------|---------------------------------------|

|onQueryIdle           |![open]   |                                       |

|onQueryProgress       |![open]   |                                       |

|onQueryStarted        |![open]   |                                       |

|onQueryTerminated     |![open]   |                                       |

### DataFrame

Spark [DataFrame](https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/dataframe.html) type object and its implemented traits.

| DataFrame                     | API     | Comment                                                    |

|-------------------------------|---------|------------------------------------------------------------|

| agg                           | ![done] |                                                            |

| alias                         | ![done] |                                                            |

| approxQuantile                | ![done] |                                                            |

| cache                         | ![done] |                                                            |

| checkpoint                    | ![open] | Not part of Spark Connect                                  |

| coalesce                      | ![done] |                                                            |

| colRegex                      | ![done] |                                                            |

| collect                       | ![done] |                                                            |

| columns                       | ![done] |                                                            |

| corr                          | ![done] |                                                            |

| count                         | ![done] |                                                            |

| cov                           | ![done] |                                                            |

| createGlobalTempView          | ![done] |                                                            |

| createOrReplaceGlobalTempView | ![done] |                                                            |

| createOrReplaceTempView       | ![done] |                                                            |

| createTempView                | ![done] |                                                            |

| crossJoin                     | ![done] |                                                            |

| crosstab                      | ![done] |                                                            |

| cube                          | ![done] |                                                            |

| describe                      | ![done] |                                                            |

| distinct                      | ![done] |                                                            |

| drop                          | ![done] |                                                            |

| dropDuplicatesWithinWatermark | ![done] |                                                            |

| drop_duplicates               | ![done] |                                                            |

| dropna                        | ![done] |                                                            |

| dtypes                        | ![done] |                                                            |

| exceptAll                     | ![done] |                                                            |

| explain                       | ![done] |                                                            |

| fillna                        | ![done] |                                                            |

| filter                        | ![done] |                                                            |

| first                         | ![done] |                                                            |

| foreach                       | ![open] |                                                            |

| foreachPartition              | ![open] |                                                            |

| freqItems                     | ![done] |                                                            |

| groupBy                       | ![done] |                                                            |

| head                          | ![done] |                                                            |

| hint                          | ![done] |                                                            |

| inputFiles                    | ![done] |                                                            |

| intersect                     | ![done] |                                                            |

| intersectAll                  | ![done] |                                                            |

| isEmpty                       | ![done] |                                                            |

| isLocal                       | ![done] |                                                            |

| isStreaming                   | ![done] |                                                            |

| join                          | ![done] |                                                            |

| limit                         | ![done] |                                                            |

| localCheckpoint               | ![open] | Not part of Spark Connect                                  |

| mapInPandas                   | ![open] | TBD on this exact implementation                           |

| mapInArrow                    | ![open] | TBD on this exact implementation                           |

| melt                          | ![done] |                                                            |

| na                            | ![done] |                                                            |

| observe                       | ![open] |                                                            |

| offset                        | ![done] |                                                            |

| orderBy                       | ![done] |                                                            |

| persist                       | ![done] |                                                            |

| printSchema                   | ![done] |                                                            |

| randomSplit                   | ![done] |                                                            |

| registerTempTable             | ![done] |                                                            |

| repartition                   | ![done] |                                                            |

| repartitionByRange            | ![done] |                                                            |

| replace                       | ![done] |                                                            |

| rollup                        | ![done] |                                                            |

| sameSemantics                 | ![done] |                                                            |

| sample                        | ![done] |                                                            |

| sampleBy                      | ![done] |                                                            |

| schema                        | ![done] |                                                            |

| select                        | ![done] |                                                            |

| selectExpr                    | ![done] |                                                            |

| semanticHash                  | ![done] |                                                            |

| show                          | ![done] |                                                            |

| sort                          | ![done] |                                                            |

| sortWithinPartitions          | ![done] |                                                            |

| sparkSession                  | ![done] |                                                            |

| stat                          | ![done] |                                                            |

| storageLevel                  | ![done] |                                                            |

| subtract                      | ![done] |                                                            |

| summary                       | ![done] |                                                            |

| tail                          | ![done] |                                                            |

| take                          | ![done] |                                                            |

| to                            | ![done] |                                                            |

| toDF                          | ![done] |                                                            |

| toJSON                        | ![partial] | Does not return an `RDD` but a long JSON formatted `String` |

| toLocalIterator               | ![open] |                                                            |

| ~~toPandas~~ to_polars & toPolars  | ![partial] | Convert to a `polars::frame::DataFrame`            |

| **new** to_datafusion & toDataFusion | ![done] | Convert to a `datafusion::dataframe::DataFrame`     |

| transform                     | ![done] |                                                            |

| union                         | ![done] |                                                            |

| unionAll                      | ![done] |                                                            |

| unionByName                   | ![done] |                                                            |

| unpersist                     | ![done] |                                                            |

| unpivot                       | ![done] |                                                            |

| where                         | ![done] | use `filter` instead, `where` is a keyword for rust        |

| withColumn                    | ![done] |                                                            |

| withColumns                   | ![done] |                                                            |

| withColumnRenamed             | ![done] |                                                            |

| withColumnsRenamed            | ![done] |                                                            |

| withMetadata                  | ![done] |                                                            |

| withWatermark                 | ![done] |                                                            |

| write                         | ![done] |                                                            |

| writeStream                   | ![done] |                                                            |

| writeTo                       | ![done] |                                                            |

### Column

Spark [Column](https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/column.html) type object and its implemented traits

| Column           | API     | Comment                                                                      |

|------------------|---------|------------------------------------------------------------------------------|

| alias            | ![done] |                                                                              |

| asc              | ![done] |                                                                              |

| asc_nulls_first  | ![done] |                                                                              |

| asc_nulls_last   | ![done] |                                                                              |

| astype           | ![open] |                                                                              |

| between          | ![open] |                                                                              |

| cast             | ![done] |                                                                              |

| contains         | ![done] |                                                                              |

| desc             | ![done] |                                                                              |

| desc_nulls_first | ![done] |                                                                              |

| desc_nulls_last  | ![done] |                                                                              |

| dropFields       | ![done] |                                                                              |

| endswith         | ![done] |                                                                              |

| eqNullSafe       | ![open] |                                                                              |

| getField         | ![open] | This is depreciated but will need to be implemented                          |

| getItem          | ![open] | This is depreciated but will need to be implemented                          |

| ilike            | ![done] |                                                                              |

| isNotNull        | ![done] |                                                                              |

| isNull           | ![done] |                                                                              |

| isin             | ![done] |                                                                              |

| like             | ![done] |                                                                              |

| name             | ![done] |                                                                              |

| otherwise        | ![open] |                                                                              |

| over             | ![done] | Refer to **Window** for creating window specifications                       |

| rlike            | ![done] |                                                                              |

| startswith       | ![done] |                                                                              |

| substr           | ![done] |                                                                              |

| when             | ![open] |                                                                              |

| withField        | ![done] |                                                                              |

| eq `==`          | ![done] | Rust does not like when you try to overload `==` and return something other than a `bool`. Currently implemented column equality like `col('name').eq(col('id'))`. Not the best, but it works for now                                                                           |

| addition `+`     | ![done] |                                                                              |

| subtration `-`   | ![done] |                                                                              |

| multiplication `*` | ![done] |                                                                            |

| division `/`     | ![done] |                                                                              |

| OR  `\|`         | ![done] |                                                                              |

| AND `&`          | ![done] |                                                                              |

| XOR `^`          | ![done] |                                                                              |

| Negate `~`       | ![done] |                                                                              |

### Data Types

Data types are used for creating schemas and for casting columns to specific types

| Column                | API     | Comment           |

|-----------------------|---------|-------------------|

| ArrayType             | ![done] |                   |

| BinaryType            | ![done] |                   |

| BooleanType           | ![done] |                   |

| ByteType              | ![done] |                   |

| DateType              | ![done] |                   |

| DecimalType           | ![done] |                   |

| DoubleType            | ![done] |                   |

| FloatType             | ![done] |                   |

| IntegerType           | ![done] |                   |

| LongType              | ![done] |                   |

| MapType               | ![done] |                   |

| NullType              | ![done] |                   |

| ShortType             | ![done] |                   |

| StringType            | ![done] |                   |

| CharType              | ![done] |                   |

| VarcharType           | ![done] |                   |

| StructField           | ![done] |                   |

| StructType            | ![done] |                   |

| TimestampType         | ![done] |                   |

| TimestampNTZType      | ![done] |                   |

| DayTimeIntervalType   | ![done] |                   |

| YearMonthIntervalType | ![done] |                   |

### Literal Types

Create Spark literal types from these rust types. E.g. `lit(1_i64)` would be a `LongType()` in the schema.

An array can be made like `lit([1_i16,2_i16,3_i16])` would result in an `ArrayType(Short)` since all the values of the slice can be translated into literal type.

| Spark Literal Type | Rust Type           | Status  |

|--------------------|---------------------|---------|

| Null               |                     | ![open] |

| Binary             | `&[u8]`             | ![done] |

| Boolean            | `bool`              | ![done] |

| Byte               |                     | ![open] |

| Short              | `i16`               | ![done] |

| Integer            | `i32`               | ![done] |

| Long               | `i64`               | ![done] |

| Float              | `f32`               | ![done] |

| Double             | `f64`               | ![done] |

| Decimal            |                     | ![open] |

| String             | `&str` / `String`   | ![done] |

| Date               | `chrono::NaiveDate` | ![done] |

| Timestamp          | `chrono::DateTime`  | ![done] |

| TimestampNtz       | `chrono::NaiveDateTime` | ![done] |

| CalendarInterval   |                     | ![open] |

| YearMonthInterval  |                     | ![open] |

| DayTimeInterval    |                     | ![open] |

| Array              | `slice` / `Vec`     | ![done] |

| Map                | Create with the function `create_map` | ![done] |

| Struct             | Create with the function `struct_col` or `named_struct` | ![done] |

### Window & WindowSpec

For ease of use it's recommended to use `Window` to create the `WindowSpec`.

| Window                  | API     | Comment |

|-------------------------|---------|---------|

| currentRow              | ![done] |         |

| orderBy                 | ![done] |         |

| partitionBy             | ![done] |         |

| rangeBetween            | ![done] |         |

| rowsBetween             | ![done] |         |

| unboundedFollowing      | ![done] |         |

| unboundedPreceding      | ![done] |         |

| WindowSpec.orderBy      | ![done] |         |

| WindowSpec.partitionBy  | ![done] |         |

| WindowSpec.rangeBetween | ![done] |         |

| WindowSpec.rowsBetween  | ![done] |         |

### Functions

Only a few of the functions are covered by unit tests. Functions involving closures or lambdas are not feasible.

| Functions                   | API     | Comments |

|-----------------------------|---------|----------|

| abs                         | ![done] |          |

| acos                        | ![done] |          |

| acosh                       | ![done] |          |

| add_months                  | ![done] |          |

| aes_decrypt                 | ![done] |          |

| aes_encrypt                 | ![done] |          |

| aggregate                   | ![open] |          |

| any_value                   | ![done] |          |

| approx_count_distinct       | ![done] |          |

| approx_percentile           | ![open] |          |

| array                       | ![done] |          |

| array_agg                   | ![done] |          |

| array_append                | ![done] |          |

| array_compact               | ![done] |          |

| array_contains              | ![done] |          |

| array_distinct              | ![done] |          |

| array_except                | ![done] |          |

| array_insert                | ![done] |          |

| array_intersect             | ![done] |          |

| array_join                  | ![done] |          |

| array_max                   | ![done] |          |

| array_min                   | ![done] |          |

| array_position              | ![done] |          |

| array_prepend               | ![done] |          |

| array_remove                | ![done] |          |

| array_repeat                | ![done] |          |

| array_size                  | ![done] |          |

| array_sort                  | ![open] |          |

| array_union                 | ![done] |          |

| arrays_overlap              | ![done] |          |

| arrays_zip                  | ![done] |          |

| asc                         | ![done] |          |

| asc_nulls_first             | ![done] |          |

| asc_nulls_last              | ![done] |          |

| ascii                       | ![done] |          |

| asin                        | ![done] |          |

| asinh                       | ![done] |          |

| assert_true                 | ![done] |          |

| atan                        | ![done] |          |

| atan2                       | ![done] |          |

| atanh                       | ![done] |          |

| avg                         | ![done] |          |

| base64                      | ![done] |          |

| bin                         | ![done] |          |

| bit_and                     | ![done] |          |

| bit_count                   | ![done] |          |

| bit_get                     | ![done] |          |

| bit_length                  | ![done] |          |

| bit_or                      | ![done] |          |

| bit_xor                     | ![done] |          |

| bitmap_bit_position         | ![done] |          |

| bitmap_bucket_number        | ![done] |          |

| bitmap_construct_agg        | ![done] |          |

| bitmap_count                | ![done] |          |

| bitmap_or_agg               | ![done] |          |

| bitwise_not                 | ![done] |          |

| bool_and                    | ![done] |          |

| bool_or                     | ![done] |          |

| broadcast                   | ![done] |          |

| bround                      | ![done] |          |

| btrim                       | ![done] |          |

| bucket                      | ![done] |          |

| call_function               | ![open] |          |

| call_udf                    | ![open] |          |

| cardinality                 | ![done] |          |

| cbrt                        | ![done] |          |

| ceil                        | ![done] |          |

| ceiling                     | ![done] |          |

| char                        | ![done] |          |

| char_length                 | ![done] |          |

| character_length            | ![done] |          |

| coalesce                    | ![done] |          |

| col                         | ![done] |          |

| collect_list                | ![done] |          |

| collect_set                 | ![done] |          |

| column                      | ![done] |          |

| concat                      | ![done] |          |

| concat_ws                   | ![done] |          |

| contains                    | ![done] |          |

| conv                        | ![done] |          |

| convert_timezone            | ![done] |          |

| corr                        | ![done] |          |

| cos                         | ![done] |          |

| cosh                        | ![done] |          |

| cot                         | ![done] |          |

| count                       | ![done] |          |

| count_distinct              | ![done] |          |

| count_if                    | ![done] |          |

| count_min_sketch            | ![done] |          |

| covar_pop                   | ![done] |          |

| covar_samp                  | ![done] |          |

| crc32                       | ![done] |          |

| create_map                  | ![done] |          |

| csc                         | ![done] |          |

| cume_dist                   | ![done] |          |

| curdate                     | ![done] |          |

| current_catalog             | ![done] |          |

| current_database            | ![done] |          |

| current_date                | ![done] |          |

| current_schema              | ![done] |          |

| current_timestamp           | ![done] |          |

| current_timezone            | ![done] |          |

| current_user                | ![done] |          |

| date_add                    | ![done] |          |

| date_diff                   | ![done] |          |

| date_format                 | ![done] |          |

| date_from_unix_date         | ![done] |          |

| date_part                   | ![done] |          |

| date_sub                    | ![done] |          |

| date_trunc                  | ![done] |          |

| dateadd                     | ![done] |          |

| datediff                    | ![done] |          |

| datepart                    | ![open] |          |

| day                         | ![done] |          |

| dayofmonth                  | ![done] |          |

| dayofweek                   | ![done] |          |

| dayofyear                   | ![done] |          |

| days                        | ![done] |          |

| decode                      | ![done] |          |

| degrees                     | ![done] |          |

| dense_rank                  | ![done] |          |

| desc                        | ![done] |          |

| desc_nulls_first            | ![done] |          |

| desc_nulls_last             | ![done] |          |

| e                           | ![done] |          |

| element_at                  | ![done] |          |

| elt                         | ![done] |          |

| encode                      | ![done] |          |

| endswith                    | ![done] |          |

| equal_null                  | ![done] |          |

| every                       | ![done] |          |

| exists                      | ![open] |          |

| exp                         | ![done] |          |

| explode                     | ![done] |          |

| explode_outer               | ![done] |          |

| expm1                       | ![done] |          |

| expr                        | ![done] |          |

| extract                     | ![done] |          |

| factorial                   | ![done] |          |

| filter                      | ![open] |          |

| find_in_set                 | ![done] |          |

| first                       | ![done] |          |

| first_value                 | ![done] |          |

| flatten                     | ![done] |          |

| floor                       | ![done] |          |

| forall                      | ![open] |          |

| format_number               | ![done] |          |

| format_string               | ![done] |          |

| from_csv                    | ![done] |          |

| from_json                   | ![done] |          |

| from_unixtime               | ![done] |          |

| from_utc_timestamp          | ![done] |          |

| get                         | ![done] |          |

| get_json_object             | ![done] |          |

| getbit                      | ![done] |          |

| greatest                    | ![done] |          |

| grouping                    | ![done] |          |

| grouping_id                 | ![done] |          |

| hash                        | ![done] |          |

| hex                         | ![done] |          |

| histogram_numeric           | ![done] |          |

| hll_sketch_agg              | ![done] |          |

| hll_sketch_estimate         | ![done] |          |

| hll_union                   | ![done] |          |

| hll_union_agg               | ![done] |          |

| hour                        | ![done] |          |

| hours                       | ![done] |          |

| hypot                       | ![done] |          |

| ifnull                      | ![done] |          |

| ilike                       | ![done] |          |

| initcap                     | ![done] |          |

| inline                      | ![done] |          |

| inline_outer                | ![done] |          |

| input_file_block_length     | ![done] |          |

| input_file_block_start      | ![done] |          |

| input_file_name             | ![done] |          |

| instr                       | ![done] |          |

| isnan                       | ![done] |          |

| isnotnull                   | ![done] |          |

| isnull                      | ![done] |          |

| java_method                 | ![done] |          |

| json_array_length           | ![done] |          |

| json_object_keys            | ![done] |          |

| json_tuple                  | ![done] |          |

| kurtosis                    | ![done] |          |

| lag                         | ![done] |          |

| last                        | ![done] |          |

| last_day                    | ![done] |          |

| last_value                  | ![done] |          |

| lcase                       | ![done] |          |

| lead                        | ![done] |          |

| least                       | ![done] |          |

| left                        | ![done] |          |

| length                      | ![done] |          |

| levenshtein                 | ![done] |          |

| like                        | ![done] |          |

| lit                         | ![done] |          |

| ln                          | ![done] |          |

| localtimestamp              | ![done] |          |

| locate                      | ![done] |          |

| log                         | ![done] |          |

| log10                       | ![done] |          |

| log1p                       | ![done] |          |

| log2                        | ![done] |          |

| lower                       | ![done] |          |

| lpad                        | ![done] |          |

| ltrim                       | ![done] |          |

| make_date                   | ![done] |          |

| make_dt_interval            | ![done] |          |

| make_interval               | ![done] |          |

| make_timestamp              | ![done] |          |

| make_timestamp_ltz          | ![done] |          |

| make_timestamp_ntz          | ![done] |          |

| make_ym_interval            | ![done] |          |

| map_concat                  | ![done] |          |

| map_contains_key            | ![done] |          |

| map_entries                 | ![done] |          |

| map_filter                  | ![open] |          |

| map_from_arrays             | ![done] |          |

| map_from_entries            | ![done] |          |

| map_keys                    | ![done] |          |

| map_values                  | ![done] |          |

| map_zip_with                | ![open] |          |

| mask                        | ![open] |          |

| max                         | ![done] |          |

| max_by                      | ![done] |          |

| md5                         | ![done] |          |

| mean                        | ![done] |          |

| median                      | ![done] |          |

| min                         | ![done] |          |

| min_by                      | ![done] |          |

| minute                      | ![done] |          |

| mode                        | ![done] |          |

| monotonically_increasing_id | ![done] |          |

| month                       | ![done] |          |

| months                      | ![done] |          |

| months_between              | ![done] |          |

| named_struct                | ![done] |          |

| nanvl                       | ![done] |          |

| negate                      | ![done] |          |

| negative                    | ![done] |          |

| next_day                    | ![done] |          |

| now                         | ![done] |          |

| nth_value                   | ![done] |          |

| ntile                       | ![done] |          |

| nullif                      | ![done] |          |

| nvl                         | ![done] |          |

| nvl2                        | ![done] |          |

| octet_length                | ![done] |          |

| overlay                     | ![done] |          |

| pandas_udf                  | ![open] |          |

| parse_url                   | ![done] |          |

| percent_rank                | ![done] |          |

| percentile                  | ![done] |          |

| percentile_approx           | ![done] |          |

| pi                          | ![done] |          |

| pmod                        | ![done] |          |

| posexplode                  | ![done] |          |

| posexplode_outer            | ![done] |          |

| position                    | ![done] |          |

| positive                    | ![done] |          |

| pow                         | ![done] |          |

| power                       | ![done] |          |

| printf                      | ![done] |          |

| product                     | ![done] |          |

| quarter                     | ![done] |          |

| radians                     | ![done] |          |

| raise_error                 | ![done] |          |

| rand                        | ![done] |          |

| randn                       | ![done] |          |

| rank                        | ![done] |          |

| reduce                      | ![open] |          |

| reflect                     | ![done] |          |

| regexp                      | ![done] |          |

| regexp_count                | ![done] |          |

| regexp_extract              | ![done] |          |

| regexp_extract_all          | ![done] |          |

| regexp_instr                | ![done] |          |

| regexp_like                 | ![done] |          |

| regexp_replace              | ![done] |          |

| regexp_substr               | ![done] |          |

| regr_avgx                   | ![done] |          |

| regr_avgy                   | ![done] |          |

| regr_count                  | ![done] |          |

| regr_intercept              | ![done] |          |

| regr_r2                     | ![done] |          |

| regr_slope                  | ![done] |          |

| regr_sxx                    | ![done] |          |

| regr_sxy                    | ![done] |          |

| regr_syy                    | ![done] |          |

| repeat                      | ![done] |          |

| replace                     | ![done] |          |

| reverse                     | ![done] |          |

| right                       | ![done] |          |

| rint                        | ![done] |          |

| rlike                       | ![done] |          |

| round                       | ![done] |          |

| row_number                  | ![done] |          |

| rpad                        | ![done] |          |

| rtrim                       | ![done] |          |

| schema_of_csv               | ![done] |          |

| schema_of_json              | ![done] |          |

| sec                         | ![done] |          |

| second                      | ![done] |          |

| sentences                   | ![done] |          |

| sequence                    | ![done] |          |

| session_window              | ![done] |          |

| sha                         | ![done] |          |

| sha1                        | ![done] |          |

| sha2                        | ![done] |          |

| shiftleft                   | ![done] |          |

| shiftright                  | ![done] |          |

| shiftrightunsigned          | ![done] |          |

| shuffle                     | ![done] |          |

| sign                        | ![done] |          |

| signum                      | ![done] |          |

| sin                         | ![done] |          |

| sinh                        | ![done] |          |

| size                        | ![done] |          |

| skewness                    | ![done] |          |

| slice                       | ![done] |          |

| some                        | ![done] |          |

| sort_array                  | ![done] |          |

| soundex                     | ![done] |          |

| spark_partition_id          | ![done] |          |

| split                       | ![done] |          |

| split_part                  | ![done] |          |

| sqrt                        | ![done] |          |

| stack                       | ![done] |          |

| startswith                  | ![done] |          |

| std                         | ![done] |          |

| stddev                      | ![done] |          |

| stddev_pop                  | ![done] |          |

| stddev_samp                 | ![done] |          |

| str_to_map                  | ![done] |          |

| struct                      | ![open] |          |

| substr                      | ![done] |          |

| substring                   | ![done] |          |

| substring_index             | ![done] |          |

| sum                         | ![done] |          |

| sum_distinct                | ![done] |          |

| tan                         | ![done] |          |

| tanh                        | ![done] |          |

| timestamp_micros            | ![done] |          |

| timestamp_millis            | ![done] |          |

| timestamp_seconds           | ![done] |          |

| to_binary                   | ![done] |          |

| to_char                     | ![done] |          |

| to_csv                      | ![done] |          |

| to_date                     | ![done] |          |

| to_json                     | ![done] |          |

| to_number                   | ![done] |          |

| to_timestamp                | ![done] |          |

| to_timestamp_ltz            | ![done] |          |

| to_timestamp_ntz            | ![done] |          |

| to_unix_timestamp           | ![done] |          |

| to_utc_timestamp            | ![done] |          |

| to_varchar                  | ![done] |          |

| to_degrees                  | ![done] |          |

| to_radians                  | ![done] |          |

| transform                   | ![open] |          |

| transform_keys              | ![open] |          |

| transform_values            | ![open] |          |

| translate                   | ![done] |          |

| trim                        | ![done] |          |

| trunc                       | ![done] |          |

| try_add                     | ![done] |          |

| try_aes_decrypt             | ![done] |          |

| try_avg                     | ![done] |          |

| try_divide                  | ![done] |          |

| try_element_at              | ![done] |          |

| try_multiply                | ![done] |          |

| try_subtract                | ![done] |          |

| try_sum                     | ![done] |          |

| try_to_binary               | ![done] |          |

| try_to_number               | ![done] |          |

| try_to_timestamp            | ![done] |          |

| typeof                      | ![open] |          |

| ucase                       | ![done] |          |

| udf                         | ![open] |          |

| udtf                        | ![open] |          |

| unbase64                    | ![done] |          |

| unhex                       | ![done] |          |

| unix_date                   | ![done] |          |

| unix_micros                 | ![open] |          |

| unix_millis                 | ![done] |          |

| unix_seconds                | ![done] |          |

| unix_timestamp              | ![done] |          |

| unwrap_udt                  | ![open] |          |

| upper                       | ![done] |          |

| url_decode                  | ![done] |          |

| url_encode                  | ![done] |          |

| user                        | ![done] |          |

| var_pop                     | ![done] |          |

| var_samp                    | ![done] |          |

| variance                    | ![done] |          |

| version                     | ![done] |          |

| weekday                     | ![done] |          |

| weekofyear                  | ![done] |          |

| when                        | ![open] |          |

| width_bucket                | ![done] |          |

| window                      | ![done] |          |

| window_time                 | ![done] |          |

| xpath                       | ![done] |          |

| xpath_boolean               | ![done] |          |

| xpath_double                | ![done] |          |

| xpath_float                 | ![done] |          |

| xpath_int                   | ![done] |          |

| xpath_long                  | ![done] |          |

| xpath_number                | ![done] |          |

| xpath_short                 | ![done] |          |

| xpath_string                | ![done] |          |

| xxhash64                    | ![done] |          |

| year                        | ![done] |          |

| years                       | ![done] |          |

| zip_with                    | ![open] |          |

### UdfRegistration (may not be possible)

|UDFRegistration   |API       |Comment                                |

|------------------|----------|---------------------------------------|

|register          |![open]   |                                       |

|registerJavaFunction|![open]   |                                       |

|registerJavaUDAF  |![open]   |                                       |

### UdtfRegistration (may not be possible)

|UDTFRegistration  |API       |Comment                                |

|------------------|----------|---------------------------------------|

|register          |![open]   |                                       |

[open]: https://cdn.jsdelivr.net/gh/Readme-Workflows/Readme-Icons@main/icons/octicons/IssueNeutral.svg

[done]: https://cdn.jsdelivr.net/gh/Readme-Workflows/Readme-Icons@main/icons/octicons/ApprovedChanges.svg

[partial]: https://cdn.jsdelivr.net/gh/Readme-Workflows/Readme-Icons@main/icons/octicons/IssueDrafted.svg
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sjrusso8/spark-connect-rs

Awesome Lists containing this project

README