https://github.com/hablapps/doric
Type safety for spark columns
https://github.com/hablapps/doric
big big-data dataframe scala spark spark-columns typesafe
Last synced: 3 days ago
JSON representation
Type safety for spark columns
- Host: GitHub
- URL: https://github.com/hablapps/doric
- Owner: hablapps
- License: apache-2.0
- Created: 2021-03-18T16:10:19.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2025-03-31T14:49:15.000Z (about 1 month ago)
- Last Synced: 2025-05-06T23:59:34.839Z (3 days ago)
- Topics: big, big-data, dataframe, scala, spark, spark-columns, typesafe
- Language: Scala
- Homepage: https://www.hablapps.com/doric/
- Size: 13.6 MB
- Stars: 78
- Watchers: 7
- Forks: 11
- Open Issues: 27
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# [doric](https://en.wikipedia.org/wiki/Doric_order)
Type-safe columns for spark DataFrames!
[](https://github.com/hablapps/doric/releases/latest)
[](https://github.com/hablapps/doric/releases/latest)[](https://github.com/hablapps/doric/actions/workflows/ci.yml)
[](https://github.com/hablapps/doric/actions/workflows/pages/pages-build-deployment)
[](https://github.com/hablapps/doric/actions/workflows/release.yml)
[](https://scala-steward.org)
[](https://mybinder.org/v2/gh/hablapps/doric/main?filepath=notebooks)| Spark | Maven Central | Codecov |
|:-----:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2.4.x | Deprecated [](https://mvnrepository.com/artifact/org.hablapps/doric_2-4_2.11/0.0.7) | [](https://codecov.io/gh/hablapps/doric) |
| 3.0.x | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-0_2.12/0.0.8) | [](https://codecov.io/gh/hablapps/doric) |
| 3.1.x | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-1_2.12/0.0.8) | [](https://codecov.io/gh/hablapps/doric) |
| 3.2.x | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-2_2.12/0.0.8) | [](https://codecov.io/gh/hablapps/doric) |
| 3.3.x | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-3_2.12/0.0.8) | [](https://codecov.io/gh/hablapps/doric) |
| 3.4.x | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-4_2.12/0.0.8) | [](https://codecov.io/gh/hablapps/doric) |
| 3.5.x | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-5_2.12/0.0.8) | [](https://codecov.io/gh/hablapps/doric) |
----Doric offers type-safety in DataFrame column expressions at a minimum
cost, without compromising performance. In particular, doric allows you
to:* Get rid of malformed column expressions at compile time
* Avoid implicit type castings
* Run DataFrames only when it is safe to do so
* Get all errors at once
* Modularize your business logicYou'll get all these goodies:
* Without resorting to Datasets and sacrificing performance, i.e. sticking to DataFrames
* With minimal learning curve: almost no change in your code with respect to conventional column expressions
* Without fully committing to a strong static typing discipline throughout all your code## User guide
Please, check out this [notebook](notebooks/README.ipynb) for examples
of use and rationale (also available through the
[binder](https://mybinder.org/v2/gh/hablapps/doric/HEAD?filepath=notebooks/README.ipynb)
link).You can also check our [documentation page](https://www.hablapps.com/doric/)
## Installation
Fetch the JAR from Maven:
_Sbt_
```scala
libraryDependencies += "org.hablapps" %% "doric_3-2" % "0.0.8"
```
_Maven_
```xmlorg.hablapps
doric_3-2_2.12
0.0.8```
`Doric` depends on Spark internals, and it's been tested against the
following spark versions.| Spark | Scala | Tested | doric |
|:------------------:|:-----------:|:------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| 2.4.x (Deprecated) | 2.11 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_2-4_2.11/0.0.7) |
| 3.0.0 | 2.12 | ✅ | You can use 3.0.2 version |
| 3.0.1 | 2.12 | ✅ | You can use 3.0.2 version |
| 3.0.2 | 2.12 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-0_2.12/0.0.8) |
| 3.1.0 | 2.12 | ✅ | You can use 3.1.2 version |
| 3.1.1 | 2.12 | ✅ | You can use 3.1.2 version |
| 3.1.2 | 2.12 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-1_2.12/0.0.8) |
| 3.2.0 | 2.12 / 2.13 | ✅ | You can use 3.2.4 version |
| 3.2.1 | 2.12 / 2.13 | ✅ | You can use 3.2.4 version |
| 3.2.2 | 2.12 / 2.13 | ✅ | You can use 3.2.4 version |
| 3.2.3 | 2.12 / 2.13 | ✅ | You can use 3.2.4 version |
| 3.2.4 | 2.12 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-2_2.12/0.0.8) |
| 3.2.4 | 2.13 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-2_2.13/0.0.8) |
| 3.3.0 | 2.12 / 2.13 | ✅ | You can use 3.3.4 version |
| 3.3.1 | 2.12 / 2.13 | ✅ | You can use 3.3.4 version |
| 3.3.2 | 2.12 / 2.13 | ✅ | You can use 3.3.4 version |
| 3.3.3 | 2.12 / 2.13 | ✅ | You can use 3.3.4 version |
| 3.3.4 | 2.12 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-3_2.12/0.0.8) |
| 3.3.4 | 2.13 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-3_2.13/0.0.8) |
| 3.4.0 | 2.12 / 2.13 | ✅ | You can use 3.4.4 version |
| 3.4.1 | 2.12 / 2.13 | ✅ | You can use 3.4.4 version |
| 3.4.2 | 2.12 / 2.13 | ✅ | You can use 3.4.4 version |
| 3.4.3 | 2.12 / 2.13 | ✅ | You can use 3.4.4 version |
| 3.4.4 | 2.12 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-4_2.12/0.0.8) |
| 3.4.4 | 2.13 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-4_2.13/0.0.8) |
| 3.5.0 | 2.12 / 2.13 | ✅ | You can use 3.5.5 version |
| 3.5.1 | 2.12 / 2.13 | ✅ | You can use 3.5.5 version |
| 3.5.2 | 2.12 / 2.13 | ✅ | You can use 3.5.5 version |
| 3.5.3 | 2.12 / 2.13 | ✅ | You can use 3.5.5 version |
| 3.5.4 | 2.12 / 2.13 | ✅ | You can use 3.5.5 version |
| 3.5.5 | 2.12 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-5_2.12/0.0.8) |
| 3.5.5 | 2.13 | ✅ | [](https://mvnrepository.com/artifact/org.hablapps/doric_3-5_2.13/0.0.8) |## Contributing
Doric is intended to offer a type-safe version of the whole Spark Column API.
Please, check the list of [open issues](https://github.com/hablapps/doric/issues) and help us to achieve that goal!Please read the [contribution guide](CONTRIBUTING.md) 📋