https://github.com/sksamuel/centurion

Kotlin Bigdata Toolkit
https://github.com/sksamuel/centurion

bigdata java kotlin orc parquet

Last synced: 3 months ago
JSON representation

Kotlin Bigdata Toolkit

Host: GitHub
URL: https://github.com/sksamuel/centurion
Owner: sksamuel
License: apache-2.0
Created: 2013-10-16T17:10:44.000Z (over 11 years ago)
Default Branch: master
Last Pushed: 2024-07-17T18:05:09.000Z (12 months ago)
Last Synced: 2025-03-29T13:47:51.274Z (3 months ago)
Topics: bigdata, java, kotlin, orc, parquet
Language: Kotlin
Homepage:
Size: 841 KB
Stars: 330
Watchers: 20
Forks: 44
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: changelog.md
- License: LICENSE

Awesome Lists containing this project

README

        # Centurion 

![master](https://github.com/sksamuel/centurion/workflows/master/badge.svg)

[](http://search.maven.org/#search%7Cga%7C1%7Ccenturion)

[](https://s01.oss.sonatype.org/content/repositories/snapshots/com/sksamuel/centurion/)

![License](https://img.shields.io/github/license/sksamuel/centurion.svg?style=plastic)

## Introduction

Centurion is a JVM (written in Kotlin) toolkit for columnar and streaming formats.

This library allows you to read, write and convert between the following formats:

* [Apache Parquet](https://parquet.apache.org)

* [Apache Orc](https://orc.apache.org)

* [Apache Arrow IPC](https://arrow.apache.org)

* [Apache Avro](https://avro.apache.org)

See [changelog](changelog.md) for release notes.

## Schema Conversions

Centurion allows easy conversion of schemas between any of the supported formats, via Centurion's own internal format.

This internal format is a superset of the functionality of all the supported formats, and is intended as an intermediate

format only to allow for conversions.

The following table shows how types map between each of the formats.

| Centurion Type  | Avro                                     | Parquet                   | Orc         | Arrow               |

|-----------------|------------------------------------------|---------------------------|-------------|---------------------|

| Strings         | String                                   | Binary (String)           | String      | Utf8                |

| UUID            | String (UUID)                            | Binary (String)           | String      | Utf8                |

| Booleans        | Boolean                                  | Boolean                   | Boolean     | Bool                |

| Int64           | Long                                     | Int64                     | Long        | Int64 Signed        |

| Int32           | Int                                      | Int32                     | Int         | Int32 Signed        |

| Int16           | N/A (Int)                                | Int32 (Signed Int16)      | Short       | Int16 Signed        |

| Int8            | N/A (Int)                                | Int32 (Signed Int8)       | Byte        | Int8 Signed         |

| Float64         | Double                                   | Double                    | Double      | FloatingPointDouble |

| Float32         | Float                                    | Float                     | Float       | FloatingPointSingle |

| Enum            | Enum                                     | Enum                      | String      | String              |

| Decimal         | Binary / Fixed with annotation _Decimal_ | Decimal(precision, scale) | Decimal)    | Decimal             |

| Varchar         | Fixed)                                   | N/A (String)              | Varchar     | N/A (String)        |

| TimestampMillis | Long (TimestampMillis)                   | Int64 (Timestamp)         | Timestamp   | Timestamp (Millis)  |

| TimestampMicros | Long (TimestampMicros)                   | Int64 (Timestamp)         | Unsupported | Timestamp (Micros)  |

| Map             | Map                                      | Map                       | Map         | Map                 |

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sksamuel/centurion

Awesome Lists containing this project

README