https://github.com/exasol/cloud-storage-extension
Exasol Cloud Storage Extension for accessing formatted data Avro, Orc and Parquet, on public cloud storage systems
https://github.com/exasol/cloud-storage-extension
avro azure-blob-storage azure-storage cloud-storage exasol exasol-integration gcs orc parquet s3
Last synced: 4 months ago
JSON representation
Exasol Cloud Storage Extension for accessing formatted data Avro, Orc and Parquet, on public cloud storage systems
- Host: GitHub
- URL: https://github.com/exasol/cloud-storage-extension
- Owner: exasol
- License: mit
- Created: 2018-11-22T15:20:59.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2025-11-28T10:19:40.000Z (6 months ago)
- Last Synced: 2025-11-30T17:57:02.340Z (6 months ago)
- Topics: avro, azure-blob-storage, azure-storage, cloud-storage, exasol, exasol-integration, gcs, orc, parquet, s3
- Language: Scala
- Homepage:
- Size: 34 MB
- Stars: 8
- Watchers: 15
- Forks: 11
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
- Authors: AUTHORS.md
Awesome Lists containing this project
README
# Exasol Cloud Storage Extension

[](https://github.com/exasol/cloud-storage-extension/actions/workflows/ci-build.yml)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
[](https://sonarcloud.io/dashboard?id=com.exasol%3Acloud-storage-extension)
## Overview
Exasol Cloud Storage Extension provides [Exasol][exasol] user-defined functions (UDFs) for accessing formatted data stored in public cloud storage systems.
## Features
* Imports formatted data from public cloud storage systems.
* Supports the following data formats for importing: [Apache Avro][avro], [Apache Orc][orc] and [Apache Parquet][parquet].
* Allows data import from [Delta Lake](https://delta.io/).
* Supports table export as Apache Parquet format to public cloud storage systems.
* Supports the following cloud storage systems: [Amazon S3][s3], [Google Cloud Storage][gcs], [Azure Blob Storage][azure-blob], [Azure Data Lake (Gen1) Storage][azure-data-lake] and [Azure Data Lake (Gen2) Storage][azure-data-lake-gen2].
* Supports [Hadoop Distributed Filesystem (HDFS)][hdfs] and [Alluxio][alluxio-overview-link] filesystems.
* Allows configuration of parallel importer or exporter processes.
## Information for Users
For more information please check out the following guides.
* [User Guide](doc/user_guide/user_guide.md)
* [Changelog](doc/changes/changelog.md)
## Information for Contributors
* [General Developer Guide for Import-Export UDF][developer-guide]
* [Project Specific Developer Guide](doc/developers_guide/developers_guide.md)
* [Dependencies](dependencies.md)
[exasol]: https://www.exasol.com/en/
[avro]: https://avro.apache.org/
[orc]: https://orc.apache.org/
[parquet]: https://parquet.apache.org/
[s3]: https://aws.amazon.com/s3/
[gcs]: https://cloud.google.com/storage/
[azure-blob]: https://azure.microsoft.com/en-us/services/storage/blobs/
[azure-data-lake]: https://azure.microsoft.com/en-us/solutions/data-lake/
[azure-data-lake-gen2]: https://azure.microsoft.com/en-us/services/storage/data-lake-storage/
[hdfs]: https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
[alluxio-overview-link]: https://docs.alluxio.io/os/user/stable/en/Overview.html
[developer-guide]: https://github.com/exasol/import-export-udf-common-scala/blob/master/doc/development/developer_guide.md