An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with orc

A curated list of projects in awesome lists tagged with orc .

https://github.com/apache/orc

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads

apache big-data cpp java orc

Last synced: 08 Jan 2026

https://github.com/sksamuel/centurion

Kotlin Bigdata Toolkit

avro java kotlin orc parquet

Last synced: 09 Apr 2026

https://github.com/Eugene-Mark/bigdata-file-viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

avro bigdata hdfs orc parquet

Last synced: 20 Nov 2025

https://github.com/51zero/eel-sdk

Big Data Toolkit for the JVM

big-data etl hadoop hive kafka kudu orc parquet scala

Last synced: 13 Apr 2025

https://github.com/noirello/pyorc

Python module for Apache ORC file format

apache-orc orc python3

Last synced: 21 Oct 2025

https://github.com/ordbase/generative-orc-721

Documentation for the proposed Generative ORC-721 Protocol / Standard for Bitcoin & Co. (Also Known As OG, Ordgen, Ordinal Generative)

bitcoin brc diybirdies diycoolcats diypunks generative og orc orc-721 ordgen ordinals ordlite pixelart punks

Last synced: 27 Mar 2026

https://github.com/dbiir/paraflow

A real-time analytical system for ID-associated data

hadoop kafka orc parquet presto spark-sql

Last synced: 13 Mar 2026

https://github.com/k-orc/openstack-resource-controller

A set of Kubernetes controllers to manage your OpenStack infrastructure

controller kubernetes openstack orc

Last synced: 15 Dec 2025

https://github.com/eclecticlogic/eclectic-orc

Annotation driven Java object writer for ORC with runtime code generation for speed.

converter java orc writer

Last synced: 01 Feb 2026

https://github.com/iflycn/hero

百万英雄答题助手 - 兼容全部答题 APP

adb android crawler orc python3

Last synced: 09 Jul 2025

https://github.com/apache/orc-format

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads

apache big-data cpp java orc

Last synced: 09 Apr 2025

https://github.com/nlpoptimize/formatflex

🚀 A flexible Python library for easy handling and conversion of Hierarchical, Tabular, and Serialized data formats.

bson cbor cloudpickle csv dill excel feather hdf5 joblib json messagepack orc parquet pickle python toml ubjson xml yaml

Last synced: 06 May 2026

https://github.com/exasol/cloud-storage-extension

Exasol Cloud Storage Extension for accessing formatted data Avro, Orc and Parquet, on public cloud storage systems

avro azure-blob-storage azure-storage cloud-storage exasol exasol-integration gcs orc parquet s3

Last synced: 12 Feb 2026

https://github.com/gordonmurray/apache_flink_and_paimon

Trying out Apache Paimon with Apache Flink using Docker Compose

apache-flink orc paimon s3

Last synced: 24 Aug 2025

https://github.com/XCollab/NoteMaster-AI

NoteMaster AI's FastAPI component provides a robust backend service for transforming photos into structured notes using AI. It handles image processing, text extraction, and note generation, offering a RESTful API for seamless integration into various applications.

ai ai-notes app g4f imagetotext llm notes notes-app orc python3 streamlit streamlit-webapp

Last synced: 04 Apr 2025

https://github.com/shutterstock/orc-metadata-reader

Python ORC metadata reader

hadoop metadata orc python

Last synced: 13 Apr 2025

https://github.com/weliveindetail/blog

Sporadic details on compilers, code and tooling from the world of LLVM

cpp jit lldb llvm native orc remote

Last synced: 12 Apr 2025

https://github.com/nim-works/arc

a hack to access reference counters with atomics

arc atomics concurrency continuations counting isolate memory nim orc ref reference threads

Last synced: 13 Jun 2025

https://github.com/xbony2/atom-orc

Atom package for Orc syntax highlighting.

atom atom-package orc

Last synced: 11 Feb 2026

https://github.com/igor-suhorukov/arrow_to_database

Import data from Arrow Dataset API into relational DB via JDBC

arrow h2-database jdbc-connector orc parquet postgresql questdb

Last synced: 13 Apr 2026

https://github.com/davidkhala/data

just index of data indexes

arrow avro lakehouse notebook orc

Last synced: 05 Oct 2025

https://github.com/karolkiljan/krux

Ork z kopalni. Mówi mało. Wie dużo. Pomaga.

claude-code orc plugin polish token-compression

Last synced: 23 Apr 2026

https://github.com/silvanheller/parquet-demo

Parquet demo project for the Workshop in the Course DIS. Benchmarks Parquet versus ORC, JSON and CSV

benchmark orc parquet r scala spark university-project

Last synced: 16 Apr 2026

https://github.com/yaceychen/nimsync

🚀 Implement lock-free SPSC and MPSC channels in Nim, ensuring production-grade performance with thorough benchmarking and industry standards compliance.

async backpressure channels chronos concurrency high-performance lockfree low-latency mpmc nim orc runtime spsc structured-concurrency zero-gc

Last synced: 01 Jun 2026

https://github.com/aikuyun/orcdemo

ORC use cases 、guied、 study materials、references

bigdata orc

Last synced: 23 Jul 2025

https://github.com/spektom/data-formats-samples

Spark-based different data formats samples generator

avro json orc parquet spark

Last synced: 06 Jul 2025

https://github.com/mattpopovich/dataframeioperformancetesting

Tests the speed and file size of reading and writing DataFrames to/from disk with different file and compression types

csv csv-format feather file-io hdf5 hdf5-format orc orc-format pandas pandas-dataframe pandas-python parquet parquet-files parquet-format pickle pickle-file python python3

Last synced: 07 Oct 2025

https://github.com/DevRopen/Ragnar

🦀 Just Another Rust Crate!

clivern is-up orc orc-rs rust rust-crate

Last synced: 10 May 2025

https://github.com/barlou/tools

Reusable Python tools for data engineering pipelines — cloud storage client (AWS S3, OVH), structured logging with cloud flush strategies, and Hive-partitioned Parquet/ORC archiving. Built for Airflow tasks and RL training workloads.

airflow aws-s3 cloud-storage data-engineering github-actions logging orc ovh parquet python

Last synced: 03 May 2026