{"id":16120486,"url":"https://github.com/sksamuel/centurion","last_synced_at":"2026-04-09T00:32:23.979Z","repository":{"id":57727087,"uuid":"13625531","full_name":"sksamuel/centurion","owner":"sksamuel","description":"Kotlin Bigdata Toolkit","archived":false,"fork":false,"pushed_at":"2025-09-09T15:40:39.000Z","size":1191,"stargazers_count":335,"open_issues_count":1,"forks_count":45,"subscribers_count":18,"default_branch":"master","last_synced_at":"2026-01-14T01:46:44.961Z","etag":null,"topics":["avro","java","kotlin","orc","parquet"],"latest_commit_sha":null,"homepage":"","language":"Kotlin","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sksamuel.png","metadata":{"files":{"readme":"README.md","changelog":"changelog.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2013-10-16T17:10:44.000Z","updated_at":"2025-12-12T13:32:04.000Z","dependencies_parsed_at":"2025-05-18T04:06:09.091Z","dependency_job_id":"9a019d89-c42b-411a-bc7d-cfacb990a262","html_url":"https://github.com/sksamuel/centurion","commit_stats":null,"previous_names":["sksamuel/centurion","sksamuel/akka-patterns","sksamuel/rxhive","sksamuel/kotlin-big-data"],"tags_count":25,"template":false,"template_full_name":null,"purl":"pkg:github/sksamuel/centurion","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sksamuel%2Fcenturion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sksamuel%2Fcenturion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sksamuel%2Fcenturion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sksamuel%2Fcenturion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sksamuel","download_url":"https://codeload.github.com/sksamuel/centurion/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sksamuel%2Fcenturion/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31579951,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T14:31:17.711Z","status":"ssl_error","status_checked_at":"2026-04-08T14:31:17.202Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["avro","java","kotlin","orc","parquet"],"created_at":"2024-10-09T20:58:31.368Z","updated_at":"2026-04-09T00:32:23.969Z","avatar_url":"https://github.com/sksamuel.png","language":"Kotlin","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Centurion \u003cimg src=\"logo.png\" height=\"50\"\u003e\n\n![master](https://github.com/sksamuel/centurion/workflows/master/badge.svg)\n[\u003cimg src=\"https://img.shields.io/maven-central/v/com.sksamuel.centurion/centurion-avro.svg?label=latest%20release\"/\u003e](https://central.sonatype.com/artifact/com.sksamuel.centurion/centurion-avro)\n[\u003cimg src=\"https://img.shields.io/maven-metadata/v?metadataUrl=https%3A%2F%2Fcentral.sonatype.com%2Frepository%2Fmaven-snapshots%2Fcom%2Fsksamuel%2Fcenturion%2Fcenturion-avro%2Fmaven-metadata.xml\u0026strategy=highestVersion\u0026label=maven-snapshot\"\u003e](https://central.sonatype.com/repository/maven-snapshots/com/sksamuel/centurion/centurion-avro/maven-metadata.xml)\n![License](https://img.shields.io/github/license/sksamuel/centurion.svg?style=plastic)\n\nCenturion is a high-performance Kotlin toolkit for working with columnar and streaming data formats in a type-safe, \nidiomatic way. Built on top of proven Apache libraries, it provides zero-copy serialization, automatic code generation,\nand seamless integration with modern JVM applications.\n\n## Why Centurion?\n\n- **Type-safe by design:** Leverage Kotlin's type system with compile-time guarantees and automatic null safety\n- **Zero-copy performance:** Optimized encoders/decoders with reflection caching and pooled resources\n- **Schema evolution made easy:** First-class support for forward/backward compatible schema changes\n- **Batteries included:** Support for 40+ types out of the box including temporal types, BigDecimal, collections\n- **Production ready:** Built on Apache Avro and Parquet - battle-tested formats used at scale\n\nSee [changelog](changelog.md) for release notes.\n\n## Features\n\n- **Type-safe schema definitions:** Define schemas using Kotlin's type system with compile-time safety\n- **Multiple format support:** Seamlessly work with Avro and Parquet formats\n- **High-performance Serde API:** Zero-copy serialization with automatic compression support  \n- **Schema evolution:** Forward and backward compatible schema changes for Avro\n- **Code generation:** Generate data classes and optimized encoders/decoders from Avro schemas\n- **Redis integration:** Built-in Lettuce codecs for caching Avro data\n- **Streaming operations:** Efficient streaming readers and writers for large datasets\n- **Kotlin-first design:** Idiomatic APIs with null safety, data classes, and extension functions\n\n## Getting Started\n\nAdd Centurion to your build depending on which formats you need:\n\n```kotlin\n// For Avro support\nimplementation(\"com.sksamuel.centurion:centurion-avro:\u003cversion\u003e\")\n\n// For Parquet support\nimplementation(\"com.sksamuel.centurion:centurion-parquet:\u003cversion\u003e\")\n```\n\n## Quick Start\n\nHere's a complete example to get you started:\n\n```kotlin\nimport com.sksamuel.centurion.avro.io.serde.BinarySerde\nimport java.math.BigDecimal\n\n// Define your domain model\ndata class Product(\n    val id: Long,\n    val name: String,\n    val price: BigDecimal,\n    val inStock: Boolean,\n    val tags: List\u003cString\u003e\n)\n\n// Create a serde (serializer/deserializer)\nval serde = BinarySerde\u003cProduct\u003e()\n\n// Your data\nval product = Product(\n    id = 12345L,\n    name = \"Kotlin in Action\",\n    price = BigDecimal(\"39.99\"),\n    inStock = true,\n    tags = listOf(\"books\", \"programming\", \"kotlin\")\n)\n\n// Serialize to bytes\nval bytes = serde.serialize(product)\n\n// Deserialize back to object\nval restored = serde.deserialize(bytes)\nprintln(restored) // Product(id=12345, name=Kotlin in Action, ...)\n```\n\n## Avro Operations\n\n### Writing Avro Data\n\n```kotlin\nimport com.sksamuel.centurion.Schema\nimport com.sksamuel.centurion.Struct\nimport com.sksamuel.centurion.avro.io.BinaryWriter\nimport com.sksamuel.centurion.avro.encoders.ReflectionRecordEncoder\nimport com.sksamuel.centurion.avro.schemas.toAvroSchema\nimport org.apache.avro.io.EncoderFactory\nimport java.io.FileOutputStream\n\n// Define your schema\nval schema = Schema.Struct(\n  Schema.Field(\"id\", Schema.Int64),\n  Schema.Field(\"name\", Schema.Strings),\n  Schema.Field(\"timestamp\", Schema.TimestampMillis)\n)\n\n// Create some data\nval records = listOf(\n  Struct(schema, 1L, \"Alice\", System.currentTimeMillis()),\n  Struct(schema, 2L, \"Bob\", System.currentTimeMillis()),\n  Struct(schema, 3L, \"Charlie\", System.currentTimeMillis())\n)\n\n// Write to Avro binary format\nFileOutputStream(\"users.avro\").use { output -\u003e\n  val avroSchema = schema.toAvroSchema()\n  val writer = BinaryWriter(\n    schema = avroSchema,\n    out = output,\n    ef = EncoderFactory.get(),\n    encoder = ReflectionRecordEncoder(avroSchema, Struct::class),\n    reuse = null\n  )\n  records.forEach { writer.write(it) }\n  writer.close()\n}\n```\n\n### Reading Avro Data\n\n```kotlin\nimport com.sksamuel.centurion.avro.io.BinaryReader\nimport com.sksamuel.centurion.avro.decoders.ReflectionRecordDecoder\nimport org.apache.avro.io.DecoderFactory\nimport java.io.FileInputStream\n\n// Read from Avro binary format  \nFileInputStream(\"users.avro\").use { input -\u003e\n  val avroSchema = schema.toAvroSchema()\n  val reader = BinaryReader(\n    schema = avroSchema,\n    input = input,\n    factory = DecoderFactory.get(),\n    decoder = ReflectionRecordDecoder(avroSchema, Struct::class),\n    reuse = null\n  )\n  // BinaryReader reads one record per file\n  val struct = reader.read()\n  println(\"User: ${struct[\"name\"]}, ID: ${struct[\"id\"]}\")\n}\n```\n\n## Parquet Operations\n\n### Writing Parquet Data\n\n```kotlin\nimport com.sksamuel.centurion.parquet.Parquet\nimport org.apache.hadoop.conf.Configuration\nimport org.apache.hadoop.fs.Path\n\n// Define schema and data\nval schema = Schema.Struct(\n  Schema.Field(\"product_id\", Schema.Strings),\n  Schema.Field(\"quantity\", Schema.Int32),\n  Schema.Field(\"price\", Schema.Decimal(Schema.Precision(10), Schema.Scale(2)))\n)\n\nval data = listOf(\n  Struct(schema, \"PROD-001\", 10, java.math.BigDecimal(\"29.99\")),\n  Struct(schema, \"PROD-002\", 5, java.math.BigDecimal(\"15.50\")),\n  Struct(schema, \"PROD-003\", 20, java.math.BigDecimal(\"8.75\"))\n)\n\n// Write to Parquet\nval path = Path(\"sales.parquet\")\nval conf = Configuration()\nval writer = Parquet.writer(path, schema, conf)\n\ndata.forEach { struct -\u003e\n  writer.write(struct)\n}\nwriter.close()\n```\n\n### Reading Parquet Data\n\n```kotlin\nimport com.sksamuel.centurion.parquet.Parquet\n\n// Read from Parquet\nval path = Path(\"sales.parquet\")\nval conf = Configuration()\nval reader = Parquet.reader(path, conf)\n\nvar struct = reader.read()\nwhile (struct != null) {\n  println(\"Product: ${struct[\"product_id\"]}, Qty: ${struct[\"quantity\"]}\")\n  struct = reader.read()\n}\nreader.close()\n\n// Count records efficiently\nval recordCount = Parquet.count(listOf(path), conf)\nprintln(\"Total records: $recordCount\")\n```\n\n## Schema Conversion\n\nConvert between different format schemas:\n\n```kotlin\nimport com.sksamuel.centurion.avro.schemas.toAvroSchema\nimport com.sksamuel.centurion.parquet.schemas.ToParquetSchema\n\n// Convert Centurion schema to Avro schema\nval centurionSchema = Schema.Struct(\n  Schema.Field(\"name\", Schema.Strings),\n  Schema.Field(\"age\", Schema.Int32)\n)\n\nval avroSchema = centurionSchema.toAvroSchema()\n\n// Convert to Parquet schema\nval parquetSchema = ToParquetSchema.toParquetType(centurionSchema)\n```\n\n## Advanced Types\n\n### Working with Complex Types\n\n```kotlin\n// Array/List schema\nval numbersSchema = Schema.Array(Schema.Int32)\n\n// Map schema\nval metadataSchema = Schema.Map(Schema.Strings) // String keys, String values\n\n// Nested struct\nval addressSchema = Schema.Struct(\n  Schema.Field(\"street\", Schema.Strings),\n  Schema.Field(\"city\", Schema.Strings),\n  Schema.Field(\"zipcode\", Schema.Strings)\n)\n\nval personSchema = Schema.Struct(\n  Schema.Field(\"name\", Schema.Strings),\n  Schema.Field(\"address\", addressSchema),\n  Schema.Field(\"phone_numbers\", Schema.Array(Schema.Strings))\n)\n```\n\n### Temporal Types\n\n```kotlin\n// Timestamp types\nval eventSchema = Schema.Struct(\n  Schema.Field(\"event_name\", Schema.Strings),\n  Schema.Field(\"timestamp_millis\", Schema.TimestampMillis),\n  Schema.Field(\"timestamp_micros\", Schema.TimestampMicros)\n)\n\n// Create struct with temporal data\nval event = Struct(\n  eventSchema,\n  \"user_login\",\n  System.currentTimeMillis(),\n  System.currentTimeMillis() * 1000\n)\n```\n\n### Decimal Precision\n\n```kotlin\n// High-precision decimal for financial data\nval transactionSchema = Schema.Struct(\n  Schema.Field(\"transaction_id\", Schema.Strings),\n  Schema.Field(\"amount\", Schema.Decimal(\n    Schema.Precision(18), // 18 total digits\n    Schema.Scale(4)       // 4 decimal places\n  ))\n)\n\nval transaction = Struct(\n  transactionSchema,\n  \"TXN-123456\",\n  java.math.BigDecimal(\"1234.5678\")\n)\n```\n\n## Supported Types\n\nCenturion provides built-in encoders and decoders for a comprehensive set of types:\n\n### Avro Type Support\n\n| Type | Encoder/Decoder | Notes |\n|------|-----------------|-------|\n| **Primitives** | | |\n| `Byte`, `Short`, `Int`, `Long` | ✓ | Direct mapping to Avro types |\n| `Float`, `Double` | ✓ | IEEE 754 floating point |\n| `Boolean` | ✓ | |\n| `String` | ✓ | UTF-8 encoded, optimized with `globalUseJavaString` |\n| **Temporal Types** | | |\n| `Instant` | ✓ | TimestampMillis/TimestampMicros logical types |\n| `LocalDateTime` | ✓ | LocalTimestampMillis/LocalTimestampMicros |\n| `LocalTime` | ✓ | TimeMillis/TimeMicros logical types |\n| `OffsetDateTime` | ✓ | Converted to Instant |\n| **Numeric Types** | | |\n| `BigDecimal` | ✓ | Bytes/Fixed/String encodings with scale |\n| `UUID` | ✓ | String or fixed byte encoding |\n| **Collections** | | |\n| `List\u003cT\u003e`, `Set\u003cT\u003e` | ✓ | Generic support for any element type |\n| `Array\u003cT\u003e` | ✓ | Native array support |\n| `LongArray`, `IntArray` | ✓ | Optimized primitive arrays |\n| `Map\u003cString, T\u003e` | ✓ | String keys required by Avro |\n| **Binary** | | |\n| `ByteArray` | ✓ | Direct bytes type |\n| `ByteBuffer` | ✓ | Zero-copy when possible |\n| **Enums** | ✓ | Kotlin enum classes |\n| **Nullable Types** | ✓ | Full Kotlin null-safety support |\n| **Data Classes** | ✓ | Via reflection or code generation |\n\n## High-Performance Serde API\n\nThe Serde (Serializer/Deserializer) API provides a convenient way to convert between Kotlin objects and byte arrays with minimal overhead:\n\n```kotlin\nimport com.sksamuel.centurion.avro.io.serde.BinarySerde\n\n// Create a serde for your data class\ndata class User(val id: Long, val name: String, val email: String?)\n\nval serde = BinarySerde\u003cUser\u003e()\n\n// Serialize to bytes\nval user = User(123L, \"Alice\", \"alice@example.com\")\nval bytes = serde.serialize(user)\n\n// Deserialize from bytes\nval decoded = serde.deserialize(bytes)\n```\n\n### Compression Support\n\nApply compression transparently with `CompressingSerde`:\n\n```kotlin\nimport com.sksamuel.centurion.avro.io.serde.CompressingSerde\nimport org.apache.avro.file.CodecFactory\n\nval serde = CompressingSerde(\n    codec = CodecFactory.snappyCodec().createInstance(),\n    serde = BinarySerde\u003cUser\u003e()\n)\n\n// Automatically compresses on serialize, decompresses on deserialize\nval compressed = serde.serialize(user)\n```\n\n### Serde Factory Pattern\n\nFor applications managing multiple schemas:\n\n```kotlin\nimport com.sksamuel.centurion.avro.io.serde.SerdeFactory\nimport com.sksamuel.centurion.avro.io.serde.CachedSerdeFactory\n\n// Cache serde instances for reuse\nval factory = CachedSerdeFactory(SerdeFactory())\nval userSerde = factory.create\u003cUser\u003e()\nval orderSerde = factory.create\u003cOrder\u003e()\n```\n\n## Error Handling\n\nCenturion provides detailed error messages for schema mismatches and data validation:\n\n```kotlin\ntry {\n  // This will fail - wrong number of values\n  val invalidStruct = Struct(userSchema, 123L, \"John\") // Missing email and age\n} catch (e: IllegalArgumentException) {\n  println(\"Schema validation error: ${e.message}\")\n  // Output: Schema size 4 != values size 2\n}\n\ntry {\n  // This will fail - field doesn't exist\n  val value = user[\"nonexistent_field\"]\n} catch (e: IllegalStateException) {\n  println(\"Field access error: ${e.message}\")\n}\n```\n\n## Schema Evolution\n\nCenturion provides robust support for schema evolution, allowing your data formats to evolve over time without breaking compatibility:\n\n```kotlin\nimport com.sksamuel.centurion.avro.io.BinaryReader\nimport com.sksamuel.centurion.avro.io.BinaryWriter\nimport com.sksamuel.centurion.avro.encoders.ReflectionRecordEncoder\nimport com.sksamuel.centurion.avro.decoders.ReflectionRecordDecoder\nimport org.apache.avro.Schema\nimport org.apache.avro.SchemaBuilder\nimport org.apache.avro.io.DecoderFactory\nimport org.apache.avro.io.EncoderFactory\nimport java.io.FileInputStream\nimport java.io.FileOutputStream\n\n// Original schema\nval writerSchema = SchemaBuilder.record(\"User\").fields()\n    .requiredString(\"name\")\n    .requiredLong(\"id\")\n    .endRecord()\n\n// Evolved schema with new field\nval readerSchema = SchemaBuilder.record(\"User\").fields()\n    .requiredString(\"name\")\n    .requiredLong(\"id\")\n    .name(\"email\").type(Schema.create(Schema.Type.STRING)).withDefault(\"\")\n    .endRecord()\n\n// Old data can be read with new schema\ndata class UserV1(val name: String, val id: Long)\ndata class UserV2(val name: String, val id: Long, val email: String)\n\n// Write with old schema\nval output = FileOutputStream(\"user.avro\")\nval writer = BinaryWriter(\n  schema = writerSchema,\n  out = output,\n  ef = EncoderFactory.get(),\n  encoder = ReflectionRecordEncoder(writerSchema, UserV1::class),\n  reuse = null\n)\nwriter.write(UserV1(\"Alice\", 123L))\nwriter.close()\n\n// Read with new schema - email gets default value\nval input = FileInputStream(\"user.avro\")\nval reader = BinaryReader(\n  writerSchema = writerSchema,\n  readerSchema = readerSchema,\n  input = input,\n  factory = DecoderFactory.get(),\n  decoder = ReflectionRecordDecoder(readerSchema, UserV2::class),\n  reuse = null\n)\nval user: UserV2 = reader.read() // UserV2(\"Alice\", 123L, \"\")\n```\n\n## Redis Integration\n\nCenturion provides Redis codecs via the `centurion-avro-lettuce` module for high-performance caching:\n\n```kotlin\nimport com.sksamuel.centurion.avro.lettuce.GenericRecordCodec\nimport com.sksamuel.centurion.avro.lettuce.ReflectionDataClassCodec\nimport io.lettuce.core.RedisClient\nimport io.lettuce.core.codec.CompressionCodec\nimport io.lettuce.core.codec.StringCodec\nimport org.apache.avro.io.DecoderFactory\nimport org.apache.avro.io.EncoderFactory\n\n// For Kotlin data classes\ndata class User(val id: Long, val name: String)\n\nval dataClassCodec = ReflectionDataClassCodec\u003cUser\u003e(\n    encoderFactory = EncoderFactory.get(),\n    decoderFactory = DecoderFactory.get(),\n    kclass = User::class\n)\n\n// For generic Avro records\nval recordCodec = GenericRecordCodec(\n    schema = myAvroSchema,\n    encoderFactory = EncoderFactory.get(),\n    decoderFactory = DecoderFactory.get()\n)\n\n// Use with Redis (key as String, value as User)\nval client = RedisClient.create(\"redis://localhost\")\nval connection = client.connect(\n    RedisCodec.of(StringCodec.UTF8, dataClassCodec)\n)\nval commands = connection.sync()\n\ncommands.set(\"user:123\", User(123L, \"Alice\"))\nval user = commands.get(\"user:123\")\n```\n\n## Gradle Plugin for Code Generation\n\nGenerate Kotlin data classes from Avro schemas at build time:\n\n```kotlin\n// build.gradle.kts\nplugins {\n    id(\"com.sksamuel.centurion.avro\") version \"\u003cversion\u003e\"\n}\n\n// The plugin registers three tasks:\n\n// Generate data classes from Avro schemas\ntasks.generateDataClasses {\n    directory.set(\"src/main/avro\")\n}\n\n// Generate optimized encoders\ntasks.generateEncoders {\n    directory.set(\"src/main/avro\")\n}\n\n// Generate optimized decoders\ntasks.generateDecoders {\n    directory.set(\"src/main/avro\")\n}\n\n// Run code generation\n./gradlew generateDataClasses generateEncoders generateDecoders\n```\n\n## Performance Optimizations\n\nCenturion includes several performance optimizations:\n\n### Reflection Caching\n- Uses `LambdaMetafactory` and `MethodHandles` for fast field access\n- Caches enum constants mapping\n- Optimized primitive type handling\n\n### Resource Pooling\n```kotlin\n// Reuse binary encoders\nval writer = BinaryWriter(schema, output, encoder, reuse = myEncoder)\n\n// Connection pooling for Parquet\nval writer = Parquet.writer(path, schema, conf).apply {\n    // Writer configuration\n}\n```\n\n### Streaming Processing\n```kotlin\n// Stream large Parquet files without loading into memory\nval reader = Parquet.reader(path, conf)\nreader.sequence().forEach { struct -\u003e\n    // Process one record at a time\n}\n```\n\n## Configuration Options\n\n### Parquet Writer Settings\n\n| Option | Default | Description |\n|--------|---------|-------------|\n| `compressionCodec` | `SNAPPY` | Compression algorithm: UNCOMPRESSED, SNAPPY, GZIP, LZO, BROTLI, LZ4, ZSTD |\n| `dictionaryEncoding` | `true` | Enable dictionary encoding for string columns |\n| `rowGroupSize` | `134217728` | Row group size in bytes (128MB) |\n| `pageSize` | `1048576` | Page size in bytes (1MB) |\n| `writerVersion` | `PARQUET_1_0` | Parquet format version |\n| `validation` | `true` | Validate written data |\n\n```kotlin\nval settings = ParquetWriterSettings(\n    compressionCodec = CompressionCodecName.ZSTD,\n    dictionaryEncoding = true,\n    rowGroupSize = 256 * 1024 * 1024, // 256MB\n    pageSize = 2 * 1024 * 1024 // 2MB\n)\n\nval writer = Parquet.writer(path, conf, schema, settings = settings)\n```\n\n## Performance Tips\n\n- **Reuse readers/writers** when processing multiple files with the same schema\n- **Use streaming APIs** for large datasets to avoid loading everything into memory\n- **Choose appropriate compression** for Parquet files based on your data characteristics\n- **Batch operations** when writing multiple records to improve throughput\n- **Enable `globalUseJavaString`** for Avro when working primarily with Java strings\n- **Use primitive array types** (`LongArray`, `IntArray`) instead of boxed collections\n\n## When to Use Centurion\n\nCenturion shines in scenarios where you need:\n\n- **High-performance serialization** with minimal overhead for Kotlin/JVM applications\n- **Type-safe data persistence** with compile-time guarantees\n- **Schema evolution support** for long-lived data formats\n- **Integration with big data tools** (Spark, Hadoop, Hive)\n- **Redis caching** of complex domain objects\n\n### Comparison with Alternatives\n\n| Feature | Centurion | Protocol Buffers | JSON | Apache Avro (Direct) |\n|---------|-----------|------------------|------|---------------------|\n| **Kotlin-first API** | ✓ Idiomatic | ✗ Java-style | ✗ Manual parsing | ✗ Java API |\n| **Type Safety** | ✓ Compile-time | ✓ Code generation | ✗ Runtime | ✗ Runtime |\n| **Schema Evolution** | ✓ Full support | ✓ Limited | ✗ None | ✓ Full support |\n| **Performance** | ✓ Optimized | ✓ Fast | ✗ Slower | ✓ Fast |\n| **File Size** | ✓ Compact | ✓ Compact | ✗ Larger | ✓ Compact |\n| **Human Readable** | ✗ Binary | ✗ Binary | ✓ Yes | ✗ Binary |\n| **Big Data Integration** | ✓ Native | ✗ Limited | ✓ Common | ✓ Native |\n\n## Common Issues and Solutions\n\n### Schema Mismatch Errors\n\n```kotlin\n// Problem: Field name mismatch\ndata class User(val username: String) // Schema expects \"name\"\n\n// Solution: Use @AvroName annotation or match schema exactly\ndata class User(@AvroName(\"name\") val username: String)\n```\n\n### Performance Issues\n\n```kotlin\n// Problem: Creating new serde for each operation\nfun processUser(user: User) {\n    val serde = BinarySerde\u003cUser\u003e() // Don't do this repeatedly\n    // ...\n}\n\n// Solution: Reuse serde instances\nclass UserService {\n    private val serde = BinarySerde\u003cUser\u003e() // Create once\n    \n    fun processUser(user: User) {\n        val bytes = serde.serialize(user)\n        // ...\n    }\n}\n```\n\n### Memory Issues with Large Files\n\n```kotlin\n// Problem: Loading entire file into memory\nval allRecords = reader.readAll() // May cause OOM\n\n// Solution: Use streaming\nreader.sequence().forEach { record -\u003e\n    // Process one at a time\n}\n```\n\n## Modules\n\n| Module | Description |\n|--------|-------------|\n| `centurion-schemas` | Core schema definitions and Struct implementations |\n| `centurion-avro` | Avro format support with binary and data file I/O |\n| `centurion-parquet` | Parquet format support with Hadoop integration |\n| `centurion-avro-lettuce` | Redis integration for Avro serialization |\n| `centurion-avro-gradle-plugin` | Gradle plugin for code generation from Avro schemas |\n\n## License\n\n```\nThis software is licensed under the Apache 2 license, quoted below.\n\nCopyright 2024 Stephen Samuel\n\nLicensed under the Apache License, Version 2.0 (the \"License\"); you may not\nuse this file except in compliance with the License. You may obtain a copy of\nthe License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT\nWARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the\nLicense for the specific language governing permissions and limitations under\nthe License.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsksamuel%2Fcenturion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsksamuel%2Fcenturion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsksamuel%2Fcenturion/lists"}