{"id":18046611,"url":"https://github.com/ltla/byteme","last_synced_at":"2026-04-01T21:18:05.063Z","repository":{"id":42991588,"uuid":"441788984","full_name":"LTLA/byteme","owner":"LTLA","description":"C++ utilities for simple buffered inputs.","archived":false,"fork":false,"pushed_at":"2026-03-25T01:07:39.000Z","size":3212,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2026-03-26T07:20:03.375Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://ltla.github.io/byteme/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LTLA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-12-26T01:38:12.000Z","updated_at":"2026-03-25T01:05:56.000Z","dependencies_parsed_at":"2025-03-17T18:23:30.549Z","dependency_job_id":"28e3abcd-c56d-4d68-b51e-e3578b710133","html_url":"https://github.com/LTLA/byteme","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/LTLA/byteme","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbyteme","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbyteme/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbyteme/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbyteme/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LTLA","download_url":"https://codeload.github.com/LTLA/byteme/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Fbyteme/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31292127,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T21:15:39.731Z","status":"ssl_error","status_checked_at":"2026-04-01T21:15:34.046Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-30T19:08:10.694Z","updated_at":"2026-04-01T21:18:05.055Z","avatar_url":"https://github.com/LTLA.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gimme some bytes \n\n![Unit tests](https://github.com/LTLA/byteme/actions/workflows/run-tests.yaml/badge.svg)\n![Documentation](https://github.com/LTLA/byteme/actions/workflows/doxygenate.yaml/badge.svg)\n[![codecov](https://codecov.io/gh/LTLA/byteme/branch/master/graph/badge.svg?token=7I3UBJLHSO)](https://codecov.io/gh/LTLA/byteme)\n\n## Overview\n\nThis library implements a few functors to read buffered inputs from uncompressed or Gzip-compressed files or buffers.\nClasses can be exchanged at compile- or run-time to easily re-use the same code across different input sources.\nThe aim is to consolidate some common boilerplate across several projects, e.g., [**tatami**](https://github.com/LTLA/tatami), [**singlepp**](https://github.com/LTLA/singlepp).\nInterfacing with Zlib is particularly fiddly and I don't want to be forced to remember how to do it in each project.\n\n## Usage\n\nTo read bytes, create an instance of the desired `Reader` class and loop until no bytes remain in the source.\n\n```cpp\n#include \"byteme/byteme.hpp\"\n\nconst char* filepath = \"input.gz\";\nbyteme::GzipFileReader reader(filepath, {}); \n\nstd::vector\u003cunsigned char\u003e buffer(20);\nwhile (1) {\n    // read() returns the number of bytes that were actually read into the buffer.\n    auto num_read = reader.read(buffer.data(), buffer.size());\n\n    /* Do something with the available bytes in the buffer */\n\n    if (num_read \u003c buffer.size()) {\n        // If fewer bytes are read than requested, the input is finished.\n        break;\n    }\n}\n```\n\nTo write bytes, create the desired `Writer` class and supply an array of bytes until completion.\n\n```cpp\n#include \"byteme/byteme.hpp\"\n\nstd::vector\u003cstd::string\u003e lyrics { \n    \"Kimi dake o kimi dake o\", \n    \"Suki de ita yo\",\n    \"Kaze de me ga nijinde\",\n    \"Tooku naru yo\"\n};\n\nbyteme::GzipFileWriter writer(\"something.gz\", {});\nconst char newline = '\\n';\nfor (const auto\u0026 line : lyrics) {\n    writer.write(reinterpret_cast\u003cconst unsigned char*\u003e(line.c_str()), line.size());\n    writer.write(reinterpret_cast\u003cconst unsigned char*\u003e(\u0026newline), 1);\n}\n\nwriter.finish();\n```\n\nMore details can be found in the [reference documentation](https://ltla.github.io/byteme).\n\n## Supported classes\n\nFor the readers:\n\n| Class | Description |\n|-------|-------------|\n|`RawBufferReader`| Read from a uncompressed buffer|\n|`RawFileReader`| Read from an uncompressed file|\n|`ZlibBufferReader`| Read from a Zlib-compressed buffer|\n|`GzipFileReader`| Read from an Gzip-compressed file|\n|`IstreamReader`| Read from a `std::istream`|\n\nFor the writers:\n\n| Class | Description |\n|-------|-------------|\n|`RawBufferWriter`| Write to a uncompressed buffer|\n|`RawFileWriter`| Write to an uncompressed file|\n|`ZlibBufferWriter`| Write to a Zlib-compressed buffer|\n|`GzipFileWriter`| Write to an Gzip-compressed file|\n|`OstreamWriter`| Write to a `std::ostream`|\n\nThe different subclasses can be switched at compile time via templating, or at run-time by exploiting the class hierarchy:\n\n```cpp\n#include \"byteme/byteme.hpp\"\n#include \u003cmemory\u003e\n\nstd::vector\u003cunsigned char\u003e input_buffer;\nauto buffer = input_buffer.data();\nsize_t length = input_buffer.size();\n\nstd::unique_ptr\u003cbyteme::Reader\u003e ptr;\nif (some_condition) {\n    ptr.reset(new byteme::ZlibBufferReader(buffer, length, {}));\n} else {\n    ptr.reset(new byteme::RawBufferReader(buffer, length));\n}\n\n// Read bytes into the buffer from an abstract input source. \nstd::vector\u003cunsigned char\u003e buffer(123);\nauto available = ptr-\u003eread(buffer.data(), buffer.size());\n```\n\nMost of the `Reader` and `Writer` constructors will also accept a matching `Options` instance to fine-tune their behavior.\n\n```cpp\n// For readers.\nbyteme::ZlibBufferReaderOptions zopt;\nzopt.buffer_size = 8096;\nzopt.mode = byteme::ZlibCompressionMode::GZIP;\nbyteme::ZlibBufferReader zreader(buffer, length, zopt);\n\n// For writers.\nbyteme::ZlibBufferWriterOptions zwopt;\nzwopt.buffer_size = 8096;\nzwopt.mode = byteme::ZlibCompressionMode::DEFLATE;\nzwopt.compression_level = 9;\nbyteme::ZlibBufferReader zwriter(zwopt);\n```\n\n## Buffered reading and writing\n\nSome applications need to access small chunks or individual bytes from the input stream.\nCalling `Reader::read()` for each request could be too expensive, e.g., if each call makes some attempt to access a storage device.\nIn such cases, users can create a `BufferedReader` class to wrap each `Reader`.\nThis will read a large chunk into a buffer from which smaller chunks or individual bytes can be extracted.\n\n```cpp\nauto reader = std::make_unique\u003cbyteme::GzipFileReader\u003e(filepath, {})\nbyteme::SerialBufferedReader\u003cchar\u003e pb(std::move(reader), /* buffer_size = */ 65536);\nauto valid = pb.valid();\nwhile (valid) {\n    char x = pb.get();\n    // Do something with 'x'.\n    valid = pb.advance();\n}\n```\n\nWe can also extract a range of bytes:\n\n```cpp\nauto reader = std::make_unique\u003cbyteme::GzipFileReader\u003e(filepath, {})\nbyteme::SerialBufferedReader\u003cunsigned char\u003e pb(std::move(reader), /* buffer_size = */ 65536);\nwhile (valid) {\n    std::int32_t value;\n    auto outcome = pb.extract(reinterpret_cast\u003cunsigned char*\u003e(\u0026value), sizeof(std::int32_t)); \n    if (outcome.first != sizeof(std::int32_t)) {\n        // uh oh, not enough bytes.\n    } else {\n        // do something with the extracted integer.\n    }\n    valid = outcome.second;\n}\n```\n\nWe can even perform the reading in a separate thread via the `ParallelBufferedReader` class.\nThis allows the (possibly expensive) disk IO operations to be performed in parallel to the user-level parsing.\n\n```cpp\nauto reader = std::make_unique\u003cbyteme::GzipFileReader\u003e(filepath, {})\nbyteme::ParallelBufferedReader\u003cchar\u003e pb(std::move(reader), /* buffer_size = */ 65536);\nauto valid = pb.valid();\nwhile (valid) {\n    char x = pb.get();\n    // Do something with 'x'.\n    valid = pb.advance();\n}\n```\n\nSimilarly, `BufferedWriter` will cache all write requests into a large buffer,\nintermittently calling `Writer::write()` to push the buffered bytes to the underlying storage.\n\n```cpp\nauto writer = std::make_unique\u003cbyteme::GzipFileWriter\u003e(filepath, {})\nbyteme::SerialBufferedWriter\u003cchar\u003e pb(std::move(writer), /* buffer_size = */ 65536);\n\nstd::string input(\"foobarwhee\");\nfor (auto i : input) { // write individual bytes.\n    pb.write(i);\n}\n\npb.write(input.c_str(), input.size()); // or write an array.\n\npb.finish(); // flush everything to file.\n```\n\n## Building projects\n\n### CMake using `FetchContent`\n\nIf you're using CMake, you just need to add something like this to your `CMakeLists.txt`:\n\n```cmake\ninclude(FetchContent)\n\nFetchContent_Declare(\n  byteme \n  GIT_REPOSITORY https://github.com/LTLA/byteme\n  GIT_TAG master # or any version of interest\n)\n\nFetchContent_MakeAvailable(byteme)\n```\n\nThen you can link to **byteme** to make the headers available during compilation:\n\n```cmake\n# For executables:\ntarget_link_libraries(myexe byteme)\n\n# For libaries\ntarget_link_libraries(mylib INTERFACE byteme)\n```\n\n### CMake using `find_package()`\n\nYou can install the library by cloning a suitable version of this repository and running the following commands:\n\n```sh\nmkdir build \u0026\u0026 cd build\ncmake .. -DBYTEME_TESTS=OFF\ncmake --build . --target install\n```\n\nThen you can use `find_package()` as usual:\n\n```cmake\nfind_package(ltla_byteme CONFIG REQUIRED)\ntarget_link_libraries(mylib INTERFACE ltla::byteme)\n```\n\n### Manual\n\nIf you're not using CMake, the simple approach is to just copy the files the `include/` subdirectory -\neither directly or with Git submodules - and include their path during compilation with, e.g., GCC's `-I`.\n\n### Adding Zlib support\n\nTo support Gzip-compressed files, we also need to link to Zlib.\nWhen using CMake, **byteme** will automatically attempt to use `find_package()` to find the system Zlib.\nIf no Zlib is found, it is skipped and no Gzip functionality is provided by the libary.\nUsers can also set the `BYTEME_FIND_ZLIB` option to `OFF` to provide their own Zlib.\n\n## Further comments\n\nI thought about using C++ streams, much like how the [**zstr**](https://github.com/mateidavid/zstr) library handles Gzip (de)compression.\nHowever, I'm not very knowledgeable about the `std::istream` interface, so I decided to go with something simpler.\nJust in case, I did add a `byteme::IstreamReader` class so that **byteme** clients can easily leverage custom streams. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fltla%2Fbyteme","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fltla%2Fbyteme","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fltla%2Fbyteme/lists"}