{"id":19427127,"url":"https://github.com/anergictcell/s3reader","last_synced_at":"2025-04-24T17:31:21.105Z","repository":{"id":57751553,"uuid":"524770637","full_name":"anergictcell/s3reader","owner":"anergictcell","description":"A Rust library for random access to S3 objects","archived":false,"fork":false,"pushed_at":"2024-05-01T08:17:37.000Z","size":20,"stargazers_count":7,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-17T12:57:08.082Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anergictcell.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-08-14T20:15:17.000Z","updated_at":"2024-05-03T15:15:44.000Z","dependencies_parsed_at":"2024-11-10T14:20:43.212Z","dependency_job_id":null,"html_url":"https://github.com/anergictcell/s3reader","commit_stats":{"total_commits":15,"total_committers":1,"mean_commits":15.0,"dds":0.0,"last_synced_commit":"55b4284b6bb33444085d057ec7e6da2a8c7d81e5"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anergictcell%2Fs3reader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anergictcell%2Fs3reader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anergictcell%2Fs3reader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anergictcell%2Fs3reader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anergictcell","download_url":"https://codeload.github.com/anergictcell/s3reader/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250674301,"owners_count":21469194,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T14:10:30.606Z","updated_at":"2025-04-24T17:31:20.863Z","avatar_url":"https://github.com/anergictcell.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build](https://github.com/anergictcell/s3reader/actions/workflows/build.yml/badge.svg)](https://github.com/anergictcell/s3reader/actions/workflows/build.yml)\n[![crates.io](https://img.shields.io/crates/v/s3reader?color=#3fb911)](https://crates.io/crates/s3reader)\n[![doc-rs](https://img.shields.io/docsrs/s3reader/latest)](https://docs.rs/s3reader/latest/s3reader/)\n\n# S3Reader\n\nA `Rust` library to read from S3 object as if they were files on a local filesystem (almost). The `S3Reader` adds both `Read` and `Seek` traits, allowing to place the cursor anywhere within the S3 object and read from any byte offset. This allows random access to bytes within S3 objects.\n\n## Usage\nAdd this to your `Cargo.toml`:\n\n```text\n[dependencies]\ns3reader = \"1.0.0\"\n```\n\n### Use `BufRead` to read line by line\n```rust\nuse std::io::{BufRead, BufReader};\n\nuse s3reader::S3Reader;\nuse s3reader::S3ObjectUri;\n\n\nfn read_lines_manually() -\u003e std::io::Result\u003c()\u003e {\n    let uri = S3ObjectUri::new(\"s3://my-bucket/path/to/huge/file\").unwrap();\n    let s3obj = S3Reader::open(uri).unwrap();\n\n    let mut reader = BufReader::new(s3obj);\n\n    let mut line = String::new();\n    let len = reader.read_line(\u0026mut line).unwrap();\n    println!(\"The first line \u003e\u003e{line}\u003c\u003c is {len} bytes long\");\n\n    let mut line2 = String::new();\n    let len = reader.read_line(\u0026mut line2).unwrap();\n    println!(\"The next line \u003e\u003e{line2}\u003c\u003c is {len} bytes long\");\n\n    Ok(())\n}\n\nfn use_line_iterator() -\u003e std::io::Result\u003c()\u003e {\n    let uri = S3ObjectUri::new(\"s3://my-bucket/path/to/huge/file\").unwrap();\n    let s3obj = S3Reader::open(uri).unwrap();\n\n    let reader = BufReader::new(s3obj);\n\n    let mut count = 0;\n    for line in reader.lines() {\n        println!(\"{}\", line.unwrap());\n        count += 1;\n    }\n\n    Ok(())\n}\n```\n\n### Use `Seek` to jump to positions\n```rust\nuse std::io::{Read, Seek, SeekFrom};\n\nuse s3reader::S3Reader;\nuse s3reader::S3ObjectUri;\n\nfn jump_within_file() -\u003e std::io::Result\u003c()\u003e {\n    let uri = S3ObjectUri::new(\"s3://my-bucket/path/to/huge/file\").unwrap();\n    let mut reader = S3Reader::open(uri).unwrap();\n\n    let len = reader.len();\n\n    let cursor_1 = reader.seek(SeekFrom::Start(len as u64)).unwrap();\n    let cursor_2 = reader.seek(SeekFrom::End(0)).unwrap();\n    assert_eq!(cursor_1, cursor_2);\n\n    reader.seek(SeekFrom::Start(10)).unwrap();\n    let mut buf = [0; 100];\n    let bytes = reader.read(\u0026mut buf).unwrap();\n    assert_eq!(buf.len(), 100);\n    assert_eq!(bytes, 100);\n\n    Ok(())\n}\n```\n\n\n## Q/A\n**Does this library really provide random access to S3 objects?**  \nAccording to this [StackOverflow answer](https://stackoverflow.com/questions/60176997/does-aws-s3-getobject-provide-random-access), yes.\n\n**Are the reads sync or async?**  \nThe S3-SDK uses mostly async operations, but the `Read` and `Seek` traits require sync methods. Due to this, I'm using a blocking tokio runtime to wrap the async calls. This might not be the best solution, but works well for me. Any improvement suggestions are very welcome\n\n**Why is this useful?**  \nDepends on your use-cases. If you need to access random bytes in the middle of large files/S3 object, this library is useful. For example, you can read it to stream mp4 files. It's also quite useful for some bioinformatic applications, where you might have a huge, several GB reference genome, but only need to access data of a few genes, accounting to only a few MB.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanergictcell%2Fs3reader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanergictcell%2Fs3reader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanergictcell%2Fs3reader/lists"}