{"id":48291771,"url":"https://github.com/hirosassa/divsufsort-rs","last_synced_at":"2026-04-04T23:09:42.449Z","repository":{"id":346244442,"uuid":"1189045028","full_name":"hirosassa/divsufsort-rs","owner":"hirosassa","description":"Pure Rust port of https://github.com/y-256/libdivsufsort — a fast suffix array construction library based on the induced-sorting (IS) algorithm.","archived":false,"fork":false,"pushed_at":"2026-03-23T04:27:46.000Z","size":70,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-23T19:47:18.158Z","etag":null,"topics":["algorithms","suffix-array"],"latest_commit_sha":null,"homepage":"https://docs.rs/divsufsort-rs","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hirosassa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-22T23:08:24.000Z","updated_at":"2026-03-23T09:01:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hirosassa/divsufsort-rs","commit_stats":null,"previous_names":["hirosassa/divsufsort-rs"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/hirosassa/divsufsort-rs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hirosassa%2Fdivsufsort-rs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hirosassa%2Fdivsufsort-rs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hirosassa%2Fdivsufsort-rs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hirosassa%2Fdivsufsort-rs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hirosassa","download_url":"https://codeload.github.com/hirosassa/divsufsort-rs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hirosassa%2Fdivsufsort-rs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31418289,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T20:09:54.854Z","status":"ssl_error","status_checked_at":"2026-04-04T20:09:44.350Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithms","suffix-array"],"created_at":"2026-04-04T23:09:41.772Z","updated_at":"2026-04-04T23:09:42.429Z","avatar_url":"https://github.com/hirosassa.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# divsufsort-rs\n\n[![Crates.io](https://img.shields.io/crates/v/divsufsort-rs.svg)](https://crates.io/crates/divsufsort-rs)\n[![Documentation](https://docs.rs/divsufsort-rs/badge.svg)](https://docs.rs/divsufsort-rs)\n[![build](https://github.com/hirosassa/divsufsort-rs/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/hirosassa/divsufsort-rs/actions/workflows/test.yaml)\n[![codecov](https://codecov.io/gh/hirosassa/divsufsort-rs/branch/main/graph/badge.svg?token=gSFzgTfVwv)](https://codecov.io/gh/hirosassa/divsufsort-rs)\n\nPure Rust port of [libdivsufsort](https://github.com/y-256/libdivsufsort) — a fast suffix array construction library based on the induced-sorting (IS) algorithm.\n\n## What it does\n\nConstructs the **suffix array** of a byte string in O(n log n) time and 5n + O(1) memory space. A suffix array is a sorted array of all suffixes of a string, and is a fundamental data structure for string search, data compression (BWT), and bioinformatics.\n\nThe implementation closely follows [the original C library](https://github.com/y-256/libdivsufsort) by Yuta Mori.\n\nThe B\\*-bucket sorting step is parallelised with [rayon](https://github.com/rayon-rs/rayon) when the `std` feature is enabled (default).\n\n### `no_std` support\n\nThis crate supports `no_std` environments (with `alloc`). Disable the default `std` feature:\n\n```toml\n[dependencies]\ndivsufsort-rs = { version = \"...\", default-features = false }\n```\n\nWhen `std` is disabled, rayon parallelism is unavailable and B\\*-bucket sorting runs sequentially.\n\n\u003e [!IMPORTANT]\n\u003e This crate uses `unsafe` Rust internally for performance. Specifically, raw pointer aliasing is used to allow the suffix array and its read-only PA view to share the same allocation (mirroring the original C code), and bounds checks are elided in hot inner loops where invariants can be proven statically. The public API is fully safe.\n\n## Usage\n\n```toml\n[dependencies]\ndivsufsort-rs = \"0.4\"\n```\n\n```rust\nuse divsufsort_rs::divsufsort;\n\nfn main() {\n    let text = b\"banana\";\n    let mut sa = vec![0i32; text.len()];\n    divsufsort(text, \u0026mut sa).unwrap();\n    // sa == [5, 3, 1, 0, 4, 2]  (indices of sorted suffixes)\n    println!(\"{:?}\", sa);\n}\n```\n\nBWT construction is also available:\n\n```rust\nuse divsufsort_rs::divbwt;\n\nlet text = b\"banana\";\nlet mut bwt = vec![0u8; text.len()];\nlet primary_index = divbwt(text, \u0026mut bwt, None).unwrap();\n// bwt == b\"nnbaaa\", primary_index == 3\n```\n\n## Benchmark\n\nBenchmarks run with [criterion](https://github.com/bheisler/criterion.rs) (`sample_size = 10`).\n\n### Environment\n\n| Item | Value |\n|---|---|\n| CPU | Apple M4 Max (16 logical cores) |\n| OS | macOS 15.5 |\n| Rust | 1.92.0 |\n\n### Corpora\n\n| Name | Description |\n|---|---|\n| `random_binary` | LCG pseudo-random bytes (alphabet size 256) |\n| `text_26` | LCG pseudo-random lowercase ASCII (alphabet size 26) |\n| `fibonacci` | Fibonacci string over `{a, b}` (highly repetitive) |\n\n### Results — Rust vs C libdivsufsort (1,000,000 bytes)\n\nCompared against the original **C libdivsufsort** compiled at `-O3`.\n\n| Corpus | Rust (this crate) | C libdivsufsort | Ratio |\n|---|---|---|---|\n| random_binary | 11.2 ms (84.9 MiB/s) | 13.7 ms (69.4 MiB/s) | **1.22× faster** |\n| text_26 | 13.2 ms (72.4 MiB/s) | 23.8 ms (40.1 MiB/s) | **1.80× faster** |\n| fibonacci | 30.1 ms (31.7 MiB/s) | 27.4 ms (34.8 MiB/s) | 0.91× |\n\nThe parallel B\\*-bucket sort drives the speedup for `random_binary` and `text_26`. For `fibonacci` the input produces only 1–2 non-trivial buckets, so parallelism provides no benefit and C is slightly faster due to lower single-thread overhead.\n\n### Results — `std` (parallel) vs `no_std` (serial)\n\nShows the effect of rayon parallelism. `std` is the default; `no_std` disables rayon and runs single-threaded.\n\n| Corpus | Size | `std` (rayon) | `no_std` (serial) | Ratio |\n|---|---|---|---|---|\n| random_binary | 100K | 1.29 ms | 1.44 ms | 1.12× |\n| random_binary | 1M | 11.8 ms | 17.8 ms | **1.51×** |\n| text_26 | 100K | 1.29 ms | 2.07 ms | **1.60×** |\n| text_26 | 1M | 14.0 ms | 28.3 ms | **2.02×** |\n| fibonacci | 100K | 2.53 ms | 2.47 ms | 0.98× |\n| fibonacci | 1M | 31.5 ms | 31.0 ms | 0.98× |\n\nFor corpora with many distinct B\\*-buckets (`random_binary`, `text_26`), rayon parallelism provides 1.5–2× speedup at 1M scale. Highly repetitive inputs (`fibonacci`) show no difference as they produce too few buckets to benefit from parallelism.\n\n### Running the benchmarks\n\n```sh\n# Rust vs C comparison (bench_compare)\n# requires the vendored C submodule — initialize it first:\ngit submodule update --init\ncargo bench --bench bench_compare --features c-bench\n\n# Lightweight benchmark — 3 corpora × 2 sizes, completes in ~2–3 minutes (bench_light)\ncargo bench --bench bench_light\n\n# Same benchmark in no_std (serial) mode\ncargo bench --bench bench_light --no-default-features\n\n# Full benchmark — larger sizes and more corpora, takes significantly longer (bench)\ncargo bench --bench bench\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhirosassa%2Fdivsufsort-rs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhirosassa%2Fdivsufsort-rs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhirosassa%2Fdivsufsort-rs/lists"}