{"id":25678292,"url":"https://github.com/digitalcortex/fast_whitespace_collapse","last_synced_at":"2026-05-18T14:04:59.929Z","repository":{"id":278644985,"uuid":"936267098","full_name":"digitalcortex/fast_whitespace_collapse","owner":"digitalcortex","description":"A high-performance Rust library for collapsing consecutive spaces and tabs into a single space.","archived":false,"fork":false,"pushed_at":"2025-02-20T21:45:47.000Z","size":12,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-25T05:57:01.951Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/digitalcortex.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-20T19:54:30.000Z","updated_at":"2025-02-21T16:34:17.000Z","dependencies_parsed_at":"2025-02-20T22:30:46.961Z","dependency_job_id":"8f1ab91b-2d8c-4e8f-acce-890c8ffd24b5","html_url":"https://github.com/digitalcortex/fast_whitespace_collapse","commit_stats":null,"previous_names":["digitalcortex/fast_whitespace_collapse"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/digitalcortex/fast_whitespace_collapse","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digitalcortex%2Ffast_whitespace_collapse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digitalcortex%2Ffast_whitespace_collapse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digitalcortex%2Ffast_whitespace_collapse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digitalcortex%2Ffast_whitespace_collapse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/digitalcortex","download_url":"https://codeload.github.com/digitalcortex/fast_whitespace_collapse/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digitalcortex%2Ffast_whitespace_collapse/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":284893576,"owners_count":27080532,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-17T02:00:06.431Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-24T15:39:00.203Z","updated_at":"2025-11-17T14:03:26.401Z","avatar_url":"https://github.com/digitalcortex.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# fast_whitespace_collapse\n\n[fast_whitespace_collapse](https://crates.io/crates/fast_whitespace_collapse) is a high-performance Rust crate for collapsing consecutive spaces and tabs into a single space.\n\nUses **SIMD (`u8x16`) via the [`wide` crate](https://crates.io/crates/wide)** for efficient processing.  \nAutomatically falls back to a **scalar implementation** if SIMD is unavailable.\n\n## Features\n- Collapses multiple spaces and tabs into a single space.\n- Preserves newlines and non-whitespace characters.\n- Uses **SIMD (`u8x16`) when supported** to process 16 bytes at a time.\n- Falls back to **a fast scalar implementation** if SIMD is unavailable.\n- Ensures valid UTF-8 output.\n- SIMD requires **AVX2, SSE2, or NEON** instruction sets.\n\n## Installation\n\nAdd this to your `Cargo.toml`:\n\n```toml\n[dependencies]\nfast_whitespace_collapse = \"0.1.0\"\n```\n\nOr run the following command:\n\n```bash\ncargo add fast_whitespace_collapse\n```\n\n### **Controlling SIMD Support**\nBy default, SIMD acceleration is **enabled**. You can control it via Cargo features:\n\n#### **🔹 Disable SIMD for Embedded Targets**\n```sh\ncargo build --no-default-features\n```\n\n#### **🔹 Explicitly Enable SIMD**\n```sh\ncargo build --features simd-optimized\n```\n\n## Usage\n\n```rust\nuse fast_whitespace_collapse::collapse_whitespace;\n\nlet input = \"This   is \\t  a   test.\";\nlet output = collapse_whitespace(input);\nassert_eq!(output, \"This is a test.\");\n```\n\n## Performance\n- Processes text using **SIMD (`u8x16`)**, handling **16 bytes in parallel**.\n- Falls back to **scalar processing** when SIMD is unavailable.\n- Handles **large inputs efficiently** while maintaining valid UTF-8 output.\n\n## Benchmark Results\n\n### **Comparison with Other Approaches**\n\n| Method | Time |\n|--------|------|\n| Regex approach | 11.289 µs |\n| [collapse](https://crates.io/crates/collapse) crate | 1.2624 µs |\n| Iterative approach | 629.60 ns |\n| Iterative bytes | 428.00 ns |\n| [fast_whitespace_collapse](https://crates.io/crates/fast_whitespace_collapse) crate | **388.73 ns** |\n\n🚀 **`fast_whitespace_collapse` outperforms other methods, achieving the lowest execution time.**\n\n📌 **Benchmark executed on Apple M1 Pro (NEON SIMD enabled).**\n\n### **🔹 Run Your Own Benchmark**\n```sh\ncargo bench\n```\n\n## Compatibility\n\n**`fast_whitespace_collapse`** supports multiple architectures:\n\n- **x86_64**: Uses SIMD (`SSE2`, `AVX2`) for maximum performance.\n- **ARM (aarch64, M1/M2/M3)**: Uses **NEON SIMD**.\n- **Other**: Falls back to **a scalar implementation**.\n\n## Examples\n\n### **Basic Usage**\n```rust\nuse fast_whitespace_collapse::collapse_whitespace;\n\nassert_eq!(collapse_whitespace(\"Hello    world\"), \"Hello world\");\nassert_eq!(collapse_whitespace(\"   Trim   spaces   \" ), \"Trim spaces\");\nassert_eq!(collapse_whitespace(\"Tabs\\t\\tconverted\"), \"Tabs converted\");\n```\n\n### **Unicode Support**\n```rust\nassert_eq!(collapse_whitespace(\"こんにちは  世界\"), \"こんにちは 世界\"); // Japanese\nassert_eq!(collapse_whitespace(\"你好  世界\"), \"你好 世界\"); // Chinese\nassert_eq!(collapse_whitespace(\"😀  😃  😄\"), \"😀 😃 😄\"); // Emojis\n```\n\n### **Handling Newlines**\n```rust\nassert_eq!(collapse_whitespace(\"Line1\\n   Line2\\nLine3\"), \"Line1\\n Line2\\nLine3\");\n```\n\n## Tests\nRun tests with:\n```sh\ncargo test\n```\n\n## License\nThis project is licensed under the **MIT License**.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigitalcortex%2Ffast_whitespace_collapse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdigitalcortex%2Ffast_whitespace_collapse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigitalcortex%2Ffast_whitespace_collapse/lists"}