{"id":29866592,"url":"https://github.com/theiskaa/slugifier","last_synced_at":"2025-07-31T14:02:48.178Z","repository":{"id":306407008,"uuid":"1026034492","full_name":"theiskaa/slugifier","owner":"theiskaa","description":"A tiny, fast, and Unicode-aware slug generator for Zig","archived":false,"fork":false,"pushed_at":"2025-07-25T11:01:58.000Z","size":10,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-25T15:22:21.897Z","etag":null,"topics":["slugifier","slugify","zig","zig-library"],"latest_commit_sha":null,"homepage":"","language":"Zig","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/theiskaa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-25T07:48:56.000Z","updated_at":"2025-07-25T14:07:03.000Z","dependencies_parsed_at":"2025-07-25T15:23:09.375Z","dependency_job_id":"b90f25aa-6f54-4b27-9b4a-81b2b67e8548","html_url":"https://github.com/theiskaa/slugifier","commit_stats":null,"previous_names":["theiskaa/slugifier"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/theiskaa/slugifier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/theiskaa%2Fslugifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/theiskaa%2Fslugifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/theiskaa%2Fslugifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/theiskaa%2Fslugifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/theiskaa","download_url":"https://codeload.github.com/theiskaa/slugifier/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/theiskaa%2Fslugifier/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267873764,"owners_count":24158690,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-30T02:00:09.044Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["slugifier","slugify","zig","zig-library"],"created_at":"2025-07-30T13:00:53.832Z","updated_at":"2025-07-30T13:02:20.844Z","avatar_url":"https://github.com/theiskaa.png","language":"Zig","funding_links":[],"categories":[],"sub_categories":[],"readme":"# slugifier\n\n\u003cp align=\"center\"\u003e\n\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![Zig](https://img.shields.io/badge/zig-0.13-orange.svg)](https://ziglang.org/)\n\n\u003c/p\u003e\n\nslugifier is a fast and comprehensive slug generation library for Zig that converts text into URL-friendly slugs with exceptional performance and extensive Unicode support. The library provides robust text processing with customizable separators, case formatting, and advanced transliteration capabilities across multiple writing systems.\n\nThe core functionality centers around converting any text input into clean, web-safe slugs. The library handles ASCII text with optimal performance while providing comprehensive Unicode support through an advanced transliteration engine. This engine supports over 20 languages across multiple script families including Latin, Cyrillic, CJK (Chinese, Japanese, Korean), and RTL scripts (Arabic, Hebrew, Persian). The transliteration system is culturally aware, applying language-specific conversion rules that preserve linguistic accuracy rather than generic character mappings.\n\nThe library offers three Unicode processing modes to suit different requirements. Strip mode removes Unicode characters entirely for ASCII-only output. Preserve mode maintains Unicode characters as-is for international slug generation. Transliterate mode converts Unicode characters to ASCII equivalents using sophisticated language-specific mappings that understand cultural context. For example, German ü becomes \"ue\" rather than \"u\", while Swedish treats the same character as \"u\" according to local conventions.\n\nThe project provides both a command-line tool for quick slug generation and automation scripts, plus a library interface for programmatic integration. The CLI delivers instant results for one-off conversions and batch processing. The library API offers extensive configuration through struct options supporting custom separators, case formatting modes, language selection, and Unicode processing preferences. All operations maintain memory safety and provide error handling for invalid configurations.\n\n## Install\n\nInstall the binary globally using Zig:\n\n```bash\ngit clone https://github.com/theiskaa/slugifier.git\ncd slugifier\nzig build -Doptimize=ReleaseFast\n```\n\n## Install as library\n\nAdd to your `build.zig.zon`:\n\n```zig\n.{\n    .name = \"your-project\",\n    .version = \"0.1.0\",\n    .dependencies = .{\n        .slugifier = .{\n            .url = \"https://github.com/theiskaa/slugifier/archive/main.tar.gz\",\n            .hash = \"1234...\", // zig will provide this\n        },\n    },\n}\n```\n\nOr add to your project as a Git submodule:\n\n```bash\ngit submodule add https://github.com/theiskaa/slugifier.git libs/slugifier\n```\n\n## Usage\nThe library exposes a main `slugify()` function that accepts raw text and configuration options through a `SlugifyOptions` struct. The function handles all text processing internally including Unicode detection, script classification, language-specific transliteration, case conversion, and separator normalization. The implementation leverages a sophisticated transliteration engine that maps Unicode characters to appropriate ASCII equivalents based on linguistic and cultural context.\n\nThe `slugify()` function returns an allocated string that the caller must manage. Memory allocation follows Zig conventions with explicit allocator passing for predictable memory management. The function performs comprehensive input validation and provides meaningful error codes for invalid configurations.\n\nConfiguration options include separator character selection (any non-alphanumeric ASCII character), case formatting (lowercase, uppercase, or preserve original), Unicode processing mode (strip, preserve, or transliterate), and optional language specification for culturally accurate transliteration. When a language is specified, the transliterator applies language-specific character mappings while falling back to generic mappings for characters outside that language's scope.\n\nImport the library:\n```zig\nconst slugifier = @import(\"slugifier\");\n```\nBasic usage with default options:\n```zig\nconst result = try slugifier.slugify(\"Hello, World!\", .{}, allocator);\ndefer allocator.free(result); // Result: \"hello-world\"\n```\nAdvanced configuration with language-specific transliteration:\n```zig\nconst options = slugifier.SlugifyOptions{\n    .separator = '_',\n    .format = .uppercase,\n    .unicode_mode = .transliterate,\n    .language = .de, // German language mappings\n};\n\nconst german_result = try slugifier.slugify(\"Müllerstraße\", options, allocator);\ndefer allocator.free(german_result); // Result: \"MUELLERSTRASSE\"\n```\n\nMixed script handling\n```zig\nconst mixed_result = try slugifier.slugify(\"Hello 你好 Привет\", .{}, allocator);\ndefer allocator.free(mixed_result); // Result: \"hello-nihao-privet\"\n```\n\nThe library automatically detects and processes multiple Unicode scripts within the same input. When language-specific settings are configured, the transliterator prioritizes those mappings while falling back to generic script mappings for characters outside the specified language. This approach ensures comprehensive text processing regardless of input complexity.\n\n## Unicode Support\n\nThe slugifier library provides comprehensive Unicode support through an advanced transliteration engine that handles multiple writing systems with cultural accuracy. The system currently supports over 20 languages across four major script families: Latin, Cyrillic, CJK (Chinese, Japanese, Korean), and RTL scripts (Arabic, Hebrew, Persian).\n\nSupported languages include European languages (German, French, Spanish, Italian, Portuguese, Dutch), Slavic languages (Russian, Ukrainian, Polish, Czech, Belarusian, Serbian), Nordic languages (Swedish, Norwegian, Danish, Finnish), East Asian languages (Chinese Simplified/Traditional, Japanese, Korean), and Middle Eastern languages (Arabic, Hebrew, Persian/Farsi). Each language implementation follows proper transliteration standards and cultural conventions rather than generic character substitution.\n\nThe transliteration system operates through a hierarchical mapping approach. When a language is specified, the engine first attempts language-specific character mappings, then falls back to generic script mappings, ensuring comprehensive coverage for mixed-language content. For example, German-specific mappings handle ü as \"ue\" and ß as \"ss\", while generic Latin mappings provide broader coverage for other accented characters.\n\nThe Unicode processing pipeline includes automatic script detection, codepoint classification, and context-aware transliteration. The system can process text containing multiple scripts simultaneously, applying appropriate conversion rules for each script type. This approach enables accurate slug generation for international content while maintaining performance and reliability.\n\nThe transliteration mappings are modular and extensible. Adding support for new languages requires creating mapping functions in the `src/unicode/mappings/` directory that accept Unicode codepoints and return ASCII string replacements. The system automatically integrates new mappings into the fallback hierarchy without requiring changes to the core transliteration logic.\n\n## Configuration\n\nThe slugifier library provides extensive customization through the `SlugifyOptions` struct with four primary configuration categories: separator selection, case formatting, Unicode processing mode, and language specification.\n\nThe separator option accepts any non-alphanumeric ASCII character to join words in the generated slug. Common choices include hyphens, underscores, dots, and plus signs. The library validates separator characters at runtime, rejecting alphanumeric characters that would conflict with slug content.\n\nCase formatting operates through three modes: lowercase converts all output to lowercase for standard web URLs, uppercase converts all output to uppercase for specific branding requirements, and default preserves the original character casing for mixed-case applications. Case formatting applies after Unicode transliteration, ensuring consistent output regardless of input script.\n\nUnicode processing mode determines how non-ASCII characters are handled. Strip mode removes Unicode characters entirely, producing ASCII-only output suitable for legacy systems. Preserve mode maintains Unicode characters unchanged, enabling international slugs for modern applications. Transliterate mode converts Unicode characters to ASCII equivalents using sophisticated language-aware mappings, balancing accessibility with linguistic accuracy.\n\nLanguage specification enables culturally accurate transliteration by applying language-specific character mappings. When specified, the system prioritizes language mappings while maintaining fallback support for characters outside that language scope. This approach ensures optimal results for primary content language while handling multilingual input gracefully.\n\nThe configuration system includes comprehensive validation with meaningful error messages. Invalid separator characters, conflicting options, or malformed configurations are detected before processing begins, preventing runtime failures and ensuring predictable behavior across all usage scenarios.\n\n## Contributing\nFor information regarding contributions, please refer to [CONTRIBUTING.md](CONTRIBUTING.md) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftheiskaa%2Fslugifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftheiskaa%2Fslugifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftheiskaa%2Fslugifier/lists"}