{"id":18301281,"url":"https://github.com/spider-rs/auto-encoder","last_synced_at":"2025-04-05T14:30:50.297Z","repository":{"id":255361398,"uuid":"849344344","full_name":"spider-rs/auto-encoder","owner":"spider-rs","description":"Auto encoding library bytes to strings","archived":false,"fork":false,"pushed_at":"2024-12-01T02:42:09.000Z","size":21,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-30T17:16:46.699Z","etag":null,"topics":["auto-encoder","encoding","string-encoding"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spider-rs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-29T12:36:31.000Z","updated_at":"2024-12-01T02:42:12.000Z","dependencies_parsed_at":"2024-08-29T14:39:49.754Z","dependency_job_id":"1f75aeaf-2e6d-44f2-a8a1-d551eb8c1511","html_url":"https://github.com/spider-rs/auto-encoder","commit_stats":null,"previous_names":["spider-rs/auto-encoder"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spider-rs%2Fauto-encoder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spider-rs%2Fauto-encoder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spider-rs%2Fauto-encoder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spider-rs%2Fauto-encoder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spider-rs","download_url":"https://codeload.github.com/spider-rs/auto-encoder/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247352291,"owners_count":20925245,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["auto-encoder","encoding","string-encoding"],"created_at":"2024-11-05T15:15:01.864Z","updated_at":"2025-04-05T14:30:50.032Z","avatar_url":"https://github.com/spider-rs.png","language":"Rust","readme":"# auto_encoder\n\n`auto_encoder` is a Rust library designed to automatically detect and encode various text and binary file formats, along with specific language encodings.\n\n## Features\n\n- **Automatic Encoding Detection**: Detects text encoding based on locale or content.\n- **Binary Format Detection**: Checks if a given file is a known binary format by inspecting its initial bytes.\n- **HTML Language Detection**: Extracts and detects the language of an HTML document from its content.\n\n## Installation\n\nAdd this to your `Cargo.toml`:\n\n```toml\n[dependencies]\nauto_encoder = \"0.1\"\n```\n\n## Usage\n\n### Encoding Detection\n\nAutomatically detect the encoding for a given locale:\n\n```rust\nuse auto_encoder::encoding_for_locale;\n\nlet encoding = encoding_for_locale(\"ja-jp\").unwrap();\nprintln!(\"Encoding for Japanese locale: {:?}\", encoding);\n```\n\nEncode bytes from a given HTML content and language:\n\n```rust\nuse auto_encoder::encode_bytes_from_language;\n\nlet html_content = b\"こんにちは、世界！\";\nlet encoded = encode_bytes_from_language(html_content, \"ja\");\nprintln!(\"Encoded content: {}\", encoded);\n```\n\n### Binary Format Detection\n\nCheck if a given file content is a known binary format:\n\n```rust\nuse auto_encoder::is_binary_file;\n\nlet file_content = \u0026[0xFF, 0xD8, 0xFF]; // JPEG file signature\nlet is_binary = is_binary_file(file_content);\nprintln!(\"Is the file a known binary format? {}\", is_binary);\n```\n\n### HTML Language Detection\n\nDetect the language attribute from an HTML document:\n\n```rust\nuse auto_encoder::detect_language;\n\nlet html_content = br#\"\u003chtml lang=\"en\"\u003e\u003chead\u003e\u003ctitle\u003eTest\u003c/title\u003e\u003c/head\u003e\u003cbody\u003e\u003c/body\u003e\u003c/html\u003e\"#;\nlet language = detect_language(html_content).unwrap();\nprintln!(\"Language detected: {}\", language);\n```\n\n## API Documentation\n\n### Functions\n\n#### `encoding_for_locale`\n\nGet the encoding for a given locale if found.\n\n```rust\npub fn encoding_for_locale(locale: \u0026str) -\u003e Option\u003c\u0026'static encoding_rs::Encoding\u003e;\n```\n\n#### `is_binary_file`\n\nCheck if the file is a known binary format using its initial bytes.\n\n```rust\npub fn is_binary_file(content: \u0026[u8]) -\u003e bool;\n```\n\n#### `detect_language`\n\nDetect the language of an HTML resource based on its content.\n\n```rust\npub fn detect_language(html_content: \u0026[u8]) -\u003e Option\u003cString\u003e;\n```\n\n#### `encode_bytes`\n\nGet the content with proper encoding. Pass in a proper encoding label like `SHIFT_JIS`.\n\n```rust\npub fn encode_bytes(html: \u0026[u8], label: \u0026str) -\u003e String;\n```\n\n#### `encode_bytes_from_language`\n\nGet the content with proper encoding based on a language code (e.g., `ja` for Japanese).\n\n```rust\npub fn encode_bytes_from_language(html: \u0026[u8], language: \u0026str) -\u003e String;\n```\n\n### Supported Locales and Encodings\n\nThe library supports a wide range of locales and their corresponding encodings, such as `WINDOWS_1252` for Western European languages, `SHIFT_JIS` for Japanese, `GB18030` for Simplified Chinese, etc.\n\n## Contributing\n\nContributions are welcome! Please feel free to open an issue or submit a pull request on [GitHub](https://github.com/spider-rs/auto-encoder).\n\n## License\n\nThis project is licensed under the MIT License. See the LICENSE file for details.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspider-rs%2Fauto-encoder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspider-rs%2Fauto-encoder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspider-rs%2Fauto-encoder/lists"}