{"id":18486481,"url":"https://github.com/sonicdoe/detect-character-encoding","last_synced_at":"2025-09-20T23:59:12.082Z","repository":{"id":28734568,"uuid":"32255885","full_name":"sonicdoe/detect-character-encoding","owner":"sonicdoe","description":"Detect character encoding using ICU","archived":false,"fork":false,"pushed_at":"2024-01-06T12:33:37.000Z","size":59812,"stargazers_count":83,"open_issues_count":3,"forks_count":15,"subscribers_count":2,"default_branch":"develop","last_synced_at":"2025-04-02T09:07:41.960Z","etag":null,"topics":["c-plus-plus","character-encoding","charset","detect","encoding","icu","javascript","nodejs"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sonicdoe.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-03-15T11:01:13.000Z","updated_at":"2025-03-15T18:50:20.000Z","dependencies_parsed_at":"2024-06-18T14:02:39.749Z","dependency_job_id":"35b68b85-d7c9-4949-b41f-ec6b61d3df2b","html_url":"https://github.com/sonicdoe/detect-character-encoding","commit_stats":{"total_commits":208,"total_committers":5,"mean_commits":41.6,"dds":0.2451923076923077,"last_synced_commit":"27d8f93f9e31032ad14e80632da501cbf9f96363"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sonicdoe%2Fdetect-character-encoding","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sonicdoe%2Fdetect-character-encoding/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sonicdoe%2Fdetect-character-encoding/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sonicdoe%2Fdetect-character-encoding/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sonicdoe","download_url":"https://codeload.github.com/sonicdoe/detect-character-encoding/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248027411,"owners_count":21035594,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-plus-plus","character-encoding","charset","detect","encoding","icu","javascript","nodejs"],"created_at":"2024-11-06T12:49:28.346Z","updated_at":"2025-09-20T23:59:07.061Z","avatar_url":"https://github.com/sonicdoe.png","language":"C++","readme":"# detect-character-encoding\n\n\u003e Detect character encoding using [ICU](http://site.icu-project.org)\n\n**Tip:** If you don’t need ICU in particular, consider using [ced](https://github.com/sonicdoe/ced), which is based on Google’s lighter [compact_enc_det](https://github.com/google/compact_enc_det) library.\n\n## Installation\n\n```console\n$ npm install detect-character-encoding\n```\n\ndetect-character-encoding is a C++ addon. Therefore, you may need to install various build tools. Check [node-gyp’s readme](https://github.com/nodejs/node-gyp#installation) for more information.\n\n## Usage\n\n```js\nconst fs = require('fs');\nconst detectCharacterEncoding = require('detect-character-encoding');\n\nconst fileBuffer = fs.readFileSync('file.txt');\nconst charsetMatch = detectCharacterEncoding(fileBuffer);\n\nconsole.log(charsetMatch);\n// {\n//   encoding: 'UTF-8',\n//   confidence: 60\n// }\n```\n\ndetect-character-encoding may return `null` if no charset matches.\n\n## Supported operating systems\n\n- macOS Sonoma\n- Ubuntu 22.04 and 20.04\n- Debian 12, 11, and 10\n\ndetect-character-encoding does not support 32-bit operating systems.\n\n## Supported character sets\n\nAs listed in [ICU’s user guide](http://userguide.icu-project.org/conversion/detection#TOC-Detected-Encodings):\n\n- UTF-8\n- UTF-16BE\n- UTF-16LE\n- UTF-32BE\n- UTF-32LE\n- Shift_JIS\n- ISO-2022-JP\n- ISO-2022-CN\n- ISO-2022-KR\n- GB18030\n- Big5\n- EUC-JP\n- EUC-KR\n- ISO-8859-1\n- ISO-8859-2\n- ISO-8859-5\n- ISO-8859-6\n- ISO-8859-7\n- ISO-8859-8\n- ISO-8859-9\n- windows-1250\n- windows-1251\n- windows-1252\n- windows-1253\n- windows-1254\n- windows-1255\n- windows-1256\n- KOI8-R\n- IBM420\n- IBM424\n\n## License\n\ndetect-character-encoding is licensed under the BSD 2-clause license but includes third-party software under different licenses. See [`LICENSE.md`](./LICENSE.md) for the full license text.\n","funding_links":[],"categories":["C++"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsonicdoe%2Fdetect-character-encoding","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsonicdoe%2Fdetect-character-encoding","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsonicdoe%2Fdetect-character-encoding/lists"}