Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aadsm/jschardet
Character encoding auto-detection in JavaScript (port of python's chardet)
https://github.com/aadsm/jschardet
character-encoding charset
Last synced: 2 days ago
JSON representation
Character encoding auto-detection in JavaScript (port of python's chardet)
- Host: GitHub
- URL: https://github.com/aadsm/jschardet
- Owner: aadsm
- License: lgpl-2.1
- Created: 2010-04-17T08:23:09.000Z (almost 15 years ago)
- Default Branch: main
- Last Pushed: 2024-09-30T21:19:08.000Z (5 months ago)
- Last Synced: 2024-10-29T23:13:14.389Z (4 months ago)
- Topics: character-encoding, charset
- Language: JavaScript
- Homepage:
- Size: 1.58 MB
- Stars: 710
- Watchers: 16
- Forks: 97
- Open Issues: 29
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-nodejs - jschardet - Character encoding auto-detection in JavaScript (port of python's chardet) ![](https://img.shields.io/github/stars/aadsm/jschardet.svg?style=social&label=Star) (Repository / Text/String)
README
[![NPM](https://nodei.co/npm/jschardet.png?downloads=true&downloadRank=true)](https://nodei.co/npm/jschardet/)
JsChardet
=========Port of python's chardet (https://github.com/chardet/chardet).
License
-------LGPL
How To Use It
-------------### Node
```
npm install jschardet
```var jschardet = require("jschardet")
// "àíàçã" in UTF-8
jschardet.detect("\xc3\xa0\xc3\xad\xc3\xa0\xc3\xa7\xc3\xa3")
// { encoding: "UTF-8", confidence: 0.9690625 }// "次常用國字標準字體表" in Big5
jschardet.detect("\xa6\xb8\xb1\x60\xa5\xce\xb0\xea\xa6\x72\xbc\xd0\xb7\xc7\xa6\x72\xc5\xe9\xaa\xed")
// { encoding: "Big5", confidence: 0.99 }// Martin Kühl
// jschardet.detectAll("\x3c\x73\x74\x72\x69\x6e\x67\x3e\x4d\x61\x72\x74\x69\x6e\x20\x4b\xfc\x68\x6c\x3c\x2f\x73\x74\x72\x69\x6e\x67\x3e")
// [
// {encoding: "windows-1252", confidence: 0.95},
// {encoding: "ISO-8859-2", confidence: 0.8796300205763055},
// {encoding: "SHIFT_JIS", confidence: 0.01}
// ]### Browser
Copy and include [jschardet.min.js](https://github.com/aadsm/jschardet/tree/master/dist/jschardet.min.js) in your web page.This library is also available in [cdnjs](https://cdnjs.com) at [https://cdnjs.cloudflare.com/ajax/libs/jschardet/1.4.1/jschardet.min.js](https://cdnjs.cloudflare.com/ajax/libs/jschardet/1.4.1/jschardet.min.js)
Options
-------```javascript
// See all information related to the confidence levels of each encoding.
// This is useful to see why you're not getting the expected encoding.
jschardet.enableDebug();// Default minimum accepted confidence level is 0.20 but sometimes this is not
// enough, specially when dealing with files mostly with numbers.
// To change this to 0 to always get something or any other value that can
// work for you.
jschardet.detect(str, { minimumThreshold: 0 });// Lock down which encodings to detect, can be useful in situations jschardet
// is giving a higher probability to encodings that you never use.
jschardet.detect(str, { detectEncodings: ["UTF-8", "windows-1252"] });
```Supported Charsets
------------------* Big5, GB2312/GB18030, EUC-TW, HZ-GB-2312, and ISO-2022-CN (Traditional and Simplified Chinese)
* EUC-JP, SHIFT_JIS, and ISO-2022-JP (Japanese)
* EUC-KR and ISO-2022-KR (Korean)
* KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, and windows-1251 (Russian)
* ISO-8859-2 and windows-1250 (Hungarian)
* ISO-8859-5 and windows-1251 (Bulgarian)
* windows-1252
* ISO-8859-7 and windows-1253 (Greek)
* ISO-8859-8 and windows-1255 (Visual and Logical Hebrew)
* TIS-620 (Thai)
* UTF-32 BE, LE, 3412-ordered, or 2143-ordered (with a BOM)
* UTF-16 BE or LE (with a BOM)
* UTF-8 (with or without a BOM)
* ASCIITechnical Information
---------------------I haven't been able to create tests to correctly detect:
* ISO-2022-CN
* windows-1250 in Hungarian
* windows-1251 in Bulgarian
* windows-1253 in Greek
* EUC-CNDevelopment
-----------
Use `npm run dist` to update the distribution files. They're available at https://github.com/aadsm/jschardet/tree/master/dist.Authors
-------Ported from python to JavaScript by António Afonso (https://github.com/aadsm/jschardet)
Transformed into an npm package by Markus Ast (https://github.com/brainafk)