https://github.com/douban/pycharlockholmes
Character encoding detecting library for Python using ICU and libmagic.
https://github.com/douban/pycharlockholmes
Last synced: 6 months ago
JSON representation
Character encoding detecting library for Python using ICU and libmagic.
- Host: GitHub
- URL: https://github.com/douban/pycharlockholmes
- Owner: douban
- License: bsd-3-clause
- Created: 2013-04-15T11:50:41.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2018-05-06T10:21:30.000Z (over 7 years ago)
- Last Synced: 2025-03-14T20:06:17.431Z (7 months ago)
- Language: Common Lisp
- Homepage:
- Size: 24.4 KB
- Stars: 49
- Watchers: 14
- Forks: 13
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Charlock Holmes
[](https://travis-ci.org/douban/PyCharlockHolmes)Character encoding detecting library for Python using [ICU](http://site.icu-project.org/) and libmagic. Inspired by [Charlock Holmes](https://github.com/brianmario/charlock_holmes)
## Dependency
1. icu
2. file(libmagic)### Gentoo
emerge -av dev-libs/icu
emerge -av sys-apps/file### Ubuntu
apt-get install libicu-dev
apt-get install libmagic-dev### Brew
brew install icu4c
brew install libmagic
export ICUI18N="/usr/local/Cellar/icu4c/xx" # Replace "xx" as the version of your icu
export MAGIC="/usr/local/Cellar/libmagic/xx" # Replace "xx" as the version of your libmagic## Install
python setup build
python setup install## Usage
from charlockholmes import detect
file = open('test.txt')
content = file.read()
print detect(content)# CHANGELOG
- 0.0.3
- Add support for python 3.X# License
Modified BSD License