https://github.com/buganini/bsdconv
A simple but powerful DSL for charset/encoding conversion and transformation, pure C implementation with no extra dependencies
https://github.com/buganini/bsdconv
Last synced: about 1 year ago
JSON representation
A simple but powerful DSL for charset/encoding conversion and transformation, pure C implementation with no extra dependencies
- Host: GitHub
- URL: https://github.com/buganini/bsdconv
- Owner: buganini
- License: bsd-2-clause
- Created: 2009-04-12T08:10:20.000Z (about 17 years ago)
- Default Branch: master
- Last Pushed: 2023-01-10T16:50:11.000Z (over 3 years ago)
- Last Synced: 2024-10-12T13:58:08.890Z (over 1 year ago)
- Language: C
- Homepage: https://bsdconv.io/bsdconv/
- Size: 10.5 MB
- Stars: 53
- Watchers: 4
- Forks: 6
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- Changelog: Changelog
- License: LICENSE
Awesome Lists containing this project
README
# Documentation & Support
http://www.slideshare.net/buganini/bsdconv
http://www.slideshare.net/Buganini/journey-of-bsdconv
API Reference: http://buganini.github.io/bsdconv/
Use bsdconv-man to show manual page for each module
IRC: irc://irc.freenode.net#bsdconv
# Compilation & Installation
```
make PREFIX=${prefix} # default to /usr/local
sudo make install PREFIX=${prefix} # default to /usr/local
sudo ldconfig ${prefix}/lib # Linux
sudo ldconfig -m ${prefix}/lib # FreeBSD
```
# Add codec alias
```
Update modules/{from,inter,to}/alias
make alias
```
# Example
Convert traditional chinese big5 to simplified chinese utf-8
```
bsdconv big5:zhcn:utf-8 in.txt > out.txt
bsdconv big5:zhcn:utf-8 -i in.txt #inplace
```
Convert traditional chinese utf-8 to simplified chinese GB2312 with transliteration
```
bsdconv utf-8:zhcn:cp936,cp936-trans in.txt > out.txt
```
Convert simplified chinese to traditional chinese
```
bsdconv utf-8:zhtw:zhtw-words:utf-8
```
And ignoring whitespaces mixed in words
```
bsdconv utf-8:whitespace-derail:zhtw:zhtw-words:whitespace-rerail:utf-8
```
Convert big5 data, traditional chinese to simplified chinese,
CRLF/CR/LF to CRLF, to big5 data, translate simplified chinese words, which are
not in big5, to HTML entities, and uppercase the ascii characters.
```
bsdconv big5:zhcn:win:upper:big5,htmlentity in.txt > out.txt
```
Counting character width
```
echo -n "aa" | bsdconv utf-8:width:null
FULL: 1
HALF: 1
echo -n "aaˇ" | bsdconv utf-8:width:null
FULL: 1
HALF: 1
AMBI: 1
```
Very useful for migrating MySQL DB from Big5 to UTF-8
```
bsdconv htmlentity,big5-5c,big5:utf-8 in.sql > out.sql
```
Recover from mis-decoding/encoding (mistreated big5 as iso-8859-1 and converted to utf-8)
```
bsdconv 'utf-8:iso-8859-1|big5:utf-8'
```
Decode escaped data (byte/unicode mixed) like %u9644%20
```
bsdconv 'escape,byte:unicode,byte|skip,ascii:utf-8'
```
Generate string for fuzzy comparison
```
echo ¼ℌăDžⓐ⁹灣湾ド鬒鬒æß | bsdconv UTF-8:ZH-FUZZY-TW:KANA-PHONETIC:NFKD-CASEFOLD:UTF-8
1⁄4hădža9灣灣do鬒鬒æss
```
Translate text to HTML
```
bsdconv big5:nl2br:ascii,html-img in.txt > out.htm
```
Use glyph image from http://www.cns11643.gov.tw
```
bsdconv utf-8:ascii,ascii-html-cns11643-img in.txt out.htm
```
Maintain inter map:
```
bsdconv bsdconv-keyword,bsdconv:bsdconv-keyword,utf-8 inter/FOO.txt > edit.tmp
vi edit.tmp
bsdconv bsdconv-keyword,utf-8:bsdconv-keyword,bsdconv edit.tmp > inter/FOO.txt
```
# Windows
Use mingw with Makefile.win to build it, then copy everythings in build/ to c:\bsdconv\
the path of the executable will be c:\bsdconv\bsdconv.exe
If you want to install to directory other than default path
set BSDCONV_PATH environment variable to your path.
Run setEnvVar.bat as administrator could help you set proper environment variables.
# Bindings
[Python](https://pypi.python.org/pypi/bsdconv/ "Python")
[Perl](https://github.com/buganini/perl-bsdconv "Perl")
[PHP](https://github.com/buganini/php-bsdconv "PHP")
[Ruby](https://rubygems.org/gems/ruby-bsdconv/ "Ruby")
[Go](https://github.com/buganini/go-bsdconv "Go")
[Java](https://github.com/buganini/jni-bsdconv "Java")
[Haskell](https://github.com/pkmx/hs-bsdconv "Haskell")
[Elasticsearch](https://github.com/buganini/elasticsearch-bsdconv-plugin "Elasticsearch")
[PostgreSQL](https://github.com/buganini/postgres-bsdconv "PostgreSQL")
[MySQL](https://github.com/buganini/mysql-udf-bsdconv "MySQL")