https://github.com/cloudaper/compact_enc_det
Ruby bindings for Google's Compact Encoding Detection C++ library
https://github.com/cloudaper/compact_enc_det
charset charset-detection charset-detector encoding encoding-detection encoding-detector ruby
Last synced: 3 months ago
JSON representation
Ruby bindings for Google's Compact Encoding Detection C++ library
- Host: GitHub
- URL: https://github.com/cloudaper/compact_enc_det
- Owner: cloudaper
- License: mit
- Created: 2024-02-04T16:42:56.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-12T14:40:55.000Z (over 1 year ago)
- Last Synced: 2025-11-30T14:35:30.872Z (4 months ago)
- Topics: charset, charset-detection, charset-detector, encoding, encoding-detection, encoding-detector, ruby
- Language: C++
- Homepage:
- Size: 33.2 KB
- Stars: 8
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Compact Encoding Detection for Ruby
Ruby bindings for [Google's Compact Encoding Detection](https://github.com/google/compact_enc_det) (CED for short) C++ library
> [!NOTE]
> Bindings temporarily use a [fork of the C++ library](https://github.com/cloudaper/compact_enc_det_fork/commit/e4eda3204bab019564b96c522baae93ee2fffdc8), which fixes the minimum CMake version for the build to pass on modern environments.
## Usage
You will need [curl](https://curl.se) and [CMake](https://cmake.org) to build the C++ native extension.
> macOS
>
> You can use [Homebrew](https://brew.sh) to install it:
>
> ```console
> brew install cmake
> ```
Then you can install the gem from [RubyGems.org](https://rubygems.org/gems/compact_enc_det).
> Either add this to your Gemfile:
>
> ```ruby
> gem 'compact_enc_det', '~> 1.0'
> ```
> or run the following command to install it:
>
> ```console
> gem install compact_enc_det
> ```
Now you can detect the encoding via the `CompactEncDet.detect_encoding`, which is a thin wrapper around `CompactEncDet::DetectEncoding` and `MimeEncodingName` functions from the C++ library.
> ```ruby
> file = File.read("unknown-encoding.txt", mode: "rb")
> result = CompactEncDet.detect_encoding(file)
> result.encoding
> # => #
> result.bytes_consumed
> # => 239
> result.is_reliable?
> # => true
> ```
## Contributing
Any contributions are welcome! Feel free to open an issue or a pull request.
### Repository
The [google/compact_enc_det](https://github.com/google/compact_enc_det) repository is linked as a Git submodule at `ext/compact_enc_det/compact_enc_det`.
> You need to clone the repository with `--recurse-submodules` flag:
>
> ```console
> git clone --recurse-submodules git@github.com:cloudaper/compact_enc_det.git
> ```
>
> Or initialize and update the submodule after cloning with the following commands:
>
> ```console
> git submodule init && git submodule update
> ```
### Testing
Tests located at `tests` use the [minitest](https://github.com/minitest/minitest) framework.
> Run the tests via test Rake task:
>
> ```console
> rake test
> ```
>
> The gem will be compiled to `lib/compact_enc_det/compact_enc_det.bundle` first.
## License
This gem is released under [MIT license](LICENSE), while the original Google's [Compact Encoding Detection library](https://github.com/google/compact_enc_det) source code, located at `ext/compact_enc_det/compact_enc_det`, is under the [Apache-2.0](LICENSE-APACHE) license.