https://github.com/willf/compression_classification
Using compression to classify
https://github.com/willf/compression_classification
Last synced: about 1 year ago
JSON representation
Using compression to classify
- Host: GitHub
- URL: https://github.com/willf/compression_classification
- Owner: willf
- Created: 2023-07-13T23:55:08.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-08-06T22:29:36.000Z (almost 3 years ago)
- Last Synced: 2025-04-10T20:13:00.069Z (about 1 year ago)
- Language: Python
- Size: 2.41 MB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: contributing.md
- Code of conduct: code-of-conduct.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# compression_classification
[](https://github.com/willf/compression_classification/actions/workflows/test.yml)
Compression Classification is a Python package for classifying via compression.
It is inspired by my talk on "[Stupid Language Tricks](https://www.entish.org/lang-id-preso/lang-id-preso.pdf)" and [“Low-Resource” Text Classification: A Parameter-Free Classification
Method with Compressors](https://aclanthology.org/2023.findings-acl.426.pdf)
Simple example:
```python
from compression_classification import compression_classification
clr = compression_classification.CompressionClassifier()
clr.train("FilterGenie 的基础设施旨在处理大量数据而不影响性能。 无论您拥有小型项目还是大型企业应用程序,我们 的 API 都可以轻松扩展以满足您的需求。", "zh")
clr.train("FilterGenie's infrastructure is built to handle high volumes of data without compromising performance. Whether you have a small-scale project or a large enterprise application, our API scales effortlessly to meet your needs.", "en")
clr.predict("This is the day they give babies away")
'en'
clr.predict("这一天是他们送孩子的日子")
'zh'
```
In general, you'll want a lot more data, though.
## Contributing
We welcome contributions to compression_classification. Please see our [contributing guidelines](contributing.md) for more information.
To install the package for development, install [poetry](https://python-poetry.org/) and then run:
```bash
gh repo clone willf/compression_classification
cd compression_classification
poetry install
poetry shell
```
## Code of Conduct
We expect project participants to adhere to our [Code of Conduct](code-of-conduct.md).