https://github.com/markivory2973/tg-spam
Chinese Telegram spam messages classifier in PyTorch.
https://github.com/markivory2973/tg-spam
lstm python pytorch spam telegram
Last synced: 8 days ago
JSON representation
Chinese Telegram spam messages classifier in PyTorch.
- Host: GitHub
- URL: https://github.com/markivory2973/tg-spam
- Owner: MarkIvory2973
- License: gpl-3.0
- Created: 2025-08-05T06:57:28.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-08-07T04:38:47.000Z (11 months ago)
- Last Synced: 2025-08-18T08:12:42.876Z (11 months ago)
- Topics: lstm, python, pytorch, spam, telegram
- Language: Python
- Homepage:
- Size: 527 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TG Spam
Chinese Telegram spam messages classifier in PyTorch.
## Installation
Create an environment (Python **3.13**):
```bash
conda create -n tg-spam python=3.13
```
Install dependencies:
- For CPU:
```bash
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cpu
```
- For NVIDIA GPU:
```bash
pip install torch torchvision --index-url https://download.pytorch.org/whl/cuxxx
```
```bash
pip install click matplotlib
```
Clone this repository:
```bash
git clone https://github.com/MarkIvory2973/tg-spam.git
```
## Usage
```bash
cd tg-spam
python src/cli.py train --batch-size 32 --learning-rate 0.001 --gamma 0.9 --epochs 25
python src/cli.py result
python src/cli.py prompt --epoch 5 --input "缺几个敢拼的兄弟😊,跟我干通宵,下个月路虎开回家,说到做到,看煮也"
python src/cli.py prompt --epoch 5 --input "网络是个神奇的地方hy2套cdn都出来了"
```
## Parameters
Train mode:
|Parameter|Required|Default|Description|
|:-|:-:|:-|:-|
|--root|-|./data/|Folder contains datasets and checkpoints|
|--batch-size|-|32|Batch size of dataset|
|--learning-rate|-|0.001|Learning rate of Adam|
|--gamma|-|0.9|Gamma of ExponentialLR|
|--epochs|-|25|Total epochs of training|
Result mode:
|Parameter|Required|Default|Description|
|:-|:-:|:-|:-|
|--root|-|./data/|Folder contains datasets and checkpoints|
Prompt mode:
|Parameter|Required|Default|Description|
|:-|:-:|:-|:-|
|--root|-|./data/|Folder contains datasets and checkpoints|
|--epoch|-|5|Epoch of model to use|
|--input|✓|-|Text to be classified|
## Result
||Accuracy (%)|
|:-:|-:|
|Eval|91.22%|
## References
[1] [Long Short-term Memory RNN](https://arxiv.org/abs/2105.06756)
[2] [reatiny/chinese-spam-10000](https://huggingface.co/datasets/reatiny/chinese-spam-10000)
[3] [paulkm/chinese_conversation_and_spam](https://huggingface.co/datasets/paulkm/chinese_conversation_and_spam)