https://github.com/markivory2973/tg-spam

Chinese Telegram spam messages classifier in PyTorch.
https://github.com/markivory2973/tg-spam

lstm python pytorch spam telegram

Last synced: 8 days ago
JSON representation

Chinese Telegram spam messages classifier in PyTorch.

Host: GitHub
URL: https://github.com/markivory2973/tg-spam
Owner: MarkIvory2973
License: gpl-3.0
Created: 2025-08-05T06:57:28.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-08-07T04:38:47.000Z (11 months ago)
Last Synced: 2025-08-18T08:12:42.876Z (11 months ago)
Topics: lstm, python, pytorch, spam, telegram
Language: Python
Homepage:
Size: 527 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # TG Spam

Chinese Telegram spam messages classifier in PyTorch.

## Installation

Create an environment (Python **3.13**):

```bash

conda create -n tg-spam python=3.13

```

Install dependencies:

- For CPU:

  ```bash

  pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cpu

  ```

- For NVIDIA GPU:

  ```bash

  pip install torch torchvision --index-url https://download.pytorch.org/whl/cuxxx

  ```

```bash

pip install click matplotlib

```

Clone this repository:

```bash

git clone https://github.com/MarkIvory2973/tg-spam.git

```

## Usage

```bash

cd tg-spam

python src/cli.py train --batch-size 32 --learning-rate 0.001 --gamma 0.9 --epochs 25

python src/cli.py result

python src/cli.py prompt --epoch 5 --input "缺几个敢拼的兄弟😊，跟我干通宵，下个月路虎开回家，说到做到，看煮也"

python src/cli.py prompt --epoch 5 --input "网络是个神奇的地方hy2套cdn都出来了"

```

## Parameters

Train mode:

|Parameter|Required|Default|Description|

|:-|:-:|:-|:-|

|--root|-|./data/|Folder contains datasets and checkpoints|

|--batch-size|-|32|Batch size of dataset|

|--learning-rate|-|0.001|Learning rate of Adam|

|--gamma|-|0.9|Gamma of ExponentialLR|

|--epochs|-|25|Total epochs of training|

Result mode:

|Parameter|Required|Default|Description|

|:-|:-:|:-|:-|

|--root|-|./data/|Folder contains datasets and checkpoints|

Prompt mode:

|Parameter|Required|Default|Description|

|:-|:-:|:-|:-|

|--root|-|./data/|Folder contains datasets and checkpoints|

|--epoch|-|5|Epoch of model to use|

|--input|✓|-|Text to be classified|

## Result

||Accuracy (%)|

|:-:|-:|

|Eval|91.22%|

## References

[1] [Long Short-term Memory RNN](https://arxiv.org/abs/2105.06756)

[2] [reatiny/chinese-spam-10000](https://huggingface.co/datasets/reatiny/chinese-spam-10000)

[3] [paulkm/chinese_conversation_and_spam](https://huggingface.co/datasets/paulkm/chinese_conversation_and_spam)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/markivory2973/tg-spam

Awesome Lists containing this project

README