An open API service indexing awesome lists of open source software.

https://github.com/zain-ul-din/urdu_pleg_checker

The Model is an advanced AI solution that efficiently detects and eliminates plagiarism in Urdu text.
https://github.com/zain-ul-din/urdu_pleg_checker

Last synced: 5 months ago
JSON representation

The Model is an advanced AI solution that efficiently detects and eliminates plagiarism in Urdu text.

Awesome Lists containing this project

README

          

# Urdu Plagiarism Checker

The UrduPlagCheckAI Model makes a significant contribution to the Urdu Adab language. It is an advanced AI solution that effectively identifies and eliminates plagiarism in Urdu text.

### Usage

```bash
> Copy `LSTM.ipynb` to `google colabs`.
> Run All cells
> download model and use it anywhere.
```

## Examples

*Text reuse is the act of composing new text from a previously published text*

![image](https://github.com/Zain-ul-din/Urdu_Pleg_Checker/assets/78583049/a6d28901-44c7-4dc1-81a3-507a9d4c6a44)
![image](https://github.com/Zain-ul-din/Urdu_Pleg_Checker/assets/78583049/2098c8ad-808a-40b0-8a83-748d8e9a14b0)

## Model Training Results

**Model Accuracy**

![Accuracy](https://github.com/Zain-ul-din/Urdu_Pleg_Checker/assets/78583049/8f0aedc2-fcfc-41b2-901a-b9136036cc58)

**Lose**

![Lose](https://github.com/Zain-ul-din/Urdu_Pleg_Checker/assets/78583049/e47937ff-3205-406e-b015-ccf57c586d19)

## Tools

The tools making this model work:

- `urduhack`: Python library for Urdu text preprocessing and normalization.
- `nlu`: Python library for natural language understanding.
- `gensim`: Python library for topic modeling and document similarity.
- `Keras`: Deep learning library for neural network models.
- `TensorFlow`: An open-source deep learning framework for building and training neural networks.
- `scikit-learn`: Python library for machine learning.
- `matplotlib` and `seaborn`: Python libraries for data visualization.
- `pandas` and `numpy`: Python libraries for data manipulation and computation.
- `Word2Vec`: Algorithm for generating word embeddings.
- `Tokenizer` and `pad_sequences`: Functions for text tokenization and padding in Keras.
- `EarlyStopping` and `ModelCheckpoint`: Keras callbacks for training control.
- `TensorBoard`: Visualization tool for monitoring model training in TensorFlow.

Make sure to provide appropriate links or references to give credit to the respective tools and their developers.

## Internal Architecture

![image](https://github.com/Zain-ul-din/Urdu_Pleg_Checker/assets/78583049/03787815-e42a-4c0e-b6f8-c2e49badda54)

[Want Help?](https://github.com/Zain-ul-din/Urdu_Pleg_Checker/issues)