An open API service indexing awesome lists of open source software.

https://github.com/msikorski93/spam-detection-with-lstm-polish

Detecting spam (a typical binary classification problem) on Polish emails.
https://github.com/msikorski93/spam-detection-with-lstm-polish

emails embeddings lstm-neural-networks nlp polish-language spam-detection tensorflow word2vec

Last synced: 8 months ago
JSON representation

Detecting spam (a typical binary classification problem) on Polish emails.

Awesome Lists containing this project

README

          

# Spam-Detection-With-LSTM-Polish
![ alt text ](https://img.shields.io/badge/license-MIT-green?style=&logo=)
![ alt text ](https://img.shields.io/badge/-Jupyter-F37626?logo=Jupyter&logoColor=white)
![ alt text ](https://img.shields.io/badge/-pandas-150458?logo=Pandas&logoColor=white)
![ alt text ](https://img.shields.io/badge/-TensorFlow-FF6F00?logo=TensorFlow&logoColor=white)
![ alt text ](https://img.shields.io/badge/-Keras-D00000?logo=Keras&logoColor=white)

The notebook aimed to perform and demonstrate a binary classification problem - spam detection, on a dataset of Polish emails. To complete the task we developed a LSTM model which is a specific type of recurrent neural network (RNN). We proved the high effectiveness of its application for natural language processing (NLP) tasks and achieved an overall 97.06% accuracy. Before training the neural network we used word2vec technique to convert descriptions into embeddings with a pre-trained model specifically for Polish, developed by the Polish Academy of Science. The evaluation of the neural network was done with standard plots.