https://github.com/tushar50896/cuss_inspect
A basic and simple yet powerful Python library to detect toxicity/profanity of a review or list of reveiws.
https://github.com/tushar50896/cuss_inspect
abusive-language-detection cusswords logistic-regression profanity profanity-detection python review-checks scikit-learn swearing-detector toxic-comment-classification
Last synced: 11 months ago
JSON representation
A basic and simple yet powerful Python library to detect toxicity/profanity of a review or list of reveiws.
- Host: GitHub
- URL: https://github.com/tushar50896/cuss_inspect
- Owner: tushar50896
- License: mit
- Created: 2020-10-12T08:16:52.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2020-12-02T06:46:46.000Z (over 5 years ago)
- Last Synced: 2025-05-28T05:09:19.100Z (about 1 year ago)
- Topics: abusive-language-detection, cusswords, logistic-regression, profanity, profanity-detection, python, review-checks, scikit-learn, swearing-detector, toxic-comment-classification
- Language: Python
- Homepage: https://pypi.org/project/cuss-inspect/
- Size: 1.75 MB
- Stars: 10
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# cuss_inspect


A simple yet powerful library to predict toxicity/profanity of a review/comment or list of reviews/comments.
## How It Works
`cuss_inspect` is a logistic regression based model trained on 180K+ reviews and tested on 24K+ reviews. The library does not uses any specific wordlist/swear-words-list but is able to detected most of the swear words easily.
### Performance
| | 1 Prediction (ms) | 10 Predictions (ms) | 100 Predictions (ms) | 1000 Predictions (ms) | 10000 Predictions (ms)
| --------|-------------------|---------------------|-----------------------| -----------------------|-----------------------
| cuss_inspect | 0.2 | 0.3 | 0.8 | 4.3 | 24.7
### Accuracy
The accuracy,precision and recall are quite impressive as compared to other models. Logistic regression for text classification outperforms many other classifcation algorithms such as SVC,Decision Tree and Naive Bayes.
| | Precision | Recall | F1 Score
| --- | ------- | ------------- | ----------------------
0 | 0.83 | 0.94 | 0.88
1 | 0.99 | 0.96 | 0.97
Accuracy | | | 0.95
macro avg | 0.91 | 0.95 | 0.93
weighted avg | 0.96 | 0.96 | 0.96
### Receiver Operating Characteristics

## Installation
```
$ pip install cuss_inspect
```
## Usage
```python
from cuss_inspect import predict, predict_prob
# for simple string
text_0 = "this is simple review. you have done a good job"
print(predict(text_0))
# [0]
print(predict_prob(text_0)
# [0.05]
text_1 = "son of a bitch"
print(predict(text_1))
# [1]
print(predict_prob(text_1)
# [1.]
# for list of inputs
test = ['who are you?' , 'what do you want?' , 'son of a dog' , 'how the hell can you say that' , 'fuck it']
print(predict(test))
# [0 0 1 1 1]
print(predict_prob(test))
# [0.12 0.22 0.55 0.96 1.]
```
*`predict()` and `predict_prob` return [`numpy`](https://pypi.org/project/numpy/) arrays.