https://github.com/tiendc/go-profanity-out
Profanity detection library
https://github.com/tiendc/go-profanity-out
badword-censor badword-filter badwords go golang profanity-check profanity-detection profanity-filter profanityfilter swear swear-filter swearing-detector
Last synced: about 2 months ago
JSON representation
Profanity detection library
- Host: GitHub
- URL: https://github.com/tiendc/go-profanity-out
- Owner: tiendc
- License: mit
- Created: 2025-05-29T18:55:56.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-14T11:43:39.000Z (12 months ago)
- Last Synced: 2025-06-14T12:33:35.127Z (12 months ago)
- Topics: badword-censor, badword-filter, badwords, go, golang, profanity-check, profanity-detection, profanity-filter, profanityfilter, swear, swear-filter, swearing-detector
- Language: Go
- Homepage:
- Size: 14.6 KB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[![Go Version][gover-img]][gover] [![GoDoc][doc-img]][doc] [![Build Status][ci-img]][ci] [![Coverage Status][cov-img]][cov] [![GoReport][rpt-img]][rpt]
# Profanity detection library
This project is inspired by [github.com/TwiN/go-away](https://github.com/TwiN/go-away) and
[github.com/finnbear/moderation](https://github.com/finnbear/moderation). It also uses
language data from the libs (with modifications).
This project is still in development and more tests are needed to ensure the accuracy.
However, you may use it in your work as it can produce good results.
## Highlights
- Fully supports Unicode
- Utilizes radix tree to improve performance
## Installation
```shell
go get github.com/tiendc/go-profanity-out
```
## How to use
```go
import (
profanityout "github.com/tiendc/go-profanity-out"
profanityDataEN "github.com/tiendc/go-profanity-out/data/en"
)
detector := profanityout.NewProfanityDetector().
WithProfaneWords(profanityDataEN.DefaultProfanities). // required
WithFalsePositiveWords(profanityDataEN.DefaultFalsePositives). // required
WithSuspectWords(profanityDataEN.DefaultSuspects). // required
WithLeetSpeakCharacters(profanityDataEN.LeetSpeakCharacters). // required
WithSpecialCharacters(profanityDataEN.SpecialCharacters). // required
WithWildcardCharacters(profanityDataEN.WildcardCharacters). // required
WithSanitizeLeetSpeak(true). // default: true
WithSanitizeSpecialCharacters(true). // default: true
WithSanitizeSpaces(true). // default: true
WithSanitizeRepeatedCharacters(true). // default: true
WithSanitizeWildcardCharacters(true). // default: true
WithSanitizeAccents(true). // default: true
WithProcessInputAsHTML(false). // default: false
WithConfidenceCalculator(calculator). // default: built-in
WithCensorCharacter('*') // default: *
// Scan for at most one profanity (result may contain found suspect words and/or false positives)
matches := detector.ScanProfanity("fuck this $h!!t") // profane: true
// Scan for all profanities
matches := detector.ScanAllProfanities("fuck this $h!!t") // profane: true
// Censor the profanities
res, matches := detector.Censor("fuck this $h!!t") // res == "**** this *****"
// WithSanitizeLeetSpeak: true
ScanProfanity("$h!t") // profane: true
// WithSanitizeLeetSpeak: false
ScanProfanity("$h!t") // profane: false
// WithSanitizeSpecialCharacters: true
ScanProfanity("sh_it") // profane: true
// WithSanitizeSpecialCharacters: false
ScanProfanity("sh_it") // profane: false
// WithSanitizeSpaces: true
ScanProfanity("f u c k") // profane: true
// WithSanitizeSpaces: false
ScanProfanity("f u c k") // profane: false
// WithSanitizeRepeatedCharacters: true
ScanProfanity("fuuuuck") // profane: true
// WithSanitizeRepeatedCharacters: false
ScanProfanity("fuuuuck") // profane: false
// WithSanitizeWildcardCharacters: true
// NOTE: wildcard characters can be in both input and/or profanity dictionary
ScanProfanity("f**k") // profane: true
WithProfaneWords([]string{"f*ck"}).ScanProfanity("fxck") // profane: true
WithProfaneWords([]string{"*fuck*"}).ScanProfanity("xfuckx") // profane: true
// WithSanitizeWildcardCharacters: false
ScanProfanity("f**k") // profane: false
ScanProfanity("fxck") // profane: false
ScanProfanity("xfuckx") // profane: false
// WithSanitizeAccents: true
ScanProfanity("fúck") // profane: true
// WithSanitizeAccents: false
ScanProfanity("fúck") // profane: false
// WithProcessInputAsHTML: true
ScanProfanity("<ock") // profane: true
// WithProcessInputAsHTML: false
ScanProfanity("<ock") // profane: false
```
## Benchmarks
[Benchmark code](https://gist.github.com/tiendc/bd5a0655ad07251f626402d819786d84)
```shell
tiendc/go-profanity-out
tiendc/go-profanity-out-10 9024 129919 ns/op 44038 B/op 306 allocs/op
TwiN/go-away
TwiN/go-away-10 2745 415685 ns/op 444899 B/op 498 allocs/op
finnbear/moderation
finnbear/moderation-10 15432 77601 ns/op 2496 B/op 22 allocs/op
```
## Help wanted
- You are welcome to make pull requests for new functions and bug fixes.
- It's really nice if you can add more input data for English and other languages.
## License
- [MIT License](LICENSE)
[doc-img]: https://pkg.go.dev/badge/github.com/tiendc/go-profanity-out
[doc]: https://pkg.go.dev/github.com/tiendc/go-profanity-out
[gover-img]: https://img.shields.io/badge/Go-%3E%3D%201.20-blue
[gover]: https://img.shields.io/badge/Go-%3E%3D%201.20-blue
[ci-img]: https://github.com/tiendc/go-profanity-out/actions/workflows/go.yml/badge.svg
[ci]: https://github.com/tiendc/go-profanity-out/actions/workflows/go.yml
[cov-img]: https://codecov.io/gh/tiendc/go-profanity-out/branch/main/graph/badge.svg
[cov]: https://codecov.io/gh/tiendc/go-profanity-out
[rpt-img]: https://goreportcard.com/badge/github.com/tiendc/go-profanity-out
[rpt]: https://goreportcard.com/report/github.com/tiendc/go-profanity-out