https://github.com/alanrva/homoglyphic
Homoglyphic makes it easy to find strings in a body of text which contains homoglyphs to bypass regular string matching. Useful for simplifying spam/phishing detection, content moderation and scrubbing text used for training ML models.
https://github.com/alanrva/homoglyphic
anti-spam antispam homoglyph homoglyphs moderation
Last synced: 6 months ago
JSON representation
Homoglyphic makes it easy to find strings in a body of text which contains homoglyphs to bypass regular string matching. Useful for simplifying spam/phishing detection, content moderation and scrubbing text used for training ML models.
- Host: GitHub
- URL: https://github.com/alanrva/homoglyphic
- Owner: AlanRVA
- License: apache-2.0
- Created: 2022-03-21T20:42:10.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2022-08-13T16:34:42.000Z (about 3 years ago)
- Last Synced: 2025-03-22T21:48:15.420Z (7 months ago)
- Topics: anti-spam, antispam, homoglyph, homoglyphs, moderation
- Language: C#
- Homepage:
- Size: 68.4 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Homoglyphic
[](https://www.nuget.org/packages/homoglyphic/)
A .net Standard 2/C# library for working with homoglyphs (characters that look identical or similar but have different unicode values).
Homoglyphic makes it easy to find strings in a body of text which contains homoglyphs to bypass regular string matching. Useful for simplifying spam/phishing detection, content moderation and scrubbing text used for training ML models.
This project was inspired by [Homoglyph](https://github.com/codebox/homoglyph) and the list of homogylphys in this project can be [found here](https://github.com/codebox/homoglyph/tree/master/raw_data/char_codes.txt)
## Installing
Install-Package Homoglyphic
## Usage
Homoglyphic consists of two main classes, the HomoglyphicLoader and HomoglyphicSearch.
The HomoglyphicLoader accepts the file path to a CSV file of homoglyphic character sets and returns a list of hashsets representing a set of homoglyphs.
Once you have your list of homoglyphs, you can then use it to create an instance of the HomoglyphSearch class which has a single function: Search. The Search function will accept a string or list of strings and return a SearchResult ojbect for each occurance of a search string found in the string being searched.
```cs
var sets = HomoglyphLoader.LoadSets("homoglyphs.csv");
var search = new HomoglyphSearch(sets);var result = search.Search("Th1s Is A Test", new List() { "This", "Test" });
```