{"id":23436635,"url":"https://github.com/uvasoftware/yara-language-nsfw","last_synced_at":"2026-02-01T15:03:48.237Z","repository":{"id":139723098,"uuid":"60058770","full_name":"uvasoftware/yara-language-nsfw","owner":"uvasoftware","description":"Lists of not-suitable-for-work words as YARA rules","archived":false,"fork":false,"pushed_at":"2026-01-31T14:12:34.000Z","size":229,"stargazers_count":29,"open_issues_count":1,"forks_count":6,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-01T00:59:21.119Z","etag":null,"topics":["nsfw","yara","yara-rules"],"latest_commit_sha":null,"homepage":"","language":"YARA","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/uvasoftware.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2016-05-31T04:43:02.000Z","updated_at":"2025-11-19T11:15:18.000Z","dependencies_parsed_at":"2025-04-09T18:02:54.501Z","dependency_job_id":null,"html_url":"https://github.com/uvasoftware/yara-language-nsfw","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/uvasoftware/yara-language-nsfw","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uvasoftware%2Fyara-language-nsfw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uvasoftware%2Fyara-language-nsfw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uvasoftware%2Fyara-language-nsfw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uvasoftware%2Fyara-language-nsfw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/uvasoftware","download_url":"https://codeload.github.com/uvasoftware/yara-language-nsfw/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uvasoftware%2Fyara-language-nsfw/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28980855,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-01T13:38:33.235Z","status":"ssl_error","status_checked_at":"2026-02-01T13:38:32.912Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nsfw","yara","yara-rules"],"created_at":"2024-12-23T13:20:14.928Z","updated_at":"2026-02-01T15:03:48.209Z","avatar_url":"https://github.com/uvasoftware.png","language":"YARA","funding_links":[],"categories":[],"sub_categories":[],"readme":"# YARA NSFW Language Detection Rules\n\nA comprehensive collection of NSFW (not suitable for work) language detection rules in [YARA](http://virustotal.github.io/yara/) pattern-matching format.\n\nThis database powers the NSFW language detection feature of the [Scanii](https://www.scanii.com) content analysis service.\n\n## Supported Languages\n\nThis project includes NSFW language detection rules for **25 languages**:\n\n| Language | Code | Language | Code |\n|----------|------|----------|------|\n| Arabic | `ar` | Italian | `it` |\n| Bengali | `bn` | Japanese | `ja` |\n| Chinese | `zh` | Korean | `ko` |\n| Czech | `cs` | Dutch | `nl` |\n| Danish | `da` | Norwegian | `no` |\n| English | `en` | Polish | `pl` |\n| English (Racial) | `en-racial` | Portuguese | `pt` |\n| Esperanto | `eo` | Russian | `ru` |\n| Finnish | `fi` | Swedish | `sv` |\n| French | `fr` | Thai | `th` |\n| German | `de` | Turkish | `tr` |\n| Hindi | `hi` | Hungarian | `hu` |\n| Spanish | `es` | | |\n\n## Rule Format\n\nAll rules use **hex-encoded strings** to ensure proper character encoding across different platforms and avoid issues with special characters. This is especially important for languages with non-ASCII characters.\n\n### Example Rule Structure\n\n```yara\nrule content_en_language_nsfw_42 {\n  meta:\n    info = \"badword\"\n  strings:\n    $ascii1 = \"\\x66\\x75\\x63\\x6b\" nocase  // \"badword\" in ASCII/UTF-8\n    $wide1 = \"\\x66\\x00\\x75\\x00\\x63\\x00\\x6b\\x00\" nocase  // \"badword\" in UTF-16LE\n  condition:\n    any of them\n}\n```\n\n### Why Hex Encoding?\n\n1. **Character Encoding Safety**: Hex encoding ensures characters are interpreted correctly regardless of file encoding\n2. **Special Character Support**: Handles accented characters (é, ñ, ü) and non-Latin scripts (Arabic, Chinese, etc.)\n3. **Multiple Encodings**: Each rule typically includes patterns for:\n   - UTF-8 (`$utf8`)\n   - Latin-1 (`$latin1`)\n   - Windows CP-1252 (`$cp1252`)\n   - UTF-16LE/Wide strings (`$wide`)\n\n### Creating New Rules\n\nWhen adding new words, convert them to hex:\n\n```bash\n# For ASCII/UTF-8:\necho -n \"word\" | xxd -p | sed 's/../\\\\x\u0026/g'\n\n# For UTF-16LE (wide):\necho -n \"word\" | iconv -t UTF-16LE | xxd -p | sed 's/../\\\\x\u0026/g'\n```\n\n#### Compiling the rules\n\n```\n% make build\nmkdir -p ./dist\nyarac src/entrypoint.yara ./dist/language-nsfw.db\n```\n\n#### Running tests \n\n```\n% make test\nmkdir -p ./dist\nyarac src/entrypoint.yara ./dist/language-nsfw.db\n...\n```\n\n## Credits:\nThis codebase started as a fork from [List of Dirty, Naughty, Obscene, and Otherwise Bad Words](https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words) .\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuvasoftware%2Fyara-language-nsfw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fuvasoftware%2Fyara-language-nsfw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuvasoftware%2Fyara-language-nsfw/lists"}