{"id":37674554,"url":"https://github.com/viddexa/safetext","last_synced_at":"2026-01-16T12:10:14.270Z","repository":{"id":211901127,"uuid":"585297852","full_name":"viddexa/safetext","owner":"viddexa","description":"Fast profanity word, curse word, swear word, bad word filtering tool for English, Spanish, Chinese, Turkish and more.","archived":false,"fork":false,"pushed_at":"2025-12-27T16:19:18.000Z","size":153,"stargazers_count":44,"open_issues_count":0,"forks_count":7,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-29T13:22:11.021Z","etag":null,"topics":["bad-words","badwords","chinese","context7","english","filter","german","llmstxt","mcp","moderation","portuguese","profanity","profanity-detection","profanity-filter","profanityfilter","russian","safety","spanish","swear-filter","turkish"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/viddexa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"fcakyon"}},"created_at":"2023-01-04T20:21:01.000Z","updated_at":"2025-12-27T16:18:47.000Z","dependencies_parsed_at":"2025-07-05T16:55:31.809Z","dependency_job_id":"f3a4bb75-fa29-4a3d-9c34-9294192a5994","html_url":"https://github.com/viddexa/safetext","commit_stats":null,"previous_names":["safevideo/safetext","deepsafe/safetext","viddexa/safetext"],"tags_count":18,"template":false,"template_full_name":null,"purl":"pkg:github/viddexa/safetext","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viddexa%2Fsafetext","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viddexa%2Fsafetext/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viddexa%2Fsafetext/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viddexa%2Fsafetext/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/viddexa","download_url":"https://codeload.github.com/viddexa/safetext/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viddexa%2Fsafetext/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28478479,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bad-words","badwords","chinese","context7","english","filter","german","llmstxt","mcp","moderation","portuguese","profanity","profanity-detection","profanity-filter","profanityfilter","russian","safety","spanish","swear-filter","turkish"],"created_at":"2026-01-16T12:10:14.186Z","updated_at":"2026-01-16T12:10:14.248Z","avatar_url":"https://github.com/viddexa.png","language":"Python","funding_links":["https://github.com/sponsors/fcakyon"],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cp\u003e\n    \u003ca align=\"center\" href=\"\" target=\"_blank\"\u003e\n      \u003cimg\n        width=\"1280\"\n        src=\"https://github.com/viddexa/safetext/assets/44926076/9af66dde-3a93-4c5b-b802-cb31dffcb2e5\"\n      \u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n\n[![Context7 MCP](https://img.shields.io/badge/Context7%20MCP-Indexed-blue)](https://context7.com/viddexa/safetext)\n[![llms.txt](https://img.shields.io/badge/llms.txt-✓-brightgreen)](https://context7.com/viddexa/safetext/llms.txt)\n[![version](https://badge.fury.io/py/safetext.svg)](https://badge.fury.io/py/safetext)\n[![downloads](https://pepy.tech/badge/safetext)](https://pepy.tech/project/safetext)\n[![license](https://img.shields.io/pypi/l/safetext)](LICENSE)\n\n\u003c/div\u003e\n\n## 🤔 why safetext?\n\n**Fast profanity detection and filtering for 13 languages.**\n\n- **Multi-format Detection**: Single words, phrases, and contextual profanity\n- **Custom Word Lists**: Extend built-in lists with your own profanity words\n- **Whitelisting**: Exclude specific words from detection\n- **Auto Language Detection**: From text or subtitle files\n- **Precise Filtering**: Exact position tracking and custom censoring\n- **Simple Integration**: One-line setup with clean API\n\n## 📦 installation\n\neasily install **safetext** with pip:\n\n```bash\npip install safetext\n```\n\nfor development setup, see our [scripts documentation](scripts/README.md).\n\n## 🎯 quickstart\n\n### check and censor profanity\n\n```python\n\u003e\u003e\u003e from safetext import SafeText\n\n\u003e\u003e\u003e st = SafeText(language='en')\n\n\u003e\u003e\u003e results = st.check_profanity(text='Some text with \u003cprofanity-word\u003e.')\n\u003e\u003e\u003e results\n[{'word': '\u003cprofanity-word\u003e', 'index': 4, 'start': 15, 'end': 31}]\n\n\u003e\u003e\u003e text = st.censor_profanity(text='Some text with \u003cprofanity-word\u003e.')\n\u003e\u003e\u003e text\n\"Some text with ***.\"\n```\n\n### extending profanity lists with custom words\n\nAdd your own profanity words by providing a custom words directory:\n\n```python\n# Directory structure:\n# custom_profanity_words/\n# ├── en.txt              # English custom words\n# ├── tr.txt              # Turkish custom words\n# └── es.txt              # Spanish custom words\n\n\u003e\u003e\u003e st = SafeText(language='en', custom_words_dir='custom_profanity_words')\n\n\u003e\u003e\u003e # Custom words from en.txt are now included\n\u003e\u003e\u003e results = st.check_profanity('This mycustomword is inappropriate')\n\u003e\u003e\u003e results\n[{'word': 'mycustomword', 'index': 2, 'start': 5, 'end': 17}]\n```\n\nCustom word files should contain one word/phrase per line:\n\n```\n# custom_profanity_words/en.txt\nmycustomword\ninappropriate phrase\ncompany specific term\n```\n\n### using whitelist\n\nexclude specific words from profanity detection:\n\n```python\n# Using a list of words\n\u003e\u003e\u003e st = SafeText(language='en', whitelist=['word1', 'word2'])\n\n# Using a file (one word per line)\n\u003e\u003e\u003e st = SafeText(language='en', whitelist='path/to/whitelist.txt')\n\n# Combining custom words with whitelist\n\u003e\u003e\u003e st = SafeText(\n...     language='en', \n...     custom_words_dir='custom_profanity_words',\n...     whitelist=['allowedcustomword']\n... )\n```\n\n### automated language detection\n\n- from text:\n\n```python\n\u003e\u003e\u003e from safetext import SafeText\n\n\u003e\u003e\u003e eng_text = \"This story is about to take a dark turn.\"\n\n\u003e\u003e\u003e st = SafeText(language=None)\n\u003e\u003e\u003e st.set_language_from_text(eng_text)\n\n\u003e\u003e\u003e st.language\n'en'\n```\n\n- from .srt (subtitle) file:\n\n```python\n\u003e\u003e\u003e from safetext import SafeText\n\n\u003e\u003e\u003e turkish_srt_file_path = \"turkish.srt\"\n\n\u003e\u003e\u003e st = SafeText(language=None)\n\u003e\u003e\u003e st.set_language_from_srt(turkish_srt_file_path)\n\n\u003e\u003e\u003e st.language\n'tr'\n```\n\n## 🌍 supported languages\n\n**safetext** currently supports profanity detection in 13 languages:\n\n| Language | ISO 639-1 Code | Language Name |\n|----------|----------------|---------------|\n| 🇸🇦 | `ar` | Arabic |\n| 🇦🇿 | `az` | Azerbaijani |\n| 🇩🇪 | `de` | German |\n| 🇬🇧 | `en` | English |\n| 🇪🇸 | `es` | Spanish |\n| 🇮🇷 | `fa` | Persian (Farsi) |\n| 🇫🇷 | `fr` | French |\n| 🇮🇳 | `hi` | Hindi |\n| 🇯🇵 | `ja` | Japanese |\n| 🇵🇹 | `pt` | Portuguese |\n| 🇷🇺 | `ru` | Russian |\n| 🇹🇷 | `tr` | Turkish |\n| 🇨🇳 | `zh` | Chinese |\n\n## 🤝 contribute to safetext\n\njoin our mission in refining content moderation!\n\ncontribute by:\n\n- **adding new languages**: create a folder with the ISO 639-1 code and include a `words.txt`.\n- **enhancing word lists**: improve detection accuracy.\n- **sharing feedback**: your ideas can shape `safetext`.\n\nsee our [contributing guidelines](CONTRIBUTING.md) for development workflow, [test documentation](tests/README.md) for running tests, and [scripts guide](scripts/README.md) for automation tools.\n\n______________________________________________________________________\n\n## 🏆 contributors\n\nmeet our awesome contributors who make **safetext** better every day!\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/viddexa/safetext/graphs/contributors\"\u003e\n      \u003cimg src=\"https://contrib.rocks/image?repo=viddexa/safetext\" /\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n______________________________________________________________________\n\n\u003cdiv align=\"center\"\u003e\n  \u003cb\u003efollow us for more!\u003c/b\u003e\n  \u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://www.linkedin.com/company/viddexa/\"\u003eLinkedIn\u003c/a\u003e • \n  \u003ca href=\"https://huggingface.co/viddexa\"\u003eHugging Face\u003c/a\u003e • \n  \u003ca href=\"https://x.com/viddexa\"\u003eX\u003c/a\u003e\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fviddexa%2Fsafetext","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fviddexa%2Fsafetext","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fviddexa%2Fsafetext/lists"}