{"id":45015579,"url":"https://github.com/ksanyok/texthumanize","last_synced_at":"2026-03-01T00:00:58.551Z","repository":{"id":339088269,"uuid":"1159532025","full_name":"ksanyok/TextHumanize","owner":"ksanyok","description":"Multilingual text humanization library (Python/PHP/JS): natural tone, punctuation fixes, style presets, AI-text cleanup.","archived":false,"fork":false,"pushed_at":"2026-02-27T00:59:59.000Z","size":2495,"stargazers_count":9,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-27T02:17:07.554Z","etag":null,"topics":["ai-text","ai-text-detector","english","humanize","humanize-ai","humanizer","library","multilingual","nlp","nlp-library","normalization","open-source","paraphrase","python","rewriting","russian","style-transfer","text-humanization","text-processing","ukrainian"],"latest_commit_sha":null,"homepage":"https://humanizekit.tester-buyreadysite.website/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ksanyok.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-16T20:59:05.000Z","updated_at":"2026-02-27T01:00:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"6ef9b463-4eb8-4921-b6ea-0dd06690f30f","html_url":"https://github.com/ksanyok/TextHumanize","commit_stats":null,"previous_names":["ksanyok/texthumanize"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/ksanyok/TextHumanize","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksanyok%2FTextHumanize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksanyok%2FTextHumanize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksanyok%2FTextHumanize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksanyok%2FTextHumanize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ksanyok","download_url":"https://codeload.github.com/ksanyok/TextHumanize/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksanyok%2FTextHumanize/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29955885,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-28T22:53:01.873Z","status":"ssl_error","status_checked_at":"2026-02-28T22:52:50.699Z","response_time":90,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-text","ai-text-detector","english","humanize","humanize-ai","humanizer","library","multilingual","nlp","nlp-library","normalization","open-source","paraphrase","python","rewriting","russian","style-transfer","text-humanization","text-processing","ukrainian"],"created_at":"2026-02-19T01:08:22.000Z","updated_at":"2026-03-01T00:00:58.333Z","avatar_url":"https://github.com/ksanyok.png","language":"Python","readme":"# TextHumanize\n\n**Algorithmic text naturalization library — transforms machine-generated text into natural, human-like prose**\n\n[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)\n[![PHP 8.1+](https://img.shields.io/badge/php-8.1+-purple.svg)](https://www.php.net/)\n[![Tests](https://img.shields.io/badge/tests-500%20passed-green.svg)]()\n[![Coverage](https://img.shields.io/badge/coverage-85%25-brightgreen.svg)]()\n[![Ruff](https://img.shields.io/badge/linting-ruff-261230.svg)](https://github.com/astral-sh/ruff)\n[![mypy](https://img.shields.io/badge/types-mypy-blue.svg)](https://mypy-lang.org/)\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen.svg)](https://pre-commit.com/)\n[![Zero Dependencies](https://img.shields.io/badge/dependencies-0-brightgreen.svg)]()\n[![License](https://img.shields.io/badge/license-Personal%20Use-orange.svg)](LICENSE)\n\n---\n\nTextHumanize is a pure-Python text processing library that normalizes typography, simplifies bureaucratic language, diversifies sentence structure, increases burstiness and perplexity, and replaces formulaic phrases with natural alternatives. Includes **AI text detection**, **paraphrasing**, **tone analysis**, **watermark cleaning**, **text spinning**, and **coherence analysis**. Available for **Python** and **PHP**.\n\n**Full language support:** Russian · Ukrainian · English · German · French · Spanish · Polish · Portuguese · Italian\n\n**Universal processor:** works with any language using statistical methods (no dictionaries required).\n\n---\n\n## Table of Contents\n\n- [Features](#features)\n- [Why TextHumanize?](#why-texthumanize)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Before \u0026 After Examples](#before--after-examples)\n- [API Reference](#api-reference)\n  - [humanize()](#humanizetext-options)\n  - [humanize_chunked()](#humanize_chunkedtext-chunk_size5000-options)\n  - [analyze()](#analyzetext-lang)\n  - [explain()](#explainresult)\n  - [detect_ai()](#detect_aitext-lang)\n  - [detect_ai_batch()](#detect_ai_batchtexts-lang)\n  - [paraphrase()](#paraphrasetext-lang-intensity-seed)\n  - [analyze_tone()](#analyze_tonetext-lang)\n  - [adjust_tone()](#adjust_tonetext-target-lang-intensity)\n  - [detect_watermarks()](#detect_watermarkstext-lang)\n  - [clean_watermarks()](#clean_watermarkstext-lang)\n  - [spin()](#spintext-lang-intensity-seed)\n  - [spin_variants()](#spin_variantstext-count-lang-intensity)\n  - [analyze_coherence()](#analyze_coherencetext-lang)\n  - [full_readability()](#full_readabilitytext-lang)\n- [Profiles](#profiles)\n- [Parameters](#parameters)\n- [Plugin System](#plugin-system)\n- [Chunk Processing](#chunk-processing)\n- [CLI Reference](#cli-reference)\n- [REST API Server](#rest-api-server)\n- [Processing Pipeline](#processing-pipeline)\n- [AI Detection — How It Works](#ai-detection--how-it-works)\n- [Language Support](#language-support)\n- [SEO Mode](#seo-mode)\n- [Readability Metrics](#readability-metrics)\n- [Paraphrasing Engine](#paraphrasing-engine)\n- [Tone Analysis \u0026 Adjustment](#tone-analysis--adjustment)\n- [Watermark Detection \u0026 Cleaning](#watermark-detection--cleaning)\n- [Text Spinning](#text-spinning)\n- [Coherence Analysis](#coherence-analysis)\n- [Morphological Engine](#morphological-engine)\n- [Smart Sentence Splitter](#smart-sentence-splitter)\n- [Context-Aware Synonyms](#context-aware-synonyms)\n- [Using Individual Modules](#using-individual-modules)\n- [Performance \u0026 Benchmarks](#performance--benchmarks)\n- [Testing](#testing)\n- [Architecture](#architecture)\n- [PHP Library](#php-library)\n- [Code Quality \u0026 Tooling](#code-quality--tooling)\n- [Migration Guide (v0.4 → v0.5)](#migration-guide-v04--v05)\n- [FAQ \u0026 Troubleshooting](#faq--troubleshooting)\n- [Contributing](#contributing)\n- [Support the Project](#support-the-project)\n- [License](#license)\n\n---\n\n## Features\n\nTextHumanize addresses common patterns found in machine-generated text:\n\n| Pattern | Before | After |\n|---------|--------|-------|\n| **Em dashes** | `text — example` | `text - example` |\n| **Typographic quotes** | `«text»` | `\"text\"` |\n| **Bureaucratic words** | `utilize`, `implement` | `use`, `do` |\n| **Formulaic connectors** | `However`, `Furthermore` | `But`, `Also` |\n| **Uniform sentences** | All 15-20 words | Varied 5-25 words |\n| **Word repetition** | `important... important...` | Synonym substitution |\n| **Overly perfect punctuation** | Frequent `;` and `:` | Simplified punctuation |\n| **Low perplexity** | Predictable word choice | Natural variation |\n| **Boilerplate phrases** | `it is important to note` | `notably`, `by the way` |\n| **AI watermarks** | Hidden zero-width chars | Cleaned text |\n\n### Key Advantages\n\n- **Fast** — pure algorithmic processing, zero network requests\n- **Private** — all processing happens locally, data never leaves your system\n- **Controllable** — fine-tuned via intensity, profiles, and keyword preservation\n- **9 languages + universal** — RU, UK, EN, DE, FR, ES, PL, PT, IT + any other\n- **Zero dependencies** — Python standard library only\n- **Extensible** — plugin system for custom pipeline stages\n- **Large text support** — chunk processing for texts of any size\n- **AI detection** — 12-metric statistical AI text detector, no ML required\n- **Paraphrasing** — algorithmic sentence-level paraphrasing\n- **Tone control** — analyze and adjust text formality (7 levels)\n- **Watermark cleaning** — detect and remove invisible text watermarks\n- **Text spinning** — generate unique content variants with spintax\n- **Coherence analysis** — assess text structure and paragraph flow\n- **Readability metrics** — Flesch-Kincaid, Coleman-Liau, ARI, SMOG, Gunning Fog, Dale-Chall\n- **Morphological engine** — rule-based lemmatization for RU, UK, EN, DE\n- **Smart sentence splitting** — handles abbreviations, decimals, initials correctly\n- **Context-aware synonyms** — word-sense disambiguation without ML\n- **REST API** — built-in HTTP server with 12 JSON endpoints\n\n---\n\n## Why TextHumanize?\n\n| Feature | TextHumanize | Online Tools | GPT Rewrite |\n|---------|:------------:|:------------:|:-----------:|\n| Works offline | ✅ | ❌ | ❌ |\n| Zero dependencies | ✅ | ❌ | ❌ |\n| Data never leaves device | ✅ | ❌ | ❌ |\n| Reproducible (seed) | ✅ | ❌ | ❌ |\n| 9 languages | ✅ | ≈ 1-3 | ✅ |\n| Fast (ms, not seconds) | ✅ | ❌ | ❌ |\n| Fine control (intensity/profile) | ✅ | ❌ | ~ |\n| Built-in AI detector | ✅ | ❌ | ❌ |\n| Plugin system | ✅ | ❌ | ❌ |\n| Free \u0026 open source | ✅ | ❌ | ❌ |\n| No API key required | ✅ | ❌ | ❌ |\n| PHP port included | ✅ | ❌ | ❌ |\n\n---\n\n## Installation\n\n### pip (recommended)\n\n```bash\npip install texthumanize\n```\n\n### From source\n\n```bash\ngit clone https://github.com/ksanyok/TextHumanize.git\ncd TextHumanize\npip install -e .\n```\n\n### PHP\n\n```bash\ncd php/\ncomposer install\n```\n\n### Verify installation\n\n```python\nimport texthumanize\nprint(texthumanize.__version__)  # 0.4.0\n```\n\n---\n\n## Quick Start\n\n```python\nfrom texthumanize import humanize, analyze, explain\n\n# Basic usage — one line\nresult = humanize(\"This text utilizes a comprehensive methodology for implementation.\")\nprint(result.text)\n# → \"This text uses a complete method for setup.\"\n\n# With options\nresult = humanize(\n    \"Furthermore, it is important to note that the implementation facilitates optimization.\",\n    lang=\"en\",           # auto-detect or specify\n    profile=\"web\",       # chat, web, seo, docs, formal\n    intensity=70,        # 0 (mild) to 100 (maximum)\n)\nprint(result.text)\nprint(f\"Changed: {result.change_ratio:.0%}\")\n\n# Analyze text metrics\nreport = analyze(\"Text to analyze for naturalness.\", lang=\"en\")\nprint(f\"Artificiality score: {report.artificiality_score:.1f}/100\")\nprint(f\"Flesch-Kincaid grade: {report.flesch_kincaid_grade:.1f}\")\n\n# Get detailed explanation of changes\nresult = humanize(\"Furthermore, it is important to utilize this approach.\")\nprint(explain(result))\n```\n\n### Quick Examples for Each Feature\n\n```python\nfrom texthumanize import (\n    detect_ai, paraphrase, analyze_tone, adjust_tone,\n    detect_watermarks, clean_watermarks, spin, spin_variants,\n    analyze_coherence, full_readability,\n)\n\n# AI Detection\nai = detect_ai(\"Text to check for AI generation.\", lang=\"en\")\nprint(f\"AI probability: {ai['score']:.0%} → {ai['verdict']}\")\n\n# Paraphrasing\nprint(paraphrase(\"The system works efficiently.\", lang=\"en\"))\n\n# Tone Analysis\ntone = analyze_tone(\"Please submit the documentation.\", lang=\"en\")\nprint(f\"Tone: {tone['primary_tone']}, formality: {tone['formality']:.2f}\")\n\n# Tone Adjustment\ncasual = adjust_tone(\"It is imperative to proceed.\", target=\"casual\", lang=\"en\")\nprint(casual)\n\n# Watermark Cleaning\nclean = clean_watermarks(\"Te\\u200bxt wi\\u200bth hid\\u200bden chars\")\nprint(clean)\n\n# Text Spinning\nunique = spin(\"The system provides important data.\", lang=\"en\")\nprint(unique)\n\n# Coherence Analysis\ncoh = analyze_coherence(\"First part.\\n\\nSecond part.\\n\\nConclusion.\", lang=\"en\")\nprint(f\"Coherence: {coh['overall']:.2f}\")\n\n# Full Readability\nread = full_readability(\"Your text here.\", lang=\"en\")\nprint(read)\n```\n\n---\n\n## Before \u0026 After Examples\n\n### English — Blog Post\n\n**Before (AI-generated):**\n\u003e Furthermore, it is important to note that the implementation of cloud computing facilitates the optimization of business processes. Additionally, the utilization of microservices constitutes a significant advancement. Nevertheless, considerable challenges remain in the area of security. It is worth mentioning that these challenges necessitate comprehensive solutions.\n\n**After (TextHumanize, profile=\"web\", intensity=70):**\n\u003e But cloud computing helps optimize how businesses work. Also, microservices are a big step forward. Still, security is tough. These challenges need thorough solutions.\n\n**Changes:** 4 bureaucratic replacements, 2 connector swaps, sentence structure diversified.\n\n### Russian — Documentation\n\n**Before:**\n\u003e Данный документ является руководством по осуществлению настройки программного обеспечения. Необходимо осуществить установку всех компонентов. Кроме того, следует обратить внимание на конфигурационные параметры.\n\n**After (profile=\"docs\", intensity=60):**\n\u003e Этот документ - руководство по настройке ПО. Нужно установить все компоненты. Также стоит обратить внимание на параметры конфигурации.\n\n### Ukrainian — Web Content\n\n**Before:**\n\u003e Даний матеріал є яскравим прикладом здійснення сучасних підходів. Крім того, необхідно зазначити важливість впровадження інноваційних рішень.\n\n**After (profile=\"web\", intensity=65):**\n\u003e Цей матеріал - яскравий приклад сучасних підходів. Також важливо впроваджувати інноваційні рішення.\n\n---\n\n## API Reference\n\n### `humanize(text, **options)`\n\nMain function — transforms text to sound more natural.\n\n```python\nfrom texthumanize import humanize\n\nresult = humanize(\n    text=\"Your text here\",\n    lang=\"auto\",        # auto-detect or specify: en, ru, de, fr, es, etc.\n    profile=\"web\",      # chat, web, seo, docs, formal, academic, marketing, social, email\n    intensity=60,       # 0 (no changes) to 100 (maximum)\n    preserve={          # protect specific elements\n        \"code_blocks\": True,\n        \"urls\": True,\n        \"emails\": True,\n        \"brand_terms\": [\"MyBrand\"],\n    },\n    constraints={       # output constraints\n        \"max_change_ratio\": 0.4,\n        \"keep_keywords\": [\"SEO\", \"API\"],\n    },\n    seed=42,            # reproducible results\n)\n\n# Result object\nprint(result.text)           # processed text\nprint(result.original)       # original text (unchanged)\nprint(result.lang)           # detected/specified language\nprint(result.profile)        # profile used\nprint(result.intensity)      # intensity used\nprint(result.change_ratio)   # fraction of text changed (0.0-1.0)\nprint(result.changes)        # list of individual changes [{type, original, replacement}]\nprint(result.metrics_before) # metrics before processing\nprint(result.metrics_after)  # metrics after processing\n```\n\n**Returns:** `HumanizeResult` dataclass.\n\n### `humanize_chunked(text, chunk_size=5000, **options)`\n\nProcess large texts by splitting into chunks at paragraph boundaries. Each chunk is processed independently with its own seed variation, then reassembled.\n\n```python\nfrom texthumanize import humanize_chunked\n\n# Process a 50,000-character document\nwith open(\"large_document.txt\") as f:\n    text = f.read()\n\nresult = humanize_chunked(\n    text,\n    chunk_size=5000,     # characters per chunk (default)\n    overlap=200,         # character overlap for context\n    lang=\"en\",\n    profile=\"docs\",\n    intensity=50,\n)\nprint(result.text)\nprint(f\"Total changes: {len(result.changes)}\")\n```\n\n**Returns:** `HumanizeResult` dataclass.\n\n### `analyze(text, lang)`\n\nAnalyze text and return naturalness metrics.\n\n```python\nfrom texthumanize import analyze\n\nreport = analyze(\"Text to analyze.\", lang=\"en\")\n\n# All available metrics\nprint(f\"Artificiality:         {report.artificiality_score:.1f}/100\")\nprint(f\"Total words:           {report.total_words}\")\nprint(f\"Total sentences:       {report.total_sentences}\")\nprint(f\"Avg sentence length:   {report.avg_sentence_length:.1f} words\")\nprint(f\"Sentence length var:   {report.sentence_length_variance:.2f}\")\nprint(f\"Bureaucratic ratio:    {report.bureaucratic_ratio:.3f}\")\nprint(f\"Connector ratio:       {report.connector_ratio:.3f}\")\nprint(f\"Repetition score:      {report.repetition_score:.3f}\")\nprint(f\"Typography score:      {report.typography_score:.3f}\")\nprint(f\"Burstiness:            {report.burstiness_score:.3f}\")\nprint(f\"Flesch-Kincaid grade:  {report.flesch_kincaid_grade:.1f}\")\nprint(f\"Coleman-Liau index:    {report.coleman_liau_index:.1f}\")\nprint(f\"Avg word length:       {report.avg_word_length:.1f}\")\nprint(f\"Avg syllables/word:    {report.avg_syllables_per_word:.1f}\")\n```\n\n**Returns:** `AnalysisReport` dataclass.\n\n### `explain(result)`\n\nGenerate a human-readable report of all changes made by `humanize()`.\n\n```python\nfrom texthumanize import humanize, explain\n\nresult = humanize(\"Furthermore, it is important to utilize this approach.\", lang=\"en\")\nreport = explain(result)\nprint(report)\n```\n\n**Output:**\n```\n=== Отчёт TextHumanize ===\nЯзык: en | Профиль: web | Интенсивность: 60\nДоля изменений: 25.3%\n\n--- Метрики ---\n  Искусственность: 45.00 → 22.00 ↓\n  Канцеляризмы: 0.12 → 0.00 ↓\n\n--- Изменения (3) ---\n  [debureaucratization] \"utilize\" → \"use\"\n  [connector] \"Furthermore\" → \"Also\"\n  [structure] sentence split applied\n```\n\n**Returns:** `str`\n\n### `detect_ai(text, lang)`\n\nDetect AI-generated text using 12 independent statistical metrics without any ML dependencies.\n\n```python\nfrom texthumanize import detect_ai\n\nresult = detect_ai(\"Your text to analyze.\", lang=\"auto\")\n\nprint(f\"AI probability:  {result['score']:.1%}\")\nprint(f\"Verdict:         {result['verdict']}\")    # \"human\", \"mixed\", \"ai\", or \"unknown\"\nprint(f\"Confidence:      {result['confidence']:.1%}\")\nprint(f\"Language:        {result['lang']}\")\n\n# Detailed per-metric scores (0.0 = human-like, 1.0 = AI-like)\nmetrics = result['metrics']\nfor name, score in metrics.items():\n    print(f\"  {name:30s} {score:.3f}\")\n\n# Human-readable explanations\nfor exp in result['explanations']:\n    print(f\"  → {exp}\")\n```\n\n**Returns:** `dict` with keys: `score`, `verdict`, `confidence`, `metrics`, `explanations`, `lang`.\n\n### `detect_ai_batch(texts, lang)`\n\nBatch AI detection for multiple texts.\n\n```python\nfrom texthumanize import detect_ai_batch\n\ntexts = [\n    \"First text to check.\",\n    \"Second text to check.\",\n    \"Third text to check.\",\n]\nresults = detect_ai_batch(texts, lang=\"en\")\nfor i, r in enumerate(results):\n    print(f\"Text {i+1}: {r['verdict']} ({r['score']:.0%})\")\n```\n\n**Returns:** `list[dict]`\n\n### `paraphrase(text, lang, intensity, seed)`\n\nParaphrase text while preserving meaning. Uses syntactic transformations: clause swaps, passive↔active, sentence splitting, adverb fronting, nominalization.\n\n```python\nfrom texthumanize import paraphrase\n\nresult = paraphrase(\n    \"Furthermore, it is important to note this fact.\",\n    lang=\"en\",\n    intensity=0.5,   # 0.0-1.0: fraction of sentences to transform\n    seed=42,         # optional: reproducible results\n)\nprint(result)\n```\n\n**Returns:** `str`\n\n### `analyze_tone(text, lang)`\n\nAnalyze text tone, formality level, and subjectivity.\n\n```python\nfrom texthumanize import analyze_tone\n\ntone = analyze_tone(\"Shall we proceed with the implementation?\", lang=\"en\")\n\nprint(f\"Primary tone:   {tone['primary_tone']}\")     # formal, casual, academic, etc.\nprint(f\"Formality:      {tone['formality']:.2f}\")     # 0=casual, 1=formal\nprint(f\"Subjectivity:   {tone['subjectivity']:.2f}\")  # 0=objective, 1=subjective\nprint(f\"Confidence:     {tone['confidence']:.2f}\")\nprint(f\"Scores:         {tone['scores']}\")            # dict of all tone scores\nprint(f\"Markers found:  {tone['markers']}\")           # detected tone markers\n```\n\n**Returns:** `dict`\n\n### `adjust_tone(text, target, lang, intensity)`\n\nAdjust text to a target tone level.\n\n```python\nfrom texthumanize import adjust_tone\n\n# Make formal text casual\ncasual = adjust_tone(\n    \"It is imperative to implement this solution immediately.\",\n    target=\"casual\",     # very_formal, formal, neutral, casual, very_casual\n    lang=\"en\",\n    intensity=0.5,       # 0.0-1.0: strength of adjustment\n)\nprint(casual)\n\n# Make casual text formal\nformal = adjust_tone(\n    \"Hey, we gotta fix this ASAP!\",\n    target=\"formal\",\n    lang=\"en\",\n)\nprint(formal)\n```\n\nAvailable targets: `very_formal`, `formal`, `neutral`, `casual`, `very_casual`, `friendly`, `academic`, `professional`, `marketing`.\n\n**Returns:** `str`\n\n### `detect_watermarks(text, lang)`\n\nDetect invisible watermarks: zero-width characters, homoglyphs, invisible formatting, statistical AI watermarks.\n\n```python\nfrom texthumanize import detect_watermarks\n\nreport = detect_watermarks(\"Text with\\u200bhidden\\u200bcharacters\")\n\nprint(f\"Has watermarks:     {report['has_watermarks']}\")\nprint(f\"Types found:        {report['watermark_types']}\")\nprint(f\"Confidence:         {report['confidence']:.2f}\")\nprint(f\"Characters removed: {report['characters_removed']}\")\nprint(f\"Cleaned text:       {report['cleaned_text']}\")\nprint(f\"Details:            {report['details']}\")\n```\n\n**Returns:** `dict`\n\n### `clean_watermarks(text, lang)`\n\nRemove all detected watermarks and return clean text.\n\n```python\nfrom texthumanize import clean_watermarks\n\nclean = clean_watermarks(\"Contaminated\\u200b text\\u200b here\")\nprint(clean)  # \"Contaminated text here\"\n```\n\n**Returns:** `str`\n\n### `spin(text, lang, intensity, seed)`\n\nGenerate a unique version of text using synonym substitution.\n\n```python\nfrom texthumanize import spin\n\nresult = spin(\"The system provides important data for analysis.\", lang=\"en\")\nprint(result)\n# → e.g. \"The platform offers crucial information for examination.\"\n```\n\n**Returns:** `str`\n\n### `spin_variants(text, count, lang, intensity)`\n\nGenerate multiple unique versions of the same text.\n\n```python\nfrom texthumanize import spin_variants\n\nvariants = spin_variants(\n    \"The system provides important data.\",\n    count=5,\n    lang=\"en\",\n    intensity=0.5,\n)\nfor i, v in enumerate(variants, 1):\n    print(f\"  #{i}: {v}\")\n```\n\n**Returns:** `list[str]`\n\n### `analyze_coherence(text, lang)`\n\nAnalyze text coherence — how well sentences and paragraphs flow together.\n\n```python\nfrom texthumanize import analyze_coherence\n\ntext = \"\"\"\nIntroduction paragraph here.\n\nMain content paragraph with details.\n\nConclusion summarizing the points.\n\"\"\"\n\nreport = analyze_coherence(text, lang=\"en\")\n\nprint(f\"Overall coherence:        {report['overall']:.2f}\")\nprint(f\"Lexical cohesion:         {report['lexical_cohesion']:.2f}\")\nprint(f\"Transition score:         {report['transition_score']:.2f}\")\nprint(f\"Topic consistency:        {report['topic_consistency']:.2f}\")\nprint(f\"Opening diversity:        {report['sentence_opening_diversity']:.2f}\")\nprint(f\"Paragraphs:               {report['paragraph_count']}\")\nprint(f\"Avg paragraph length:     {report['avg_paragraph_length']:.1f}\")\n\nif report['issues']:\n    print(\"Issues:\")\n    for issue in report['issues']:\n        print(f\"  - {issue}\")\n```\n\n**Returns:** `dict`\n\n### `full_readability(text, lang)`\n\nCompute all readability indices at once.\n\n```python\nfrom texthumanize import full_readability\n\nr = full_readability(\"Your text here with multiple sentences. Each one helps.\", lang=\"en\")\n\n# Available indices\nprint(f\"Flesch-Kincaid Grade: {r.get('flesch_kincaid_grade', 0):.1f}\")\nprint(f\"Coleman-Liau:         {r.get('coleman_liau_index', 0):.1f}\")\nprint(f\"ARI:                  {r.get('ari', 0):.1f}\")\nprint(f\"SMOG:                 {r.get('smog_index', 0):.1f}\")\nprint(f\"Gunning Fog:          {r.get('gunning_fog', 0):.1f}\")\nprint(f\"Dale-Chall:           {r.get('dale_chall', 0):.1f}\")\n```\n\n**Returns:** `dict`\n\n---\n\n## Profiles\n\nNine built-in profiles control the processing style:\n\n| Profile | Use Case | Sentence Length | Colloquialisms | Intensity Default |\n|---------|----------|:---------:|:---------:|:---------:|\n| `chat` | Messaging, social media | 8-18 words | High | 80 |\n| `web` | Blog posts, articles | 10-22 words | Medium | 60 |\n| `seo` | SEO content | 12-25 words | None | 40 |\n| `docs` | Technical documentation | 12-28 words | None | 50 |\n| `formal` | Academic, legal | 15-30 words | None | 30 |\n| `academic` | Research papers | 15-30 words | None | 25 |\n| `marketing` | Sales, promo copy | 8-20 words | Medium | 70 |\n| `social` | Social media posts | 6-15 words | High | 85 |\n| `email` | Business emails | 10-22 words | Medium | 50 |\n\n```python\n# Conversational style for social media\nresult = humanize(text, profile=\"chat\", intensity=80)\n\n# SEO-safe mode (preserves keywords, minimal changes)\nresult = humanize(text, profile=\"seo\", intensity=40,\n                  constraints={\"keep_keywords\": [\"API\", \"cloud\"]})\n\n# Academic writing\nresult = humanize(text, profile=\"academic\", intensity=25)\n\n# Marketing copy — energetic and engaging\nresult = humanize(text, profile=\"marketing\", intensity=70)\n```\n\n### Profile Comparison\n\nGiven the input: *\"Furthermore, it is important to note that the implementation of this approach facilitates comprehensive optimization.\"*\n\n| Profile | Output |\n|---------|--------|\n| `chat` | *\"This approach helps optimize things a lot.\"* |\n| `web` | *\"Also, this approach helps with thorough optimization.\"* |\n| `seo` | *\"This approach facilitates comprehensive optimization.\"* |\n| `formal` | *\"Notably, implementing this approach facilitates optimization.\"* |\n\n---\n\n## Parameters\n\n### Intensity (0-100)\n\nControls how aggressively text is modified:\n\n| Range | Effect | Best For |\n|-------|--------|----------|\n| 0-20 | Typography normalization only | Legal, contracts |\n| 20-40 | + light debureaucratization | Documentation |\n| 40-60 | + structure diversification \u0026 connector swaps | Blog posts |\n| 60-80 | + synonym replacement, natural phrasing | Web content |\n| 80-100 | + maximum variation, colloquial insertions | Chat, social |\n\n```python\n# Minimal — only fix typography\nresult = humanize(text, intensity=10)\n\n# Moderate — safe for most content\nresult = humanize(text, intensity=50)\n\n# Maximum — full rewrite\nresult = humanize(text, intensity=95)\n```\n\n### Preserve Options\n\nProtect specific elements from modification:\n\n```python\npreserve = {\n    \"code_blocks\": True,    # protect ```code``` blocks\n    \"urls\": True,           # protect URLs\n    \"emails\": True,         # protect email addresses\n    \"hashtags\": True,       # protect #hashtags\n    \"mentions\": True,       # protect @mentions\n    \"markdown\": True,       # protect markdown formatting\n    \"html\": True,           # protect HTML tags\n    \"numbers\": False,       # protect numbers (default: False)\n    \"brand_terms\": [        # exact terms to protect (case-sensitive)\n        \"TextHumanize\",\n        \"MyBrand\",\n        \"ProductName™\",\n    ],\n}\n```\n\n### Constraints\n\nSet limits on processing:\n\n```python\nconstraints = {\n    \"max_change_ratio\": 0.4,            # max 40% of text changed\n    \"min_sentence_length\": 3,           # minimum words per sentence\n    \"keep_keywords\": [\"SEO\", \"API\"],    # keywords preserved exactly\n}\n```\n\n### Seed (Reproducibility)\n\n```python\n# Same seed = same result every time\nr1 = humanize(\"Text here.\", seed=42)\nr2 = humanize(\"Text here.\", seed=42)\nassert r1.text == r2.text  # guaranteed\n```\n\n---\n\n## Plugin System\n\nRegister custom processing stages that run before or after any built-in stage:\n\n```python\nfrom texthumanize import Pipeline, humanize\n\n# Simple hook function\ndef add_disclaimer(text: str, lang: str) -\u003e str:\n    return text + \"\\n\\n---\\nProcessed by TextHumanize.\"\n\nPipeline.register_hook(add_disclaimer, after=\"naturalization\")\n\n# Plugin class with full context\nclass BrandEnforcer:\n    def __init__(self, brand: str, canonical: str):\n        self.brand = brand\n        self.canonical = canonical\n\n    def process(self, text: str, lang: str, profile: str, intensity: int) -\u003e str:\n        import re\n        return re.sub(re.escape(self.brand), self.canonical, text, flags=re.IGNORECASE)\n\nPipeline.register_plugin(\n    BrandEnforcer(\"texthumanize\", \"TextHumanize\"),\n    after=\"typography\",\n)\n\n# Process text — plugins run automatically\nresult = humanize(\"texthumanize is great.\")\nprint(result.text)  # \"TextHumanize is great. ...\"\n\n# Clean up when done\nPipeline.clear_plugins()\n```\n\n### Available Stage Names\n\n```\nsegmentation → typography → debureaucratization → structure → repetitions →\nliveliness → universal → naturalization → validation → restore\n```\n\nYou can attach plugins `before` or `after` any of these stages.\n\n---\n\n## Chunk Processing\n\nFor large documents (articles, books, reports), use `humanize_chunked` to process text in manageable pieces:\n\n```python\nfrom texthumanize import humanize_chunked\n\n# Automatically splits at paragraph boundaries\nresult = humanize_chunked(\n    very_long_text,\n    chunk_size=5000,    # characters per chunk\n    overlap=200,        # context overlap\n    lang=\"en\",\n    profile=\"docs\",\n    intensity=50,\n    seed=42,            # base seed, each chunk gets seed+i\n)\nprint(f\"Processed {len(result.text)} characters\")\n```\n\nEach chunk is processed independently with its own seed for variation, then reassembled into the final text. The chunk boundary detection preserves paragraph integrity.\n\n---\n\n## CLI Reference\n\n### Basic Usage\n\n```bash\n# Process a file (output to stdout)\ntexthumanize input.txt\n\n# Process with options\ntexthumanize input.txt -l en -p web -i 70\n\n# Save to file\ntexthumanize input.txt -o output.txt\n\n# Process from stdin\necho \"Text to process\" | texthumanize - -l en\ncat article.txt | texthumanize -\n```\n\n### All CLI Options\n\n```bash\ntexthumanize [input] [options]\n\nPositional:\n  input                     Input file path (or '-' for stdin)\n\nOptions:\n  -o, --output FILE         Output file (default: stdout)\n  -l, --lang LANG           Language: auto, en, ru, uk, de, fr, es, pl, pt, it\n  -p, --profile PROFILE     Profile: chat, web, seo, docs, formal, academic,\n                            marketing, social, email\n  -i, --intensity N         Processing intensity 0-100 (default: 60)\n  --keep WORD [WORD ...]    Keywords to preserve\n  --brand TERM [TERM ...]   Brand terms to protect\n  --max-change RATIO        Maximum change ratio 0-1 (default: 0.4)\n  --seed N                  Random seed for reproducibility\n  --report FILE             Save JSON report to file\n\nAnalysis modes:\n  --analyze                 Analyze text metrics (no processing)\n  --explain                 Show detailed change report\n  --detect-ai               Check for AI-generated text\n  --tone-analyze            Analyze text tone\n  --readability             Full readability analysis\n  --coherence               Coherence analysis\n\nTransform modes:\n  --paraphrase              Paraphrase the text\n  --tone TARGET             Adjust tone (formal, casual, neutral, etc.)\n  --watermarks              Detect and clean watermarks\n  --spin                    Generate a spun version\n  --variants N              Generate N spin variants\n\nServer:\n  --api                     Start REST API server\n  --port N                  API server port (default: 8080)\n\nOther:\n  -v, --version             Show version\n```\n\n### CLI Examples\n\n```bash\n# Analyze a file\ntexthumanize article.txt --analyze -l en\n\n# Check for AI generation\ntexthumanize essay.txt --detect-ai\n\n# Paraphrase with output file\ntexthumanize input.txt --paraphrase -o paraphrased.txt\n\n# Adjust tone to casual\ntexthumanize formal_email.txt --tone casual -o casual_email.txt\n\n# Clean watermarks\ntexthumanize suspect.txt --watermarks -o clean.txt\n\n# Generate 5 spin variants\ntexthumanize template.txt --variants 5\n\n# Start API server\ntexthumanize dummy --api --port 9090\n```\n\n---\n\n## REST API Server\n\nTextHumanize includes a zero-dependency HTTP server for JSON API access:\n\n```bash\n# Start server\npython -m texthumanize.api --port 8080\n\n# Or via CLI\ntexthumanize dummy --api --port 8080\n```\n\n### Endpoints\n\nAll `POST` endpoints accept JSON body with `{\"text\": \"...\"}` and return JSON.\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `POST` | `/humanize` | Humanize text |\n| `POST` | `/analyze` | Analyze text metrics |\n| `POST` | `/detect-ai` | AI detection (single or batch) |\n| `POST` | `/paraphrase` | Paraphrase text |\n| `POST` | `/tone/analyze` | Tone analysis |\n| `POST` | `/tone/adjust` | Tone adjustment |\n| `POST` | `/watermarks/detect` | Detect watermarks |\n| `POST` | `/watermarks/clean` | Clean watermarks |\n| `POST` | `/spin` | Spin text (single or multi) |\n| `POST` | `/coherence` | Coherence analysis |\n| `POST` | `/readability` | Readability metrics |\n| `GET` | `/health` | Server health check |\n| `GET` | `/` | API info \u0026 endpoint list |\n\n### Usage with curl\n\n```bash\n# Humanize\ncurl -X POST http://localhost:8080/humanize \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"text\": \"Furthermore, it is important to utilize this.\", \"lang\": \"en\", \"profile\": \"web\"}'\n\n# AI Detection\ncurl -X POST http://localhost:8080/detect-ai \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"text\": \"Text to check.\"}'\n\n# Batch AI Detection\ncurl -X POST http://localhost:8080/detect-ai \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"texts\": [\"First text.\", \"Second text.\"]}'\n\n# Tone Adjustment\ncurl -X POST http://localhost:8080/tone/adjust \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"text\": \"Formal text here.\", \"target\": \"casual\"}'\n\n# Health Check\ncurl http://localhost:8080/health\n```\n\n### Usage with Python requests\n\n```python\nimport requests\n\nAPI = \"http://localhost:8080\"\n\n# Humanize\nr = requests.post(f\"{API}/humanize\", json={\n    \"text\": \"Text to process.\",\n    \"lang\": \"en\",\n    \"profile\": \"web\",\n    \"intensity\": 60,\n})\nprint(r.json()[\"text\"])\n\n# AI Detection\nr = requests.post(f\"{API}/detect-ai\", json={\"text\": \"Check this text.\"})\nprint(r.json()[\"verdict\"])\n```\n\nAll responses include `_elapsed_ms` field with processing time in milliseconds.\n\n---\n\n## Processing Pipeline\n\nTextHumanize uses a 10-stage pipeline:\n\n```\nInput Text\n  │\n  ├─ 1. Segmentation        ─ protect code blocks, URLs, emails, brands\n  │\n  ├─ 2. Typography           ─ normalize dashes, quotes, ellipses, punctuation\n  │\n  ├─ 3. Debureaucratization  ─ replace bureaucratic/formal words  [dictionary]\n  │\n  ├─ 4. Structure            ─ diversify sentence openings         [dictionary]\n  │\n  ├─ 5. Repetitions          ─ reduce word/phrase repetitions       [dictionary + context]\n  │\n  ├─ 6. Liveliness           ─ inject natural phrasing              [dictionary]\n  │\n  ├─ 7. Universal            ─ statistical processing               [any language]\n  │\n  ├─ 8. Naturalization       ─ burstiness, perplexity, rhythm       [KEY STAGE]\n  │\n  ├─ 9. Validation           ─ quality check, rollback if needed\n  │\n  └─ 10. Restore             ─ restore protected segments\n  │\nOutput Text\n```\n\n**Stages 3-6** require full dictionary support (9 languages).\n**Stages 2, 7-8** work for any language, including those without dictionaries.\n**Stage 9** rolls back changes if quality degrades (configurable via `max_change_ratio`).\n\n---\n\n## AI Detection — How It Works\n\nThe AI detection engine uses **12 independent statistical metrics**, each measuring a different aspect of text naturalness. No machine learning models, neural networks, or external APIs are used.\n\n### Metrics Explained\n\n| # | Metric | What It Measures | Weight |\n|---|--------|-----------------|:------:|\n| 1 | **AI Patterns** | Formulaic phrases (\"it is important to note\", \"furthermore\") | 20% |\n| 2 | **Burstiness** | Sentence length variation (humans vary more than AI) | 14% |\n| 3 | **Opening Diversity** | How varied sentence beginnings are | 9% |\n| 4 | **Entropy** | Word predictability (AI text has lower entropy) | 8% |\n| 5 | **Stylometry** | Word length distribution consistency | 8% |\n| 6 | **Coherence** | Paragraph transition smoothness | 8% |\n| 7 | **Vocabulary** | Type-to-token ratio, lexical richness | 7% |\n| 8 | **Grammar Perfection** | Too-perfect grammar is suspicious | 6% |\n| 9 | **Punctuation** | Punctuation diversity and distribution | 6% |\n| 10 | **Rhythm** | Syllabic rhythm patterns | 6% |\n| 11 | **Readability** | Consistency of reading level across paragraphs | 5% |\n| 12 | **Zipf** | Word frequency distribution (Zipf's law adherence) | 3% |\n\n### Scoring\n\nEach metric produces a score from 0.0 (human-like) to 1.0 (AI-like). The weighted average is passed through a calibrated sigmoid function (center=0.45, steepness=8.0) to produce the final AI probability.\n\n**Verdicts:**\n- `score \u003c 0.35` → **\"human\"** — text appears naturally written\n- `0.35 ≤ score \u003c 0.65` → **\"mixed\"** — uncertain or partially AI\n- `score ≥ 0.65` → **\"ai\"** — text shows strong AI patterns\n\n### Benchmark Results\n\nTested on a curated benchmark of 9 samples (4 AI-generated, 5 human-written):\n\n```\n┌──────────────────┬─────────────────┐\n│ Metric           │ Value           │\n├──────────────────┼─────────────────┤\n│ Accuracy         │ 100%            │\n│ Precision        │ 100%            │\n│ Recall           │ 100%            │\n│ F1 Score         │ 1.000           │\n│ True Positives   │ 4               │\n│ False Positives  │ 0               │\n│ True Negatives   │ 5               │\n│ False Negatives  │ 0               │\n└──────────────────┴─────────────────┘\n```\n\n### Example: AI vs Human Text\n\n```python\nfrom texthumanize import detect_ai\n\n# AI-generated text (GPT-like)\nai_text = \"\"\"\nFurthermore, it is important to note that the implementation of artificial \nintelligence constitutes a significant paradigm shift. Additionally, the \nutilization of machine learning facilitates comprehensive optimization \nof various processes. Nevertheless, it is worth mentioning that \nconsiderable challenges remain.\n\"\"\"\nresult = detect_ai(ai_text, lang=\"en\")\nprint(f\"AI: {result['score']:.0%}\")  # ~87-89%\n\n# Human-written casual text\nhuman_text = \"\"\"\nI tried that new coffee shop downtown yesterday. Their espresso was \nactually decent - not as burnt as the place on 5th. The barista \nwas nice too, recommended this Ethiopian blend I'd never heard of. \nMight go back this weekend.\n\"\"\"\nresult = detect_ai(human_text, lang=\"en\")\nprint(f\"AI: {result['score']:.0%}\")  # ~20-27%\n```\n\n### Recommendations\n\n- **Best accuracy:** texts of 100+ words\n- **Short texts** (\u003c 50 words): results may be less reliable\n- **Formal texts:** may score slightly higher even if human-written\n- **Multiple metrics** help even when individual ones are uncertain\n\n---\n\n## Language Support\n\n### Full Dictionary Support (9 languages)\n\nEach language pack includes:\n- Bureaucratic word → natural replacements\n- Formulaic connector alternatives\n- Synonym dictionaries (context-aware)\n- Sentence starter variations\n- Colloquial markers\n- Abbreviation lists (for sentence splitting)\n- Language-specific trigrams (for detection)\n- Stop words\n- Profile-specific sentence length targets\n- Perplexity boosters\n\n| Language | Code | Bureaucratic | Connectors | Synonyms | Abbreviations |\n|----------|:----:|:-----:|:------:|:------:|:------:|\n| Russian | `ru` | 70+ | 25+ | 50+ | 15+ |\n| Ukrainian | `uk` | 50+ | 24 | 48 | 12+ |\n| English | `en` | 40+ | 25 | 35+ | 20+ |\n| German | `de` | 22 | 12 | 26 | 10+ |\n| French | `fr` | 20 | 12 | 20 | 8+ |\n| Spanish | `es` | 18 | 12 | 18 | 8+ |\n| Polish | `pl` | 18 | 12 | 18 | 8+ |\n| Portuguese | `pt` | 16 | 12 | 17 | 6+ |\n| Italian | `it` | 16 | 12 | 17 | 6+ |\n\n### Universal Processor\n\nFor any language not in the dictionary list, TextHumanize uses statistical methods:\n- Sentence length variation (burstiness injection)\n- Punctuation normalization\n- Whitespace regularization\n- Perplexity boosting\n- Fragment insertion\n\n```python\n# Works with any language — no dictionaries needed\nresult = humanize(\"日本語のテキスト\", lang=\"ja\")\nresult = humanize(\"Текст на казахском\", lang=\"kk\")\nresult = humanize(\"متن فارسی\", lang=\"fa\")\nresult = humanize(\"Đây là văn bản tiếng Việt\", lang=\"vi\")\n```\n\n### Auto-Detection\n\n```python\n# Language is detected automatically\nresult = humanize(\"Этот текст автоматически определяется как русский.\")\nprint(result.lang)  # \"ru\"\n\nresult = humanize(\"This text is automatically detected as English.\")\nprint(result.lang)  # \"en\"\n```\n\n---\n\n## SEO Mode\n\nThe `seo` profile is designed for content that must preserve search ranking:\n\n```python\nresult = humanize(\n    text,\n    profile=\"seo\",\n    intensity=40,            # lower intensity for safety\n    constraints={\n        \"max_change_ratio\": 0.3,\n        \"keep_keywords\": [\"cloud computing\", \"API\", \"microservices\"],\n    },\n)\n```\n\n### SEO Mode Features\n\n| Feature | Behavior |\n|---------|----------|\n| Keyword preservation | All specified keywords kept exactly |\n| Intensity cap | Limited to safe levels |\n| Colloquialisms | None inserted |\n| Structure changes | Minimal |\n| Sentence length | Stays within 12-25 words (optimal for SEO) |\n| Synonyms | Only for non-keyword terms |\n| Readability | Grade 6-8 target maintained |\n\n### SEO Workflow Example\n\n```python\nfrom texthumanize import humanize, analyze, detect_ai\n\n# 1. Analyze original\nreport = analyze(seo_text, lang=\"en\")\nprint(f\"Artificiality before: {report.artificiality_score:.0f}/100\")\n\n# 2. Humanize with SEO protection\nresult = humanize(seo_text, profile=\"seo\", intensity=35,\n                  constraints={\"keep_keywords\": [\"cloud\", \"scalability\"]})\n\n# 3. Verify keywords preserved\nfor kw in [\"cloud\", \"scalability\"]:\n    assert kw in result.text, f\"Keyword '{kw}' was modified!\"\n\n# 4. Check AI detection improvement\nai_before = detect_ai(seo_text, lang=\"en\")\nai_after = detect_ai(result.text, lang=\"en\")\nprint(f\"AI score: {ai_before['score']:.0%} → {ai_after['score']:.0%}\")\n```\n\n---\n\n## Readability Metrics\n\nTextHumanize includes 6 readability indices:\n\n| Index | Range | Measures |\n|-------|-------|----------|\n| **Flesch-Kincaid Grade** | 0-18+ | US grade level needed to read |\n| **Coleman-Liau** | 0-18+ | Grade level (character-based) |\n| **ARI** | 0-14+ | Automated Readability Index |\n| **SMOG** | 3-18+ | Complexity from polysyllabic words |\n| **Gunning Fog** | 6-20+ | Complexity estimate |\n| **Dale-Chall** | 0-10+ | Difficulty using common word list |\n\n```python\nfrom texthumanize import analyze, full_readability\n\n# Quick readability from analyze()\nreport = analyze(\"Your text here.\", lang=\"en\")\nprint(f\"Flesch-Kincaid: {report.flesch_kincaid_grade:.1f}\")\nprint(f\"Coleman-Liau:   {report.coleman_liau_index:.1f}\")\n\n# Full readability with all indices\nr = full_readability(\"Your text with multiple sentences. Each one counts.\", lang=\"en\")\nfor metric, value in r.items():\n    print(f\"  {metric}: {value}\")\n```\n\n### Readability Grade Interpretation\n\n| Grade | Level | Audience |\n|:-----:|-------|----------|\n| 5-6 | Easy | General public |\n| 7-8 | Standard | Web content, blogs |\n| 9-10 | Moderate | Business writing |\n| 11-12 | Difficult | Academic papers |\n| 13+ | Complex | Technical/legal |\n\n---\n\n## Paraphrasing Engine\n\nThe paraphrasing engine uses syntactic transformations (no ML):\n\n### Transformations Applied\n\n| Transformation | Example |\n|---------------|---------|\n| **Clause swap** | \"Although X, Y.\" → \"Y, although X.\" |\n| **Passive→Active** | \"The report was written by John.\" → \"John wrote the report.\" |\n| **Sentence splitting** | \"X, and Y, and Z.\" → \"X. Y. Z.\" |\n| **Adverb fronting** | \"He quickly ran.\" → \"Quickly, he ran.\" |\n| **Nominalization** | \"He decided to go.\" → \"His decision was to go.\" |\n\n```python\nfrom texthumanize import paraphrase\n\noriginal = \"Although the study was comprehensive, the results were inconclusive.\"\nresult = paraphrase(original, lang=\"en\", intensity=0.8)\nprint(result)\n# → e.g. \"The results were inconclusive, although the study was comprehensive.\"\n```\n\n---\n\n## Tone Analysis \u0026 Adjustment\n\n### Tone Levels\n\n| Tone | Formality | Example |\n|------|:---------:|---------|\n| `very_formal` | 0.9+ | \"The undersigned hereby acknowledges...\" |\n| `formal` | 0.7-0.9 | \"Please submit the required documentation.\" |\n| `neutral` | 0.4-0.7 | \"Send us the documents.\" |\n| `casual` | 0.2-0.4 | \"Just send over the docs.\" |\n| `very_casual` | 0.0-0.2 | \"Shoot me the docs!\" |\n\n### Markers Detected\n\nFor English: `hereby`, `pursuant`, `constitutes`, `facilitate`, `implement`, `utilize`, `gonna`, `wanna`, `hey`, `awesome`, etc.\n\nFor Russian: `настоящим`, `осуществить`, `однако`, `привет`, `круто`, etc.\n\n```python\nfrom texthumanize import analyze_tone, adjust_tone\n\n# Analyze\ntone = analyze_tone(\"Pursuant to our agreement, please facilitate the transfer.\", lang=\"en\")\nprint(tone['primary_tone'])  # \"formal\"\nprint(tone['formality'])     # ~0.85\n\n# Adjust down\ncasual = adjust_tone(\"Pursuant to our agreement, please facilitate the transfer.\",\n                     target=\"casual\", lang=\"en\")\nprint(casual)  # → \"Based on our agreement, go ahead and start the transfer.\"\n```\n\n---\n\n## Watermark Detection \u0026 Cleaning\n\n### What It Detects\n\n| Type | Description | Example |\n|------|-------------|---------|\n| **Zero-width chars** | U+200B, U+200C, U+200D, U+FEFF | Invisible between words |\n| **Homoglyphs** | Cyrillic/Latin lookalikes | `а` (Cyrillic) vs `a` (Latin) |\n| **Invisible formatting** | Invisible Unicode chars | U+2060, U+2061, etc. |\n| **Spacing steganography** | Unusual space patterns | Extra spaces encoding data |\n| **Statistical watermarks** | AI watermark patterns | Token probability anomalies |\n\n```python\nfrom texthumanize import detect_watermarks, clean_watermarks\n\n# Full detection\nreport = detect_watermarks(suspicious_text, lang=\"en\")\nif report['has_watermarks']:\n    print(f\"Found: {report['watermark_types']}\")\n    print(f\"Confidence: {report['confidence']:.0%}\")\n    print(f\"Cleaned: {report['cleaned_text']}\")\nelse:\n    print(\"No watermarks detected\")\n\n# Quick clean\nclean = clean_watermarks(suspicious_text)\n```\n\n---\n\n## Text Spinning\n\nGenerate unique content variants using dictionary-based synonym replacement.\n\n### Spintax\n\nThe spinner can output spintax format for use in other tools:\n\n```python\nfrom texthumanize.spinner import ContentSpinner\n\nspinner = ContentSpinner(lang=\"en\", seed=42)\n\n# Generate spintax\nspintax = spinner.generate_spintax(\"The system provides important data.\")\nprint(spintax)\n# → \"The {system|platform} {provides|offers} {important|crucial} {data|information}.\"\n\n# Resolve spintax to one variant\nresolved = spinner.resolve_spintax(spintax)\nprint(resolved)\n```\n\n### High-Level API\n\n```python\nfrom texthumanize import spin, spin_variants\n\n# Single variant\nunique = spin(\"Original text here.\", lang=\"en\", intensity=0.6, seed=42)\n\n# Multiple variants\nvariants = spin_variants(\"Original text.\", count=5, lang=\"en\")\nfor v in variants:\n    print(v)\n```\n\n---\n\n## Coherence Analysis\n\nMeasures how well text flows at the paragraph level.\n\n### Metrics\n\n| Metric | Range | Description |\n|--------|:-----:|-------------|\n| `overall` | 0-1 | Weighted average of all coherence metrics |\n| `lexical_cohesion` | 0-1 | Word overlap between adjacent sentences |\n| `transition_score` | 0-1 | Quality of logical transitions |\n| `topic_consistency` | 0-1 | How consistent the topic is throughout |\n| `sentence_opening_diversity` | 0-1 | Variety in sentence beginnings |\n\n### Issues Detected\n\nThe analyzer flags specific problems:\n- \"Weak transition between paragraph 2 and 3\"\n- \"Topic drift detected at paragraph 4\"\n- \"Repetitive sentence openings in paragraph 1\"\n- \"Paragraph too short (1 sentence)\"\n\n```python\nfrom texthumanize import analyze_coherence\n\nreport = analyze_coherence(article_text, lang=\"en\")\nprint(f\"Overall: {report['overall']:.2f}\")\n\nif report['overall'] \u003c 0.5:\n    print(\"Text coherence is low. Issues:\")\n    for issue in report['issues']:\n        print(f\"  - {issue}\")\n```\n\n---\n\n## Morphological Engine\n\nBuilt-in lemmatization for RU, UK, EN, DE — no external libraries needed.\n\n### Supported Operations\n\n| Operation | Languages | Example |\n|-----------|-----------|---------|\n| Lemmatization | RU, UK, EN, DE | \"running\" → \"run\" |\n| Form generation | RU, UK, EN, DE | \"run\" → [\"runs\", \"running\", \"ran\"] |\n| Case handling | RU, UK, DE | Automatic declension matching |\n| Compound words | DE | Splitting German compounds |\n\n### Usage in Synonym Matching\n\nThe morphological engine is used internally by the repetition reducer to ensure synonym forms match the original grammatically:\n\n```python\n# Internal usage — synonyms match morphological forms\n# \"They were implementing...\" → \"They were doing...\" (not \"They were do...\")\n```\n\nDirect usage:\n\n```python\nfrom texthumanize.morphology import MorphologicalEngine\n\nmorph = MorphologicalEngine(lang=\"en\")\nprint(morph.lemmatize(\"running\"))   # \"run\"\nprint(morph.lemmatize(\"houses\"))    # \"house\"\nprint(morph.lemmatize(\"better\"))    # \"good\"\n```\n\n---\n\n## Smart Sentence Splitter\n\nHandles edge cases that naive regex splitting gets wrong:\n\n| Case | Input | Correct Split |\n|------|-------|--------------|\n| Abbreviations | \"Dr. Smith went home.\" | 1 sentence |\n| Decimals | \"Temperature is 36.6 degrees.\" | 1 sentence |\n| Initials | \"J.K. Rowling wrote it.\" | 1 sentence |\n| Ellipsis | \"Well... Maybe not.\" | 2 sentences |\n| Direct speech | '\"Hello,\" she said.' | 1 sentence |\n| URLs | \"Visit example.com today.\" | 1 sentence |\n\n```python\nfrom texthumanize.sentence_split import split_sentences\n\ntext = \"Dr. Smith arrived at 3 p.m. He brought the report.\"\nsents = split_sentences(text, lang=\"en\")\nprint(sents)  # ['Dr. Smith arrived at 3 p.m.', 'He brought the report.']\n```\n\nThe smart splitter is integrated into all pipeline stages that need sentence-level processing.\n\n---\n\n## Context-Aware Synonyms\n\nWord-sense disambiguation (WSD) without ML. Chooses the best synonym based on surrounding context.\n\n### How It Works\n\n1. **Topic detection** — classifies text as technology, business, casual, or neutral\n2. **Collocation scoring** — checks expected word pairs (\"make decision\" not \"make choice\")\n3. **Context window** — examines surrounding words to determine word sense\n\n```python\nfrom texthumanize.context import ContextualSynonyms\n\nctx = ContextualSynonyms(lang=\"en\", seed=42)\nctx.detect_topic(\"The server handles API requests efficiently.\")\n\n# Choose best synonym for \"important\" in tech context\nbest = ctx.choose_synonym(\"important\", [\"significant\", \"crucial\", \"key\", \"vital\"],\n                          \"This is an important update to the system.\")\nprint(best)  # \"key\" or \"crucial\" (tech-appropriate)\n```\n\n---\n\n## Using Individual Modules\n\nEach module can be used independently:\n\n```python\n# Typography normalization only\nfrom texthumanize.normalizer import TypographyNormalizer\nnorm = TypographyNormalizer(profile=\"web\")\nresult = norm.normalize(\"Text — with dashes and «quotes»...\")\n# → 'Text - with dashes and \"quotes\"...'\n\n# Debureaucratization only\nfrom texthumanize.decancel import Debureaucratizer\ndb = Debureaucratizer(lang=\"en\", profile=\"chat\", intensity=80)\nresult = db.process(\"This text utilizes a comprehensive methodology.\")\n# → \"This text uses a complete method.\"\n\n# Structure diversification\nfrom texthumanize.structure import StructureDiversifier\nsd = StructureDiversifier(lang=\"en\", profile=\"web\", intensity=60)\nresult = sd.process(\"Furthermore, X. Additionally, Y. Moreover, Z.\")\n\n# Sentence splitting\nfrom texthumanize.sentence_split import split_sentences\nsents = split_sentences(\"Dr. Smith said hello. She left.\", lang=\"en\")\n\n# AI detection (low-level)\nfrom texthumanize.detectors import detect_ai\nresult = detect_ai(\"Text to check.\", lang=\"en\")\nprint(result.ai_probability, result.verdict)\n\n# Tone analysis (low-level)\nfrom texthumanize.tone import analyze_tone\nreport = analyze_tone(\"Formal text here.\", lang=\"en\")\nprint(report.primary_tone, report.formality)\n\n# Content spinning\nfrom texthumanize.spinner import ContentSpinner\nspinner = ContentSpinner(lang=\"en\", seed=42)\nspintax = spinner.generate_spintax(\"The system works well.\")\n\n# Analysis only\nfrom texthumanize.analyzer import TextAnalyzer\nanalyzer = TextAnalyzer(lang=\"en\")\nreport = analyzer.analyze(\"Text to analyze.\")\n```\n\n---\n\n## Performance \u0026 Benchmarks\n\nAll benchmarks on Apple Silicon (M1 Pro), Python 3.12, single thread.\n\n### Processing Speed\n\n| Text Size | Time | Words/sec |\n|-----------|------|-----------|\n| 100 words | ~3ms | ~33,000 |\n| 500 words | ~8ms | ~62,000 |\n| 1,000 words | ~15ms | ~66,000 |\n| 5,000 words | ~60ms | ~83,000 |\n| 10,000 words | ~120ms | ~83,000 |\n\n### AI Detection Speed\n\n| Text Size | Time |\n|-----------|------|\n| 100 words | ~5ms |\n| 500 words | ~12ms |\n| 1,000 words | ~20ms |\n\n### Memory Usage\n\n- Base import: ~2MB\n- Per text processing: negligible overhead\n- No model files to load\n\n### Test Suite Performance\n\n```\n500 tests in 2.21 seconds\nCoverage: 85%\n```\n\n---\n\n## Testing\n\n```bash\n# Run all tests (500 tests)\npytest\n\n# With coverage report\npytest --cov=texthumanize --cov-report=term-missing\n\n# Quick run (no coverage)\npytest -q\n\n# Verbose\npytest -v\n\n# Lint check\nruff check texthumanize/\n\n# Type check\nmypy texthumanize/\n\n# Pre-commit hooks\npre-commit run --all-files\n\n# Specific test suite\npytest tests/test_core.py             # Core humanize/analyze\npytest tests/test_golden.py            # Golden master tests\npytest tests/test_segmenter.py         # Segmenter protection\npytest tests/test_normalizer.py        # Typography normalization\npytest tests/test_decancel.py          # Debureaucratization\npytest tests/test_structure.py         # Structure diversification\npytest tests/test_multilang.py         # Multi-language support\npytest tests/test_naturalizer.py       # Style naturalization\npytest tests/test_detectors.py         # AI detection\npytest tests/test_morphology_ext.py    # Morphological engine (extended)\npytest tests/test_coverage_boost.py    # Coherence/paraphrase/watermark\npytest tests/test_sentence_split.py    # Sentence splitter\npytest tests/test_tone.py              # Tone analysis\npytest tests/test_watermark.py         # Watermark detection\npytest tests/test_spinner.py           # Content spinning\npytest tests/test_coherence.py         # Coherence analysis\npytest tests/test_paraphrase.py        # Paraphrasing\npytest tests/test_context.py           # Context-aware synonyms\npytest tests/test_tokenizer.py         # Tokenizer\npytest tests/test_api_wrappers.py      # API wrapper functions\npytest tests/test_cli.py               # CLI interface\n```\n\n### Coverage Summary\n\n| Module | Coverage |\n|--------|:--------:|\n| core.py | 98% |\n| decancel.py | 97% |\n| segmenter.py | 98% |\n| lang_detect.py | 96% |\n| coherence.py | 96% |\n| tokenizer.py | 95% |\n| spinner.py | 94% |\n| normalizer.py | 94% |\n| tone.py | 94% |\n| morphology.py | 93% |\n| analyzer.py | 93% |\n| detectors.py | 90% |\n| utils.py | 90% |\n| repetitions.py | 88% |\n| structure.py | 88% |\n| paraphrase.py | 87% |\n| watermark.py | 87% |\n| liveliness.py | 86% |\n| validator.py | 86% |\n| cli.py | 85% |\n| lang/ | 100% |\n| **Overall** | **85%** |\n\n---\n\n## Architecture\n\n```\ntexthumanize/\n├── __init__.py          # Public API exports (16 functions + 4 classes)\n├── core.py              # API facade: humanize(), analyze(), detect_ai(), etc.\n├── api.py               # REST API: zero-dependency HTTP server, 12 endpoints\n├── cli.py               # CLI interface with 15+ commands\n├── pipeline.py          # 10-stage pipeline + plugin system\n│\n├── analyzer.py          # Artificiality scoring + 6 readability metrics\n├── tokenizer.py         # Paragraph/sentence/word tokenization\n├── sentence_split.py    # Smart sentence splitter (abbreviations, decimals)\n│\n├── segmenter.py         # Code/URL/email/brand protection\n├── normalizer.py        # Typography normalization\n├── decancel.py          # Debureaucratization\n├── structure.py         # Sentence structure diversification\n├── repetitions.py       # Repetition reduction (context-aware)\n├── liveliness.py        # Natural phrasing injection\n├── universal.py         # Universal processor (any language)\n├── naturalizer.py       # Style naturalization (burstiness, perplexity)\n├── validator.py         # Quality validation + automatic rollback\n│\n├── detectors.py         # AI text detector (12 statistical metrics)\n├── paraphrase.py        # Syntactic paraphrasing engine\n├── tone.py              # Tone analysis \u0026 adjustment (7 levels)\n├── watermark.py         # Watermark detection \u0026 cleaning\n├── spinner.py           # Text spinning \u0026 spintax generation\n├── coherence.py         # Coherence \u0026 paragraph flow analysis\n├── morphology.py        # Morphological engine (RU/UK/EN/DE)\n├── context.py           # Context-aware synonym selection (WSD)\n│\n├── lang_detect.py       # Language detection (9 languages)\n├── utils.py             # Options, profiles, result classes\n├── __main__.py          # python -m texthumanize\n│\n└── lang/                # Language packs (data only, no logic)\n    ├── __init__.py      # Registry + fallback\n    ├── ru.py            # Russian (70+ bureaucratic, 50+ synonyms)\n    ├── uk.py            # Ukrainian\n    ├── en.py            # English\n    ├── de.py            # German\n    ├── fr.py            # French\n    ├── es.py            # Spanish\n    ├── pl.py            # Polish\n    ├── pt.py            # Portuguese\n    └── it.py            # Italian\n```\n\n### Design Principles\n\n| Principle | Description |\n|-----------|-------------|\n| **Modularity** | Each pipeline stage is a separate module |\n| **Declarative rules** | Language packs contain only data, not logic |\n| **Idempotent** | Re-processing doesn't degrade quality |\n| **Safe defaults** | Validator auto-rolls back harmful changes |\n| **Extensible** | Add languages, profiles, or stages via plugins |\n| **Portable** | Declarative architecture enables easy porting |\n| **Zero dependencies** | Pure Python stdlib only |\n| **Lazy imports** | New modules loaded on first use, fast startup |\n\n---\n\n## PHP Library\n\nA full PHP port is available in the `php/` directory with identical functionality.\n\n### PHP Quick Start\n\n```php\n\u003c?php\nuse TextHumanize\\TextHumanize;\n\n// Basic usage\n$result = TextHumanize::humanize(\"Text to process\", profile: 'web');\necho $result-\u003eprocessed;\n\n// Chunk processing for large texts\n$result = TextHumanize::humanizeChunked($longText, chunkSize: 5000);\n\n// Analysis\n$report = TextHumanize::analyze(\"Text to analyze\");\necho $report-\u003eartificialityScore;\n\n// Explanation\n$explanation = TextHumanize::explain(\"Text to explain\");\n```\n\n### PHP Modules\n\nThe PHP port includes all new v0.4.0 modules:\n\n| Module | PHP Class |\n|--------|-----------|\n| AI Detection | `AIDetector` |\n| Sentence Splitting | `SentenceSplitter` |\n| Paraphrasing | `Paraphraser` |\n| Tone Analysis | `ToneAnalyzer` |\n| Watermark Detection | `WatermarkDetector` |\n| Content Spinning | `ContentSpinner` |\n| Coherence Analysis | `CoherenceAnalyzer` |\n\n### PHP Installation\n\n```bash\ncd php/\ncomposer install\nphp vendor/bin/phpunit  # run tests\n```\n\nSee [php/README.md](php/README.md) for full PHP documentation.\n\n---\n\n## Code Quality \u0026 Tooling\n\n### Linting\n\nTextHumanize enforces strict code quality with [ruff](https://github.com/astral-sh/ruff):\n\n```bash\n# Check all code (0 errors)\nruff check texthumanize/\n\n# Auto-fix safe issues\nruff check --fix texthumanize/\n```\n\nRules enabled: `E` (pycodestyle), `F` (Pyflakes), `W` (warnings), `I` (isort). Line length: 100 chars.\n\n### Type Checking\n\nPEP 561 compliant — ships `py.typed` marker for downstream type checkers:\n\n```bash\nmypy texthumanize/\n```\n\nConfiguration in `pyproject.toml`:\n- `python_version = \"3.9\"` — minimum supported version\n- `check_untyped_defs = true` — checks function bodies even without annotations\n- `warn_return_any = true` — warns on `Any` return types\n\n### Pre-commit Hooks\n\nAutomatic quality checks on every commit:\n\n```bash\npre-commit install        # one-time setup\npre-commit run --all-files # manual run\n```\n\nHooks configured:\n- Trailing whitespace removal\n- End-of-file fixer\n- YAML/TOML validation\n- Large file prevention\n- Merge conflict detection\n- Ruff lint + format check\n\n### CI/CD Pipeline\n\nGitHub Actions runs on every push/PR:\n\n| Step | Description |\n|------|-------------|\n| **Lint** | `ruff check` — zero errors enforced |\n| **Test** | `pytest` across Python 3.9–3.12 + PHP 8.1–8.3 |\n| **Coverage** | `pytest-cov` — 85% minimum |\n| **Types** | `mypy` on Python 3.12 (non-blocking) |\n\n---\n\n## Migration Guide (v0.4 → v0.5)\n\n### What's New in v0.5\n\n1. **500 tests** — up from 382, covering 85% of codebase (was 80%)\n2. **Zero lint errors** — `ruff check` passes cleanly (67 errors fixed)\n3. **Type checking** — PEP 561 `py.typed` marker, mypy configuration\n4. **Pre-commit hooks** — ruff + formatting checks on every commit\n5. **Enhanced CI/CD** — ruff lint step + mypy type check + XML coverage output\n6. **pytest fixtures** — `conftest.py` with 12 reusable fixtures for all tests\n7. **PHP fixes** — type safety improvements in SentenceSplitter and ToneAnalyzer\n\n### Breaking Changes\n\n**None.** v0.5.0 is fully backward-compatible with v0.4.0. All existing code works without changes.\n\n### Developer Tooling Setup\n\n```bash\n# Install dev dependencies (new in 0.5)\npip install -e \".[dev]\"\n\n# Set up pre-commit hooks\npre-commit install\n\n# Verify everything passes\nruff check texthumanize/   # 0 errors\npytest -q                  # 500 passed\n```\n\n---\n\n## FAQ \u0026 Troubleshooting\n\n### General\n\n**Q: Does TextHumanize use the internet?**\nNo. All processing is 100% local. No API calls, no data sent anywhere.\n\n**Q: Does it require GPU or large models?**\nNo. Pure algorithmic processing using Python standard library only.\n\n**Q: Can I use it commercially?**\nThe current license is Personal Use Only. Contact the author for commercial licensing.\n\n**Q: Which Python versions are supported?**\nPython 3.9 through 3.12+ (tested in CI/CD).\n\n### Processing\n\n**Q: My text isn't changing much. Why?**\nIncrease `intensity` (e.g., 80-100) or use a more aggressive profile like `chat`. The `seo` and `formal` profiles intentionally make fewer changes.\n\n**Q: Can I undo changes?**\nThe `explain(result)` function shows all changes. The original text is always available in `result.original`.\n\n**Q: How do I protect specific words from changing?**\nUse `constraints={\"keep_keywords\": [\"word1\", \"word2\"]}` or `preserve={\"brand_terms\": [\"Brand\"]}`.\n\n**Q: The output has too many colloquialisms.**\nSwitch to `profile=\"docs\"` or `profile=\"formal\"` and lower the intensity.\n\n### AI Detection\n\n**Q: The detector says my text is AI-generated but it's not.**\nFormal, academic, or legal text can score higher due to formulaic patterns. This is expected. The detector works best on general-purpose text (blogs, articles, essays).\n\n**Q: How accurate is the AI detector?**\nOn our benchmark: F1=100% (4 AI texts, 5 human texts correctly classified). Real-world accuracy depends on text type and length. Best results with 100+ words.\n\n**Q: Does it detect ChatGPT/GPT-4/Claude specifically?**\nIt detects statistical patterns common to all LLMs, not any specific model. It works for GPT-3.5, GPT-4, Claude, Gemini, etc.\n\n### Languages\n\n**Q: My language isn't in the supported list.**\nUse `lang=\"xx\"` (your ISO code) — the universal processor will handle typography normalization, sentence variation, and burstiness without language-specific dictionaries.\n\n**Q: Can I add a new language?**\nYes! Create a new file in `texthumanize/lang/` following the existing pattern. See any existing language file (e.g., `en.py`) as a template.\n\n### CLI \u0026 API\n\n**Q: How do I start the REST API?**\n```bash\npython -m texthumanize.api --port 8080\n# or\ntexthumanize dummy --api --port 8080\n```\n\n**Q: Is there WebSocket support?**\nNot yet. The current API is HTTP/REST only.\n\n---\n\n## Contributing\n\nContributions are welcome:\n\n1. Fork the repository\n2. Create a feature branch: `git checkout -b feature/my-feature`\n3. Write tests for new functionality\n4. Ensure all tests pass: `pytest`\n5. Commit changes: `git commit -m 'Add my feature'`\n6. Push: `git push origin feature/my-feature`\n7. Open a Pull Request\n\n### Areas for Improvement\n\n- **Dictionaries** — expand bureaucratic and synonym dictionaries for all languages\n- **Languages** — add new language packs (Japanese, Chinese, Arabic, Korean, etc.)\n- **Tests** — more edge cases and golden tests, push coverage past 90%\n- **Documentation** — tutorials, video walkthroughs, blog posts\n- **Ports** — Node.js, Go, Rust implementations\n- **API** — WebSocket support, authentication, rate limiting\n- **Morphology** — expand to more languages (FR, ES, PL, PT, IT)\n- **AI Detector** — larger benchmark suite, more metrics\n\n### Development Setup\n\n```bash\ngit clone https://github.com/ksanyok/TextHumanize.git\ncd TextHumanize\npython -m venv .venv\nsource .venv/bin/activate\npip install -e \".[dev]\"\npre-commit install\nruff check texthumanize/\npytest --cov=texthumanize\n```\n\n---\n\n## Support the Project\n\nIf you find TextHumanize useful, consider supporting the development:\n\n[![PayPal](https://img.shields.io/badge/PayPal-Donate-blue.svg?logo=paypal)](https://www.paypal.com/cgi-bin/webscr?cmd=_donations\u0026business=ksanyok%40me.com\u0026item_name=TextHumanize\u0026currency_code=USD)\n\n- Star the repository\n- Report bugs and suggest features\n- Improve documentation\n- Add language packs\n\n---\n\n## License\n\nTextHumanize Personal Use License. See [LICENSE](LICENSE).\n\nThis library is licensed for **personal, non-commercial use only**. Commercial use requires a separate license — contact the author for details.\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/ksanyok/TextHumanize\"\u003eGitHub\u003c/a\u003e ·\n  \u003ca href=\"https://github.com/ksanyok/TextHumanize/issues\"\u003eIssues\u003c/a\u003e ·\n  \u003ca href=\"https://github.com/ksanyok/TextHumanize/discussions\"\u003eDiscussions\u003c/a\u003e\n\u003c/p\u003e\n","funding_links":["https://www.paypal.com/cgi-bin/webscr?cmd=_donations\u0026business=ksanyok%40me.com\u0026item_name=TextHumanize\u0026currency_code=USD"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fksanyok%2Ftexthumanize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fksanyok%2Ftexthumanize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fksanyok%2Ftexthumanize/lists"}