{"id":50348078,"url":"https://github.com/redredchen01/wm-tool-enhanced","last_synced_at":"2026-05-29T20:01:31.996Z","repository":{"id":351523753,"uuid":"1211354999","full_name":"redredchen01/wm-tool-enhanced","owner":"redredchen01","description":null,"archived":false,"fork":false,"pushed_at":"2026-04-15T10:29:55.000Z","size":1528,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"feat/auto-quality-enhancement","last_synced_at":"2026-04-15T12:15:56.801Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/redredchen01.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-15T10:03:44.000Z","updated_at":"2026-04-15T10:29:59.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/redredchen01/wm-tool-enhanced","commit_stats":null,"previous_names":["redredchen01/wm-tool-enhanced"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/redredchen01/wm-tool-enhanced","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redredchen01%2Fwm-tool-enhanced","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redredchen01%2Fwm-tool-enhanced/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redredchen01%2Fwm-tool-enhanced/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redredchen01%2Fwm-tool-enhanced/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/redredchen01","download_url":"https://codeload.github.com/redredchen01/wm-tool-enhanced/tar.gz/refs/heads/feat/auto-quality-enhancement","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redredchen01%2Fwm-tool-enhanced/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33668186,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-29T20:01:31.159Z","updated_at":"2026-05-29T20:01:31.988Z","avatar_url":"https://github.com/redredchen01.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# wm-tool\n\nLocal video floating-text watermark detector and remover — frame-by-frame OCR, temporal tracking, stable mask generation, multi-strategy removal, and optional custom watermark overlay.\n\n## Features\n\n- **EasyOCR / PaddleOCR detection** with confidence filtering, area ratio rejection, and edge-region focus\n- **Temporal tracking** — assigns bounding boxes across keyframes using EMA-smoothed centroids; survives OCR misses via configurable gap tolerance\n- **Geometry filter** — distinguishes flat 2D overlays from angled physical logos (clothing, products) via skew / aspect / area coefficient-of-variation tests\n- **ROI refine pass** — second-pass detection inside predicted track regions at a lower confidence threshold to catch semi-transparent text\n- **Stable mask generation** — unions per-track bounding boxes across all frames; supports expand/feather margins, hint regions, and strict-mode (detection-only, no interpolation)\n- **Auto-chunking** — splits long videos into ≤4 GB RAM chunks and concatenates the output seamlessly\n- **Multiple removal strategies** — see table below\n- **Gradio web UI** — full pipeline accessible via browser without writing config files\n- **Apple Silicon GPU support** — MPS auto-detected; CUDA supported on Linux\n- **Phase 2 Performance Optimizations** — 55-60% speedup via dynamic ROI, optical flow skipping, batch OCR, and adaptive morphology\n\n## Installation\n\n```bash\npip install -r requirements.txt\n```\n\n**选择 OCR 引擎** (Phase 1 优化已完成):\n- EasyOCR (默认): `pip install easyocr` — 稳定，多语言，100-150ms/frame\n- PaddleOCR (推荐): `pip install paddleocr\u003e=2.7` — 快 2-3 倍，模型小 80%，30-50ms/frame\n\n在 `configs/demo.yaml` 中选择 `backend: \"easyocr\"` 或 `backend: \"paddleocr\"`\n\nRequires `ffmpeg` on `PATH` for accurate duration probing and video concatenation.\n\n## Quick Start\n\n**CLI**\n\n```bash\n# Single file\npython -m src.app --input video.mp4 --output out.mp4 --config configs/demo.yaml\n\n# Batch mode\npython -m src.app --input-dir ./videos/ --output-dir ./output/ --config configs/demo.yaml\n```\n\n**Web UI**\n\n```bash\npython webui.py\n# or\npython webui.py --port 7861 --share\n```\n\n## Configuration\n\nAll options are set via a YAML file or the web UI. The full schema is defined in `src/config.py`.\n\n```yaml\ndetection:\n  backend: easyocr           # easyocr | paddleocr\n  detect_interval: 10        # run OCR every N frames\n  confidence_threshold: 0.3\n  max_area_ratio: 0.25       # reject boxes covering \u003e25% of frame (subtitles/logos)\n  detect_max_side: 960       # downscale longer side before OCR (0 = disabled)\n  frame_enhance: false       # CLAHE + unsharp mask — helps semi-transparent text\n  roi_refine: false          # second-pass inside predicted regions\n\n  # Phase 2 Performance Optimizations (experimental, enabled by default)\n  enable_dynamic_roi: true       # learn watermark position, auto-narrow search region\n  roi_history_window: 15         # frames to accumulate before suggesting ROI\n  enable_optical_flow: true      # skip detection on low-motion frames\n  motion_threshold: 5.0          # optical flow magnitude threshold (pixels)\n\ntracking:\n  gap_tolerance: 60          # survive N-frame OCR gaps before closing a track\n  min_track_frames: 2        # drop single-frame noise\n  min_track_ratio: 0.0       # fraction of keyframes a track must appear in\n  geometry_filter: true      # reject physically angled text\n\nmask:\n  expand_px: 20              # grow mask beyond detected bbox\n  cover_expand_px: 15        # extra pixels during removal only\n  feather_radius: 0          # soft edge (pixels)\n  hint_regions:              # force-mask regions regardless of detection\n    - [0.0, 0.0, 0.15, 0.08] # x_pct, y_pct, w_pct, h_pct\n\n  # Phase 2 Performance Optimizations\n  enable_morphological_fastpath: true  # adaptive dilation for fast mask generation\n\nremove:\n  strategy: gaussian_blur        # default strategy when adaptive is disabled\n  blur_ksize: 71\n  \n  # Phase 3.8: Adaptive Watermark Removal Strategy\n  # Routes removal based on background complexity: simple regions → Gaussian blur (fast),\n  # complex regions → LaMa (high quality). Improves output quality by 5-20%.\n  enable_adaptive_strategy: true\n  adaptive_complexity_threshold: 0.35  # 0.0 (simple) to 1.0 (complex)\n  simple_region_strategy: gaussian_blur  # Fast strategies: gaussian_blur, smart_cover, solid\n  complex_region_strategy: lama           # Quality strategies: lama, inpaint, temporal_median\n  blur_adaptive_kernel_min: 11   # Small kernel for simple regions (preserve detail)\n  blur_adaptive_kernel_max: 31   # Large kernel for complex regions (coverage)\n  \n  encode_crf: 23             # H.264 quality (18–28 typical)\n  encode_preset: fast\n\ndebug:\n  enabled: false\n  output_dir: debug\n```\n\n## Performance Optimization (Phase 2)\n\nPhase 2 optimizations are **enabled by default** and can significantly reduce processing time:\n\n| Optimization | Speedup | Condition |\n|---|---|---|\n| **Dynamic ROI** | 30-40% | Stable watermark position |\n| **Optical Flow Skip** | 20-30% | Low-motion scenes |\n| **Batch OCR** | 25-30% | Multi-frame detection |\n| **Morphological Fast Path** | 15-25% | Mixed mask sizes |\n| **Detector Cache** | 2-3s/video | Multi-video batch |\n\n**Expected total improvement**: 55-60% for typical 1080p video (130s → 45-55s)\n\nTo disable any optimization:\n\n```yaml\ndetection:\n  enable_dynamic_roi: false\n  enable_optical_flow: false\nmask:\n  enable_morphological_fastpath: false\n```\n\nSee `docs/reports/OPTIMIZATION_COMPLETE_SUMMARY.md` and `docs/reports/AUTO_OPTIMIZATION_GUIDE.md` for detailed information.\n\n## Phase 2 Complexity-Driven Removal (Beta)\n\n**Phase 2 Unit 3+** introduces intelligent watermark removal via background complexity detection and adaptive parameter tuning:\n\n### Background Complexity Detection\n\nAnalyzes the mask region to determine adaptive inpainting parameters:\n\n- **Histogram Variance**: Measures color/brightness variation in masked region\n- **Edge Density**: Computes Sobel gradient magnitude around mask boundary  \n- **Complexity Score** (0.0-1.0): Combined metric where 0=simple, 1=complex\n\n### Adaptive LaMa Inpainting\n\nWhen `remove.enable_complexity_detection: true`, the LaMa inpainting kernel size adapts per-frame:\n\n```yaml\nremove:\n  strategy: lama                              # High-fidelity learned inpainting\n  enable_lama: true\n  enable_complexity_detection: true           # Complexity-aware adaptation\n  lama_complexity_kernel_min: 3               # Min dilation kernel (simple bkg)\n  lama_complexity_kernel_max: 25              # Max dilation kernel (complex bkg)\n  enable_lama_batch: true                     # Batch frame processing (GPU)\n  lama_batch_size: 4                          # Frames per batch\n  enable_lama_model_cache: true               # Keep model in VRAM\n```\n\n### Phase 2 Presets\n\nThree preconfigured presets optimize for different motion scenarios:\n\n| Preset | detect_interval | kernel_range | Use Case |\n|---|---|---|---|\n| `P2-穩定` | 15 | 3–15px | Stable background, low motion |\n| `P2-快速` | 5 | 9–25px | Fast-moving watermarks, complex scenes |\n| `P2-混合` | 10 | 5–21px | Balanced: variable motion/complexity |\n\nAccess via Web UI: buttons under \"🔮 Phase 2 複雜度感知預設\" or YAML:\n\n```yaml\ndetection:\n  detect_interval: 10\n  frame_enhance: true\nremoval:\n  strategy: lama\n  enable_complexity_detection: true\n  lama_complexity_kernel_min: 5\n  lama_complexity_kernel_max: 21\nupscale:\n  enable_upscale: true              # ESRGAN 4x super-resolution\n  target_height: 1080\n```\n\n### Expected Performance\n\nPhase 2 with LaMa + ESRGAN upscaling:\n\n- **Speed**: 3-6s/frame (GPU) → 2-4s/frame with batch processing (35% faster)\n- **Quality**: SSIM ≥0.80, ΔE ≤15 (better texture/edge preservation vs. blur)\n- **Resolution**: Auto-upscale 720p removal output to 1080p\n\n## Phase 3.8: Adaptive Watermark Removal Strategy\n\nAutomatically routes watermark removal based on background complexity, achieving 5-20% quality improvement:\n\n### How It Works\n\n1. **Complexity Detection** per watermark region\n   - **Low complexity** (uniform, simple textures) → Gaussian blur (fast, preserves detail)\n   - **High complexity** (fine textures, edges) → LaMa (high-quality inpainting)\n\n2. **Adaptive Kernel Sizing** for Gaussian blur\n   - Simple regions use **small kernels** (11px) for detail preservation\n   - Threshold-based routing ensures smooth transitions\n\n### Configuration\n\n```yaml\nremove:\n  enable_adaptive_strategy: true                  # Enable adaptive routing\n  adaptive_complexity_threshold: 0.35             # Complexity boundary (0.0-1.0)\n  simple_region_strategy: gaussian_blur           # Fast removal for simple regions\n  complex_region_strategy: lama                   # Quality removal for complex regions\n  blur_adaptive_kernel_min: 11                    # Min kernel (simple regions)\n  blur_adaptive_kernel_max: 31                    # Max kernel (complex regions)\n```\n\n### Quality Targets\n\n- **SSIM** ≥ 0.85 (detail preservation)\n- **ΔE** ≤ 10 (color consistency)\n- **Temporal stability** \u003c 1% flicker\n- **Processing** ≤ 2 min per minute of 1080p video\n\n### Tuning\n\n- **Threshold too low**: Overuses Gaussian blur, may miss complex regions\n- **Threshold too high**: Overuses LaMa, slower but higher quality\n- **Default (0.35)**: Balanced for editorial workflows\n\n## Strategies\n\n| Strategy | Description |\n|---|---|\n| `gaussian_blur` | Repeated Gaussian blur over the masked region (default) |\n| `smart_cover` | Blur with optional solid-color tint blended on top |\n| `mosaic` | Pixelate the masked region |\n| `solid` | Fill with a solid color |\n| `delogo` | FFmpeg `delogo` filter — interpolates from surrounding pixels |\n| `inpaint` | OpenCV Telea inpainting |\n| `temporal_median` | Sample ±N frames and use the per-pixel median as background |\n\n`temporal_median` produces the cleanest result for stationary watermarks but requires more memory and time.\n\n## Debug Output\n\nEnable `debug.enabled: true` to write per-stage artifacts to `debug/`:\n\n- `keyframes/` — sampled frames sent to OCR\n- `detections/` — frames with raw OCR bounding boxes drawn\n- `tracks/` — frames with tracker assignments and IDs\n- `masks/` — binary mask images per frame\n- `removed/` — frames after removal, before re-encoding\n\nUseful for diagnosing missed detections or incorrect mask placement.\n\n## Requirements\n\n- Python 3.9+\n- OpenCV (`opencv-python`)\n- EasyOCR (default) or PaddleOCR\n- NumPy, Pydantic v2, PyYAML, tqdm\n- Gradio (web UI only)\n- `ffmpeg` binary (duration probing, video concatenation)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredredchen01%2Fwm-tool-enhanced","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fredredchen01%2Fwm-tool-enhanced","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredredchen01%2Fwm-tool-enhanced/lists"}