{"id":44575899,"url":"https://github.com/zensgit/dedupcad-vision","last_synced_at":"2026-04-01T19:10:53.311Z","repository":{"id":330158189,"uuid":"1099419261","full_name":"zensgit/dedupcad-vision","owner":"zensgit","description":"Graphics-based CAD drawing deduplication using computer vision","archived":false,"fork":false,"pushed_at":"2026-03-30T08:25:53.000Z","size":46858,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-03-30T10:14:42.939Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zensgit.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"docs/security-audit-report.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-11-19T01:04:51.000Z","updated_at":"2026-03-30T08:25:57.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/zensgit/dedupcad-vision","commit_stats":null,"previous_names":["zensgit/dedupcad-vision"],"tags_count":22,"template":false,"template_full_name":null,"purl":"pkg:github/zensgit/dedupcad-vision","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zensgit%2Fdedupcad-vision","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zensgit%2Fdedupcad-vision/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zensgit%2Fdedupcad-vision/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zensgit%2Fdedupcad-vision/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zensgit","download_url":"https://codeload.github.com/zensgit/dedupcad-vision/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zensgit%2Fdedupcad-vision/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31291092,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T13:12:26.723Z","status":"ssl_error","status_checked_at":"2026-04-01T13:12:25.102Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-14T05:09:29.298Z","updated_at":"2026-04-01T19:10:53.306Z","avatar_url":"https://github.com/zensgit.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CADDedup Vision\n\n**Graphics-based CAD drawing deduplication using computer vision techniques**\n\n[![CI](https://github.com/zensgit/dedupcad-vision/actions/workflows/ci.yml/badge.svg)](https://github.com/zensgit/dedupcad-vision/actions/workflows/ci.yml)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Docker](https://img.shields.io/badge/docker-ready-brightgreen.svg)](https://github.com/users/zensgit/packages/container/package/dedupcad-vision)\n\n## Overview\n\nCADDedup Vision is a high-performance, production-ready system for detecting duplicate CAD drawings using computer vision. It features a **progressive 4-layer search architecture** that balances speed and accuracy.\n\n## Documentation Map\n\n- Documentation index: [docs/DOCUMENTATION_INDEX.md](docs/DOCUMENTATION_INDEX.md)\n- Deployment guide: [docs/DEPLOYMENT.md](docs/DEPLOYMENT.md)\n- Windows Server deployment: [docs/WINDOWS_SERVER_DEPLOYMENT.md](docs/WINDOWS_SERVER_DEPLOYMENT.md)\n- Pre-release checklist: [docs/PRE_RELEASE_CHECKLIST.md](docs/PRE_RELEASE_CHECKLIST.md)\n- Operations runbook: [docs/OPERATIONS_RUNBOOK.md](docs/OPERATIONS_RUNBOOK.md)\n- API v2 reference: [docs/API_V2_REFERENCE.md](docs/API_V2_REFERENCE.md)\n- Technical handoff note: [reports/TECHNICAL_SESSION_NOTES_20260310.md](reports/TECHNICAL_SESSION_NOTES_20260310.md)\n\n### Key Features\n\n- **Progressive Search**: L1 (pHash) → L2 (FAISS) → L3 (ML) → L4 (Geometric)\n- **Sub-second Search**: 50-300ms for most queries\n- **Scalable**: Handles 100K+ drawings with FAISS indexing\n- **Production Ready**: Kubernetes Helm chart, monitoring, caching\n- **Extensible**: Plugin architecture for ML Platform and DedupCAD integration\n\n### Architecture\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│                    Progressive Search Engine                     │\n├─────────────────────────────────────────────────────────────────┤\n│  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐         │\n│  │ L1: pHash    │ → │ L2: FAISS    │ → │ L3: ML       │ → L4    │\n│  │ (~1ms)       │   │ (~10ms)      │   │ (optional)   │         │\n│  │ Fast filter  │   │ ANN search   │   │ Deep verify  │         │\n│  └──────────────┘   └──────────────┘   └──────────────┘         │\n├─────────────────────────────────────────────────────────────────┤\n│  Cache Layer (Redis) │ Rate Limiting │ Telemetry (OpenTelemetry)│\n└─────────────────────────────────────────────────────────────────┘\n```\n\n## Quick Start\n\n### Docker (Recommended)\n\n```bash\n# Pull and run (latest)\ndocker run -p 8000:8000 ghcr.io/zensgit/dedupcad-vision:latest\n\n# Or pin a release version\ndocker run -p 8000:8000 ghcr.io/zensgit/dedupcad-vision:1.1.7\n\n# Optional: Docker Hub mirror (if configured for this repo)\ndocker run -p 8000:8000 \u003cDOCKERHUB_IMAGE\u003e:latest\n\n# Or with docker-compose\ndocker-compose up -d\n```\n\nNote: `ghcr.io` container packages may be private. If you see `401 Unauthorized`, either make the\npackage public (GitHub UI -\u003e Packages -\u003e Settings -\u003e Change visibility) or login with a GitHub PAT:\n`docker login ghcr.io` (token scope: `read:packages`).\n\nNote: The root Dockerfile exposes port 8000. The Python entrypoint defaults to 58001.\n\n### Python Installation\n\nTested Python versions: 3.10, 3.11, 3.13 (3.11 recommended). Python 3.13\nuses NumPy 2.x and faiss-cpu\u003e=1.10.0 via dependency markers.\n\n```bash\n# Install from PyPI\npip install caddedup-vision\n\n# Install with all extras\npip install caddedup-vision[all]\n\n# Start the server\ncaddedup-vision\n```\n\nDefault port for the Python entrypoint is 58001. Override with CADDEDUP_VISION_PORT if needed.\n\n### Kubernetes (Helm)\n\n```bash\nhelm install caddedup-vision ./deploy/helm/caddedup-vision \\\n  --set redis.auth.password=your-password \\\n  --set persistence.enabled=true\n```\n\nIf you deploy from `ghcr.io` and the image is private, create an `imagePullSecret` and set\n`imagePullSecrets` in Helm values. See `deploy/helm/caddedup-vision/README.md`.\n\nFor detailed deployment instructions, see [Deployment Guide](docs/DEPLOYMENT.md).\nFor a step-by-step development + verification checklist, see `docs/DEV_AND_VERIFY_ZH.md`.\n\n## API Usage\n\n### Search for Duplicates\n\n```bash\n# Upload and search\ncurl -X POST http://localhost:58001/api/v2/search \\\n  -F \"file=@drawing.pdf\" \\\n  -F \"mode=balanced\"\n```\n\n### End-to-End Smoke Check (Search + Visual Diff)\n\nUse the bundled script to verify the full flow:\nupload/index -\u003e search similar drawings -\u003e generate colored visual diff.\n\n```bash\n# 1) start server\npython3 start_server.py --port 58001\n\n# 2) run smoke test in another terminal\nscripts/smoke_search_visual_diff.sh\n```\n\nOptional arguments:\n\n```bash\nscripts/smoke_search_visual_diff.sh \u003csource_image\u003e \u003cpeer_image\u003e\n```\n\nExpected output includes:\n- index response (`success=true`)\n- search response with at least one candidate (`similar` or `duplicates`)\n- visual diff response (`success=true`)\n- generated diff image: `/tmp/visual_diff_stored.png`\n\n### Python Client\n\n```python\nimport httpx\n\nasync with httpx.AsyncClient() as client:\n    with open(\"drawing.pdf\", \"rb\") as f:\n        response = await client.post(\n            \"http://localhost:58001/api/v2/search\",\n            files={\"file\": f},\n            data={\"mode\": \"balanced\"}\n        )\n    result = response.json()\n\n    matches = (result.get(\"duplicates\") or []) + (result.get(\"similar\") or [])\n    for match in matches:\n        print(f\"Match: {match['file_name']} ({match['similarity']:.1%})\")\n```\n\n### Search Modes\n\n| Mode | Layers | Typical Speed | Accuracy | Use Case |\n|------|--------|---------------|----------|----------|\n| `l1` | L1 (pHash) | ~5ms | Coarse | Ultra fast filtering |\n| `fast` | L1 + L2 (FAISS) | ~10-50ms | Good | Quick screening |\n| `balanced` | L1 + L2 (+ optional L3) | ~200-500ms | Better | Recommended |\n| `precise` | L1 + L2 (+ optional L3/L4) | ~0.5-10s | Best | Final verification |\n\nSee [API Documentation](docs/API_USAGE.md) for complete reference.\n\n## Web UI\n\nThe system includes a built-in Web UI for management and monitoring.\n\n- **URL**: `http://localhost:8000`\n- **URL (Python entrypoint)**: `http://localhost:58001`\n- **Features**:\n  - **Search**: Drag \u0026 drop file search with visual diff.\n  - **License Manager**: Generate and validate licenses (Requires Auth).\n  - **Update Monitor**: Track plugin update status and errors.\n\n### Authentication\n\nAdmin features (License generation, Update config) are protected by Basic Authentication.\n\n- **Default User**: `admin`\n- **Default Password**: `admin`\n- **Configuration**: Set `ADMIN_USER` and `ADMIN_PASSWORD` environment variables.\n\n## Configuration\n\n### Environment Variables\n\n```bash\n# Server\nCADDEDUP_VISION_PORT=58001\nCADDEDUP_VISION_WORKERS=1\n\n# Search Thresholds\nPHASH_THRESHOLD=10\nFEATURE_SIMILARITY_MIN=0.85\n\n# Redis\nREDIS_URL=redis://localhost:6379/0\n\n# Rate Limiting\nRATE_LIMIT_ENABLED=true\nRATE_LIMIT_SEARCH=100/minute\n\n# Telemetry (optional)\nOTEL_ENABLED=true\nOTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317\n```\n\n### Helm Values (Production)\n\n```yaml\n# High Availability\nreplicaCount: 3\nautoscaling:\n  enabled: true\n  minReplicas: 3\n  maxReplicas: 10\n\n# Monitoring\nmetrics:\n  serviceMonitor:\n    enabled: true\n  prometheusRule:\n    enabled: true\n\ngrafana:\n  dashboard:\n    enabled: true\n\n# Caching\nredis:\n  architecture: replication\n```\n\nSee [Helm Chart README](deploy/helm/caddedup-vision/README.md) for full configuration.\n\n## Operations\n\nFor production deployment and ops checklists, see `docs/OPERATIONS_RUNBOOK.md`.\n\n## Delivery Pack\n\nSee `reports/DELIVERY_SUMMARY.md` for a concise handoff index.\n\n## User Flow Recap\n\n- English: `docs/USER_FLOW_RECAP.md`\n- 中文版: `docs/USER_FLOW_RECAP_ZH.md`\n\n## Project Structure\n\n```\ndedupcad-vision/\n├── src/caddedup_vision/\n│   ├── api/              # FastAPI application\n│   ├── core/             # Core algorithms (pHash, features)\n│   ├── search/           # Search engine \u0026 indexes\n│   ├── cache/            # Multi-layer caching\n│   ├── telemetry/        # OpenTelemetry integration\n│   ├── logging/          # Structured logging\n│   └── storage/          # Storage backends (S3, local)\n├── tests/                # 287 tests\n├── deploy/\n│   └── helm/             # Kubernetes Helm chart\n├── docs/                 # Documentation\n└── .github/workflows/    # CI/CD pipelines\n```\n\n## Development\n\n### Setup\n\n```bash\n# Clone and install\ngit clone https://github.com/your-org/dedupcad-vision.git\ncd dedupcad-vision\n\n# Create a virtual env (Python \u003e= 3.10, tested with 3.11)\npython3.11 -m venv .venv\nsource .venv/bin/activate\npython -m pip install -e \".[dev,test]\"\n\n# Run tests\npytest tests/ -v\n\n# Run with coverage\npytest tests/ --cov=src/caddedup_vision --cov-report=html\n```\n\n### Testing\n\n```bash\n# All tests\npytest tests/ -v\n\n# Specific module\npytest tests/test_search.py -v\n\n# With markers\npytest tests/ -m \"not slow\" -v\n```\n\n## Monitoring\n\n### Metrics (Prometheus)\n\n- `caddedup_vision_search_requests_total` - Search request count\n- `caddedup_vision_search_duration_seconds` - Search latency histogram\n- `caddedup_vision_search_layer_hits_total` - Layer hit distribution\n- `caddedup_vision_cache_hit_rate` - Cache effectiveness\n\n### Grafana Dashboard\n\nPre-built dashboard included in Helm chart:\n- Request overview (QPS, latency, error rate)\n- Progressive search layer analysis\n- Redis \u0026 cache performance\n- Resource utilization\n\n### Alerting\n\nPrometheusRule alerts for:\n- High error rates\n- Latency degradation\n- Circuit breaker trips\n- Resource exhaustion\n\n## Roadmap\n\n- [x] Core algorithms (pHash, FAISS)\n- [x] Progressive 4-layer search\n- [x] FastAPI REST API\n- [x] Redis caching\n- [x] Rate limiting\n- [x] Kubernetes Helm chart\n- [x] Prometheus metrics \u0026 Grafana dashboard\n- [x] OpenTelemetry tracing\n- [x] CI/CD pipelines\n- [x] ML Platform integration (L3)\n- [x] DedupCAD integration (L4)\n- [x] Batch processing API\n- [x] Web UI\n\n## License\n\nMIT License - see [LICENSE](LICENSE) for details.\n\n## Acknowledgments\n\n- [OpenCV](https://opencv.org/) - Computer vision\n- [FAISS](https://github.com/facebookresearch/faiss) - Vector similarity search\n- [FastAPI](https://fastapi.tiangolo.com/) - Modern web framework\n- [OpenTelemetry](https://opentelemetry.io/) - Observability\n\n---\n\n**Version**: 1.0.0\n**Status**: Production Ready\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzensgit%2Fdedupcad-vision","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzensgit%2Fdedupcad-vision","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzensgit%2Fdedupcad-vision/lists"}