{"id":45171161,"url":"https://github.com/yasho191/SwiftAnnotate","last_synced_at":"2026-03-05T06:00:35.992Z","repository":{"id":273054421,"uuid":"918474739","full_name":"yasho191/SwiftAnnotate","owner":"yasho191","description":"Auto labelling tool for Text, Image, Video","archived":false,"fork":false,"pushed_at":"2025-03-18T07:04:10.000Z","size":2369,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-11-28T01:34:42.742Z","etag":null,"topics":["automation","computer-vision","data-labeling","llms","nlp","vlms"],"latest_commit_sha":null,"homepage":"https://yasho191.github.io/SwiftAnnotate/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yasho191.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-18T02:59:50.000Z","updated_at":"2025-09-20T12:26:12.000Z","dependencies_parsed_at":"2025-03-19T00:15:16.075Z","dependency_job_id":null,"html_url":"https://github.com/yasho191/SwiftAnnotate","commit_stats":null,"previous_names":["yasho191/swiftannotate"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yasho191/SwiftAnnotate","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yasho191%2FSwiftAnnotate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yasho191%2FSwiftAnnotate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yasho191%2FSwiftAnnotate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yasho191%2FSwiftAnnotate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yasho191","download_url":"https://codeload.github.com/yasho191/SwiftAnnotate/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yasho191%2FSwiftAnnotate/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30111779,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-05T03:40:26.266Z","status":"ssl_error","status_checked_at":"2026-03-05T03:39:15.902Z","response_time":93,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","computer-vision","data-labeling","llms","nlp","vlms"],"created_at":"2026-02-20T08:05:55.701Z","updated_at":"2026-03-05T06:00:35.974Z","avatar_url":"https://github.com/yasho191.png","language":"Python","funding_links":[],"categories":["public repositories"],"sub_categories":["backend only"],"readme":"# SwiftAnnotate 🚀\n\n![swiftannotate](https://github.com/yasho191/SwiftAnnotate/blob/main/assets/swiftannotate-high-resolution-logo.png?raw=True)\n\nSwiftAnnotate is a comprehensive auto-labeling tool designed for Text, Image, and Video data. It leverages state-of-the-art (SOTA) Vision Language Models (VLMs) and Large Language Models (LLMs) through a robust annotator-validator pipeline, ensuring high-quality, grounded annotations while minimizing hallucinations. SwiftAnnotate also supports annotations tasks like Object Detection and Segmentation through SOTA CV models like `SAM2`, `YOLOWorld`, and `OWL-ViT`.\n\n## Key Features 🎯\n\n1. **Text Processing 📝**  \nPerform **classification**, **summarization**, and **text generation** with state-of-the-art NLP models. Solve real-world problems like spam detection, sentiment analysis, and content creation.\n\n2. **Image Analysis 🖼️**  \nGenerate **captions** for images to provide meaningful descriptions. Classify images into predefined categories with high precision. Detect objects in images using models like **YOLOWorld**. Achieve pixel-perfect segmentation with **SAM2** and **OWL-ViT**.  \n\n3. **Video Processing 🎥**  \nGenerate captions for videos with **frame-level analysis** and **temporal understanding** Understand video content by detecting scenes and actions effortlessly.  \n\n4. **Quality Assurance ✅**  \nUse a **two-stage pipeline** for annotation and validation to ensure high data quality. Validate outputs rigorously to maintain reliability before deployment.  \n\n5. **Multi-modal Support 🌐**  \nSeamlessly process **text**, **images**, and **videos** within a unified framework. Combine data types for powerful multi-modal insights and applications.  \n\n6. **Customization 🛠️**\nEasily extend and adapt the framework to suit specific project needs. Integrate new models and tasks seamlessly with modular architecture.\n\n7. **Developer-Friendly 👩‍💻👨‍💻**\nEasy-to-use package and detailed documentation to get started quickly.\n\n## Installation Guide  \n\nTo install **SwiftAnnotate** from PyPI and set up the project environment, follow these steps:  \n\n1. **Install from PyPI**  \n\n    Run the following command to install the package directly:  \n\n    ```bash\n    pip install swiftannotate\n    ```\n\n2. **For Development (Using Poetry)**  \n\n    If you want to contribute or explore the project codebase ensure you have Poetry installed.  Follow the steps given below:\n\n    ```bash\n    git clone https://github.com/yasho191/SwiftAnnotate\n    cd SwiftAnnotate\n    poetry install\n    ```\n\n    You're now ready to explore and develop SwiftAnnotate!  \n\n## Annotator-Validator Pipeline for LLMs and VLMs\n\n![Annotation Pipeline](https://github.com/yasho191/SwiftAnnotate/blob/main/assets/SwiftAnnotatePipeline.png?raw=True)\n\nThe annotator-validator pipeline ensures high-quality annotations through a two-stage process:\n\n**Stage 1: Annotation**\n\n- Primary LLM/VLM generates initial annotations\n- Configurable model selection (OpenAI, Google Gemini, Anthropic, Mistral, Qwen-VL)\n\n**Stage 2: Validation**\n\n- Secondary model validates initial annotations\n- Cross-checks for hallucinations and factual accuracy\n- Provides confidence scores and correction suggestions\n- Option to regenerate annotations if validation fails\n- Structured output format for consistency\n\n**Benefits**\n\n- Reduced hallucinations through 2 stage verification\n- Higher annotation quality and consistency\n- Automated quality control\n- Traceable annotation process\n\nThe pipeline can be customized with different model combinations and validation thresholds based on specific use cases.\n\n## Supported Modalities and Tasks\n\n### Text\n\n### Images\n\n#### Captioning\n\nCurrently, we support OpenAI, Google-Gemini, Ollama, and Qwen2-VL for image captioning. As Qwen2-VL is not yet available on Ollama it is supported through HuggingFace. To get started quickly refer the code snippets shown below.\n\n**OpenAI**\n\n```python\nimport os\nfrom swiftannotate.image import OpenAIForImageCaptioning\n\ncaption_model = \"gpt-4o\"\nvalidation_model = \"gpt-4o-mini\"\napi_key = \"\u003cYOUR_OPENAI_API_KEY\u003e\"\nBASE_DIR = \"\u003cIMAGE_DIR\u003e\"\nimage_paths = [os.path.join(BASE_DIR, image) for image in os.listdir(BASE_DIR)]\n\nimage_captioning_pipeline = OpenAIForImageCaptioning(\n    caption_model=caption_model,\n    validation_model=validation_model,\n    api_key=api_key,\n    output_file=\"image_captioning_output.json\"\n)\n\nresults = image_captioning_pipeline.generate(image_paths=image_paths)\n```\n\n**Qwen2-VL**\n\nYou can use any version for the Qwen2-VL (7B, 72B) depending on the available resources. vLLM inference is not currently supported but it will be available soon.\n\n```python\nimport os\nfrom transformers import AutoProcessor, AutoModelForImageTextToText\nfrom transformers import BitsAndBytesConfig\nfrom swiftannotate.image import Qwen2VLForImageCaptioning\n\n# Load the images\nBASE_DIR = \"\u003cIMAGE_DIR\u003e\"\nimage_paths = [os.path.join(BASE_DIR, image) for image in os.listdir(BASE_DIR)]\n\nquantization_config = BitsAndBytesConfig(\n    load_in_4bit=True,\n    bnb_4bit_quant_type=\"nf4\",\n    bnb_4bit_compute_dtype=\"float16\",\n    bnb_4bit_use_double_quant=True\n)\n\nmodel = AutoModelForImageTextToText.from_pretrained(\n    \"Qwen/Qwen2-VL-7B-Instruct\",\n    device_map=\"auto\",\n    torch_dtype=\"auto\",\n    quantization_config=quantization_config)\n\nprocessor = AutoProcessor.from_pretrained(\"Qwen/Qwen2-VL-7B-Instruct\")\n\n# Load the Caption Model\ncaptioning_pipeline = Qwen2VLForImageCaptioning(\n    model = model,\n    processor = processor,\n    output_file=\"image_captioning_output.json\"\n)\n\nresults = captioning_pipeline.generate(image_paths)\n```\n\n### Videos\n\n## Contributing to SwiftAnnotate 🤝\n\nWe welcome contributions to SwiftAnnotate! There are several ways you can help improve the project:\n\n- **Enhanced Prompts**: Contribute better validation and annotation prompts for improved accuracy\n- **File Support**: Add support for additional input/output file formats\n- **Cloud Integration**: Implement AWS S3 storage support and other cloud services\n- **Validation Strategies**: Develop new validation approaches for different annotation tasks\n- **Model Support**: Integrate additional LLMs and VLMs\n- **Documentation**: Improve guides and examples\n\nPlease submit a pull request with your contributions or open an issue to discuss new features.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyasho191%2FSwiftAnnotate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyasho191%2FSwiftAnnotate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyasho191%2FSwiftAnnotate/lists"}