{"id":25165576,"url":"https://github.com/raad-labs/raad-video","last_synced_at":"2025-10-17T07:02:30.753Z","repository":{"id":276352572,"uuid":"927237154","full_name":"Raad-Labs/raad-video","owner":"Raad-Labs","description":"A high-performance video loading library for machine learning, designed for efficient training data preparation.","archived":false,"fork":false,"pushed_at":"2025-02-17T10:25:39.000Z","size":79,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-30T19:23:13.781Z","etag":null,"topics":["cuda","machine-learning","training-data"],"latest_commit_sha":null,"homepage":"https://raad.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Raad-Labs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-04T16:23:37.000Z","updated_at":"2025-03-19T17:57:29.000Z","dependencies_parsed_at":"2025-02-07T18:42:05.611Z","dependency_job_id":null,"html_url":"https://github.com/Raad-Labs/raad-video","commit_stats":null,"previous_names":["raad-labs/raad-video"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Raad-Labs%2Fraad-video","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Raad-Labs%2Fraad-video/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Raad-Labs%2Fraad-video/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Raad-Labs%2Fraad-video/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Raad-Labs","download_url":"https://codeload.github.com/Raad-Labs/raad-video/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251774105,"owners_count":21641719,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","machine-learning","training-data"],"created_at":"2025-02-09T05:26:39.959Z","updated_at":"2025-10-17T07:02:30.664Z","avatar_url":"https://github.com/Raad-Labs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://pbs.twimg.com/profile_banners/5385912/1737222476/1500x500\" alt=\"RAAD Video Banner\" width=\"100%\"\u003e\n\u003c/div\u003e\n\n# RAAD Video\n\nA high-performance video loading library for machine learning, designed for efficient training data preparation.\n\n## Features\n\n- **High-Performance Processing**\n  - Fast video frame extraction and preprocessing\n  - Multi-threaded and distributed processing support\n  - Smart caching with Redis for improved throughput\n  - Memory-efficient streaming with adaptive buffering\n\n- **ML Framework Integration**\n  - Native support for PyTorch, TensorFlow, JAX, and more\n  - Optimized tensor conversions and memory management\n  - GPU acceleration support\n  - Configurable output formats and color spaces\n\n- **Advanced Capabilities**\n  - Sophisticated augmentation pipeline\n  - Real-time performance monitoring\n  - Auto-tuning for optimal performance\n  - Peer-to-peer sharing for distributed setups\n\n## Installation\n\nRAAD Video requires Python 3.8 or later. Install via pip:\n\n```bash\npip install raad-video\n```\n\nFor development installation with testing and code quality tools:\n```bash\npip install \"raad-video[dev]\"\n```\n\n### System Requirements\n- Python 3.8+\n- OpenCV dependencies\n- Redis (optional, for distributed caching)\n- CUDA-compatible GPU (optional, for GPU acceleration)\n\n## Quick Start\n\n```python\nfrom raad import VideoDataLoader, VideoCatalog\nfrom raad.config import ProcessingMode, StreamingConfig, FrameFormat\n\n# Create a catalog of your videos\ncatalog = VideoCatalog()\ncatalog.add_video(\"video1.mp4\", categories=[\"training\"])\ncatalog.add_video(\"video2.mp4\", categories=[\"validation\"])\n\n# Initialize the loader with optimal settings\nloader = VideoDataLoader(\n    catalog=catalog,\n    processing_mode=ProcessingMode.MULTI_THREAD,\n    frame_format=FrameFormat.TORCH,  # Output PyTorch tensors\n    streaming_config=StreamingConfig(\n        mode=\"adaptive\",\n        buffer_size=1000\n    ),\n    target_size=(224, 224),  # Resize frames\n    normalize=True,          # Normalize pixel values\n    device=\"cuda\"          # Use GPU if available\n)\n\n# Get frames for training\nfor frames in loader.get_dataset_iterator(\"training\"):\n    # frames will be preprocessed and ready for your model\n    # Shape: (batch_size, channels, height, width)\n    model.train(frames)\n```\n\n## Advanced Usage\n\n### Distributed Processing\n\n```python\nfrom raad.config import DistributedConfig\n\n# Setup distributed processing across multiple nodes\nloader = VideoDataLoader(\n    catalog=catalog,\n    processing_mode=ProcessingMode.DISTRIBUTED,\n    distributed_config=DistributedConfig(\n        enabled=True,\n        num_nodes=4,\n        node_rank=0,\n        master_addr='10.0.0.1',\n        master_port=29500\n    )\n)\n```\n\n### Custom Augmentation Pipeline\n\n```python\nfrom raad.augmentation import (\n    RandomBrightness,\n    RandomContrast,\n    RandomFlip,\n    RandomRotation,\n    ColorJitter\n)\n\n# Create a sophisticated augmentation pipeline\nloader = VideoDataLoader(\n    catalog=catalog,\n    augmentations=[\n        RandomBrightness(0.2),\n        RandomContrast(0.2),\n        RandomFlip(p=0.5),\n        RandomRotation(degrees=15),\n        ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1)\n    ]\n)\n```\n\n### Performance Optimization\n\n```python\nfrom raad.config import CacheConfig, StreamingConfig\n\n# Configure caching and streaming for optimal performance\nloader = VideoDataLoader(\n    catalog=catalog,\n    cache_config=CacheConfig(\n        enabled=True,\n        policy=\"lru\",\n        max_size_gb=100,\n        persistent=True,\n        compression=\"lz4\"\n    ),\n    streaming_config=StreamingConfig(\n        mode=\"adaptive\",\n        buffer_size=1000,\n        max_latency=0.1,\n        drop_threshold=0.8\n    ),\n    auto_tune=True,  # Enable automatic performance tuning\n    monitoring_interval=1.0,  # Monitor performance every second\n    export_metrics=True\n)\n```\n\n## Troubleshooting\n\n### Common Issues\n\n1. **Memory Usage**\n   - Use `streaming_config` with appropriate `buffer_size`\n   - Enable frame dropping with `drop_threshold` if needed\n   - Consider using persistent caching\n\n2. **Performance**\n   - Enable `auto_tune` for automatic optimization\n   - Use appropriate `processing_mode` for your setup\n   - Monitor performance with `export_metrics=True`\n\n3. **GPU Issues**\n   - Ensure CUDA is properly installed\n   - Set appropriate `device` and `batch_size`\n   - Monitor GPU memory usage\n\n### Getting Help\n\n- Open an issue on GitHub\n- Check the API documentation\n- Join our community on Discord\n```\n\n## Performance Tips\n\n1. **Streaming Mode Selection**:\n   - Use `ADAPTIVE` for general training\n   - Use `REAL_TIME` for time-critical applications\n   - Use `PRIORITY` when certain frames are more important\n\n2. **Caching Strategy**:\n   - Enable Redis for distributed setups\n   - Use local caching for single machine training\n   - Configure cache size based on available RAM\n\n3. **Processing Mode**:\n   - `MULTI_THREAD` for I/O-bound workloads\n   - `MULTI_PROCESS` for CPU-bound preprocessing\n   - `DISTRIBUTED` for large-scale training\n\n## Contributing\n\nContributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) for details on our code of conduct and the process for submitting pull requests.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraad-labs%2Fraad-video","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fraad-labs%2Fraad-video","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraad-labs%2Fraad-video/lists"}