{"id":18317509,"url":"https://github.com/alladinian/visionaire","last_synced_at":"2025-08-01T10:36:04.941Z","repository":{"id":192765870,"uuid":"662231244","full_name":"alladinian/Visionaire","owner":"alladinian","description":"Streamlined, ergonomic APIs around Apple's Vision framework","archived":false,"fork":false,"pushed_at":"2023-12-22T08:31:01.000Z","size":112,"stargazers_count":60,"open_issues_count":1,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-22T21:45:57.347Z","etag":null,"topics":["computer-vision","ios","macos","swift","vision"],"latest_commit_sha":null,"homepage":"","language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alladinian.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-07-04T16:37:46.000Z","updated_at":"2025-06-27T06:23:15.000Z","dependencies_parsed_at":null,"dependency_job_id":"926aadf4-7387-4a6f-bb91-6830c4311788","html_url":"https://github.com/alladinian/Visionaire","commit_stats":null,"previous_names":["alladinian/visionaire"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/alladinian/Visionaire","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alladinian%2FVisionaire","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alladinian%2FVisionaire/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alladinian%2FVisionaire/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alladinian%2FVisionaire/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alladinian","download_url":"https://codeload.github.com/alladinian/Visionaire/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alladinian%2FVisionaire/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268207624,"owners_count":24213016,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-01T02:00:08.611Z","response_time":67,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","ios","macos","swift","vision"],"created_at":"2024-11-05T18:06:22.668Z","updated_at":"2025-08-01T10:36:04.899Z","avatar_url":"https://github.com/alladinian.png","language":"Swift","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Visionaire \n\n\u003eStreamlined, ergonomic APIs around Apple's Vision framework\n\n![Swift](https://img.shields.io/badge/Swift-5.8+-ec775c?style=flat)\n![iOS](https://img.shields.io/badge/iOS-13+-549bf5?style=flat)\n![macOS](https://img.shields.io/badge/macOS-10.15+-549bf5?style=flat)\n![macCatalyst](https://img.shields.io/badge/macCatalyst-13.1+-549bf5?style=flat)\n![tvOS](https://img.shields.io/badge/tvOS-13+-549bf5?style=flat)\n![Swift Package Manager](https://img.shields.io/badge/Swift_Package_Manager-Compatible-347d39?style=flat)\n\n\nThe main goal of `Visionaire` is to reduce ceremony and provide a concise set of APIs for Vision tasks.\n\nSome of its features include:\n\n- **Centralized list of all tasks**, available via the `VisionTaskType` enum (with platform availability checks).\n- **Automatic image handling** for all supported image sources.\n- **Convenience APIs for all tasks**, along with all available parameters for each task (with platform availability checks).\n- Support for **custom CoreML models** (Classification, Image-To-Image, Object Recognition, Generic `VNCoreMLFeatureValueObservation`s).\n- Support for **multiple task execution**, maintaining task type information in the results.\n- Support for raw `VNRequest`s.\n- All calls are **synchronous** (just like the original calls) - **no extra 'magic', assumptions or hidden juggling**.\n- **SwiftUI extensions** for helping you **rapidly visualize results** (great for evaluation).\n\n## Installation\n`Visionaire` is provided as a Swift Package. You can add it to your project via [this repository's address](https://github.com/alladinian/Visionaire).\n\n## Supported Vision Tasks\n\n**All** Vision tasks are supported (including **iOS 17** \u0026 **macOS 14**, which are the latest production releases).\n\u003cdetails\u003e\n\u003csummary\u003e\nExpand to see a detailed list of all available tasks\n\u003c/summary\u003e\n\n| **Task**                                   | **Vision API**                                | **Visionaire Task**                      | **iOS** | **macOS** | **Mac Catalyst** | **tvOS** |\n| ------------------------------------------ | --------------------------------------------- | ---------------------------------------- | -------:| ---------:| ---------------: | -------: |\n| **Generate Feature Print**                 | VNGenerateImageFeaturePrintRequest            | .featurePrintGeneration                  |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Person Segmentation**                    | VNGeneratePersonSegmentationRequest           | .personSegmentation                      |    15.0 |      12.0 |             15.0 |     15.0 |\n| **Document Segmentation**                  | VNDetectDocumentSegmentationRequest           | .documentSegmentation                    |    15.0 |      12.0 |             15.0 |     15.0 |\n| **Attention Based Saliency**               | VNGenerateAttentionBasedSaliencyImageRequest  | .attentionSaliency                       |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Objectness Based Saliency**              | VNGenerateObjectnessBasedSaliencyImageRequest | .objectnessSaliency                      |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Track Rectangle**                        | VNTrackRectangleRequest                       | .rectangleTracking                       |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Track Object**                           | VNTrackObjectRequest                          | .objectTracking                          |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Detect Rectangles**                      | VNDetectRectanglesRequest                     | .rectanglesDetection                     |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Detect Face Capture Quality**            | VNDetectFaceCaptureQualityRequest             | .faceCaptureQuality                      |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Detect Face Landmarks**                  | VNDetectFaceLandmarksRequest                  | .faceLandmarkDetection                   |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Detect Face Rectangles**                 | VNDetectFaceRectanglesRequest                 | .faceDetection                           |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Detect Human Rectangles**                | VNDetectHumanRectanglesRequest                | .humanRectanglesDetection                |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Detect Human Body Pose**                 | VNDetectHumanBodyPoseRequest                  | .humanBodyPoseDetection                  |    14.0 |      11.0 |             14.0 |     14.0 |\n| **Detect Human Hand Pose**                 | VNDetectHumanHandPoseRequest                  | .humanHandPoseDetection                  |    14.0 |      11.0 |             14.0 |     14.0 |\n| **Recognize Animals**                      | VNRecognizeAnimalsRequest                     | .animalDetection                         |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Detect Trajectories**                    | VNDetectTrajectoriesRequest                   | .trajectoriesDetection                   |    14.0 |      11.0 |             14.0 |     14.0 |\n| **Detect Contours**                        | VNDetectContoursRequest                       | .contoursDetection                       |    14.0 |      11.0 |             14.0 |     14.0 |\n| **Generate Optical Flow**                  | VNGenerateOpticalFlowRequest                  | .opticalFlowGeneration                   |    14.0 |      11.0 |             14.0 |     14.0 |\n| **Detect Barcodes**                        | VNDetectBarcodesRequest                       | .barcodeDetection                        |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Detect Text Rectangles**                 | VNDetectTextRectanglesRequest                 | .textRectanglesDetection                 |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Recognize Text**                         | VNRecognizeTextRequest                        | .textRecognition                         |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Detect Horizon**                         | VNDetectHorizonRequest                        | .horizonDetection                        |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Classify Image**                         | VNClassifyImageRequest                        | .imageClassification                     |    13.0 |     10.15 |             13.1 |     13.0 |\n| **Translational Image Registration**       | VNTranslationalImageRegistrationRequest       | .translationalImageRegistration          |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Homographic Image Registration**         | VNHomographicImageRegistrationRequest         | .homographicImageRegistration            |    11.0 |     10.13 |             13.1 |     11.0 |\n| **Detect Human Body Pose (3D)**            | VNDetectHumanBodyPose3DRequest                | .humanBodyPoseDetection3D                |    17.0 |      14.0 |             17.0 |     17.0 |\n| **Detect Animal Body Pose**                | VNDetectAnimalBodyPoseRequest                 | .animalBodyPoseDetection                 |    17.0 |      14.0 |             17.0 |     17.0 |\n| **Track Optical Flow**                     | VNTrackOpticalFlowRequest                     | .opticalFlowTracking                     |    17.0 |      14.0 |             17.0 |     17.0 |\n| **Track Translational Image Registration** | VNTrackTranslationalImageRegistrationRequest  | .translationalImageRegistrationTracking  |    17.0 |      14.0 |             17.0 |     17.0 |\n| **Track Homographic Image Registration**   | VNTrackHomographicImageRegistrationRequest    | .homographicImageRegistrationTracking    |    17.0 |      14.0 |             17.0 |     17.0 |\n| **Generate Foreground Instance Mask**      | VNGenerateForegroundInstanceMaskRequest       | .foregroundInstanceMaskGeneration        |    17.0 |      14.0 |             17.0 |     17.0 |\n\u003c/details\u003e\n\n## Supported Image Sources\n- `CGImage`\n- `CIImage`\n- `CVPixelBuffer`\n- `CMSampleBuffer`\n- `Data`\n- `URL`\n\n## Examples\n\nThe main class for interfacing is called `Visionaire`. \n\nIt's an `ObservableObject` and reports processing through a published property called `isProcessing`.\n\nYou can execute tasks on the `shared` Visionaire singleton or on your own instance (useful if you want to have separate processors reporting on their own).\n\nThere are two sets of apis: convenience methods \u0026 task-based methods.\n\nConvenience methods have the benefit of returning typed results while tasks can be submitted en masse.\n\n### Single task execution (convenience apis):\n\n```swift\nDispatchQueue.global(qos: .userInitiated).async {\n    do {\n        let image   = /* any supported image source, such as CGImage, CIImage, CVPixelBuffer, CMSampleBuffer, Data or URL */\n        let horizon = try Visionaire.shared.horizonDetection(imageSource: image) // The result is a `VNHorizonObservation`\n        let angle   = horizon.angle\n        // Do something with the horizon angle\n    } catch {\n        print(error)\n    }\n}\n```\n\n### Custom CoreML model (convenience apis):\n\n```swift\n// Create an instance of your model\nlet yolo: MLModel = {\n    // Tell Core ML to use the Neural Engine if available.\n    let config = MLModelConfiguration()\n    config.computeUnits = .all\n    // Load your custom model\n    let yolo = try! yolo(configuration: config)\n    return yolo.model\n}()\n    \n// Optionally create a feature provider to setup custom model attributes\nclass YoloFeatureProvider: MLFeatureProvider {\n    var values: [String : MLFeatureValue] {\n        [\n            \"iouThreshold\": MLFeatureValue(double: 0.45),\n            \"confidenceThreshold\": MLFeatureValue(double: 0.25)\n        ]\n    }\n\n    var featureNames: Set\u003cString\u003e {\n        Set(values.keys)\n    }\n\n    func featureValue(for featureName: String) -\u003e MLFeatureValue? {\n        values[featureName]\n    }\n}\n\n// Perform the task\nlet detectedObjectObservations = try visionaire.customRecognition(imageSource: image,\n                                                                        model: try! VNCoreMLModel(for: yolo),\n                                                        inputImageFeatureName: \"image\",\n                                                              featureProvider: YoloFeatureProvider(),\n                                                      imageCropAndScaleOption: .scaleFill)\n```\n\n\n### Single task execution (task-based apis):\n\n```swift\nDispatchQueue.global(qos: .userInitiated).async {\n    do {\n        let image       = /* any supported image source, such as CGImage, CIImage, CVPixelBuffer, CMSampleBuffer, Data or URL */\n        let result      = try Visionaire.shared.perform(.horizonDetection, on: image) // The result is a `VisionTaskResult`\n        let observation = result.observations.first as? VNHorizonObservation\n        let angle       = observation?.angle\n        // Do something with the horizon angle\n    } catch {\n        print(error)\n    }\n}\n```\n\n### Multiple task execution (task-based apis):\n\n```swift\nDispatchQueue.global(qos: .userInitiated).async {\n    do {\n        let image   = /* any supported image source, such as CGImage, CIImage, CVPixelBuffer, CMSampleBuffer, Data or URL */\n        let results = try Visionaire.shared.perform([.horizonDetection, .personSegmentation(qualityLevel: .accurate)], on: image)\n        for result in results {\n            switch result.taskType {\n            case .horizonDetection:\n                let horizon = result.observations.first as? VNHorizonObservation\n                // Do something with the observation\n            case .personSegmentation:\n                let segmentationObservations = result.observations as? [VNPixelBufferObservation]\n                // Do something with the observations\n            default:\n                break\n            }\n        }   \n    } catch {\n        print(error)\n    }\n}\n```\n\n\n## Task configuration\n\nAll tasks can be configured with \"modifier\" style calls for common options.\n\nAn example using all the available options:\n\n```swift\nlet segmentation = VisionTask.personSegmentation(qualityLevel: .accurate)\n    .preferBackgroundProcessing(true)\n    .usesCPUOnly(false)\n    .regionOfInterest(CGRect(x: 0, y: 0, width: 0.5, height: 0.5))\n    .latestRevision() // You can also use .revision(n)\n\nlet result = try Visionaire.shared.perform([.horizonDetection, segmentation], on: image) // The result is a `VisionTaskResult`\n```\n\n## SwiftUI Extensions\n\nThere are also some SwiftUI extensions available in order to help you visualize results for quick evaluation.\n\n**Detected Object Observations**\n\n```swift\nImage(myImage)\n    .resizable()\n    .aspectRatio(contentMode: .fit)\n    .drawObservations(detectedObjectObservations) {\n        Rectangle()\n            .stroke(Color.blue, lineWidth: 2)\n    }\n```\n![image](https://github.com/alladinian/Visionaire/assets/156458/70b4a0dd-dcf7-4c15-8ccb-cd37910e6a35)\n\n**Rectangle Observations**\n\n```swift\nImage(myImage)\n    .resizable()\n    .aspectRatio(contentMode: .fit)\n    .drawQuad(rectangleObservations) { shape in\n        shape\n            .stroke(Color.green, lineWidth: 2)\n    }\n```\n![image](https://github.com/alladinian/Visionaire/assets/156458/9cc38998-e069-414b-8fae-bb5584ee48ec)\n\n**Face Landmarks**\n\nNote: For Face Landmarks you can specify individual characteristics or groups for visualization. The available options are available through the `FaceLandmarks` OptionSet and they are:\n\n`constellation`, `contour`, `leftEye`, `rightEye`, `leftEyebrow`, `rightEyebrow`, `nose`, `noseCrest`, `medianLine`, `outerLips`, `innerLips`, `leftPupil`, `rightPupil`, `eyes`, `pupils`, `eyeBrows`, `lips` and `all`.\n\n```swift\nImage(myImage)\n    .resizable()\n    .aspectRatio(contentMode: .fit)\n    .drawFaceLandmarks(faceObservations, landmarks: .all) { shape in\n        shape\n            .stroke(.red, style: .init(lineWidth: 2, lineJoin: .round))\n    }\n```\n![image](https://github.com/alladinian/Visionaire/assets/156458/f63e6646-a2ce-4f82-bcdd-1ef30160ddb6)\n\n**Person Segmentation Mask**\n\n```swift\nImage(myImage)\n    .resizable()\n    .aspectRatio(contentMode: .fit)\n    .visualizePersonSegmentationMask(pixelBufferObservations)\n```\n![image](https://github.com/alladinian/Visionaire/assets/156458/72536049-3547-4c89-994c-4b46aee4e295)\n\n**Human Body Pose**\n\n```swift\nImage(myImage)\n    .resizable()\n    .aspectRatio(contentMode: .fit)\n    .visualizeHumanBodyPose(humanBodyPoseObservations) { shape in\n        shape\n            .fill(.red)\n    }\n```\n![image](https://github.com/alladinian/Visionaire/assets/156458/dc56da48-ac80-4723-8403-dea660c73c20)\n\n**Contours**\n\n```swift\nImage(myImage)\n    .resizable()\n    .aspectRatio(contentMode: .fit)\n    .visualizeContours(contoursObservations) { shape in\n        shape\n            .stroke(.red, style: .init(lineWidth: 2, lineJoin: .round))\n    }\n```\n![image](https://github.com/alladinian/Visionaire/assets/156458/ee4d9e63-3e37-494e-94d4-63ae2c72dc0a)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falladinian%2Fvisionaire","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falladinian%2Fvisionaire","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falladinian%2Fvisionaire/lists"}