Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/motemen/macos-obs-websocket-ocr
A proxy for obs-websocket that adds Optical Character Recognition (OCR) capabilities.
https://github.com/motemen/macos-obs-websocket-ocr
macos obs vision-framework
Last synced: about 2 months ago
JSON representation
A proxy for obs-websocket that adds Optical Character Recognition (OCR) capabilities.
- Host: GitHub
- URL: https://github.com/motemen/macos-obs-websocket-ocr
- Owner: motemen
- Created: 2024-06-18T16:05:42.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-06-28T04:01:18.000Z (7 months ago)
- Last Synced: 2024-11-26T18:28:31.439Z (about 2 months ago)
- Topics: macos, obs, vision-framework
- Language: Swift
- Homepage:
- Size: 60.5 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# OBS WebSocket OCR Proxy for macOS
A proxy for obs-websocket that adds Optical Character Recognition (OCR) capabilities.
Utilizes macOS’s [Vision framework](https://developer.apple.com/documentation/vision) to perform OCR on captured screenshots.
Currently, it introduces one special request type: `__GetTextFromLastScreenshot`. See below for details.
## Usage
obs-websocket-ocr [--upstream-url URL] [--port PORT] [--hostname HOSTNAME]
- `--upstream-url URL`: The URL of the upstream obs-websocket server. Default: `ws://localhost:4455`.
- `--port PORT`: The port to bind to. Default: `4456`.
- `--hostname HOSTNAME`: The hostname to bind to. Default: `localhost`.When started, the proxy will listen on `localhost:4456` and forward all messages to the upstream obs-websocket server.
You can connect to the proxy using the obs-websocket client as you would with a normal obs-websocket server.## Request types
In addition to the [standard obs-websocket request types](https://github.com/obsproject/obs-websocket/blob/master/docs/generated/protocol.md#getversion), the proxy adds one special request type: `__GetTextFromLastScreenshot`.
### \_\_GetTextFromLastScreenshot
Does OCR on the last screenshot taken by OBS and returns the recognized text items and the bounding boxes of them.
#### Response fields:
| Name | Type | Description |
| -------------- | ------------------- | -------------------------- |
| `text_results` | `Array` | The recognized text items. |TextResult:
| Name | Type | Description |
| -------------- | -------------------- | ----------------------------- |
| `text` | `string` | The recognized text. |
| `bounding_box` | `Array` | The bounding box of the text. |BoundingBox:
| Name | Type | Description |
| -------- | -------- | ---------------------------------------- |
| `x` | `number` | The x-coordinate of the top-left corner. |
| `y` | `number` | The y-coordinate of the top-left corner. |
| `width` | `number` | The width of the bounding box. |
| `height` | `number` | The height of the bounding box. |## Author
Hironao Otsubo (motemen)