{"id":24759116,"url":"https://github.com/pyreko/kaguyaocr","last_synced_at":"2025-10-11T06:31:27.423Z","repository":{"id":38063403,"uuid":"198933862","full_name":"Pyreko/KaguyaOCR","owner":"Pyreko","description":"A tool for reading in manga/comic pages and generating useful JSON files, using Microsoft's Read API.","archived":false,"fork":false,"pushed_at":"2022-10-29T07:46:12.000Z","size":29,"stargazers_count":33,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-04T10:11:35.954Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Pyreko.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-07-26T02:42:20.000Z","updated_at":"2024-12-21T03:58:54.000Z","dependencies_parsed_at":"2023-01-19T15:02:51.336Z","dependency_job_id":null,"html_url":"https://github.com/Pyreko/KaguyaOCR","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/Pyreko/KaguyaOCR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Pyreko%2FKaguyaOCR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Pyreko%2FKaguyaOCR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Pyreko%2FKaguyaOCR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Pyreko%2FKaguyaOCR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Pyreko","download_url":"https://codeload.github.com/Pyreko/KaguyaOCR/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Pyreko%2FKaguyaOCR/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279006456,"owners_count":26084108,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-28T16:50:51.840Z","updated_at":"2025-10-11T06:31:27.127Z","avatar_url":"https://github.com/Pyreko.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# KaguyaOCR\n\n**Note that this tool is likely not getting any more changes and is in a stable state; if I make improvements it will be for a new tool from scratch**\n\nA tool for reading in ~~Kaguya~~manga pages and generating a resulting OCR JSON file\nfor a chapter, in addition to a master dictionary, [using MS's Reader tool from their Cognitive Services](https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/).  In particular, it uses the Read API, not the OCR or Recognize Text API (see [here](https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text) for the differences).\n\nNote this is probably not the cleanest code or anything, and was kinda cobbled together to just work.  Improvements to usability and less jankiness will come in the future (I swear).  Furthermore, quality of the OCR is entirely dependant on MS's OCR.  This just formats the results into useful forms.\n\nDespite the name and the fact that it was written for https://guya.moe/, this will work fine for any image/manga/comic.\n\n### What it does\n\nThe main use case is to generate a JSON file containing all words found and their locations for a chapter (I'll refer to this as the \"chapter\" JSON) and a dictionary containing all words for that series and their respective locations, chapter and page wise (henceforth refered to as the \"master\" dictionary).\n\nWhen fed a directory containing image files for a chapter, it'll output a chapter JSON (ie: chapter 1 gives ``1.json``) and a master dictionary JSON (defaults to ``master_dictionary.json`` in the same directory).  See below for an example of the JSON format.\n\n### To use\nTo build from source, clone and open the solution, then Publish.  If you're using VS Studio, it should be fairly straightforward.  If not, you can publish from command line using dotnet:\n\n```bash\ndotnet publish MangaReader.sln\n```\n\nTo just use the latest release, download from Releases.  Note that the most up-to-date version will always be from the repo, so cloning from that is probably preferable.\n\nIn both cases, you'll need to include a config.json file with the following structure in the same directory as the built program (where ``MangaReader.dll`` is located):\n```json\n{\n    \"subscriptionKey\": \"your-microsoft-cognitive-key\",\n    \"personalEndpoint\": \"your-endpoint\"\n}\n```\n\nTo run the program, you will need .NET Core, as well as an internet connection (or if you're running Cognitive off a Docker container, I suppose not).\n\nThen, open a prompt in the release directory.  To see a list of flags:\n```bash\ndotnet MangaReader.dll --help\n```\n\nTo OCR a folder (representing a chapter), run the following:\n```bash\ndotnet MangaReader.dll -i \"your/input/directory/folder\" -c chapter_number -o \"optional/output/json/file/path.json\" -m \"optional/output/master/dictionary/path.json\"\n```\n\nNote that your files in the directory **must** be in order, namewise!  That is, page 1 should be the first file, page 2 should be the second file, etc.  To be safe, just pad your filenames with 0's if needed (01, 02, 03... 15, for example).\n\nTo just add a chapter JSON file to a master dictionary (or to just create a master dictionary from a chapter JSON file):\n```bash\ndotnet MangaReader.dll -i \"your/input/json/file.json\" -m \"optional/output/master/dictionary/path.json\"\n```\n\nTo add **multiple** chapter JSON files in a directory to a master dictionary:\n```bash\ndotnet MangaReader.dll -b \"chapter/json/directory/path\" -m \"optional/output/master/dictionary/path\"\n```\n\nTo add verbosity or logging (logs to ``diagnostics.log`` in the program directory), use ``-v`` or ``-l`` tags respectively.\n\n### JSON format\n\nTwo JSON files are generated at the end (at most).  A chapter JSON file and a master dictionary JSON.\n\nThe chapter JSON file has the following structure:\n```json\n{\n  \"Chapter\": 1.0,\n  \"Pages\": [\n    {\n      \"Words\": [\n        {\n          \"BoundingBox\": [\n            {\n              \"X\": 156,\n              \"Y\": 1196\n            },\n            {\n              \"X\": 455,\n              \"Y\": 1194\n            },\n            {\n              \"X\": 456,\n              \"Y\": 1225\n            },\n            {\n              \"X\": 157,\n              \"Y\": 1227\n            }\n          ],\n          \"Text\": \"BOOK DESIGN IN TSUKANO HIROTAKA\"\n        }\n      ],\n      \"Page\": 1,\n      \"Height\": 1250,\n      \"Width\": 3095\n        }\n  ],\n  \"MentionedWordChapterLocation\": {\n    \"CANDIDACY\": [\n      30\n    ],\n    \"HIS\": [\n      13,\n      16,\n      17,\n      23,\n      25\n    ]\n  }\n}\n```\n(note this is a heavily truncated result).  The bounding box coordinates go top left, top right, bottom right, bottom left, in that order.  Each bounding box represents a *phrase* or a group of words that MS Reader detected.\n\nThe master dictionary has a slightly similar format to the ``MentionedWordChapterLocation`` field, but includes a reference to the chapter:\n```json\n{\n    \"MentionedWordLocation\": {\n      \"KAGUYA\": {\n          \"1\": [\n            11,\n            12,\n            13\n          ],\n          \"10\": [\n            5,\n            11,\n            15,\n            17,\n            20\n          ],\n          \"100\": [\n            1,\n            3,\n            4,\n            5\n          ],\n          \"101\": [\n            6,\n            7,\n            11,\n            13,\n            15,\n            18\n          ],\n          \"101-1\": [\n            7\n          ]\n        }\n    }\n}\n```\nwhere each word maps to a (chapter, page array) dictionary.\n\n### Contributions\n\nWell, this was originally a private repo, but if anyone wants to contribute, feel free to.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyreko%2Fkaguyaocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpyreko%2Fkaguyaocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyreko%2Fkaguyaocr/lists"}