{"id":17294894,"url":"https://github.com/linsamtw/taiwantrainverificationcode2text","last_synced_at":"2025-10-26T07:33:22.722Z","repository":{"id":62584659,"uuid":"171095158","full_name":"linsamtw/TaiwanTrainVerificationCode2text","owner":"linsamtw","description":"台鐵驗證碼辨識/轉文字","archived":false,"fork":false,"pushed_at":"2022-05-06T04:30:50.000Z","size":12074,"stargazers_count":90,"open_issues_count":0,"forks_count":20,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-12-10T04:05:49.415Z","etag":null,"topics":["api","cnn","cnn-classification","cnn-keras","verification-code"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linsamtw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-02-17T07:42:51.000Z","updated_at":"2024-07-05T03:03:48.000Z","dependencies_parsed_at":"2022-11-03T22:00:41.357Z","dependency_job_id":null,"html_url":"https://github.com/linsamtw/TaiwanTrainVerificationCode2text","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linsamtw%2FTaiwanTrainVerificationCode2text","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linsamtw%2FTaiwanTrainVerificationCode2text/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linsamtw%2FTaiwanTrainVerificationCode2text/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linsamtw%2FTaiwanTrainVerificationCode2text/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linsamtw","download_url":"https://codeload.github.com/linsamtw/TaiwanTrainVerificationCode2text/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230438185,"owners_count":18225870,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","cnn","cnn-classification","cnn-keras","verification-code"],"created_at":"2024-10-15T11:08:30.778Z","updated_at":"2025-10-07T04:20:59.891Z","avatar_url":"https://github.com/linsamtw.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Taiwan Train Verification Code 2 text ( 台鐵驗證碼轉文字 )\n\n\n[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/linsamtw/TaiwanTrainVerificationCode2text/blob/master/LICENSE)\n[![PyPI version](https://badge.fury.io/py/TaiwanTrainVerificationCode2text.svg)](https://badge.fury.io/py/TaiwanTrainVerificationCode2text)\n\n-------------------\n## introduce\n提供各位開發程式訂票，其中驗證碼破解部分的方法，可直接使用，不須再回傳 image 用人工方式辨識。\u003cbr\u003e\n此 package 使用 keras \u0026 Tensorflow 建模\u0026預測，需要安裝相依 package。\n\nmodel test data 準確率約為 88%，使用 10 萬張圖進行 training。\n\n-------------------\n\n\tpip3 install TaiwanTrainVerificationCode2text\n    \ncv2 比較難裝，以下提供安裝方法\n\n    conda install -c menpo opencv\n\t# 你還需要這個\n    pip3 install h5py\n\n---------------------------------\n\n## demo\n\tinput \n![image](https://raw.githubusercontent.com/linsamtw/TaiwanTrainVerificationCode2text/master/WNBA8S.jpg)\n\n\toutput\nWNBA8S\n\n--------------------\n\n## exmaple \n\n\timport os\n\timport platform\n\tfrom TaiwanTrainVerificationCode2text import verification_code2text\n\tfrom TaiwanTrainVerificationCode2text import work_vcode \n\tfrom TaiwanTrainVerificationCode2text import download \n\timport TaiwanTrainVerificationCode2text\n\tPATH = TaiwanTrainVerificationCode2text.__path__[0]\n\timport cv2\n\timport matplotlib.pyplot as plt\n\timport random\n\tfrom datetime import datetime\n\t\n\t# 下載我 train 好的 weight，ttf 是驗證碼字形，用於以下生成模擬驗證碼\n\tdownload.weight()\n\tdownload.ttf()\n\t# 生成模擬驗證碼\n\twork_vcode.work_vcode_fun(10,'test_data',5)\n\twork_vcode.work_vcode_fun(10,'test_data',6)\n\n\tif 'linux' in platform.platform():\n\t    file_path = '{}/{}/'.format(PATH,'test_data')\n\t    train_image_path = [file_path + i for i in os.listdir(file_path+'/')]\n\telse:\n\t    file_path = '{}\\\\{}\\\\'.format(PATH,'test_data')\n\t    train_image_path = [file_path + i for i in os.listdir(file_path+'\\\\')]\n\t# 隨機取一個當作 demo\n\timage_name = train_image_path[random.sample( range(len(train_image_path)) ,1)[0]]\n\n\t# 讀取圖片\n\timage = cv2.imread(image_name)\n\t# 畫圖\n\tplt.imshow(image)\n\t# 辨識，驗證碼轉文字\n\ttext = verification_code2text.main(image)\n\t# 印出最後結果\n\tprint(text)\n\n最後結果就會類似 demo ，\n\n-------------------------------\n\n如果想自己 train，可以使用\n\n[build_verification_code_cnn_model.py](https://github.com/linsamtw/TaiwanTrainVerificationCode2text/blob/master/build_verification_code_cnn_model.py)\n\n稍微介紹主要 code\n\n    def main():\n        import work_vcode \n        #import time\n        # 因為台鐵驗整碼是 5~6 隨機，因此必須生成 5 碼驗證碼\u0026 6 碼驗證碼\n        # 500 是 data 數量，建議數字為30000，500 只是 demo\n        work_vcode.work_vcode_fun(500,'train_data',5)\n        work_vcode.work_vcode_fun(500,'train_data',6)\n\t\t# 生成 test data，可根據自己喜好調整 data 數量\n        work_vcode.work_vcode_fun(100,'test_data',5)\n        work_vcode.work_vcode_fun(100,'test_data',6)\n        self = build_verification_code_cnn_model()\n        # 建模，最後 weight 會存放在 package_path/cnn_weight/verificatioin_code.h5\n        self.build_model_process()  \n   \n   train 好後，可再使用以上 example code，會讀取你 train 好的 weight。\n\n--------------------------\n\n## 方法\n由於驗證碼是26個英文字搭配10個數字，再加上隨機 5~6 碼，我將 class 分成 26 + 10 + null，37類，下去做分類。\u003cbr\u003e\nnull 代表沒有此文字， 我將 NN 結構中的 output 設計為 6 dimension，將5碼也看成6碼，只是最後一個是 null，藉此同時處理 5 or 6 碼問題。\u003cbr\u003e\n\n----------------\n## future\n面臨到最大的問題是，null 佔比例太大，以上面生成 data 的 code 為例\n\n        work_vcode.work_vcode_fun(500,'train_data',5)\n        work_vcode.work_vcode_fun(500,'train_data',6)\n先不管英文\u0026數字數量，有一半的 class 是 null，因此在分類問題上，會造成嚴重的 imbalance 問題，即使 data 增加也不會改善問題，未來將搭配 object detection，改善準確率。\n\n-----------------------\n如有問題，可寄信給我 or 留言在 issues。\n\nemail : samlin266118@gmail.com\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinsamtw%2Ftaiwantrainverificationcode2text","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinsamtw%2Ftaiwantrainverificationcode2text","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinsamtw%2Ftaiwantrainverificationcode2text/lists"}