{"id":13935892,"url":"https://github.com/philipperemy/keras-seq2seq-example","last_synced_at":"2025-04-30T12:20:39.496Z","repository":{"id":70249811,"uuid":"106375374","full_name":"philipperemy/keras-seq2seq-example","owner":"philipperemy","description":"Toy Keras implementation of a seq2seq model with examples.","archived":false,"fork":false,"pushed_at":"2020-03-30T14:34:44.000Z","size":2925,"stargazers_count":47,"open_issues_count":0,"forks_count":12,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-30T16:05:51.585Z","etag":null,"topics":["keras","keras-neural-networks","keras-tutorials","seq2seq","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/philipperemy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["philipperemy"]}},"created_at":"2017-10-10T06:11:14.000Z","updated_at":"2024-08-12T19:33:09.000Z","dependencies_parsed_at":"2023-03-04T13:00:31.937Z","dependency_job_id":null,"html_url":"https://github.com/philipperemy/keras-seq2seq-example","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philipperemy%2Fkeras-seq2seq-example","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philipperemy%2Fkeras-seq2seq-example/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philipperemy%2Fkeras-seq2seq-example/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philipperemy%2Fkeras-seq2seq-example/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/philipperemy","download_url":"https://codeload.github.com/philipperemy/keras-seq2seq-example/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251700490,"owners_count":21629837,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["keras","keras-neural-networks","keras-tutorials","seq2seq","tensorflow"],"created_at":"2024-08-07T23:02:10.730Z","updated_at":"2025-04-30T12:20:39.469Z","avatar_url":"https://github.com/philipperemy.png","language":"Python","funding_links":["https://github.com/sponsors/philipperemy"],"categories":["Python"],"sub_categories":[],"readme":"# Keras sequence to sequence example\nVery simple Keras implementation of a sequence to sequence model with several examples.\n\n[![license](https://img.shields.io/badge/License-Apache_2.0-brightgreen.svg)](https://github.com/philipperemy/keras-seq2seq-example/blob/master/LICENSE) [![dep1](https://img.shields.io/badge/Tensorflow-1.2+-blue.svg)](https://www.tensorflow.org/) [![dep2](https://img.shields.io/badge/Keras-2.0+-blue.svg)](https://keras.io/) \n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"http://suriyadeepan.github.io/img/seq2seq/seq2seq2.png\" width=\"700\"\u003e\n  \u003cbr\u003e\u003ci\u003eAn example of a sequence to sequence model: Encoder Decoder\u003c/i\u003e\n\u003c/p\u003e\n\n\n# Japanese postal Addresses ⇄ ZIP Code (seq2seq)\n\n## Problem explained\n\nBased on a Japanese postal address, predict the corresponding ZIP Code.\n\nThis address `福島県会津若松市栄町２−４` corresponds to `965-0871`.\n\nThe current data set (~300k samples) is composed of postal addresses, scraped from the Japanese yellow pages [itp.ne.jp](itp.ne.jp). One line looks like this:\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/IMG_1.png\" width=\"400\"\u003e\n  \u003cbr\u003e\u003ci\u003eRow of the data set\u003c/i\u003e\n\u003c/p\u003e\n\nWe extract the left part (target) and the right part (inputs) and we build a supervised learning problem.\n\nWe expect the accuracy to be very very high because finding the zip code based on the address is a deterministic function (cf. [Zip codes in Japan](http://www.zipcode-jp.com/modules/zipcode/getarea.php?aid=13113)).\n\nLet's also mention that Google contains a big database and lookups are possible. It should give a nearly perfect accuracy.\n\n*The question is: Why do we bother building this model?*\n\n- For the sake of learning!\n\n- Google does not deal with unseen addresses (permute numbers and see if Google knows about it).\n- If one or more characters are missing, Google hardly handles it. Deep learning can still make a prediction. \n- We can add noise in the addresses (such as Dropout or character replacement) and train a model on this augmented data set.\n- Also it works totally offline (nowadays, it's less important though!)\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/IMG_0162.jpg\" width=\"400\"\u003e\n  \u003cbr\u003e\u003ci\u003eScreenshot of Google.\u003c/i\u003e\n\u003c/p\u003e\n\n## Training\n\nWhat you need before executing the scripts\n\n- Keras and tensorflow installed\n- One NVIDIA GPU (\u003eGTX1070)\n- A lot of RAM (\u003e32GB). The vectorization is highly unoptimized.\n\n```bash\ngit clone https://github.com/philipperemy/keras-seq2seq-example.git\ncd keras-seq2seq-example\nrm -rf *.npz *.pkl nohup.out\npython3 utils.py # build the vocabulary and the characters.\npython3 vectorization.py\nexport CUDA_VISIBLE_DEVICES=0; nohup python3 -u model.py \u0026\n```\n\n## Results\n\nAfter a while, you should see an accuracy very close to 1.0 for both the training and the validation set.\n\nThis is what I have after the first 10 epochs:\n\n```\nIteration 1\nTrain on 382617 samples, validate on 42513 samples\nEpoch 1/10\n382617/382617 [==============================] - 216s - loss: 0.8973 - acc: 0.6880 - val_loss: 0.3011 - val_acc: 0.8997\nEpoch 2/10\n382617/382617 [==============================] - 197s - loss: 0.1868 - acc: 0.9401 - val_loss: 0.1296 - val_acc: 0.9589\nEpoch 3/10\n382617/382617 [==============================] - 196s - loss: 0.0921 - acc: 0.9718 - val_loss: 0.0790 - val_acc: 0.9763\nEpoch 4/10\n382617/382617 [==============================] - 200s - loss: 0.0586 - acc: 0.9825 - val_loss: 0.0562 - val_acc: 0.9839\nEpoch 5/10\n382617/382617 [==============================] - 201s - loss: 0.0440 - acc: 0.9871 - val_loss: 0.0535 - val_acc: 0.9848\nEpoch 6/10\n382617/382617 [==============================] - 197s - loss: 0.0345 - acc: 0.9900 - val_loss: 0.0334 - val_acc: 0.9908\nEpoch 7/10\n382617/382617 [==============================] - 198s - loss: 0.0279 - acc: 0.9920 - val_loss: 0.0305 - val_acc: 0.9918\nEpoch 8/10\n382617/382617 [==============================] - 196s - loss: 0.0239 - acc: 0.9932 - val_loss: 0.0234 - val_acc: 0.9938\nEpoch 9/10\n382617/382617 [==============================] - 199s - loss: 0.0207 - acc: 0.9942 - val_loss: 0.0253 - val_acc: 0.9935\nEpoch 10/10\n382617/382617 [==============================] - 200s - loss: 0.0180 - acc: 0.9950 - val_loss: 0.0263 - val_acc: 0.9933\n```\n\n\u003e You might have to run it a second time if it gets blocked around an accuracy of 0.38 after the first epoch. I ran it several times and the accuracy on the testing set was always around 0.90 after the 1st epoch.\n\nAfter 75 epochs, the accuracy is around 0.9984. So roughly 16 mistakes per 10,000 calls. Not too bad. And the loss is still decreasing!\n\nAfter 199 epochs, the accuracy is around 0.9986. So roughly 14 mistakes per 10,000 calls. Almost flawless.\n\nThe script evaluates some examples once in a while. You can check the training procedure this way. `-` means pad. All the addresses are padded up to the length of the longest address in the dataset.\n\n```\nQ -------------------福島県会津若松市栄町２−４\nT 965-0871\n☑ 965-0871\n---\nQ -----------------東京都品川区西品川３丁目５−４\nT 141-0033\n☑ 141-0033\n---\nQ -------------------滋賀県愛知郡愛荘町市１５７\nT 529-1313\n☑ 529-1313\n---\nQ ----------------青森県つがる市木造赤根１３−４０\nT 038-3142\n☑ 038-3142\n---\nQ ---------------大阪府東大阪市中鴻池町１丁目６−６\nT 578-0975\n☑ 578-0975\n---\nQ ------------------東京都千代田区一番町２７−４\nT 102-0082\n☑ 102-0082\n---\nQ ------------神奈川県横須賀市太田和４丁目２５５０−１\nT 238-0311\n☑ 238-0311\n---\nQ ------------鹿児島県南さつま市笠沙町片浦２３４７−６\nT 897-1301\n☑ 897-1301\n---\nQ ---------------千葉県東金市田間１１５−１−１０２\nT 283-0005\n☑ 283-0005\n---\nQ ---------------千葉県匝瑳市八日市場イ２４０４−１\nT 289-2144\n☑ 289-2144\n```\n\n## References\n\n- [https://github.com/fchollet/keras/blob/master/examples/addition_rnn.py](https://github.com/fchollet/keras/blob/master/examples/addition_rnn.py)\n- [https://www.tensorflow.org/tutorials/seq2seq](https://www.tensorflow.org/tutorials/seq2seq)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphilipperemy%2Fkeras-seq2seq-example","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fphilipperemy%2Fkeras-seq2seq-example","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphilipperemy%2Fkeras-seq2seq-example/lists"}