{"id":13535087,"url":"https://github.com/nicolas-ivanov/debug_seq2seq","last_synced_at":"2025-04-02T00:32:22.224Z","repository":{"id":37382089,"uuid":"47339115","full_name":"nicolas-ivanov/debug_seq2seq","owner":"nicolas-ivanov","description":"[unmaintained] Make seq2seq for keras work","archived":false,"fork":false,"pushed_at":"2016-12-19T17:04:05.000Z","size":6458,"stargazers_count":233,"open_issues_count":21,"forks_count":86,"subscribers_count":23,"default_branch":"master","last_synced_at":"2024-11-02T23:32:10.198Z","etag":null,"topics":["chatbot","keras","seq2seq"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nicolas-ivanov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-12-03T15:02:32.000Z","updated_at":"2024-05-09T02:52:09.000Z","dependencies_parsed_at":"2022-09-14T21:20:36.289Z","dependency_job_id":null,"html_url":"https://github.com/nicolas-ivanov/debug_seq2seq","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolas-ivanov%2Fdebug_seq2seq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolas-ivanov%2Fdebug_seq2seq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolas-ivanov%2Fdebug_seq2seq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolas-ivanov%2Fdebug_seq2seq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nicolas-ivanov","download_url":"https://codeload.github.com/nicolas-ivanov/debug_seq2seq/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246735286,"owners_count":20825219,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","keras","seq2seq"],"created_at":"2024-08-01T08:00:49.567Z","updated_at":"2025-04-02T00:32:17.706Z","avatar_url":"https://github.com/nicolas-ivanov.png","language":"Python","funding_links":[],"categories":["Codes"],"sub_categories":[],"readme":"# debug seq2seq\n\n\u003e *Note: the repository is not maintained. Feel free to PM me if you'd like to take up the maintainance.*\n\nMake [seq2seq for keras](https://github.com/farizrahman4u/seq2seq) work. And also give a try to some other implementations of [seq2seq](https://github.com/nicolas-ivanov/seq2seq_chatbot_links).\n\nThe code includes:\n\n* small dataset of movie scripts to train your models on\n* preprocessor function to properly tokenize the data\n* word2vec helpers to make use of gensim word2vec lib for extra flexibility\n* train and predict function to harness the power of seq2seq\n \n**Warning**\n\n* The code has bugs, undoubtedly. Feel free to fix them and pull-request.\n* No good results were achieved with this architecture yet. See 'Results' section below for details. \n\n**Papers**\n\n* [Sequence to Sequence Learning with Neural Networks](http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf)\n* [A Neural Conversational Model](http://arxiv.org/pdf/1506.05869v1.pdf)\n\n**Nice picture**\n\n[![seq2seq](https://4.bp.blogspot.com/-aArS0l1pjHQ/Vjj71pKAaEI/AAAAAAAAAxE/Nvy1FSbD_Vs/s640/2TFstaticgraphic_alt-01.png)](http://4.bp.blogspot.com/-aArS0l1pjHQ/Vjj71pKAaEI/AAAAAAAAAxE/Nvy1FSbD_Vs/s1600/2TFstaticgraphic_alt-01.png)\n\n**Setup\u0026Run**\n\n    git clone https://github.com/nicolas-ivanov/debug_seq2seq\n    cd debug_seq2seq\n    bash bin/setup.sh\n    python bin/train.py\n\nand then\n\n    python bin/test.py\n\n\n**Results**\n\nNo good results were achieved so far:\n\n    [why ?] -\u003e [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]\n    [who ?] -\u003e [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]\n    [yeah ?] -\u003e [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]\n    [what is it ?] -\u003e [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]\n    [why not ?] -\u003e [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]\n    [really ?] -\u003e [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]\n\nMy guess is that there are some foundational problems in this approach:\n\n* Since word2vec vectors are used for words representations and the model returns an approximate vector for every next word, this error is accumulated from one word to another and thus starting from the third word the model fails to predict anything meaningful...\nThis problem might be overcome if we replace our approximate word2vec vector every thimestamp with a \"correct\" vector, i.e. the one that corresponds to an actual word from the dictionary. Does it make sence?\nHowever you need to dig into seq2seq code to do that.\n\n* The second problem relates to word sampling: even if you manage to solve the aforementioned issue, in case you stick to using argmax() for picking the most probable word every time stamps, the answers gonna be too simple and not interesting, like:\n\n```\nare you a human?\t\t\t-- no .\nare you a robot or human?\t-- no .\nare you a robot?\t\t\t-- no .\nare you better than siri?  \t\t-- yes .\nare you here ?\t\t\t\t-- yes .\nare you human?\t\t\t-- no .\nare you really better than siri?\t-- yes .\nare you there \t\t\t\t-- you ' re not going to be\nare you there?!?!\t\t\t-- yes .\n```\n\nNot to mislead you: these results were achieved on a different seq2seq architecture, based on tensorflow.\n\nSampling with temperature could be used in order to diversify the output results, however that's again should be done inside seq2seq library.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnicolas-ivanov%2Fdebug_seq2seq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnicolas-ivanov%2Fdebug_seq2seq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnicolas-ivanov%2Fdebug_seq2seq/lists"}