{"id":13628015,"url":"https://github.com/magic282/MXNMT","last_synced_at":"2025-04-17T00:33:05.187Z","repository":{"id":74397969,"uuid":"74627075","full_name":"magic282/MXNMT","owner":"magic282","description":"MXNet based Neural Machine Translation","archived":false,"fork":false,"pushed_at":"2018-07-24T01:17:33.000Z","size":56,"stargazers_count":118,"open_issues_count":4,"forks_count":39,"subscribers_count":16,"default_branch":"next","last_synced_at":"2024-08-01T22:41:45.046Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/magic282.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-11-24T01:23:48.000Z","updated_at":"2024-07-25T15:10:58.000Z","dependencies_parsed_at":"2023-05-19T08:31:02.175Z","dependency_job_id":null,"html_url":"https://github.com/magic282/MXNMT","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/magic282%2FMXNMT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/magic282%2FMXNMT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/magic282%2FMXNMT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/magic282%2FMXNMT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/magic282","download_url":"https://codeload.github.com/magic282/MXNMT/tar.gz/refs/heads/next","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223735240,"owners_count":17194068,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T22:00:42.919Z","updated_at":"2024-11-08T18:31:16.981Z","avatar_url":"https://github.com/magic282.png","language":"Python","funding_links":[],"categories":["\u003ca name=\"NLP\"\u003e\u003c/a\u003e3. NLP"],"sub_categories":["2.14 Misc"],"readme":"# MXNMT: MXNet based Neural Machine Translation\n\nThis is an implementation of seq2seq with attention for neural machine translation with MXNet.\n\n## Warning:\nThis repo is no longer maintained.\nI recommend https://github.com/magic282/PyTorch_seq2seq\n\n## Data\n\nThe current code uses IWSLT 2009 Chinese-English corpus as training, development and test data. Please request this data set or **use other available parallel corpus**. Data statistics,\n\n| training | dev | test |\n|----------|-----|------|\n| 81819    | 446 | 504  |\n\n## Attention\n* This code does work with the latest mxnet. I made a new version with improved performance in the [next](https://github.com/magic282/MXNMT/tree/next) branch and it can run with the 0.9.5 mxnet. However, this branch is not complete since it lacks the decode part. **I will really appreciate it if you can contribute to this branch.** Also, I ***strongly*** recommend to use this commit (138344683e65c87af20250e3f4cdcc5a72ac3cc5) of mxnet because of [this issue](https://github.com/dmlc/mxnet/issues/5816).\n* The author cannot distribute this dataset. **Any email requesting this dataset to the code author will not be replied.**\n\n### Dev/Test Data Format\nThe reference number of IWSLT 2009 Ch-En is 7, for example:\n```\n在 找 给 家里 人 的 礼物 .\n\ni 'm searching for some gifts for my family .\ni want to find something for my family as presents .\ni 'm about to buy some presents for my family .\ni 'd like to buy my family something as a gift .\ni 'm looking for a gift for my family .\ni 'm looking for a present for my family .\ni need a gift for my family .\n有 $number 块 钱 以下 的 茶 吗 ? |||| {1 ||| 1 ||| one thousand ||| $number ||| 一千}\n\ndo you have any tea under one thousand yen ?\ni 'd like to take a look at some tea cheaper than one thousand yen .\nis there any tea less than one thousand yen here ?\ni 'm looking for some tea under one thousand yen .\ndo you have any tea lower than one thousand yen ?\ndo you have any tea less than one thousand yen ?\ni would like to buy some tea cheaper than one thousand yen .\n```\n\n## Result\n\nAccording to my test, this code can achieve 44.18 BLEU score (with beam search) on IWSLT dev set without post-processing after 53 iteration. Specifically,\n`1gram=72.65%  2gram=49.63%  3gram=37.62%  4gram=28.08%   BP = 1.0000 BLEU = 0.4418`\n\n\n## Know Issues\n*  Compatibility issue. The current version will ask to use Python 3 since it is annoying to handle Chinese encoding problems for Python 2.\n*  In the attention part, `h.dot(U)` should be pre-computed. However it seems that it won't work properly if I do so.\n*  The BLEU evaluator, which is an exe file and not included, should be replaced by nltk evaluator in the future.\n*  The model can be modified to make it achieve about 50 BLEU score on this data set.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmagic282%2FMXNMT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmagic282%2FMXNMT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmagic282%2FMXNMT/lists"}