{"id":13468868,"url":"https://github.com/tensorflow/tensor2tensor","last_synced_at":"2025-10-05T16:30:54.036Z","repository":{"id":37335495,"uuid":"94460704","full_name":"tensorflow/tensor2tensor","owner":"tensorflow","description":"Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.","archived":true,"fork":false,"pushed_at":"2023-06-02T18:55:09.000Z","size":17508,"stargazers_count":16528,"open_issues_count":590,"forks_count":3668,"subscribers_count":473,"default_branch":"master","last_synced_at":"2025-10-04T00:36:14.767Z","etag":null,"topics":["deep-learning","machine-learning","machine-translation","reinforcement-learning","tpu"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tensorflow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-06-15T16:57:39.000Z","updated_at":"2025-10-03T22:03:12.000Z","dependencies_parsed_at":"2023-02-10T20:16:04.144Z","dependency_job_id":"befab3c9-a726-48c3-b745-34c3b2660478","html_url":"https://github.com/tensorflow/tensor2tensor","commit_stats":{"total_commits":4111,"total_committers":255,"mean_commits":16.12156862745098,"dds":0.8139138895645828,"last_synced_commit":"bafdc1b67730430d38d6ab802cbd51f9d053ba2e"},"previous_names":[],"tags_count":76,"template":false,"template_full_name":null,"purl":"pkg:github/tensorflow/tensor2tensor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Ftensor2tensor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Ftensor2tensor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Ftensor2tensor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Ftensor2tensor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tensorflow","download_url":"https://codeload.github.com/tensorflow/tensor2tensor/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Ftensor2tensor/sbom","scorecard":{"id":873972,"data":{"date":"2025-08-11","repo":{"name":"github.com/tensorflow/tensor2tensor","commit":"bafdc1b67730430d38d6ab802cbd51f9d053ba2e"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.9,"checks":[{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Code-Review","score":9,"reason":"Found 29/30 approved changesets -- score normalized to 9","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Maintained","score":0,"reason":"project is archived","details":["Warn: Repository is archived."],"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: pipCommand not pinned by hash: oss_scripts/oss_integration_test.sh:26","Warn: pipCommand not pinned by hash: oss_scripts/oss_integration_test.sh:43","Warn: pipCommand not pinned by hash: oss_scripts/oss_pip_install.sh:9","Warn: pipCommand not pinned by hash: oss_scripts/oss_pip_install.sh:10","Warn: pipCommand not pinned by hash: oss_scripts/oss_pip_install.sh:14","Warn: pipCommand not pinned by hash: oss_scripts/oss_pip_install.sh:15","Warn: pipCommand not pinned by hash: oss_scripts/oss_pip_install.sh:21","Warn: pipCommand not pinned by hash: oss_scripts/oss_pip_install.sh:26","Warn: pipCommand not pinned by hash: oss_scripts/oss_pip_install.sh:28","Warn: pipCommand not pinned by hash: oss_scripts/oss_release.sh:19","Info:   0 out of  10 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 2 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-24T05:14:40.156Z","repository_id":37335495,"created_at":"2025-08-24T05:14:40.156Z","updated_at":"2025-08-24T05:14:40.156Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278251584,"owners_count":25956124,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-03T02:00:06.070Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","machine-learning","machine-translation","reinforcement-learning","tpu"],"created_at":"2024-07-31T15:01:20.832Z","updated_at":"2025-10-05T16:30:53.104Z","avatar_url":"https://github.com/tensorflow.png","language":"Python","funding_links":[],"categories":["Python","Implementations","Implementations \u0026 How to reproduce paper's result","Python (144)","Deep Learning Framework","TensorFlow Tools, Libraries, and Frameworks","Implementations \u0026 How to reproduce paper's result?","Tensorflow实用程序","Optimized Computation","Libraries related to the Natural Language Processing","📦 Legacy \u0026 Inactive Projects","Model 💛💛💛💛💛\u003ca name=\"Model\" /\u003e"],"sub_categories":["Complex, performance-reproducable implementations","High-Level DL APIs","Natural Language Processing","NLP Model 自然语言处理模型"],"readme":"# Tensor2Tensor\n\n[![PyPI\nversion](https://badge.fury.io/py/tensor2tensor.svg)](https://badge.fury.io/py/tensor2tensor)\n[![GitHub\nIssues](https://img.shields.io/github/issues/tensorflow/tensor2tensor.svg)](https://github.com/tensorflow/tensor2tensor/issues)\n[![Contributions\nwelcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CONTRIBUTING.md)\n[![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/tensor2tensor/Lobby)\n[![License](https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Travis](https://img.shields.io/travis/tensorflow/tensor2tensor.svg)](https://travis-ci.org/tensorflow/tensor2tensor)\n[![Run on FH](https://static.floydhub.com/button/button-small.svg)](https://floydhub.com/run)\n\n[Tensor2Tensor](https://github.com/tensorflow/tensor2tensor), or\n[T2T](https://github.com/tensorflow/tensor2tensor) for short, is a library\nof deep learning models and datasets designed to make deep learning more\naccessible and [accelerate ML\nresearch](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).\n\n\nT2T was developed by researchers and engineers in the\n[Google Brain team](https://research.google.com/teams/brain/) and a community\nof users. It is now deprecated \u0026mdash; we keep it running and welcome\nbug-fixes, but encourage users to use the successor library [Trax](https://github.com/google/trax).\n\n### Quick Start\n\n[This iPython notebook](https://colab.research.google.com/github/tensorflow/tensor2tensor/blob/master/tensor2tensor/notebooks/hello_t2t.ipynb)\nexplains T2T and runs in your browser using a free VM from Google,\nno installation needed. Alternatively, here is a one-command version that\ninstalls T2T, downloads MNIST, trains a model and evaluates it:\n\n```\npip install tensor2tensor \u0026\u0026 t2t-trainer \\\n  --generate_data \\\n  --data_dir=~/t2t_data \\\n  --output_dir=~/t2t_train/mnist \\\n  --problem=image_mnist \\\n  --model=shake_shake \\\n  --hparams_set=shake_shake_quick \\\n  --train_steps=1000 \\\n  --eval_steps=100\n```\n\n### Contents\n\n* [Suggested Datasets and Models](#suggested-datasets-and-models)\n  * [Mathematical Language Understanding](#mathematical-language-understanding)\n  * [Story, Question and Answer](#story-question-and-answer)\n  * [Image Classification](#image-classification)\n  * [Image Generation](#image-generation)\n  * [Language Modeling](#language-modeling)\n  * [Sentiment Analysis](#sentiment-analysis)\n  * [Speech Recognition](#speech-recognition)\n  * [Summarization](#summarization)\n  * [Translation](#translation)\n* [Basics](#basics)\n  * [Walkthrough](#walkthrough)\n  * [Installation](#installation)\n  * [Features](#features)\n* [T2T Overview](#t2t-overview)\n  * [Datasets](#datasets)\n  * [Problems and Modalities](#problems-and-modalities)\n  * [Models](#models)\n  * [Hyperparameter Sets](#hyperparameter-sets)\n  * [Trainer](#trainer)\n* [Adding your own components](#adding-your-own-components)\n* [Adding a dataset](#adding-a-dataset)\n* [Papers](#papers)\n* [Run on FloydHub](#run-on-floydhub)\n\n## Suggested Datasets and Models\n\nBelow we list a number of tasks that can be solved with T2T when\nyou train the appropriate model on the appropriate problem.\nWe give the problem and model below and we suggest a setting of\nhyperparameters that we know works well in our setup. We usually\nrun either on Cloud TPUs or on 8-GPU machines; you might need\nto modify the hyperparameters if you run on a different setup.\n\n### Mathematical Language Understanding\n\nFor evaluating mathematical expressions at the character level involving addition, subtraction and multiplication of both positive and negative decimal numbers with variable digits assigned to symbolic variables, use\n\n* the [MLU](https://art.wangperawong.com/mathematical_language_understanding_train.tar.gz) data-set:\n `--problem=algorithmic_math_two_variables`\n\nYou can try solving the problem with different transformer models and hyperparameters as described in the [paper](https://arxiv.org/abs/1812.02825):\n* Standard transformer:\n`--model=transformer`\n`--hparams_set=transformer_tiny`\n* Universal transformer:\n`--model=universal_transformer`\n`--hparams_set=universal_transformer_tiny`\n* Adaptive universal transformer:\n`--model=universal_transformer`\n`--hparams_set=adaptive_universal_transformer_tiny`\n\n### Story, Question and Answer\n\nFor answering questions based on a story, use\n\n* the [bAbi](https://research.fb.com/downloads/babi/) data-set:\n `--problem=babi_qa_concat_task1_1k`\n\nYou can choose the bAbi task from the range [1,20] and the subset from 1k or\n10k. To combine test data from all tasks into a single test set, use\n`--problem=babi_qa_concat_all_tasks_10k`\n\n### Image Classification\n\nFor image classification, we have a number of standard data-sets:\n\n* ImageNet (a large data-set): `--problem=image_imagenet`, or one\n   of the re-scaled versions (`image_imagenet224`, `image_imagenet64`,\n   `image_imagenet32`)\n* CIFAR-10: `--problem=image_cifar10` (or\n    `--problem=image_cifar10_plain` to turn off data augmentation)\n* CIFAR-100: `--problem=image_cifar100`\n* MNIST: `--problem=image_mnist`\n\nFor ImageNet, we suggest to use the ResNet or Xception, i.e.,\nuse `--model=resnet --hparams_set=resnet_50` or\n`--model=xception --hparams_set=xception_base`.\nResnet should get to above 76% top-1 accuracy on ImageNet.\n\nFor CIFAR and MNIST, we suggest to try the shake-shake model:\n`--model=shake_shake --hparams_set=shakeshake_big`.\nThis setting trained for `--train_steps=700000` should yield\nclose to 97% accuracy on CIFAR-10.\n\n### Image Generation\n\nFor (un)conditional image generation, we have a number of standard data-sets:\n\n* CelebA: `--problem=img2img_celeba` for image-to-image translation, namely,\n    superresolution from 8x8 to 32x32.\n* CelebA-HQ: `--problem=image_celeba256_rev` for a downsampled 256x256.\n* CIFAR-10: `--problem=image_cifar10_plain_gen_rev` for class-conditional\n    32x32 generation.\n* LSUN Bedrooms: `--problem=image_lsun_bedrooms_rev`\n* MS-COCO: `--problem=image_text_ms_coco_rev` for text-to-image generation.\n* Small ImageNet (a large data-set): `--problem=image_imagenet32_gen_rev` for\n    32x32 or `--problem=image_imagenet64_gen_rev` for 64x64.\n\nWe suggest to use the Image Transformer, i.e., `--model=imagetransformer`, or\nthe Image Transformer Plus, i.e., `--model=imagetransformerpp` that uses\ndiscretized mixture of logistics, or variational auto-encoder, i.e.,\n`--model=transformer_ae`.\nFor CIFAR-10, using `--hparams_set=imagetransformer_cifar10_base` or\n`--hparams_set=imagetransformer_cifar10_base_dmol` yields 2.90 bits per\ndimension. For Imagenet-32, using\n`--hparams_set=imagetransformer_imagenet32_base` yields 3.77 bits per dimension.\n\n### Language Modeling\n\nFor language modeling, we have these data-sets in T2T:\n\n* PTB (a small data-set): `--problem=languagemodel_ptb10k` for\n    word-level modeling and `--problem=languagemodel_ptb_characters`\n    for character-level modeling.\n* LM1B (a billion-word corpus): `--problem=languagemodel_lm1b32k` for\n    subword-level modeling and `--problem=languagemodel_lm1b_characters`\n    for character-level modeling.\n\nWe suggest to start with `--model=transformer` on this task and use\n`--hparams_set=transformer_small` for PTB and\n`--hparams_set=transformer_base` for LM1B.\n\n### Sentiment Analysis\n\nFor the task of recognizing the sentiment of a sentence, use\n\n* the IMDB data-set: `--problem=sentiment_imdb`\n\nWe suggest to use `--model=transformer_encoder` here and since it is\na small data-set, try `--hparams_set=transformer_tiny` and train for\nfew steps (e.g., `--train_steps=2000`).\n\n### Speech Recognition\n\nFor speech-to-text, we have these data-sets in T2T:\n\n* Librispeech (US English): `--problem=librispeech` for\n    the whole set and `--problem=librispeech_clean` for a smaller\n    but nicely filtered part.\n\n* Mozilla Common Voice (US English): `--problem=common_voice` for the whole set\n    `--problem=common_voice_clean` for a quality-checked subset.\n\n### Summarization\n\nFor summarizing longer text into shorter one we have these data-sets:\n\n* CNN/DailyMail articles summarized into a few sentences:\n  `--problem=summarize_cnn_dailymail32k`\n\nWe suggest to use `--model=transformer` and\n`--hparams_set=transformer_prepend` for this task.\nThis yields good ROUGE scores.\n\n### Translation\n\nThere are a number of translation data-sets in T2T:\n\n* English-German: `--problem=translate_ende_wmt32k`\n* English-French: `--problem=translate_enfr_wmt32k`\n* English-Czech: `--problem=translate_encs_wmt32k`\n* English-Chinese: `--problem=translate_enzh_wmt32k`\n* English-Vietnamese: `--problem=translate_envi_iwslt32k`\n* English-Spanish: `--problem=translate_enes_wmt32k`\n\nYou can get translations in the other direction by appending `_rev` to\nthe problem name, e.g., for German-English use\n`--problem=translate_ende_wmt32k_rev`\n(note that you still need to download the original data with t2t-datagen\n`--problem=translate_ende_wmt32k`).\n\nFor all translation problems, we suggest to try the Transformer model:\n`--model=transformer`. At first it is best to try the base setting,\n`--hparams_set=transformer_base`. When trained on 8 GPUs for 300K steps\nthis should reach a BLEU score of about 28 on the English-German data-set,\nwhich is close to state-of-the art. If training on a single GPU, try the\n`--hparams_set=transformer_base_single_gpu` setting. For very good results\nor larger data-sets (e.g., for English-French), try the big model\nwith `--hparams_set=transformer_big`.\n\nSee this [example](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/notebooks/Transformer_translate.ipynb) to know how the translation works.\n\n## Basics\n\n### Walkthrough\n\nHere's a walkthrough training a good English-to-German translation\nmodel using the Transformer model from [*Attention Is All You\nNeed*](https://arxiv.org/abs/1706.03762) on WMT data.\n\n```\npip install tensor2tensor\n\n# See what problems, models, and hyperparameter sets are available.\n# You can easily swap between them (and add new ones).\nt2t-trainer --registry_help\n\nPROBLEM=translate_ende_wmt32k\nMODEL=transformer\nHPARAMS=transformer_base_single_gpu\n\nDATA_DIR=$HOME/t2t_data\nTMP_DIR=/tmp/t2t_datagen\nTRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS\n\nmkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR\n\n# Generate data\nt2t-datagen \\\n  --data_dir=$DATA_DIR \\\n  --tmp_dir=$TMP_DIR \\\n  --problem=$PROBLEM\n\n# Train\n# *  If you run out of memory, add --hparams='batch_size=1024'.\nt2t-trainer \\\n  --data_dir=$DATA_DIR \\\n  --problem=$PROBLEM \\\n  --model=$MODEL \\\n  --hparams_set=$HPARAMS \\\n  --output_dir=$TRAIN_DIR\n\n# Decode\n\nDECODE_FILE=$DATA_DIR/decode_this.txt\necho \"Hello world\" \u003e\u003e $DECODE_FILE\necho \"Goodbye world\" \u003e\u003e $DECODE_FILE\necho -e 'Hallo Welt\\nAuf Wiedersehen Welt' \u003e ref-translation.de\n\nBEAM_SIZE=4\nALPHA=0.6\n\nt2t-decoder \\\n  --data_dir=$DATA_DIR \\\n  --problem=$PROBLEM \\\n  --model=$MODEL \\\n  --hparams_set=$HPARAMS \\\n  --output_dir=$TRAIN_DIR \\\n  --decode_hparams=\"beam_size=$BEAM_SIZE,alpha=$ALPHA\" \\\n  --decode_from_file=$DECODE_FILE \\\n  --decode_to_file=translation.en\n\n# See the translations\ncat translation.en\n\n# Evaluate the BLEU score\n# Note: Report this BLEU score in papers, not the internal approx_bleu metric.\nt2t-bleu --translation=translation.en --reference=ref-translation.de\n```\n\n### Installation\n\n\n```\n# Assumes tensorflow or tensorflow-gpu installed\npip install tensor2tensor\n\n# Installs with tensorflow-gpu requirement\npip install tensor2tensor[tensorflow_gpu]\n\n# Installs with tensorflow (cpu) requirement\npip install tensor2tensor[tensorflow]\n```\n\nBinaries:\n\n```\n# Data generator\nt2t-datagen\n\n# Trainer\nt2t-trainer --registry_help\n```\n\nLibrary usage:\n\n```\npython -c \"from tensor2tensor.models.transformer import Transformer\"\n```\n\n### Features\n\n* Many state of the art and baseline models are built-in and new models can be\n  added easily (open an issue or pull request!).\n* Many datasets across modalities - text, audio, image - available for\n  generation and use, and new ones can be added easily (open an issue or pull\n  request for public datasets!).\n* Models can be used with any dataset and input mode (or even multiple); all\n  modality-specific processing (e.g. embedding lookups for text tokens) is done\n  with `bottom` and `top` transformations, which are specified per-feature in the\n  model.\n* Support for multi-GPU machines and synchronous (1 master, many workers) and\n  asynchronous (independent workers synchronizing through a parameter server)\n  [distributed training](https://tensorflow.github.io/tensor2tensor/distributed_training.html).\n* Easily swap amongst datasets and models by command-line flag with the data\n  generation script `t2t-datagen` and the training script `t2t-trainer`.\n* Train on [Google Cloud ML](https://tensorflow.github.io/tensor2tensor/cloud_mlengine.html) and [Cloud TPUs](https://tensorflow.github.io/tensor2tensor/cloud_tpu.html).\n\n## T2T overview\n\n### Problems\n\n**Problems** consist of features such as inputs and targets, and metadata such\nas each feature's modality (e.g. symbol, image, audio) and vocabularies. Problem\nfeatures are given by a dataset, which is stored as a `TFRecord` file with\n`tensorflow.Example` protocol buffers. All\nproblems are imported in\n[`all_problems.py`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/data_generators/all_problems.py)\nor are registered with `@registry.register_problem`. Run\n[`t2t-datagen`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/bin/t2t-datagen)\nto see the list of available problems and download them.\n\n### Models\n\n**`T2TModel`s** define the core tensor-to-tensor computation. They apply a\ndefault transformation to each input and output so that models may deal with\nmodality-independent tensors (e.g. embeddings at the input; and a linear\ntransform at the output to produce logits for a softmax over classes). All\nmodels are imported in the\n[`models` subpackage](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/models/__init__.py),\ninherit from [`T2TModel`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/t2t_model.py),\nand are registered with\n[`@registry.register_model`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/registry.py).\n\n### Hyperparameter Sets\n\n**Hyperparameter sets** are encoded in\n[`HParams`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/hparam.py)\nobjects, and are registered with\n[`@registry.register_hparams`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/registry.py).\nEvery model and problem has a `HParams`. A basic set of hyperparameters are\ndefined in\n[`common_hparams.py`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/layers/common_hparams.py)\nand hyperparameter set functions can compose other hyperparameter set functions.\n\n### Trainer\n\nThe **trainer** binary is the entrypoint for training, evaluation, and\ninference. Users can easily switch between problems, models, and hyperparameter\nsets by using the `--model`, `--problem`, and `--hparams_set` flags. Specific\nhyperparameters can be overridden with the `--hparams` flag. `--schedule` and\nrelated flags control local and distributed training/evaluation\n([distributed training documentation](https://github.com/tensorflow/tensor2tensor/tree/master/docs/distributed_training.md)).\n\n## Adding your own components\n\nT2T's components are registered using a central registration mechanism that\nenables easily adding new ones and easily swapping amongst them by command-line\nflag. You can add your own components without editing the T2T codebase by\nspecifying the `--t2t_usr_dir` flag in `t2t-trainer`.\n\nYou can do so for models, hyperparameter sets, modalities, and problems. Please\ndo submit a pull request if your component might be useful to others.\n\nSee the [`example_usr_dir`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/test_data/example_usr_dir)\nfor an example user directory.\n\n## Adding a dataset\n\nTo add a new dataset, subclass\n[`Problem`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/data_generators/problem.py)\nand register it with `@registry.register_problem`. See\n[`TranslateEndeWmt8k`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/data_generators/translate_ende.py)\nfor an example. Also see the [data generators\nREADME](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/data_generators/README.md).\n\n## Run on FloydHub\n\n[![Run on FloydHub](https://static.floydhub.com/button/button.svg)](https://floydhub.com/run)\n\nClick this button to open a [Workspace](https://blog.floydhub.com/workspaces/) on [FloydHub](https://www.floydhub.com/?utm_medium=readme\u0026utm_source=tensor2tensor\u0026utm_campaign=jul_2018). You can use the workspace to develop and test your code on a fully configured cloud GPU machine.\n\nTensor2Tensor comes preinstalled in the environment, you can simply open a [Terminal](https://docs.floydhub.com/guides/workspace/#using-terminal) and run your code.\n\n```bash\n# Test the quick-start on a Workspace's Terminal with this command\nt2t-trainer \\\n  --generate_data \\\n  --data_dir=./t2t_data \\\n  --output_dir=./t2t_train/mnist \\\n  --problem=image_mnist \\\n  --model=shake_shake \\\n  --hparams_set=shake_shake_quick \\\n  --train_steps=1000 \\\n  --eval_steps=100\n```\n\nNote: Ensure compliance with the FloydHub [Terms of Service](https://www.floydhub.com/about/terms).\n\n## Papers\n\nWhen referencing Tensor2Tensor, please cite [this\npaper](https://arxiv.org/abs/1803.07416).\n\n```\n@article{tensor2tensor,\n  author    = {Ashish Vaswani and Samy Bengio and Eugene Brevdo and\n    Francois Chollet and Aidan N. Gomez and Stephan Gouws and Llion Jones and\n    \\L{}ukasz Kaiser and Nal Kalchbrenner and Niki Parmar and Ryan Sepassi and\n    Noam Shazeer and Jakob Uszkoreit},\n  title     = {Tensor2Tensor for Neural Machine Translation},\n  journal   = {CoRR},\n  volume    = {abs/1803.07416},\n  year      = {2018},\n  url       = {http://arxiv.org/abs/1803.07416},\n}\n```\n\nTensor2Tensor was used to develop a number of state-of-the-art models\nand deep learning methods. Here we list some papers that were based on T2T\nfrom the start and benefited from its features and architecture in ways\ndescribed in the [Google Research Blog post introducing\nT2T](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).\n\n* [Attention Is All You Need](https://arxiv.org/abs/1706.03762)\n* [Depthwise Separable Convolutions for Neural Machine\n   Translation](https://arxiv.org/abs/1706.03059)\n* [One Model To Learn Them All](https://arxiv.org/abs/1706.05137)\n* [Discrete Autoencoders for Sequence Models](https://arxiv.org/abs/1801.09797)\n* [Generating Wikipedia by Summarizing Long\n   Sequences](https://arxiv.org/abs/1801.10198)\n* [Image Transformer](https://arxiv.org/abs/1802.05751)\n* [Training Tips for the Transformer Model](https://arxiv.org/abs/1804.00247)\n* [Self-Attention with Relative Position Representations](https://arxiv.org/abs/1803.02155)\n* [Fast Decoding in Sequence Models using Discrete Latent Variables](https://arxiv.org/abs/1803.03382)\n* [Adafactor: Adaptive Learning Rates with Sublinear Memory Cost](https://arxiv.org/abs/1804.04235)\n* [Universal Transformers](https://arxiv.org/abs/1807.03819)\n* [Attending to Mathematical Language with Transformers](https://arxiv.org/abs/1812.02825)\n* [The Evolved Transformer](https://arxiv.org/abs/1901.11117)\n* [Model-Based Reinforcement Learning for Atari](https://arxiv.org/abs/1903.00374)\n* [VideoFlow: A Flow-Based Generative Model for Video](https://arxiv.org/abs/1903.01434)\n\n*NOTE: This is not an official Google product.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorflow%2Ftensor2tensor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftensorflow%2Ftensor2tensor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorflow%2Ftensor2tensor/lists"}