{"id":13443493,"url":"https://github.com/tensorflow/lingvo","last_synced_at":"2025-05-13T22:11:31.390Z","repository":{"id":37665048,"uuid":"142219189","full_name":"tensorflow/lingvo","owner":"tensorflow","description":"Lingvo","archived":false,"fork":false,"pushed_at":"2025-04-30T08:52:54.000Z","size":149095,"stargazers_count":2838,"open_issues_count":145,"forks_count":451,"subscribers_count":117,"default_branch":"master","last_synced_at":"2025-05-08T00:09:45.230Z","etag":null,"topics":["asr","distributed","gpu-computing","language-model","lm","machine-translation","mnist","nlp","research","seq2seq","speech","speech-recognition","speech-synthesis","speech-to-text","tensorflow","translation","tts"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tensorflow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-07-24T22:30:28.000Z","updated_at":"2025-04-30T08:52:57.000Z","dependencies_parsed_at":"2023-10-02T21:35:09.766Z","dependency_job_id":"44f17f8a-ac30-48ba-a542-4a8cc18d99ce","html_url":"https://github.com/tensorflow/lingvo","commit_stats":{"total_commits":4640,"total_committers":85,"mean_commits":"54.588235294117645","dds":0.5911637931034484,"last_synced_commit":"1f53897c0d265908929c6ccea2ac0e3e05e3ffc7"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Flingvo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Flingvo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Flingvo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Flingvo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tensorflow","download_url":"https://codeload.github.com/tensorflow/lingvo/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254036842,"owners_count":22003654,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asr","distributed","gpu-computing","language-model","lm","machine-translation","mnist","nlp","research","seq2seq","speech","speech-recognition","speech-synthesis","speech-to-text","tensorflow","translation","tts"],"created_at":"2024-07-31T03:02:02.057Z","updated_at":"2025-05-13T22:11:26.379Z","avatar_url":"https://github.com/tensorflow.png","language":"Python","readme":"# Lingvo\n\n[![PyPI](https://badge.fury.io/py/lingvo.svg)](https://badge.fury.io/py/lingvo)\n[![Python](https://img.shields.io/pypi/pyversions/lingvo)](https://badge.fury.io/py/tensorflow)\n\n[![Documentation](https://img.shields.io/badge/api-reference-blue.svg)](https://tensorflow.github.io/lingvo)\n\n[![License](https://img.shields.io/github/license/tensorflow/lingvo)](LICENSE)\n\n## What is it?\n\nLingvo is a framework for building neural networks in Tensorflow, particularly\nsequence models.\n\nA list of publications using Lingvo can be found [here](PUBLICATIONS.md).\n\n## Table of Contents\n\n*   [Releases](#releases)\n    *   [Major breaking changes](#major-breaking-changes)\n*   [Quick start](#quick-start)\n    *   [Installation](#installation)\n    *   [Running the MNIST image model](#running-the-mnist-image-model)\n    *   [Running the machine translation model](#running-the-machine-translation-model)\n    *   [Running the GShard transformer based giant language model](#running-the-gshard-transformer-based-giant-language-model)\n    *   [Running the 3d object detection model](#running-the-3d-object-detection-model)\n*   [Models](#models)\n    *   [Automatic Speech Recognition](#automatic-speech-recognition)\n    *   [Car](#car)\n    *   [Image](#image)\n    *   [Language Modelling](#language-modelling)\n    *   [Machine Translation](#machine-translation)\n*   [References](#references)\n*   [License](#license)\n\n## Releases\n\nPyPI Version | Commit\n------------ | ----------------------------------------\n0.12.4       | --\n0.11.0       | 6fae10077756f54beacd5c454959f20b33fd65e2\n0.10.0       | 075fd1d88fa6f92681f58a2383264337d0e737ee\n0.9.1        | c1124c5aa7af13d2dd2b6d43293c8ca6d022b008\n0.9.0        | f826e99803d1b51dccbbbed1ef857ba48a2bbefe\n\n\u003cdetails\u003e\u003csummary\u003e\n\u003cb\u003eOlder releases\u003c/b\u003e\n\u003c/summary\u003e\u003cp\u003e\n\nPyPI Version | Commit\n------------ | ----------------------------------------\n0.8.2        | 93e123c6788e934e6b7b1fd85770371becf1e92e\n0.7.2        | b05642fe386ee79e0d88aa083565c9a93428519e\n\nDetails for older releases are unavailable.\n\n\u003c/p\u003e\u003c/details\u003e\n\n### Major breaking changes\n\n**NOTE: this is not a comprehensive list. Lingvo releases do not offer any\nguarantees regarding backwards compatibility.**\n\n#### HEAD\n\nNothing here.\n\n#### 0.12.0\n\n*   **General**\n    *   Tensorflow 2.9 is now required.\n    *   Python 3.7 support has been removed.\n    *   Compatible with (up to) Tensorflow 2.10 and Python 3.10\n\n#### 0.11.0\n\n*   **General**\n    *   Tensorflow 2.7 is now the required version.\n    *   Python 3.6 support has been removed.\n\n#### 0.10.0\n\n*   **General**\n    *   Tensorflow 2.6 is now the required version.\n    *   The theta_fn arg to CreateVariable() has been removed.\n\n#### 0.9.1\n\n*   **General**\n    *   Python 3.9 is now supported.\n    *   ops.beam_search_step now takes and returns an additional arg\n        `beam_done`.\n    *   The namedtuple beam_search_helper.BeamSearchDecodeOutput now removes the\n        field `done_hyps`.\n\n#### 0.9.0\n\n*   **General**\n    *   Tensorflow 2.5 is now the required version.\n    *   Python 3.5 support has been removed.\n    *   py_utils.AddGlobalVN and py_utils.AddPerStepVN have been combined into\n        py_utils.AddVN.\n    *   BaseSchedule().Value() no longer takes a step arg.\n    *   Classes deriving from BaseSchedule should implement Value() not FProp().\n    *   theta.global_step has been removed in favor of py_utils.GetGlobalStep().\n    *   py_utils.GenerateStepSeedPair() no longer takes a global_step arg.\n    *   PostTrainingStepUpdate() no longer takes a global_step arg.\n    *   The fatal_errors argument to custom input ops now takes error message\n        substrings rather than integer error codes.\n\n\u003cdetails\u003e\u003csummary\u003e\n\u003cb\u003eOlder releases\u003c/b\u003e\n\u003c/summary\u003e\u003cp\u003e\n\n#### 0.8.2\n\n*   **General**\n    *   NestedMap Flatten/Pack/Transform/Filter etc now expand descendent dicts\n        as well.\n    *   Subclasses of BaseLayer extending from `abc.ABCMeta` should now extend\n        `base_layer.ABCLayerMeta` instead.\n    *   Trying to call self.CreateChild outside of `__init__` now raises an\n        error.\n    *   `base_layer.initializer` has been removed. Subclasses no longer need to\n        decorate their `__init__` function.\n    *   Trying to call self.CreateVariable outside of `__init__` or\n        `_CreateLayerVariables` now raises an error.\n    *   It is no longer possible to access self.vars or self.theta inside of\n        `__init__`. Refactor by moving the variable creation and access to\n        `_CreateLayerVariables`. The variable scope is set automatically\n        according to the layer name in `_CreateLayerVariables`.\n\nDetails for older releases are unavailable.\n\n\u003c/p\u003e\u003c/details\u003e\n\n## Quick start\n\n### Installation\n\nThere are two ways to set up Lingvo: installing a fixed version through pip, or\ncloning the repository and building it with bazel. Docker configurations are\nprovided for each case.\n\nIf you would just like to use the framework as-is, it is easiest to just install\nit through pip. This makes it possible to develop and train custom models using\na frozen version of the Lingvo framework. However, it is difficult to modify the\nframework code or implement new custom ops.\n\nIf you would like to develop the framework further and potentially contribute\npull requests, you should avoid using pip and clone the repository instead.\n\n**pip:**\n\nThe [Lingvo pip package](https://pypi.org/project/lingvo) can be installed with\n`pip3 install lingvo`.\n\nSee the\n[codelab](https://colab.research.google.com/github/tensorflow/lingvo/blob/master/codelabs/introduction.ipynb)\nfor how to get started with the pip package.\n\n**From sources:**\n\nThe prerequisites are:\n\n*   a TensorFlow 2.7 [installation](https://www.tensorflow.org/install/),\n*   a `C++` compiler (only g++ 7.3 is officially supported), and\n*   the bazel build system.\n\nRefer to [docker/dev.Dockerfile](docker/dev.Dockerfile) for a set of working\nrequirements.\n\n`git clone` the repository, then use bazel to build and run targets directly.\nThe `python -m module` commands in the codelab need to be mapped onto `bazel\nrun` commands.\n\n**docker:**\n\nDocker configurations are available for both situations. Instructions can be\nfound in the comments on the top of each file.\n\n*   [lib.dockerfile](docker/lib.dockerfile) has the Lingvo pip package\n    preinstalled.\n*   [dev.Dockerfile](docker/dev.Dockerfile) can be used to build Lingvo from\n    sources.\n\n[How to install docker.](https://docs.docker.com/install/linux/docker-ce/ubuntu/)\n\n### Running the MNIST image model\n\n#### Preparing the input data\n\n**pip:**\n\n```shell\nmkdir -p /tmp/mnist\npython3 -m lingvo.tools.keras2ckpt --dataset=mnist\n```\n\n**bazel:**\n\n```shell\nmkdir -p /tmp/mnist\nbazel run -c opt //lingvo/tools:keras2ckpt -- --dataset=mnist\n```\n\nThe following files will be created in `/tmp/mnist`:\n\n*   `mnist.data-00000-of-00001`: 53MB.\n*   `mnist.index`: 241 bytes.\n\n#### Running the model\n\n**pip:**\n\n```shell\ncd /tmp/mnist\ncurl -O https://raw.githubusercontent.com/tensorflow/lingvo/master/lingvo/tasks/image/params/mnist.py\npython3 -m lingvo.trainer --run_locally=cpu --mode=sync --model=mnist.LeNet5 --logdir=/tmp/mnist/log\n```\n\n**bazel:**\n\n```shell\n(cpu) bazel build -c opt //lingvo:trainer\n(gpu) bazel build -c opt --config=cuda //lingvo:trainer\nbazel-bin/lingvo/trainer --run_locally=cpu --mode=sync --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr\n```\n\nAfter about 20 seconds, the loss should drop below 0.3 and a checkpoint will be\nsaved, like below. Kill the trainer with Ctrl+C.\n\n```\ntrainer.py:518] step:   205, steps/sec: 11.64 ... loss:0.25747201 ...\ncheckpointer.py:115] Save checkpoint\ncheckpointer.py:117] Save checkpoint done: /tmp/mnist/log/train/ckpt-00000205\n```\n\nSome artifacts will be produced in `/tmp/mnist/log/control`:\n\n*   `params.txt`: hyper-parameters.\n*   `model_analysis.txt`: model sizes for each layer.\n*   `train.pbtxt`: the training `tf.GraphDef`.\n*   `events.*`: a tensorboard events file.\n\nAs well as in `/tmp/mnist/log/train`:\n\n*   `checkpoint`: a text file containing information about the checkpoint files.\n*   `ckpt-*`: the checkpoint files.\n\nNow, let's evaluate the model on the \"Test\" dataset. In the normal training\nsetup the trainer and evaler should be run at the same time as two separate\nprocesses.\n\n**pip:**\n\n```shell\npython3 -m lingvo.trainer --job=evaler_test --run_locally=cpu --mode=sync --model=mnist.LeNet5 --logdir=/tmp/mnist/log\n```\n\n**bazel:**\n\n```shell\nbazel-bin/lingvo/trainer --job=evaler_test --run_locally=cpu --mode=sync --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr\n```\n\nKill the job with Ctrl+C when it starts waiting for a new checkpoint.\n\n```\nbase_runner.py:177] No new check point is found: /tmp/mnist/log/train/ckpt-00000205\n```\n\nThe evaluation accuracy can be found slightly earlier in the logs.\n\n```\nbase_runner.py:111] eval_test: step:   205, acc5: 0.99775392, accuracy: 0.94150388, ..., loss: 0.20770954, ...\n```\n\n### Running the machine translation model\n\nTo run a more elaborate model, you'll need a cluster with GPUs. Please refer to\n[`third_party/py/lingvo/tasks/mt/README.md`](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/README.md)\nfor more information.\n\n### Running the GShard transformer based giant language model\n\nTo train a GShard language model with one trillion parameters on GCP using\nCloudTPUs v3-512 using 512-way model parallelism, please refer to\n[`third_party/py/lingvo/tasks/lm/README.md`](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/lm/README.md)\nfor more information.\n\n### Running the 3d object detection model\n\nTo run the StarNet model using CloudTPUs on GCP, please refer to\n[`third_party/py/lingvo/tasks/car/README.md`](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/README.md).\n\n## Models\n\n### Automatic Speech Recognition\n\n*   [Listen, Attend and Spell](https://arxiv.org/pdf/1508.01211.pdf).\u003cbr/\u003e\n    William Chan, Navdeep Jaitly, Quoc V. Le, and Oriol Vinyals. ICASSP 2016.\n\n    [End-to-end Continuous Speech Recognition using Attention-based Recurrent\n    NN: First Results](https://arxiv.org/pdf/1412.1602.pdf).\u003cbr/\u003eJan Chorowski,\n    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. arXiv 2014.\n\n    *   [asr.librispeech.Librispeech960Grapheme](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/asr/params/librispeech.py)\n    *   [asr.librispeech.Librispeech960Wpm](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/asr/params/librispeech.py)\n\n### Car\n*   [DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection](https://arxiv.org/pdf/2203.08195.pdf).\u003cbr/\u003e\n    Yingwei Li, Adams Wei Yu, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Bo Wu, Yifeng Lu, Denny \n    Zhou, Quoc V. Le, Alan Yuille, Mingxing Tan. CVPR 2022.\n    *   [car.waymo_deepfusion.DeepFusionCenterPointPed](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/waymo_deepfusion.py)\n\n*   [StarNet: Targeted Computation for Object Detection in Point Clouds](https://arxiv.org/pdf/1908.11069.pdf).\u003cbr/\u003e\n    Jiquan Ngiam, Benjamin Caine, Wei Han, Brandon Yang, Yuning Chai, Pei Sun, Yin\n    Zhou, Xi Yi, Ouais Alsharif, Patrick Nguyen, Zhifeng Chen, Jonathon Shlens,\n    and Vijay Vasudevan. arXiv 2019.\n\n    *   [car.kitti.StarNetCarModel0701](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/kitti.py)\n    *   [car.kitti.StarNetPedCycModel0704](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/kitti.py)\n    *   [car.waymo.StarNetVehicle](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/waymo.py)\n    *   [car.waymo.StarNetPed](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/waymo.py)\n\n### Image\n\n*   [Gradient-based learning applied to document recognition](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf).\u003cbr/\u003e\n    Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. IEEE 1998.\n\n    *   [image.mnist.LeNet5](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/image/params/mnist.py)\n\n### Language Modelling\n\n*   [Exploring the Limits of Language Modeling](https://arxiv.org/pdf/1602.02410.pdf).\u003cbr/\u003e\n    Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui\n    Wu. arXiv, 2016.\n\n    *   [lm.one_billion_wds.WordLevelOneBwdsSimpleSampledSoftmax](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/lm/params/one_billion_wds.py)\n\n*   [GShard: Scaling Giant Models with Conditional Computation and Automatic\n    Sharding](https://arxiv.org/pdf/2006.16668.pdf).\u003cbr/\u003e\n    Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun,\n    Noam Shazeer and Zhifeng Chen arXiv, 2020.\n\n    *   [lm.synthetic_packed_input.DenseLm1T16x16](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/lm/params/synthetic_packed_input.py)\n\n### Machine Translation\n\n*   [The Best of Both Worlds: Combining Recent Advances in Neural Machine\n    Translation](http://aclweb.org/anthology/P18-1008).\u003cbr/\u003e\n    Mia X. Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion\n    Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob\n    Uszkoreit, Lukasz Kaiser, Zhifeng Chen, Yonghui Wu, and Macduff Hughes.\n    ACL 2018.\n\n    *   [mt.wmt14_en_de.WmtEnDeTransformerBase](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/params/wmt14_en_de.py)\n    *   [mt.wmt14_en_de.WmtEnDeRNMT](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/params/wmt14_en_de.py)\n    *   [mt.wmtm16_en_de.WmtCaptionEnDeTransformer](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/params/wmtm16_en_de.py)\n\n*   [Self-supervised and Supervised Joint Training for Resource-rich Neural\n    Machine Translation](https://arxiv.org/pdf/2106.04060.pdf).\u003cbr/\u003e\n    Yong Cheng, Wei Wang, Lu Jiang, and Wolfgang Macherey. ICML 2021.\n\n    *   [mt.xendec.wmt14_en_de.WmtEnDeXEnDec](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/params/xendec/wmt14_en_de.py)\n\n## References\n\n*   [API Docs](https://tensorflow.github.io/lingvo/)\n*   [Codelab](https://colab.research.google.com/github/tensorflow/lingvo/blob/master/codelabs/introduction.ipynb)\n\nPlease cite this [paper](https://arxiv.org/abs/1902.08295) when referencing\nLingvo.\n\n```\n@misc{shen2019lingvo,\n    title={Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling},\n    author={Jonathan Shen and Patrick Nguyen and Yonghui Wu and Zhifeng Chen and others},\n    year={2019},\n    eprint={1902.08295},\n    archivePrefix={arXiv},\n    primaryClass={cs.LG}\n}\n```\n\n## License\n\n[Apache License 2.0](LICENSE)\n","funding_links":[],"categories":["Python","Industrial Strength NLP","AutoML NLP","A01_机器学习教程","Industry Strength Natural Language Processing"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorflow%2Flingvo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftensorflow%2Flingvo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorflow%2Flingvo/lists"}