{"id":13609857,"url":"https://github.com/pytorch/translate","last_synced_at":"2025-09-30T05:31:21.745Z","repository":{"id":43305559,"uuid":"130885307","full_name":"pytorch/translate","owner":"pytorch","description":"Translate - a PyTorch Language Library","archived":true,"fork":false,"pushed_at":"2023-04-27T20:56:00.000Z","size":2546,"stargazers_count":830,"open_issues_count":28,"forks_count":190,"subscribers_count":43,"default_branch":"master","last_synced_at":"2025-01-16T10:42:15.381Z","etag":null,"topics":["artificial-intelligence","machine-learning","onnx","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pytorch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-04-24T16:44:04.000Z","updated_at":"2025-01-15T14:50:42.000Z","dependencies_parsed_at":"2024-08-01T19:43:36.803Z","dependency_job_id":"88477a1b-16c8-4a2e-80ac-6e3879befa75","html_url":"https://github.com/pytorch/translate","commit_stats":{"total_commits":810,"total_committers":92,"mean_commits":8.804347826086957,"dds":0.8209876543209876,"last_synced_commit":"b89dc35abeb7fe516e3b95ccacdedfc1a92e5626"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2Ftranslate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2Ftranslate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2Ftranslate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pytorch%2Ftranslate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pytorch","download_url":"https://codeload.github.com/pytorch/translate/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234707372,"owners_count":18874671,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","machine-learning","onnx","pytorch"],"created_at":"2024-08-01T19:01:38.718Z","updated_at":"2025-09-30T05:31:16.412Z","avatar_url":"https://github.com/pytorch.png","language":"Python","readme":"***\n**NOTE**\n\nPyTorch Translate is now deprecated, please use [fairseq](https://github.com/pytorch/fairseq) instead.\n\n***\n\n\n# Translate - a PyTorch Language Library\n\nTranslate is a library for machine translation written in PyTorch. It provides training for sequence-to-sequence models. Translate relies on [fairseq](https://github.com/pytorch/fairseq), a general sequence-to-sequence library, which means that models implemented in both Translate and Fairseq can be trained. Translate also provides the ability to export some models to Caffe2 graphs via [ONNX](https://onnx.ai/) and to load and run these models from C++ for production purposes. Currently, we export components (encoder, decoder) to Caffe2 separately and beam search is implemented in C++. In the near future, we will be able to export the beam search as well. We also plan to add export support to more models.\n\n## Quickstart\n\nIf you are just interested in training/evaluating MT models, and not in exporting the models to Caffe2 via ONNX, you can install Translate for Python 3 by following these few steps:\n\n1. [Install pytorch](https://pytorch.org/)\n2. [Install fairseq](https://github.com/pytorch/fairseq#requirements-and-installation)\n3. Clone this repository `git clone https://github.com/pytorch/translate.git pytorch-translate \u0026\u0026 cd pytorch-translate`\n4. Run `python setup.py install`\n\nProvided you have CUDA installed you should be good to go.\n\n## Requirements and Full Installation\n\n### Translate Requires:\n\n* A Linux operating system with a CUDA compatible card\n* GNU C++ compiler version 4.9.2 and above\n* A [CUDA installation](https://docs.nvidia.com/cuda/). We recommend CUDA 8.0 or CUDA 9.0\n\n### Use Our Docker Image:\nInstall [Docker](https://docs.docker.com/install/) and\n[nvidia-docker](https://github.com/NVIDIA/nvidia-docker), then run\n\n```\nsudo docker pull pytorch/translate\nsudo nvidia-docker run -i -t --rm pytorch/translate /bin/bash\n. ~/miniconda/bin/activate\ncd ~/translate\n```\n\nYou should now be able to run the sample commands in the\n[Usage Examples](#usage-examples) section below. You can also see the available\nimage versions under https://hub.docker.com/r/pytorch/translate/tags/.\n\n### Install Translate from Source:\nThese instructions were mainly tested on Ubuntu 16.04.5 LTS (Xenial Xerus) with a Tesla M60 card\nand a CUDA 9 installation. We highly encourage you to [report an issue](https://github.com/pytorch/translate/issues)\nif you are unable to install this project for your specific configuration.\n\n- If you don't already have an existing [Anaconda](https://www.anaconda.com/download/)\nenvironment with Python 3.6, you can install one via [Miniconda3](https://conda.io/miniconda.html):\n\n  ```\n  wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh\n  chmod +x miniconda.sh\n  ./miniconda.sh -b -p ~/miniconda\n  rm miniconda.sh\n  . ~/miniconda/bin/activate\n  ```\n\n- Clone the Translate repo:\n\n  ```\n  git clone https://github.com/pytorch/translate.git\n  pushd translate\n  ```\n\n- Install the [PyTorch](https://pytorch.org/) conda package:\n\n  ```\n  # Set to 8 or 9 depending on your CUDA version.\n  TMP_CUDA_VERSION=\"9\"\n\n  # Uninstall previous versions of PyTorch. Doing this twice is intentional.\n  # Error messages about torch not being installed are benign.\n  pip uninstall -y torch\n  pip uninstall -y torch\n\n  # This may not be necessary if you already have the latest cuDNN library.\n  conda install -y cudnn\n\n  # Add LAPACK support for the GPU.\n  conda install -y -c pytorch \"magma-cuda${TMP_CUDA_VERSION}0\"\n\n  # Install the combined PyTorch nightly conda package.\n  conda install pytorch-nightly cudatoolkit=${TMP_CUDA_VERSION}.0 -c pytorch\n\n  # Install NCCL2.\n  wget \"https://s3.amazonaws.com/pytorch/nccl_2.1.15-1%2Bcuda${TMP_CUDA_VERSION}.0_x86_64.txz\"\n  TMP_NCCL_VERSION=\"nccl_2.1.15-1+cuda${TMP_CUDA_VERSION}.0_x86_64\"\n  tar -xvf \"${TMP_NCCL_VERSION}.txz\"\n  rm \"${TMP_NCCL_VERSION}.txz\"\n\n  # Set some environmental variables needed to link libraries correctly.\n  export CONDA_PATH=\"$(dirname $(which conda))/..\"\n  export NCCL_ROOT_DIR=\"$(pwd)/${TMP_NCCL_VERSION}\"\n  export LD_LIBRARY_PATH=\"${CONDA_PATH}/lib:${NCCL_ROOT_DIR}/lib:${LD_LIBRARY_PATH}\"\n  ```\n\n- Install [ONNX](https://onnx.ai/):\n\n  ```\n  git clone --recursive https://github.com/onnx/onnx.git\n  yes | pip install ./onnx 2\u003e\u00261 | tee ONNX_OUT\n  ```\n\nIf you get a `Protobuf compiler not found` error, you need to install it:\n\n  ```\n  conda install -c anaconda protobuf\n  ```\n\nThen, try to install ONNX again:\n\n  ```\n  yes | pip install ./onnx 2\u003e\u00261 | tee ONNX_OUT\n  ```\n\n- Build Translate:\n\n  ```\n  pip uninstall -y pytorch-translate\n  python3 setup.py build develop\n  ```\n\nNow you should be able to run the example scripts below!\n\n## Usage Examples\n\nNote: the example commands given assume that you are the root of the cloned\nGitHub repository or that you're in the `translate` directory of the Docker or\nAmazon image. You may also need to make sure you have the Anaconda environment\nactivated.\n\n### Training\n\nWe provide an [example script](https://github.com/pytorch/translate/blob/master/pytorch_translate/examples/train_iwslt14.sh) to train a model for the IWSLT 2014 German-English task. We used this command to obtain [a pretrained model](https://download.pytorch.org/models/translate/iwslt14/model.tar.gz):\n\n```\nbash pytorch_translate/examples/train_iwslt14.sh\n```\n\nThe pretrained model actually contains two checkpoints that correspond to training twice with random initialization of the parameters. This is useful to obtain ensembles. This dataset is relatively small (~160K sentence pairs), so training will complete in a few hours on a single GPU.\n\n####  Training with tensorboard visualization\n\nWe provide support for visualizing training stats with tensorboard. As a dependency, you will need [tensorboard_logger](https://github.com/TeamHG-Memex/tensorboard_logger) installed.\n\n```\npip install tensorboard_logger\n```\n\nPlease also make sure that [tensorboard](https://github.com/tensorflow/tensorboard) is installed. It also comes with `tensorflow` installation.\n\nYou can use the above [example script](https://github.com/pytorch/translate/blob/master/pytorch_translate/examples/train_iwslt14.sh) to train with tensorboard, but need to change line 10 from :\n\n```\nCUDA_VISIBLE_DEVICES=0 python3 pytorch_translate/train.py\n```\nto\n\n```\nCUDA_VISIBLE_DEVICES=0 python3 pytorch_translate/train_with_tensorboard.py\n```\nThe event log directory for tensorboard can be specified by option `--tensorboard_dir` with a default value: `run-1234`. This directory is appended to your `--save_dir` argument.\n\nFor example in the above script, you can visualize with:\n\n```\ntensorboard --logdir checkpoints/runs/run-1234\n```\n\nMultiple runs can be compared by specifying different `--tensorboard_dir`. i.e. `run-1234` and `run-2345`. Then\n\n```\ntensorboard --logdir checkpoints/runs\n```\n\ncan visualize stats from both runs.\n\n### Pretrained Model\n\nA pretrained model for IWSLT 2014 can be evaluated by running the [example script](https://github.com/pytorch/translate/blob/master/pytorch_translate/examples/generate_iwslt14.sh):\n\n```\nbash pytorch_translate/examples/generate_iwslt14.sh\n```\n\nNote the improvement in performance when using an ensemble of size 2 instead of a single model.\n\n### Exporting a Model with ONNX\n\nWe provide an [example script](https://github.com/pytorch/translate/blob/master/pytorch_translate/examples/export_iwslt14.sh) to export a PyTorch model to a Caffe2 graph via ONNX:\n\n```\nbash pytorch_translate/examples/export_iwslt14.sh\n```\n\nThis will output two files, `encoder.pb` and `decoder.pb`, that correspond to the computation of the encoder and one step of the decoder. The example exports a single checkpoint (`--checkpoint model/averaged_checkpoint_best_0.pt` but is also possible to export an ensemble (`--checkpoint model/averaged_checkpoint_best_0.pt --checkpoint model/averaged_checkpoint_best_1.pt`). Note that during export, you can also control a few hyperparameters such as beam search size, word and UNK rewards.\n\n### Using the Model\n\nTo use the sample exported Caffe2 model to translate sentences, run:\n\n```\necho \"hallo welt\" | bash pytorch_translate/examples/translate_iwslt14.sh\n```\n\nNote that the model takes in [BPE](https://github.com/rsennrich/subword-nmt)\ninputs, so some input words need to be split into multiple tokens.\nFor instance, \"hineinstopfen\" is represented as \"hinein@@ stop@@ fen\".\n\n### PyTorch Translate Research\n\nWe welcome you to explore the models we have in the `pytorch_translate/research`\nfolder. If you use them and encounter any errors, please paste logs and a\ncommand that we can use to reproduce the error. Feel free to contribute any\nbugfixes or report your experience, but keep in mind that these models are a\nwork in progress and thus are currently unsupported.\n\n## Join the Translate Community\n\nWe welcome contributions! See the `CONTRIBUTING.md` file for how to help out.\n\n## License\nTranslate is BSD-licensed, as found in the `LICENSE` file.\n","funding_links":[],"categories":["文本数据和NLP","Pytorch \u0026 related libraries｜Pytorch \u0026 相关库","Pytorch \u0026 related libraries","Python"],"sub_categories":["NLP \u0026 Speech Processing｜自然语言处理 \u0026 语音处理:","NLP \u0026 Speech Processing:"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpytorch%2Ftranslate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpytorch%2Ftranslate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpytorch%2Ftranslate/lists"}