{"id":13472574,"url":"https://github.com/NetEase-FuXi/EET","last_synced_at":"2025-03-26T17:30:54.432Z","repository":{"id":47117533,"uuid":"350553712","full_name":"NetEase-FuXi/EET","owner":"NetEase-FuXi","description":"Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model","archived":false,"fork":false,"pushed_at":"2024-11-30T12:48:47.000Z","size":45581,"stargazers_count":261,"open_issues_count":6,"forks_count":46,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-11-30T13:35:00.476Z","etag":null,"topics":["bert","bert-inference-performance","eet","gpt2","gpt2-inference-performance"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NetEase-FuXi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-23T02:20:56.000Z","updated_at":"2024-11-30T12:48:51.000Z","dependencies_parsed_at":"2023-01-23T12:01:05.758Z","dependency_job_id":"1ae39ac7-c6b4-4470-acef-a54d169d44da","html_url":"https://github.com/NetEase-FuXi/EET","commit_stats":{"total_commits":172,"total_committers":8,"mean_commits":21.5,"dds":0.6686046511627908,"last_synced_commit":"02d0eae3181ada62695cdc0e1273b166e4c0b915"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NetEase-FuXi%2FEET","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NetEase-FuXi%2FEET/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NetEase-FuXi%2FEET/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NetEase-FuXi%2FEET/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NetEase-FuXi","download_url":"https://codeload.github.com/NetEase-FuXi/EET/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245702146,"owners_count":20658552,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","bert-inference-performance","eet","gpt2","gpt2-inference-performance"],"created_at":"2024-07-31T16:00:55.866Z","updated_at":"2025-03-26T17:30:54.425Z","avatar_url":"https://github.com/NetEase-FuXi.png","language":"Python","funding_links":[],"categories":["Python","Transformer库与优化"],"sub_categories":[],"readme":"## Easy and Efficient Transformer \n\u003cdiv align='right' \u003e\u003cfont size=\"1\"\u003e\u003cb\u003e\u003ca href=\"./README_zh.md\"\u003e中文README\u003c/a\u003e\u003c/b\u003e \u003c/font\u003e\u003c/div\u003e\n\n\n\u003cdiv  align=\"center\"\u003e \u003cimg src=\"./doc/image/EETblueLOGO.png\" width = \"600\" height = \"180\" alt=\"EET\" align=center /\u003e\u003c/div\u003e\n\u003c/br\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/NetEase-FuXi/EET/blob/main/LICENSE\"\u003e\n        \u003cimg alt=\"GitHub license\" src=\"./doc/image/license.svg\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/NetEase-FuXi/EET/tree/main/example/python\"\u003e\n        \u003cimg alt=\"GitHub release\" src=\"./doc/image/example.svg\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/NetEase-FuXi/EET/releases\"\u003e\n        \u003cimg alt=\"release\" src=\"./doc/image/release.svg\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\nEET(Easy and Efficient Transformer) is a friendly Pytorch inference plugin focus on Transformer-based models to make mega-size model affordable.\n\n## Features\n\n- **New**🔥: Support Baichuan, LLaMA and other LLMs.\n- **New**🔥: Support int8 quantization.\n- Support Mega-size model with single GPU. \n- Expertise in inference for multi-modal and NLP tasks (CLIP/GPT-3/Bert/Seq2seq etc.).\n- High performance. Make the transformer-based model faster and faster with the effect of CUDA kernel optimization and quantization/sparsity algorithm. \n- Out-of-the-box for Transformers and Fairseq. Save your pain of trivial configuration and make your model work within a few lines.\n----\n\n- [Easy and Efficient Transformer](#easy-and-efficient-transformer)\n- [Features](#features)\n- [Model Matrix](#model-matrix)\n- [Quick Start](#quick-start)\n  - [Environment](#environment)\n  - [Installation](#installation)\n    - [From Source](#from-source)\n    - [From Docker](#from-docker)\n  - [Run](#run)\n    - [Operators APIs](#operators-apis)\n    - [Model APIs](#model-apis)\n    - [Application APIs](#application-apis)\n- [Performance](#performance)\n- [Cite Us](#cite-us)\n- [Video](#video)\n- [Contact us](#contact-us)\n\n\n## Model Matrix\n\n\u003ctable\u003e\n        \u003cth bgcolor=\"#a9a9a9\" style=\"text-align: center\"\u003e\u003cfont color=\"#00008b\"\u003emodel type\u003c/font\u003e\u003c/th\u003e\n        \u003cth bgcolor=\"#a9a9a9\"\u003e\u003cfont color=\"#00008b\"\u003eTransformers\u003c/font\u003e\u003c/th\u003e\n        \u003cth bgcolor=\"#a9a9a9\"\u003e\u003cfont color=\"#00008b\"\u003eFairseq\u003c/font\u003e\u003c/th\u003e\n        \u003cth bgcolor=\"#a9a9a9\"\u003e\u003cfont color=\"#00008b\"\u003eQuantization\u003c/font\u003e\u003c/th\u003e\n        \u003cth bgcolor=\"#a9a9a9\"\u003e\u003cfont color=\"#00008b\"\u003eSpeedUp\u003c/font\u003e\u003c/th\u003e\n        \u003cth bgcolor=\"#a9a9a9\"\u003e\u003cfont color=\"#00008b\"\u003eSince version\u003c/font\u003e\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eGPT-3\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e2~8x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e0.0.1 beta\u003c/font\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eBert\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e1~5x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e0.0.1 beta\u003c/font\u003e\u003c/td\u003e \n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eALBert\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e1~5x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e0.0.1 beta\u003c/font\u003e\u003c/td\u003e \n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eRoberta\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e1~5x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e0.0.1 beta\u003c/font\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eT5\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e4~8x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e1.0\u003c/font\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n     \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eViT\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e1~5x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e1.0\u003c/font\u003e\u003c/td\u003e \n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eCLIP(GPT+ViT)\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e2~4x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e1.0\u003c/font\u003e\u003c/td\u003e \n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eDistillbert\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e1~2x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e1.0\u003c/font\u003e\u003c/td\u003e \n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eBaichuan\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e1~2x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e2.0\u003c/font\u003e\u003c/td\u003e \n    \u003c/tr\u003e\n    \u003ctr\u003e\n       \u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#1e90ff\"\u003eLLaMA\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003eX\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u0026#x2705;\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#dc143c\"\u003e1~2x\u003c/font\u003e\u003c/td\u003e\u003ctd style=\"text-align: center\"\u003e\u003cfont color=\"#deb887\"\u003e2.0\u003c/font\u003e\u003c/td\u003e \n    \u003c/tr\u003e\n\u003c/table\u003e\n\n\n## Quick Start\n\n### Environment\n\n* cuda:\u003e=11.4 \n* python:\u003e=3.7 \n* gcc:\u003e= 7.4.0 \n* torch:\u003e=1.12.0 \n* numpy:\u003e=1.19.1 \n* fairseq:==0.10.0\n* transformers:\u003e=4.31.0\n\nThe above environment is the minimum configuration, and it is best to use a newer version.\n\n### Installation\n\nRecommend using docker images.\n\n#### From Source\nIf you are installing from source, you will need install the necessary [environment](#environment).Then proceed as follows: \n\n```bash\n$ git clone https://github.com/NetEase-FuXi/EET.git\n$ pip install .\n```\nRecommend using nvcr.io/nvidia/pytorch:23.04-py3 and other series of images, you can also use the provided Dockerfile file.\n\n#### From Docker\n\n```bash\n$ git clone https://github.com/NetEase-FuXi/EET.git\n$ docker build -t eet_docker:0.1 .\n$ nvidia-docker run -it --net=host -v /your/project/directory/:/root/workspace  eet_docker:0.1 bash\n```\nThe EET and its required environment have been installed in docker.\n\n### Run\n\nWe provide three types of APIs:\n- **Operators APIs**, such as embedding, masked-multi-head-attention, ffn etc. Enable you to define your custom models.\n- **Model APIs**, such as TransformerDecoder, BertEncoder etc. Enable you to integrate EET into your pytorch project.\n- **Application APIs**, such as Transformers Pipeline. Enable you to run your model in a few lines.\n\n#### Operators APIs\n\nOperators APIs are the intermediate representation of C++/CUDA and Python. We provide almost all the operators required for Transformer models. You can combine different OPs to build other model structures.\n- Operators API table\n    |          operators          |       python API       |                  Remarks                  |\n    | :-------------------------: | :--------------------: | :---------------------------------------: |\n    |    multi_head_attention     |    EETSelfAttention    |              self attention               |\n    | masked_multi_head_attention | EETSelfMaskedAttention |             causal attention              |\n    | cross_multi_head_attention  |   EETCrossAttention    |              cross attention              |\n    |             ffn             |     EETFeedforward     |           feed forward network            |\n    |          embedding          |    EETBertEmbedding    | correspondence to Fairseq and Transfomers |\n    |          layernorm          |      EETLayerNorm      |           same as nn.LayerNorm            |\n\n- How to use\n\n    The definition of these OPs is in the file [EET/csrc/py11/eet2py.cpp](./csrc/py11/eet2py.cpp) and\n    some using examples were show in the files under [python/eet](./python/eet), which tell us how to use those OPs to make up classic models.\n\n#### Model APIs\n\nAs an plugin, EET provides friendly model APIs([python/eet](./python/eet)) to integrated into Fairseq and Transformers. \n\nAll you need to do is find the corresponding class according to the tables below (usually with a prefix of 'EET') and initialize an object with the from_torch and from_pretrained function. \n\nNote: We now only support **pre-padding** for GPT-3.\n    \n\u003cb\u003eEET and fairseq class comparison table :\u003c/b\u003e\n\n|             EET             |             fairseq              |               Remarks               | \n|:---------------------------:|:--------------------------------:|:-----------------------------------:| \n|    EETTransformerDecoder    |        TransformerDecoder        |                                     |\n| EETTransformerDecoderLayer  |     TransformerDecoderLayer      |                                     |\n|   EETTransformerAttention   |        MultiheadAttention        |                                     |\n|  EETTransformerFeedforward  |     TransformerDecoderLayer      | fusion of multiple small operators  |\n|   EETTransformerEmbedding   | Embedding + PositionalEmbedding  |                                     |\n|   EETTransformerLayerNorm   |           nn.LayerNorm           |                                     |\n\n\n\u003cb\u003eEET and Transformers class comparison table : \u003c/b\u003e\n\n|         EET          |          transformers          |             Remarks             | \n|:--------------------:|:------------------------------:|:-------------------------------:| \n|     EETBertModel     |           BertModel            |                                 |\n|   EETBertEmbedding   |         BertEmbeddings         |                                 |\n|     EETGPT2Model     |           GPT2Model            |                                 |\n|    EETGPT2Decoder    |           GPT2Model            | Transformers has no GPT2Decoder |\n| EETGPT2DecoderLayer  |             Block              |                                 |\n|   EETGPT2Attention   |           Attention            |                                 |\n|  EETGPT2Feedforward  |              MLP               |                                 |\n|   EETGPT2Embedding   |          nn.Embedding          |                                 |\n|     EETLayerNorm     |          nn.LayerNorm          |                                 |\n\n  In addition to the basic model types above, we have extended some task-specific APIs to support different tasks. The table below is part of our task-specific model APIs :\n\n|                EET                |          transformers          | Remarks | \n|:---------------------------------:|:------------------------------:|:----:| \n|       EETBertForPreTraining       |       BertForPreTraining       |      |\n|        EETBertLMHeadModel         |        BertLMHeadModel         |      |\n|        EETBertForMaskedLM         |        BertForMaskedLM         |      |\n| EETBertForNextSentencePrediction  | BertForNextSentencePrediction  |      |\n| EETBertForSequenceClassification  | BertForSequenceClassification  |      |\n|     EETBertForMultipleChoice      |     BertForMultipleChoice      |      |\n|   EETBertForTokenClassification   |   BertForTokenClassification   |      |\n|    EETBertForQuestionAnswering    |    BertForQuestionAnswering    |      |\n\n- How to use\n\nThis is a code snip to show how to use model APIs :\n\n\u003cdiv  align=\"left\"\u003e \u003cimg src=\"./doc/image/use_bert.png\" width = \"850\" height = \"325\" alt=\"useofbert\"/\u003e\u003c/div\u003e\n\nYou can build your application with the model APIs directly with the task-specific APIs.\nThere is an example of a fill-mask:\n\n```python\nfrom eet import EETRobertaForMaskedLM\nfrom transformers import RobertaTokenizer\ninput = [\"My \u003cmask\u003e is Sarah and I live in London\"]\ntokenizer = RobertaTokenizer.from_pretrained('roberta-base')\neet_roberta_model = EETRobertaForMaskedLM.from_pretrained('roberta-base',max_batch = max_batch_size,data_type = data_type)\n# first step: tokenize\nmodel_inputs = tokenizer(input,return_tensors = 'pt')\nmasked_index = torch.nonzero(model_inputs['input_ids'][0] == tokenizer.mask_token_id, as_tuple=False).squeeze(-1)\n# second step: predict\nprediction_scores = eet_roberta_model(model_inputs['input_ids'].cuda(),attention_mask = model_inputs['attention_mask'])\n# third step: argmax\npredicted_index = torch.argmax(prediction_scores.logits[0, masked_index]).item()\npredicted_token = tokenizer.convert_ids_to_tokens(predicted_index)\n```\n\nFor more examples, please refer to [example/python/models](example/python/models/).\n\n#### Application APIs\n\nEET provides a ready-made pipelines approach to simplify your application building for different tasks without using the model APIs above.\n\nHere is an example :\n\n```python\nimport torch\nfrom eet import pipeline\nmax_batch_size = 1\nmodel_path = 'roberta-base'\ndata_type = torch.float16\ninput = [\"My \u003cmask\u003e is Sarah and I live in London\"]\nnlp = pipeline(\"fill-mask\",model = model_path,data_type = data_type,max_batch_size = max_batch_size)\nout = nlp(input)\n```\n\nNow we support these tasks：\n\n| Task                 | Since version | \n|:-------------------------------|:---:|\n| text-classification            | 1.0 |\n| token-classification           | 1.0 | \n| question-answering             | 1.0 | \n| fill-mask                      | 1.0 |\n| text-generation                | 1.0 |\n| image-classification           | 1.0 |\n| zero_shot_image_classification | 1.0 |\n\nFor more examples, please refer to [example/python/pipelines](./example/python/pipelines).\n\n\n## Performance\n\nDetailed performance data of GPT-3 and Bert model inference can be viewed at [link](https://github.com/NetEase-FuXi/EET/blob/main/doc/benchmark.md).\n* GPT-3 on A100\n\n\u003cdiv  align=\"left\"\u003e \u003cimg src=\"./doc/image/a100_prompt.png\" width = \"700\" height = \"387\" alt=\"a100_prompt\"/\u003e\u003c/div\u003e\n\n* Bert on 2080ti\n\u003cdiv  align=\"left\"\u003e \u003cimg src=\"./doc/image/bert_ft.png\" width = \"700\" height = \"386\" alt=\"bert_ft\"/\u003e\u003c/div\u003e\n\n* Llama13B on 3090\n\u003cdiv  align=\"left\"\u003e \u003cimg src=\"./doc/image/llama13B_tps.png\" width = \"700\" height = \"386\" alt=\"bert_ft\"/\u003e\u003c/div\u003e\n\n## Cite Us\n\nIf you use EET in your research, please cite the following paper.\n\n```\n@misc{https://doi.org/10.48550/arxiv.2104.12470,\n  doi = {10.48550/ARXIV.2104.12470},\n  url = {https://arxiv.org/abs/2104.12470},\n  author = {Li, Gongzheng and Xi, Yadong and Ding, Jingzhen and Wang, Duan and Liu, Bai and Fan, Changjie and Mao, Xiaoxi and Zhao, Zeng},\n  keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},\n  title = {Easy and Efficient Transformer : Scalable Inference Solution For large NLP model},\n```\n\n## Video\nWe have a share on ZhiYuan LIVE, link: https://event.baai.ac.cn/activities/325.\n\n## Contact us\nYou can post your problem with github issues. \n\nYou can also contact us by email :\n\nzhaosida@corp.netease.com, zhuangzhong@corp.netease.com, hzzhaozeng@corp.netease.com\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNetEase-FuXi%2FEET","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNetEase-FuXi%2FEET","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNetEase-FuXi%2FEET/lists"}