Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
bert-in-production
A collection of resources on using BERT (https://arxiv.org/abs/1810.04805 ) and related Language Models in production environments.
https://github.com/DomHudson/bert-in-production
Last synced: 5 days ago
JSON representation
-
Descriptive Resources
- BERT to the rescue!
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT Technology introduced in 3-minutes
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
- The Illustrated Transformer
- Sequence-to-sequence modeling with `NN.TRANSFORMER` and `TORCHTEXT`
- Exploring BERT's Vocabulary
- BERT to the rescue!
- Pre-training BERT from scratch with cloud TPU
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT, RoBERTa, DistilBERT, XLNet — which one to use?
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
- BERT to the rescue!
- BERT Technology introduced in 3-minutes
-
Implementations
- pytorch/fairseq
- google-research/google-research
- hanxiao/bert-as-service
- microsoft/onnxruntime - sourced by Microsoft; it contains several model-specific optimisations including one for transformer models. A model's architecture is compiled into the Open Neural Network Exchange (ONNX) standard and optionally optimised for a specific platform's hardware.
- google-research/bert
- huggingface/transformers - of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch. The transformers library is focussed on using publicly-available pretrained models and has wide support for many of the most popular varieties.
- huggingface/tokenizers - of-the-Art Tokenizers optimized for Research and Production
- spacy-transformers
- codertimo/BERT-pytorch
- CyberZHG/keras-bert
- kaushaltrivedi/fast-bert
- hanxiao/bert-as-service
- pytorch/fairseq
-
Deep Analysis
-
General Resources
-
Speed
-
Compression
- Extreme Language Model Compression with Optimal Subwords and Shared Projections
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
- Compression BERT for faster prediction
- Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
- PoWER-BERT: Accelerating BERT inference for Classification Tasks - BERT, for improving the inference time for the BERT model without significant loss in the accuracy. The method works by eliminating word-vectors (intermediate vector outputs) from the encoder pipeline. We design a strategy for measuring the significance of the word-vectors based on the self-attention mechanism of the encoders which helps us identify the word-vectors to be eliminated. Experimental evaluation on the standard GLUE benchmark shows that PoWER-BERT achieves up to 4.5x reduction in inference time over BERT with < 1% loss in accuracy. We show that compared to the prior inference time reduction methods, PoWER-BERT offers better trade-off between accuracy and inference time. Lastly, we demonstrate that our scheme can also be used in conjunction with ALBERT (a highly compressed version of BERT) and can attain up to 6.8x factor reduction in inference time with < 1% loss in accuracy.
- Q8BERT: Quantized 8Bit BERT - trained Transformer based language models such as BERT and GPT, have shown great improvement in many Natural Language Processing (NLP) tasks. However, these models contain a large amount of parameters. The emergence of even larger and more accurate models such as GPT2 and Megatron, suggest a trend of large pre-trained Transformer models. However, using these large models in production environments is a complex task requiring a large amount of compute, memory and power resources. In this work we show how to perform quantization-aware training during the fine-tuning phase of BERT in order to compress BERT by 4× with minimal accuracy loss. Furthermore, the produced quantized model can accelerate inference speed if it is optimized for 8bit Integer supporting hardware.
- Small and Practical BERT Models for Sequence Labeling - of-the-art multilingual baseline.
- TinyBERT: Distilling BERT for Natural Language Understanding - training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive and memory intensive, so it is difficult to effectively execute them on some resource-restricted devices. To accelerate inference and reduce model size while maintaining accuracy, we firstly propose a novel transformer distillation method that is a specially designed knowledge distillation (KD) method for transformer-based models. By leveraging this new KD method, the plenty of knowledge encoded in a large teacher BERT can be well transferred to a small student TinyBERT. Moreover, we introduce a new two-stage learning framework for TinyBERT, which performs transformer distillation at both the pre-training and task-specific learning stages. This framework ensures that TinyBERT can capture both the general-domain and task-specific knowledge of the teacher BERT.TinyBERT is empirically effective and achieves more than 96% the performance of teacher BERTBASE on GLUE benchmark while being 7.5x smaller and 9.4x faster on inference. TinyBERT is also significantly better than state-of-the-art baselines on BERT distillation, with only about 28% parameters and about 31% inference time of them.
-
Knowledge Distillation
-
-
Other Resources
-
Compression
- RoBERTa: A Robustly Optimized BERT Pretraining Approach
- Deploying BERT in production
- Serving Google BERT in Production using Tensorflow and ZeroMQ
- Pruning bert to accelerate inference
- Improving Neural Machine Translation with Parent-Scaled Self-Attention
- Reducing Transformer Depth on Demand with Structured Dropout
-
Categories
Sub Categories
Keywords
bert
7
language-model
5
pytorch
5
nlp
5
natural-language-processing
4
deep-learning
3
machine-learning
3
tensorflow
3
natural-language-understanding
3
openai
2
transformers
2
google
2
onnx
2
transformer
2
pretrained-models
1
nlp-library
1
model-hub
1
python
1
pytorch-transformers
1
language-models
1
jax
1
flax
1
scikit-learn
1
neural-networks
1
hardware-acceleration
1
ai-framework
1
sentence2vec
1
sentence-encoding
1
neural-search
1
multi-modality
1
image2vec
1
cross-modality
1
cross-modal-retrieval
1
clip-model
1
clip-as-service
1
bert-as-service
1
fastai
1
fast-bert
1
keras
1
xlnet
1
transfer-learning
1
spacy-pipeline
1
spacy-extension
1
spacy
1
pytorch-model
1
huggingface
1
gpt-2
1
gpt
1
speech-recognition
1
seq2seq
1