https://github.com/HighCWu/keras-bert-tpu

Implementation of BERT that could load official pre-trained models for feature extraction and prediction on TPU
https://github.com/HighCWu/keras-bert-tpu

Last synced: 9 months ago
JSON representation

Implementation of BERT that could load official pre-trained models for feature extraction and prediction on TPU

Host: GitHub
URL: https://github.com/HighCWu/keras-bert-tpu
Owner: HighCWu
License: other
Created: 2018-12-03T04:42:58.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2019-02-24T01:13:48.000Z (almost 7 years ago)
Last Synced: 2024-09-25T11:26:03.026Z (about 1 year ago)
Language: Python
Size: 10.5 MB
Stars: 17
Watchers: 3
Forks: 3
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-bert - HighCWu/keras-bert-tpu - trained models for feature extraction and prediction on TPU, (other resources for BERT:)
awesome-transformer-nlp - HighCWu/keras-bert-tpu - Implementation of BERT that could load official pre-trained models for feature extraction and prediction on TPU. (Other Resources / Other)

README

          # Keras BERT TPU

[![Travis](https://travis-ci.org/HighCWu/keras-bert-tpu.svg?branch=master)](https://travis-ci.org/HighCWu/keras-bert-tpu)

[![Coverage](https://coveralls.io/repos/github/HighCWu/keras-bert-tpu/badge.svg?branch=master)](https://coveralls.io/github/HighCWu/keras-bert-tpu)

This is a fork of [CyberZHG/keras_bert](https://github.com/CyberZHG/keras-bert) which supports Keras BERT on TPU.

Implementation of the [BERT](https://arxiv.org/pdf/1810.04805.pdf). Official pre-trained models could be loaded for feature extraction and prediction.

## Colab Demo

[HighCWu/keras-bert-tpu](https://colab.research.google.com/github/HighCWu/keras-bert-tpu/blob/master/demo/load_model/load_and_predict.ipynb)

## Install

```bash

pip install keras-bert-tpu

```

## Usage

### Load Official Pre-trained Models

In [feature extraction demo](./demo/load_model/load_and_extract.py), you should be able to get the same extraction result as the official model. And in [prediction demo](./demo/load_model/load_and_predict.py), the missing word in the sentence could be predicted.

### Train & Use

```python

from keras_bert import get_base_dict, get_model, gen_batch_inputs

# A toy input example

sentence_pairs = [

    [['all', 'work', 'and', 'no', 'play'], ['makes', 'jack', 'a', 'dull', 'boy']],

    [['from', 'the', 'day', 'forth'], ['my', 'arm', 'changed']],

    [['and', 'a', 'voice', 'echoed'], ['power', 'give', 'me', 'more', 'power']],

]

# Build token dictionary

token_dict = get_base_dict()  # A dict that contains some special tokens

for pairs in sentence_pairs:

    for token in pairs[0] + pairs[1]:

        if token not in token_dict:

            token_dict[token] = len(token_dict)

token_list = list(token_dict.keys())  # Used for selecting a random word

# Build & train the model

model = get_model(

    token_num=len(token_dict),

    head_num=5,

    transformer_num=12,

    embed_dim=25,

    feed_forward_dim=100,

    seq_len=20,

    pos_num=20,

    dropout_rate=0.05,

)

model.summary()

def _generator():

    while True:

        yield gen_batch_inputs(

            sentence_pairs,

            token_dict,

            token_list,

            seq_len=20,

            mask_rate=0.3,

            swap_sentence_rate=1.0,

        )

model.fit_generator(

    generator=_generator(),

    steps_per_epoch=1000,

    epochs=100,

    validation_data=_generator(),

    validation_steps=100,

    callbacks=[

        keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)

    ],

)

# Use the trained model

inputs, output_layer = get_model(  # `output_layer` is the last feature extraction layer (the last transformer)

    token_num=len(token_dict),

    head_num=5,

    transformer_num=12,

    embed_dim=25,

    feed_forward_dim=100,

    seq_len=20,

    pos_num=20,

    dropout_rate=0.05,

    training=False,  # The input layers and output layer will be returned if `training` is `False`

)

```

### Custom Feature Extraction

```python

def _custom_layers(x, trainable=True):

    return keras.layers.LSTM(

        units=768,

        trainable=trainable,

        name='LSTM',

    )(x)

model = get_model(

    token_num=200,

    embed_dim=768,

    custom_layers=_custom_layers,

)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/HighCWu/keras-bert-tpu

Awesome Lists containing this project

README