https://github.com/yong-asial/huggingface

Run huggingFace model locally
https://github.com/yong-asial/huggingface

docker huggingface nlp pipeline pytorch sentiment-analysis tensorflow transformer

Last synced: 3 months ago
JSON representation

Run huggingFace model locally

Host: GitHub
URL: https://github.com/yong-asial/huggingface
Owner: yong-asial
Created: 2024-02-19T06:13:41.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-11-29T04:55:56.000Z (7 months ago)
Last Synced: 2025-02-01T22:15:12.034Z (5 months ago)
Topics: docker, huggingface, nlp, pipeline, pytorch, sentiment-analysis, tensorflow, transformer
Language: Python
Homepage:
Size: 76.2 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Sentiment Analysis Application

This application uses a Hugging Face model to perform sentiment analysis on a sentence provided as a command line argument.

## Project Structure

- `apps/`: Contains the Python script for the application.

- `apps/model`: Contains the saved model locally.

- `docker-compose.yml`: Docker Compose configuration file.

- `Dockerfile`: Dockerfile for building the Docker image. It might takes around 9GB.

- `requirements.txt`: Contains Python dependencies for the application.

  - Pytorch: install `torchvision` and `torchaudio` for using Vision and Audio models.

  - Tensorflow: install `tensorflow` for using tensorflow models.

## Setup

### Build

```bash

docker-compose build

```

### Run

```bash

docker-compose up -d

```

### Stop

```bash

docker-compose down

```

## Usage

Run the application with a sentence as command line arguments:

```bash

docker exec -it python-server bash

python3 index.py "task_name" "model_name" "sentence"

```

## Pytorch vs. Tensorflow vs. Default

For a model (for example, nlptown/bert-base-multilingual-uncased-sentiment), there are many model binaries, pytorch, tensorflow, jax, etc. We can choose to use which model binary by using specific Tokenizer and Model class.

### Pytorch

```python

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# load pytorch model from hugginFace

model_name = "nlptown/bert-base-multilingual-uncased-sentiment"

model = AutoModelForSequenceClassification.from_pretrained(model_name)

tokenizer = AutoTokenizer.from_pretrained(model_name)

# save model

save_directory = "./directory"

tokenizer.save_pretrained(save_directory)

model.save_pretrained(save_directory)

# load model from local

tokenizer = AutoTokenizer.from_pretrained(save_directory)

model = AutoModelForSequenceClassification.from_pretrained(save_directory)

# use model

classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

classifier("I like huggingFace.")

```

### Tensorflow

Note: to use following code, you need to install `tensorflow` to this docker.

```python

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

# load tensorflow model from hugginFace

model_name = "nlptown/bert-base-multilingual-uncased-sentiment"

model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

tokenizer = AutoTokenizer.from_pretrained(model_name)

# save tensorflow model

save_directory = "./directory"

tokenizer.save_pretrained(save_directory)

model.save_pretrained(save_directory)

# load model from local

tokenizer = AutoTokenizer.from_pretrained(save_directory)

tf_model = TFAutoModelForSequenceClassification.from_pretrained(save_directory)

# use model

classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

classifier("I like huggingFace.")

```

### Default

Otherwise, we just use `pipeline` function to import default Tokenizer/Model for specified model_name.

We don't need to import Tokenizer and Model class.

```python

from transformers import pipeline

# load default model from huggingFace

model_name = "nlptown/bert-base-multilingual-uncased-sentiment"

classifier = pipeline("sentiment-analysis", model=model_name)

# save model

save_directory = "./directory"

classifier.model.save_pretrained(save_directory)

classifier.tokenizer.save_pretrained(save_directory)

# Load the pipeline with the saved model

classifier = pipeline("sentiment-analysis", model=save_directory)

classifier("I like huggingFace.")

```

## Installed Packages

These are required for translation model.

```txt

sentencepiece

sacremoses

```

## Use it with Javascript

If the model has onnx (model.onnx) then we can use transformer.js to load and infer the model.

```javascript

import { pipeline } from '@xenova/transformers';

const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english');

const out = await pipe('I love transformers!');

```

Otherwise, if the model doesn't have onnx, we first need to convert it to onnx using onnx-converter

```bash

cd onnx-converter

docker-compose build

docker-compose up -d

docker exec -it python-server bash

python convert.py --quantize --model_id google-bert/bert-base-uncased

```

Then you can copy the onnx model to somewhere and load it

```javascript

import { pipeline, env } from '@xenova/transformers';

env.localModelPath = './models/';

env.allowRemoteModels = false;

const pipe = await pipeline('fill-mask', 'google-bert/bert-base-uncased');

const out = await pipe("Hello I'm a [MASK] model.");

```

## Resources

- [Google Colab](https://colab.research.google.com/drive/1sWXmi8xaBUw6-ZYi3y76ODw50jM1jxJb)

- [Transformer.js](https://huggingface.co/docs/transformers.js/index)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yong-asial/huggingface

Awesome Lists containing this project

README