{"id":19986500,"url":"https://github.com/bentoml/transformers-nlp-service","last_synced_at":"2025-05-04T07:31:25.270Z","repository":{"id":158063656,"uuid":"629306069","full_name":"bentoml/transformers-nlp-service","owner":"bentoml","description":"Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more","archived":false,"fork":false,"pushed_at":"2024-03-16T09:25:55.000Z","size":6605,"stargazers_count":44,"open_issues_count":3,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-25T06:41:07.462Z","etag":null,"topics":["llm","llmops","mlops","model-deployment","model-inference-service","model-serving","nlp","nlp-machine-learning","online-inference","transformer"],"latest_commit_sha":null,"homepage":"https://bentoml.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bentoml.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-18T03:41:46.000Z","updated_at":"2025-03-31T14:07:16.000Z","dependencies_parsed_at":"2024-11-13T04:39:55.048Z","dependency_job_id":null,"html_url":"https://github.com/bentoml/transformers-nlp-service","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bentoml%2Ftransformers-nlp-service","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bentoml%2Ftransformers-nlp-service/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bentoml%2Ftransformers-nlp-service/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bentoml%2Ftransformers-nlp-service/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bentoml","download_url":"https://codeload.github.com/bentoml/transformers-nlp-service/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252304686,"owners_count":21726610,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llm","llmops","mlops","model-deployment","model-inference-service","model-serving","nlp","nlp-machine-learning","online-inference","transformer"],"created_at":"2024-11-13T04:29:25.289Z","updated_at":"2025-05-04T07:31:24.666Z","avatar_url":"https://github.com/bentoml.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n    \u003ch1 align=\"center\"\u003eTransformers NLP Service\u003c/h1\u003e\n    \u003cbr\u003e\n    \u003cstrong\u003eA modular, composable, and scalable solution for building NLP services with Transformers\u003cbr\u003e\u003c/strong\u003e\n    \u003ci\u003ePowered by BentoML 🍱 + HuggingFace 🤗\u003c/i\u003e\n    \u003cbr\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\n## 📖 Introduction 📖\n- This project showcase how one can serve HuggingFace's transformers models for various NLP with ease.\n- It incorporates BentoML's best practices, from setting up model services and handling pre/post-processing to deployment in production.\n- User can explore the example endpoints such as summarization and categorization via an interactive Swagger UI.\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/bentoml/transformers-nlp-service/blob/main/images/demo.gif\" alt=\"Computer man\" \u003e   \n\u003c/div\u003e\n\n## 🏃‍♂️ Running the Service 🏃‍♂️\nTo fully take advantage of this repo, we recommend you to clone it and try out the service locally. \n\n### BentoML CLI\nThis requires Python3.8+ and `pip` installed.\n\n```bash\ngit clone https://github.com/bentoml/transformers-nlp-service.git \u0026\u0026 cd transformers-nlp-service\n\npip install -r requirements/tests.txt\n\nbentoml serve\n```\n\nYou can then open your browser at http://127.0.0.1:3000 and interact with the service through Swagger UI.\n\n### Containers\n\nWe also provide two pre-built container to run on CPU and GPU respectively. \nThis requires any container engine, such as docker, podman, ...\nYou can then quickly try out the service by running the container:\n\n```bash\n# cpu\ndocker run -p 3000:3000 ghcr.io/bentoml/transformers-nlp-service:cpu\n\n# gpu\ndocker run --gpus all -p 3000:3000 ghcr.io/bentoml/transformers-nlp-service:gpu\n```\n\n\u003e Note that to run with GPU, you will need to have [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) setup.\n\n### Python API\nOne can also use the BentoML Python API to serve their models.\n\nRun the following to build a Bento within the Bento Store:\n```bash\nbentoml build\n```\nThen, start a server with `bentoml.HTTPServer`:\n\n```python\nimport bentoml\n\n# Retrieve Bento from Bento Store\nbento = bentoml.get(\"transformers-nlp-service\")\n\nserver = bentoml.HTTPServer(bento, port=3000)\nserver.start(blocking=True)\n```\n\n### gRPC?\nIf you wish to use gRPC, this project also include gRPC support. Run the following:\n\n```bash\nbentoml serve-grpc\n```\n\nTo run the container with gRPC, do\n\n```bash\ndocker run -p 3000:3000 -p 3001:3001 ghcr.io/bentoml/nlp:cpu serve-grpc\n``` \n\nTo find more information about gRPC with BentoML, refer to [our documentation](https://docs.bentoml.org/en/latest/guides/grpc.html)\n\n## 🌐 Interacting with the Service 🌐\nThe default mode of BentoML's model serving is via HTTP server. Here, we showcase a few examples of how one can interact with the service:\n### cURL\nThe following example shows how to send a request to the service to summarize a text via cURL:\n\n```bash\ncurl -X 'POST' \\\n  'http://0.0.0.0:3000/summarize' \\\n  -H 'accept: text/plain' \\\n  -H 'Content-Type: text/plain' \\\n  -d 'The three words that best describe Hunter Schafer'\\''s Vanity Fair Oscars party look? Less is more.\nDressed in a bias-cut white silk skirt, a single ivory-colored feather and — crucially — nothing else, Schafer was bound to raise a few eyebrows. Google searches for the actor and model skyrocketed on Sunday night as her look hit social media. On Twitter, pictures of Schafer immediately received tens of thousands of likes, while her own Instagram post has now been liked more than 2 million times.\nLook of the Week: Zendaya steals the show at Louis Vuitton in head-to-toe tiger print\nBut more than just creating a headline-grabbing moment, Schafer'\\''s ensemble was clearly considered. Fresh off the Fall-Winter 2023 runway, the look debuted earlier this month at fashion house Ann Demeulemeester'\\''s show in Paris. It was designed by Ludovic de Saint Sernin, the label'\\''s creative director since December.\nCelebrity fashion works best when there'\\''s a story behind a look. For example, the plausible Edie Sedgwick reference in Kendall Jenner'\\''s Bottega Veneta tights, or Paul Mescal winking at traditional masculinity in a plain white tank top.\nFor his first Ann Demeulemeester collection, De Saint Sernin was inspired by \"fashion-making as an authentic act of self-involvement.\" It was a love letter — almost literally — to the Belgian label'\\''s founder, with imagery of \"authorship and autobiography\" baked into the clothes (Sernin called his feather bandeaus \"quills\" in the show notes).\nHunter Schafer'\\''s barely-there Oscars after party look was more poetic than it first seemed.\nThese ideas of self-expression, self-love and self-definition took on new meaning when worn by Schafer. As a trans woman whose ascent to fame was inextricably linked to her gender identity — her big break was playing trans teenager Jules in HBO'\\''s \"Euphoria\" — Schafer'\\''s body is subjected to constant scrutiny online. The comment sections on her Instagram posts often descend into open forums, where users feel entitled (and seemingly compelled) to ask intimate questions about the trans experience or challenge Schafer'\\''s womanhood.\nFittingly, there is a long lineage of gender-defying sentiments stitched into Schafer'\\''s outfit. Founded in 1985 by Ann Demeulemeester and her husband Patrick Robyn, the brand boasts a long legacy of gender-non-conforming fashion.\n\"I was interested in the tension between masculine and feminine, but also the tension between masculine and feminine within one person,\" Demeulemeester told Vogue ahead of a retrospective exhibition of her work in Florence, Italy, last year. \"That is what makes every person really interesting to me because everybody is unique.\"\nIn his latest co-ed collection, De Saint Sernin — who is renowned in the industry for his eponymous, gender-fluid label — brought his androgynous world view to Ann Demeulemeester with fitted, romantic menswear silhouettes and sensual fabrics for all (think skin-tight mesh tops, leather, and open shirts made from a translucent organza material).\nCelebrity stylist Law Roach on dressing Zendaya and '\\''faking it '\\''till you make it'\\''\nA quill strapped across her chest, Schafer let us know she is still writing her narrative — and defining herself on her own terms. There'\\''s an entire story contained in those two garments. As De Saint Sernin said in the show notes: \"Thirty-six looks, each one a heartfelt sentence.\"\nThe powerful ensemble may become one of Law Roach'\\''s last celebrity styling credits. Roach announced over social media on Tuesday that he would be retiring from the industry after 14 years of creating conversation-driving looks for the likes of Zendaya, Bella Hadid, Anya Taylor-Joy, Ariana Grande and Megan Thee Stallion.'\n```\n### Via BentoClient 🐍\nTo send requests in Python, one can use ``bentoml.client.Client`` to send requests to the service:\n\n```python\nif __name__ == \"__main__\":\n    import bentoml\n\n    client = bentoml.client.Client.from_url(f\"http://{host}:3000\")\n\n    print(\"Summarized text from the article:\", client.summarize(SAMPLES))\n    print(\"Categories prediction of the article:\", client.categorize({'text': SAMPLES, 'categories': CATEGORIES}))\n```\n\nRun `python client.py` to see it in action.\n\n\u003e Checkout the [`client.py`](./client.py) file for more details.\n\nNote that all API endpoints defined in `service.py` can be access through client through its sync and async methods. For example, the [`service.py`](./service.py) contains three endpoints: `/summarize`, `/categorize` and `/make_analysis`, and hence the following\nmethods are available on the client instance:\n\n- `client.async_summarize` | `client.summarize`\n- `client.async_categorize` | `client.categorize`\n- `client.async_make_analysis` | `client.make_analysis`\n\n### Via Javascript\nYou can also send requests to this service with `axios` in JS. \nThe following example sends a request to make analysis on a given text and categories:\n\n```javascript\nimport axios from 'axios'\n\nvar TEXT = `...`\n\nvar CATEGORIES = [ 'world', 'politics', 'technology', 'defence', 'parliament' ]\n\nconst client = axios.create({\n  baseURL: 'http://localhost:3000',\n  timeout: 3000,\n})\n\nclient\n  .post('/make_analysis', {\n    text: TEXT,\n    categories: CATEGORIES.join(', '),\n  })\n  .then(function (r) {\n    console.log('Full analysis:', r.data)\n  })\n```\n\nRun the `client.js` with `yarn run client` or `npm run client`, and it should yield the following result\n\n```prelog\nFull analysis: {\n  summary: \" Actor and model Hunter Schafer wore a barely-there Oscars after party look . The look debuted\n earlier this month at fashion house Ann Demeulemeester's Fall-Winter 2023 runway . It was designed by des\nigner Ludovic de Saint Sernin, who is renowned for his eponymous label .\",\n  categories: {\n    entertainment: 0.4694322943687439,\n    healthcare: 0.4245288372039795,\n    defence: 0.42102956771850586,\n    world: 0.416515976190567,\n  }\n}\n```\n\n\u003e Checkout the [`client.js`](./client.js) for more details.\n\n## ⚙️ Customization ⚙️\n### What if I want to add tasks *X*?\n\nThis project is designed to be used with different [NLP tasks](https://huggingface.co/tasks) and its corresponding models:\n\n| Tasks                                                                               \t| Example model                                                                                                               \t|\n|-------------------------------------------------------------------------------------\t|-----------------------------------------------------------------------------------------------------------------------------\t|\n| [Conversational](https://huggingface.co/tasks/conversational)                     \t| [`facebook/blenderbot-400M-distill`](https://huggingface.co/facebook/blenderbot-400M-distill)                               \t|\n| [Fill-Mask](https://huggingface.co/tasks/fill-mask)                               \t| [`distilroberta-base`](https://huggingface.co/distilroberta-base)                                                           \t|\n| [Question Answering](https://huggingface.co/tasks/question-answering)             \t| [`deepset/roberta-base-squad2`](https://huggingface.co/deepset/roberta-base-squad2)                                         \t|\n| [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity)           \t| [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)                   \t|\n| [Summarisation](https://huggingface.co/tasks/summarization)                       \t| [`sshleifer/distilbart-cnn-12-6`](https://huggingface.co/sshleifer/distilbart-cnn-12-6) [included]                          \t|\n| [Table Question Answering](https://huggingface.co/tasks/table-question-answering) \t| [`google/tapas-base-finetuned-wtq`](https://huggingface.co/google/tapas-base-finetuned-wtq)                                 \t|\n| [Text Classification](https://huggingface.co/tasks/text-classification)           \t| [`distilbert-base-uncased-finetuned-sst-2-english`](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) \t|\n| [Text Generation](https://huggingface.co/tasks/text-generation)                   \t| [`bigscience/T0pp`](https://huggingface.co/bigscience/T0pp)                                                                 \t|\n| [Token Classification](https://huggingface.co/tasks/token-classification)         \t| [`dslim/bert-base-NER`](https://huggingface.co/dslim/bert-base-NER)                                                         \t|\n| [Zero-Shot Classification](https://huggingface.co/tasks/zero-shot-classification) \t| [`facebook/bart-large-mnli`](https://huggingface.co/facebook/bart-large-mnli) [included]                                    \t|\n| [Translation](https://huggingface.co/tasks/translation)                           \t| [`Helsinki-NLP/opus-mt-en-fr`](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr)                                           \t|\n\n### Where can I add models?\nYou can add more tasks and models by editing the `download_model.py` file.\n\n### Where can I add API logics?\nPre/post processing logics can be set in the `service.py` file.\n\n\n### Where can I find more docs about Transformers and BentoML?\nBentoML supports Transformers models out of the box. You can find more details in the [BentoML support](https://docs.bentoml.org/en/latest/frameworks/transformers.html) for [Transformers](https://huggingface.co/docs/transformers/index).\n\n## 🚀 Deploying to Production 🚀\nEffortlessly transition your project into a production-ready application using [BentoCloud](https://www.bentoml.com/bento-cloud/), the production-ready platform for managing and deploying machine learning models.\n\nStart by creating a BentoCloud account. Once you've signed up, log in to your BentoCloud account using the command:\n\n```bash\nbentoml cloud login --api-token \u003cyour-api-token\u003e --endpoint \u003cbento-cloud-endpoint\u003e\n```\n\u003e Note: Replace `\u003cyour-api-token\u003e` and `\u003cbento-cloud-endpoint\u003e` with your specific API token and the BentoCloud endpoint respectively.\n\nNext, build your BentoML service using the `build` command:\n\n```bash\nbentoml build\n```\n\nThen, push your freshly-built Bento service to BentoCloud using the `push` command:\n\n```bash\nbentoml push \u003cname:version\u003e\n```\n\nLastly, deploy this application to BentoCloud with a single `bentoml deployment create` command following the [deployment instructions](https://docs.bentoml.org/en/latest/reference/cli.html#bentoml-deployment-create).\n\nBentoML offers a number of options for deploying and hosting online ML services into production, learn more at [Deploying a Bento](https://docs.bentoml.org/en/latest/concepts/deploy.html).\n\n## 👥 Community 👥\nBentoML has a thriving open source community where thousands of ML/AI practitioners are \ncontributing to the project, helping other users and discussing the future of AI. 👉 [Pop into our Slack community!](https://l.bentoml.com/join-slack)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbentoml%2Ftransformers-nlp-service","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbentoml%2Ftransformers-nlp-service","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbentoml%2Ftransformers-nlp-service/lists"}