{"id":15103586,"url":"https://github.com/samestrin/llm-services-api","last_synced_at":"2026-02-09T16:02:09.646Z","repository":{"id":252002296,"uuid":"839053030","full_name":"samestrin/llm-services-api","owner":"samestrin","description":"A FastAPI-powered REST API offering a comprehensive suite of natural language processing services using  machine learning models with PyTorch and Transformers, packaged in a Docker container to run efficiently. ","archived":false,"fork":false,"pushed_at":"2024-08-13T02:47:42.000Z","size":62,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-10T21:26:26.622Z","etag":null,"topics":["api","docker","fastapi","hugging-face","hugging-face-transformers","huggingface-transformers","keybert","llm","openai-compatible-api","python","python3","pytorch","rest","rest-api","spacy","torch","transformers","uvicorn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/samestrin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-06T21:47:07.000Z","updated_at":"2025-01-16T09:42:15.000Z","dependencies_parsed_at":"2024-09-16T01:06:06.442Z","dependency_job_id":"b07d4c3d-ca1d-41af-a696-3fea77d355af","html_url":"https://github.com/samestrin/llm-services-api","commit_stats":{"total_commits":22,"total_committers":1,"mean_commits":22.0,"dds":0.0,"last_synced_commit":"47378686f7045756493d9923b3d9e9f4b1b8adb5"},"previous_names":["samestrin/llm-services-api"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/samestrin/llm-services-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-services-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-services-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-services-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-services-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/samestrin","download_url":"https://codeload.github.com/samestrin/llm-services-api/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samestrin%2Fllm-services-api/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29271854,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-09T13:47:44.167Z","status":"ssl_error","status_checked_at":"2026-02-09T13:47:43.721Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","docker","fastapi","hugging-face","hugging-face-transformers","huggingface-transformers","keybert","llm","openai-compatible-api","python","python3","pytorch","rest","rest-api","spacy","torch","transformers","uvicorn"],"created_at":"2024-09-25T19:40:40.969Z","updated_at":"2026-02-09T16:02:09.309Z","avatar_url":"https://github.com/samestrin.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LLM Services API\n\n[![Star on GitHub](https://img.shields.io/github/stars/samestrin/llm-services-api?style=social)](https://github.com/samestrin/llm-services-api/stargazers)[![Fork on GitHub](https://img.shields.io/github/forks/samestrin/llm-services-api?style=social)](https://github.com/samestrin/llm-services-api/network/members)[![Watch on GitHub](https://img.shields.io/github/watchers/samestrin/llm-services-api?style=social)](https://github.com/samestrin/llm-services-api/watchers)\n\n![Version 0.0.4](https://img.shields.io/badge/Version-0.0.4-blue) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)[![Built with Python](https://img.shields.io/badge/Built%20with-Python-green)](https://www.python.org/)\n\nLLM Services API is a FastAPI-based application that provides a suite of natural language processing services using various machine learning models from Hugging Face's `transformers` library through a REST API interface. The application is designed to run in a Docker container, providing endpoints for text summarization, sentiment analysis, named entity recognition, paraphrasing, keyword extraction, and embedding generation. The entire API is secured using an API key with `Bearer \u003ctoken\u003e` format, ensuring that only authorized users can access the endpoints.\n\nThe service allows flexibility in model selection through command-line arguments and a configuration file, `models_config.json`, enabling users to specify different Hugging Face models for various NLP tasks. This flexibility allows users to select lightweight models for lower-resource environments or more powerful models for advanced tasks.\n\n## Updates\n\n**0.0.4**\n\n- **Tokenization:** Convert input text into a list of token IDs, allowing you to process and manipulate text at the token level, default model `all-MiniLM-L6-v2`.\n- **Detokenization:** Reconstruct original text from a list of token IDs, allowing you to reverse the tokenization process, default model `all-MiniLM-L6-v2`.\n\n**0.0.3**\n\n- **Adaptive Throttling:** Implemented an adaptive throttling mechanism that delays requests using the `Retry-After` header when errors are encountered due to high request frequency or processing failures. The delay is dynamically adjusted based on the client’s request rate and error occurrences.\n\n**0.0.2**\n\n- **OpenAI-Compatible Embeddings:** Provides an endpoint that mimics the OpenAI embedding API, allowing easy integration with existing systems expecting OpenAI-like responses.\n- **Configurable Model Loading:** Customize which Hugging Face NLP models are loaded by providing command-line arguments or configuring the `models_config.json` file. This flexibility allows the application to adapt to different resource environments or use cases.\n\n## Features\n\n- **Text Summarization:** Generate concise summaries of long texts, default model `BART`.\n- **Sentiment Analysis:** Determine the sentiment of text inputs, default model `DistilBERT`.\n- **Named Entity Recognition (NER):** Identify entities within text and sort them by frequency, default model `BERT` (dbmdz/bert-large-cased-finetuned-conll03-english).\n- **Paraphrasing:** Rephrase sentences to produce semantically similar outputs, default model `T5`.\n- **Keyword Extraction:** Extract important keywords from text, with customizable output count, default model `KeyBERT`.\n- **Embedding Generation:** Create vector representations of text, default model `SentenceTransformers` (all-MiniLM-L6-v2).\n- **Caching with LRU:** Frequently used computations, such as generating embeddings and tokenizations, are cached using the Least Recently Used (LRU) strategy. This reduces response times for repeated requests and enhances overall performance.\n\n## Dependencies\n\n- Python 3.7+\n- FastAPI\n- Uvicorn\n- spaCy\n- transformers\n- sentence-transformers\n- keybert\n- torch\n- python-dotenv (for environment variable management)\n\n## Installation\n\nTo get started with the LLM Services API, follow these steps:\n\n1. **Clone the Repository:**\n\n```bash\ngit clone https://github.com/samestrin/llm-services-api.git\ncd llm-services-api\n```\n\n2. **Create a Virtual Environment:**\n\n```bash\npython -m venv venv\nsource venv/bin/activate # On Windows use `venv\\Scripts\\activate`\n```\n\n3. **Install the Dependencies:**\n\n```bash\npip install -r requirements.txt\n```\n\n4. **Download SpaCy Model:**\n\n```bash\npython -m spacy download en_core_web_sm\n```\n\n5. **Create Your .env File:**\n\n```bash\necho \"API_KEY=your-key-here\" \u003e .env\n```\n\n6. **Run the Application Locally:**\n\nYou can run the application locally in two ways:\n\n- **Using Uvicorn:**\n\nThis is the recommended method for running in a development or production-like environment.\n\n```bash\nuvicorn main:app --reload --port 5000\n```\n\n- **Using Python:**\n\nThis method allows you to pass command-line arguments for customizing models.\n\n```bash\npython main.py --embedding-model all-MiniLM-L6-v2 --summarization-model facebook/bart-large-cnn\n```\n\nReplace `--embedding-model` and `--summarization-model` with the models you wish to use. This approach offers flexibility by allowing you to specify different models for various NLP tasks.\n\n### Options\n\n```bash\n  -h, --help                                  Show this help message and exit\n  --embedding-model EMBEDDING_MODEL           Specify embedding model\n  --summarization-model SUMMARIZATION_MODEL   Specify summarization model\n  --sentiment-model SENTIMENT_MODEL           Specify sentiment analysis model\n  --ner-model NER_MODEL                       Specify named entity recognition model\n  --paraphrase-model PARAPHRASE_MODEL         Specify paraphrasing model\n  --keyword-model KEYWORD_MODEL               Specify keyword extraction mode\n```\n\n## Running with Docker\n\nTo run the application in a Docker container, follow these steps:\n\n1. **Build the Docker Image:**\n\n```bash\ndocker build -t llm-services-api .\n```\n\n2. **Run the Docker Container:**\n\n```bash\ndocker run -p 5000:5000 llm-services-api\n```\n\nThe application will be accessible at `http://localhost:5000`.\n\n## Usage\n\nThe API provides several endpoints for various NLP tasks. Below is a summary of the available endpoints:\n\n### Endpoints\n\n#### 1. Text Summarization\n\n- **Endpoint:** `/summarize`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"text\": \"Your text here\"\n}\n```\n\n- **Response:**\n\n```json\n{\n  \"summary\": \"The generated summary of the provided text.\"\n}\n```\n\n#### 2. Sentiment Analysis\n\n- **Endpoint:** `/sentiment`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"text\": \"Your text here\"\n}\n```\n\n- **Response:**\n\n```json\n{\n    \"sentiment\": [\n        {\n        \"label\": \"POSITIVE\", # or \"NEGATIVE\"\n        \"score\": 0.99\n        }\n    ]\n}\n```\n\n#### 3. Named Entity Recognition\n\n- **Endpoint:** `/entities`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"text\": \"Your text here\"\n}\n```\n\n- **Response:**\n\n```json\n{\n    \"entities\": [\n        {\n        \"entity\": \"PERSON\",\n        \"word\": \"John Doe\",\n        \"frequency\": 3\n        },\n        ...\n    ]\n}\n```\n\n#### 4. Paraphrasing\n\n- **Endpoint:** `/paraphrase`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"text\": \"Your text here\"\n}\n```\n\n- **Response:**\n\n```json\n{\n  \"paraphrased_text\": \"The paraphrased version of the input text.\"\n}\n```\n\n#### 5. Keyword Extraction\n\n- **Endpoint:** `/extract_keywords`\n- **Method:** `POST`\n- **Query Parameters:**\n  - `num_keywords`: Optional, defaults to 5. Specifies the number of keywords to extract.\n- **Request Body:**\n\n```json\n{\n  \"text\": \"Your text here\"\n}\n```\n\n- **Response:**\n\n```json\n{\n\"keywords\": [\n    {\n        \"keyword\": \"important keyword\",\n        \"score\": 0.95\n        },\n        ...\n    ]\n}\n```\n\n#### 6. Embedding Generation\n\n- **Endpoint:** `/embed`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"text\": \"Your text here\"\n}\n```\n\n- **Response:**\n\n```json\n{\n    \"embedding\": [0.1, 0.2, 0.3, ...] # Array of float numbers representing the text embedding\n}\n```\n\n### 7. OpenAI-Compatible Embedding\n\n- **Endpoint:** `/v1/embeddings`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"input\": \"Your text here\",\n  \"model\": \"all-MiniLM-L6-v2\"  # or another supported model\n}\n```\n\n- **Response:**\n\n```json\n{\n  \"object\": \"list\",\n  \"data\": [\n    {\n      \"object\": \"embedding\",\n      \"index\": 0,\n      \"embedding\": [-0.006929283495992422, -0.005336422007530928, ...],  # Embedding array\n    }\n  ],\n  \"model\": \"all-MiniLM-L6-v2\",\n  \"usage\": {\n    \"prompt_tokens\": 5,  # Number of tokens in the input\n    \"total_tokens\": 5    # Total number of tokens processed\n  }\n}\n```\n\n#### 8. Tokenization\n\n- **Endpoint:** `/tokenize`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"text\": \"Your text here\",\n  \"model\": \"all-MiniLM-L6-v2\"  # Optional, specify a model for tokenization\n}\n```\n\n- **Response:**\n\n```json\n{\n  \"tokens\": [101, 7592, 999, ...]  # Array of token IDs representing the text\n}\n```\n\nThis endpoint allows you to tokenize input text using a specified or default model. If the model field is not provided, the default embeddings model `all-MiniLM-L6-v2` will be used.\n\n#### 8. Detokenization\n\n- **Endpoint:** `/detokenize`\n- **Method:** `POST`\n- **Request Body:**\n\n```json\n{\n  \"tokens\": [101, 2023, 2003, 2019, 2742, 6251, 2000, 19204, 1012, 102],  # List of token IDs\n  \"model\": \"all-MiniLM-L6-v2\"  # Optional, specify a model for detokenization\n}\n```\n\n- **Response:**\n\n```json\n{\n  \"text\": \"This is an example sentence to tokenize.\"  # The reconstructed text\n}\n```\n\n## Contribute\n\nContributions to this project are welcome. Please fork the repository and submit a pull request with your changes or improvements.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](/LICENSE) file for details.\n\n## Share\n\n[![Twitter](https://img.shields.io/badge/X-Tweet-blue)](https://twitter.com/intent/tweet?text=Check%20out%20this%20awesome%20project!\u0026url=https://github.com/samestrin/llm-services-api) [![Facebook](https://img.shields.io/badge/Facebook-Share-blue)](https://www.facebook.com/sharer/sharer.php?u=https://github.com/samestrin/llm-services-api) [![LinkedIn](https://img.shields.io/badge/LinkedIn-Share-blue)](https://www.linkedin.com/sharing/share-offsite/?url=https://github.com/samestrin/llm-services-api)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamestrin%2Fllm-services-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamestrin%2Fllm-services-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamestrin%2Fllm-services-api/lists"}