Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alekseyscorpi/vacancies_server
This is a server for vacancies generation using LLM (Saiga3)
https://github.com/alekseyscorpi/vacancies_server
code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga
Last synced: 7 days ago
JSON representation
This is a server for vacancies generation using LLM (Saiga3)
- Host: GitHub
- URL: https://github.com/alekseyscorpi/vacancies_server
- Owner: AlekseyScorpi
- Created: 2024-05-03T10:32:38.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-05-31T10:02:39.000Z (8 months ago)
- Last Synced: 2025-01-20T16:08:48.466Z (10 days ago)
- Topics: code, cuda, cuda-toolkit, docker, dockerfile, flask, llama3, llamacpp, llm, ngrok, pydantic, saiga
- Language: Python
- Homepage: https://vacancies-site.vercel.app/
- Size: 29.3 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# This is the server for the vacancies generation system
## About this server
This server works in pair with front-end part https://github.com/AlekseyScorpi/vacancies_site
It allows you to generate job texts at the request of users based on the form they submitted.
To generate text, the server uses a large language model. Model is finetuned Saiga3 - https://huggingface.co/AlekseyScorpi/saiga_llama3_vacancies_lora
## Installation
* You need to download any saiga_llama3_vacanices GGUF model from https://huggingface.co/AlekseyScorpi/saiga_llama3_vacancies_GGUF, create new folder and put the model in it
* Next you have to set model_config.json for your purpose or you can use my default config (default config assumes that you have created the generation_model folder and placed the model-Q5_K_M.gguf model in it)
* You need set several environments variables in .env file (create it):
* NGROK_AUTH_TOKEN
* NGROK_SERVER_DOMAIN
* FLASK_PORT
* CLIENT_DOMAIN
### Docker way
* You should be sure that you have the nvidia container toolkit installed to work with cuda inside containers
* Just build docker container with ```docker build -t {YOUR_IMAGE_NAME} . ```
* Run it 😉 , example run script ```docker run --gpus all -p 80:80 vacancies-server-saiga3```### Default python way
* First, create new virtual environment and activate it (for this project python 3.11 is using)
* Second, set new environment variable ```ENV CMAKE_ARGS="-DLLAMA_CUDA=on"``` to build llama_cpp_python with CUDA (be sure, that you have CUDA Toolkit on your device)
* Then ```run pip install -r requirements.txt```
* Now you can run your server with ```python start.py```