{"id":41346219,"url":"https://github.com/raisultan/pac","last_synced_at":"2026-01-23T06:59:19.014Z","repository":{"id":231854090,"uuid":"766217620","full_name":"raisultan/pac","owner":"raisultan","description":"Automation of Prioritization and Categorization of Support Tickets Using LLMs and Vector DBs","archived":false,"fork":false,"pushed_at":"2024-04-06T09:43:51.000Z","size":999,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-07-25T20:06:48.973Z","etag":null,"topics":["categorization","function-calling","llms","normalization","prioritization","vector-db"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/raisultan.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-03-02T16:52:42.000Z","updated_at":"2024-05-06T15:54:27.000Z","dependencies_parsed_at":"2024-04-06T10:39:12.835Z","dependency_job_id":null,"html_url":"https://github.com/raisultan/pac","commit_stats":null,"previous_names":["raisultan/pac"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/raisultan/pac","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raisultan%2Fpac","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raisultan%2Fpac/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raisultan%2Fpac/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raisultan%2Fpac/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/raisultan","download_url":"https://codeload.github.com/raisultan/pac/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raisultan%2Fpac/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28682264,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-23T05:48:07.525Z","status":"ssl_error","status_checked_at":"2026-01-23T05:48:07.129Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["categorization","function-calling","llms","normalization","prioritization","vector-db"],"created_at":"2026-01-23T06:59:18.485Z","updated_at":"2026-01-23T06:59:19.001Z","avatar_url":"https://github.com/raisultan.png","language":"Python","readme":"# PAC\n\nPAC is a tool for Prioritization and Categorization of support tickets. It enables quick and effortless categorization of support tickets for any kind of product. The main goal of a project is to remove manual human labor and automate the process. It reaches the goal using Vector Semantic Search and LLM function calling.\n\nQuick explanation of the logic behind it is as follows: support ticket data is received by PAC using Kafka topic. PAC first vectorizes ticket data and searches similar vectors in Vector DB using COSINE similarity. In the resulting search result list it takes the most similar and checks the distance between what's given and most similar vector from Vector DB, if the distance is greater than some specific threshold, then the received ticket will be assigned the same category and priority. If the search and threshold check failed, then request to LLM is made, which should return category and priority for the ticket. After the ticket is assigned with priority and category it is inserted into Vector DB. The same process is applied for all the other incoming tickets. This approach saves up costs for LLM requests by first checking the Vector DB and if there is no similar enough ticket, only then it makes the request.\n\nPAC also generates a response event with original ticket data and priority and category and sends it to output topic. So that this event can be further sent to data lake or other storage for later BI or other type of analysis.\n\nIn case if priority or category of a certain ticket was assigned incorrectly, there is an API so that correct priority or category can be assigned manually. If such case happens, app sends separate correction event to a separate topic, so that it will be taken to account during analysis.\n\n### Process Flow\n\n```mermaid\nflowchart TB\n    A[Support Ticket] --\u003e|Received via Kafka Topic| B[Text Normalization]\n    B --\u003e C[Request to LLM for Vectorization]\n    C --\u003e|Vector Embedding| D{Vector DB Search}\n\n    D --\u003e|If match| E[Check Distance]\n    E --\u003e|Below Threshold| F[Assign Category \u0026 Priority]\n    E --\u003e|Above Threshold| G[LLM Function Call for Priority and Category]\n    G --\u003e F\n    D --\u003e|No match| G\n    \n    F --\u003e|Insert into Vector DB| H[Vector DB]\n    F --\u003e I[Generate Response Event]\n    I --\u003e|Send to Output Topic| J[Data Lake / Storage]\n    \n    K[Manual API Correction] -.-\u003e|If needed| F\n    K --\u003e|Correction Event| L[Corrected Tickets Topic]\n\n    style A fill:#4f77f6,stroke:#333,stroke-width:2px\n    style B fill:#ffcf33,stroke:#333,stroke-width:4px\n    style C fill:#7fd3a4,stroke:#333,stroke-width:4px\n    style D fill:#4095c6,stroke:#333,stroke-width:2px\n    style E fill:#f98b88,stroke:#333,stroke-width:2px\n    style F fill:#8bc34a,stroke:#333,stroke-width:2px\n    style G fill:#f06292,stroke:#333,stroke-width:2px\n    style H fill:#795548,stroke:#333,stroke-width:2px\n    style I fill:#64b5f6,stroke:#333,stroke-width:2px\n    style J fill:#ba68c8,stroke:#333,stroke-width:2px\n    style K fill:#ffeb3b,stroke:#333,stroke-width:2px\n    style L fill:#e91e63,stroke:#333,stroke-width:2px\n```\n\n## Tech Stack\n- Python 3.10\n- Milvus\n- Kafka and Zookeeper\n- Docker\n- OpenAI\n\n## Components\n1. Text Normalizer\n    - Removes Noise: Strips out irrelevant characters.\n    - Standardizes: Converts all characters to lowercase to ensure consistency.\n    - Anonymizes: Replaces names, email adresses, phone numbers, and any other user-specific data with generic placeholders.\n    - Normalizes URLs and Paths: Converts URLs, file paths, or specific codes to generic placeholders or remove them if they are not relevant to the understanding of the ticket.\n\n2. Vectorizer: creates a vector embedding from given text.\n\n3. Vector DB Repository\n    - Searches Similar Tickets\n    - Inserts into Vector DB\n    - Updates Record in Vector DB\n    - Removes Record from Vector DB\n    - Gets a Record by ID from Vector DB\n\n4. PAC: given a ticket prioritizes and categorizes it to be one of available categories.\n\n5. Updater: corrects already PACed ticket with given priority and category.\n\n## Getting Started\n\nThis section provides instructions on how to set up and run the project using `Poetry` as the package manager.\n\n### Prerequisites\n\nEnsure you have Docker and Poetry installed on your system. These tools are required to run the services and the application.\n\n### Setup and Running Services\n\n**Start Milvus**\n\nTo start the Milvus database, run the following command:\n```bash\nmake start-milvus\n```\n\n**Start Kafka**\n\nTo start Kafka for message queuing, execute:\n```bash\nmake start-kafka\n```\n\n**Install Dependencies**\n\nInstall the project dependencies using Poetry:\n```bash\npoetry install\n```\n\n**Create Vector Database Collection**\n\nBefore running the application, ensure to create the vector database collection with:\n```bash\nmake create-collection\n```\n\n**Run the Application**\n\nStart the FastAPI application with the following command:\n```bash\nmake run\n```\n\n### Testing Utilities\n\n**Create Input Topic**\n\nYou can create a Kafka topic for input tickets by running:\n```bash\nmake create-input-topic CONTAINER_ID=\u003cyour_kafka_container_id\u003e\n```\n\n**Write to Input Topic**\n\nTo send a test ticket to the input topic, use:\n```bash\nmake write-to-input-topic CONTAINER_ID=\u003cyour_kafka_container_id\u003e\n```\n\nThen, input your test ticket JSON data, for example:\n```json\n{\"id\": 123, \"email\": \"test@test.com\", \"text\": \"peripherals you sent me are not working. i wanna return them today\"}\n```\n\n**Monitor Processed Tickets**\n\nTo monitor processed tickets:\n```bash\nmake monitor-processed-tickets CONTAINER_ID=\u003cyour_kafka_container_id\u003e\n```\n\n**Monitor Corrected Tickets**\n\nFor monitoring corrected tickets:\n```bash\nmake monitor-corrected-tickets CONTAINER_ID=\u003cyour_kafka_container_id\u003e\n```\n\n### Stopping Services\n\nTo stop the services, use the following commands:\n\n**Stop Kafka:**\n\n```bash\nmake stop-kafka\n```\n\n**Stop Milvus:**\n\n```bash\nmake stop-milvus\n```\n\n### API Documentation\nFor detailed API documentation, visit the FastAPI generated API documentation once the application is running on http://127.0.0.1:8000/docs#/\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraisultan%2Fpac","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fraisultan%2Fpac","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraisultan%2Fpac/lists"}