{"id":40976898,"url":"https://github.com/e-wave112/klu-be-assessment","last_synced_at":"2026-01-22T06:53:15.102Z","repository":{"id":184535707,"uuid":"671990421","full_name":"E-wave112/klu-be-assessment","owner":"E-wave112","description":"A simple implementation of the OpenAI Chat Completion API built with FastAPI","archived":false,"fork":false,"pushed_at":"2023-12-09T22:54:48.000Z","size":42,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-04-18T11:08:00.002Z","etag":null,"topics":["benchmarking","fastapi","llms","openai","performance-testing","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/E-wave112.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-07-28T16:15:13.000Z","updated_at":"2023-07-30T02:57:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"dd8cff77-929e-4d59-a372-57df7f95b8df","html_url":"https://github.com/E-wave112/klu-be-assessment","commit_stats":{"total_commits":11,"total_committers":2,"mean_commits":5.5,"dds":0.4545454545454546,"last_synced_commit":"e30cdc8417b571dfd304a36a440203c3901da892"},"previous_names":["e-wave112/klu-be-assessment"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/E-wave112/klu-be-assessment","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/E-wave112%2Fklu-be-assessment","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/E-wave112%2Fklu-be-assessment/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/E-wave112%2Fklu-be-assessment/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/E-wave112%2Fklu-be-assessment/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/E-wave112","download_url":"https://codeload.github.com/E-wave112/klu-be-assessment/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/E-wave112%2Fklu-be-assessment/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28657379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T01:17:37.254Z","status":"online","status_checked_at":"2026-01-22T02:00:07.137Z","response_time":144,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarking","fastapi","llms","openai","performance-testing","python"],"created_at":"2026-01-22T06:53:13.434Z","updated_at":"2026-01-22T06:53:15.097Z","avatar_url":"https://github.com/E-wave112.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### A simple implementation of the [OpenAI Chat Completion](https://platform.openai.com/docs/guides/gpt/chat-completions-api) API built with [FastAPI](https://fastapi.tiangolo.com/)\n\n- The API demo can be found [here](https://www.loom.com/share/7b56de39016546cf964e663c99d5006e?sid=5217093a-8a9a-4922-879e-2f264937b419)\n\n- **DATA-SOURCE** : [huggingface](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/blob/main/ShareGPT_V3_unfiltered_cleaned_split.json) (Be sure to download the file to the root of the project for the application to work)\n\n### Benchmarks\n\n\u003e As observed, The improvement in performance of the `/api/v1/chat/completion/v1` endpoint is due to the fact that the api uses python [generators](https://realpython.com/introduction-to-python-generators/) to load the dataset on demand, as opposed to loading it all at once hence reducing the memory footprint of the application. It also leverages [redis](https://redis.io) as a caching layer to reduce the number of lookups made to the dataset for repeated payloads.\n\n- NB: these benchmarks might vary due to network conditions and the resource on the machine at the time of testing.\n\n100 Epochs(Requests)\n\n\u003e The sample data in the `benchmark.json` file is extracted from the last 100 dictionaries in the core `ShareGPT_V3_unfiltered_cleaned_split.json` dataset.\n\n`/api/v1/chat/completion` **POST** (with repeated payloads)\n\n```json\n{\n  \"requests_per_minute\": \"343.76\",\n  \"avg_latency_in_seconds\": \"0.174539\"\n}\n```\n\n`/api/v1/chat/completion` **POST** (with non-repeated payloads)\n\n```json\n{\n  \"requests_per_minute\": \"437.89\",\n  \"avg_latency_in_seconds\": \"0.137020\"\n}\n```\n\n`/api/v1/chat/completion/v1` **POST** (with repeated payloads)\n\n```json\n{\n  \"requests_per_minute\": \"8079.54\",\n  \"avg_latency_in_seconds\": \"0.007426\"\n}\n```\n\n`/api/v1/chat/completion/v1` **POST** (with non-repeated payloads)\n\n```json\n{\n  \"requests_per_minute\": \"3483.11\",\n  \"avg_latency_in_seconds\": \"0.017226\"\n}\n```\n\nTest Configuration: Macbook Pro 2018, 2.7GHz Quad-Core Intel Core i7, 16GB RAM 2133 MHz LPDDR3 Python 3.9.6\n\n### Getting Started\n\nTo get started with the project, ensure you have setup and activated a virtual environment, guides on that [here](https://realpython.com/python-virtual-environments-a-primer/)\n\nclone the repository via the command\n\n```\n$ git clone https://github.com/E-wave112/klu-be-assessment\n```\n\ninstall dependencies\n\n```\n$ python3 -m pip install -r requirements.txt\n```\n\n### Running the development Server\n\nstart the server by running the bash script below:\n\n```\n$ bash start.sh\n```\n\nAlternatively, you can start the server using the command below:\n\n```\n$ uvicorn application:app --port 8000 --reload\n```\n\nthe server will be running on http://localhost:8000/docs\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fe-wave112%2Fklu-be-assessment","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fe-wave112%2Fklu-be-assessment","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fe-wave112%2Fklu-be-assessment/lists"}