{"id":24167187,"url":"https://github.com/thedeadcoder/tokkhok-backend","last_synced_at":"2026-03-11T10:05:45.190Z","repository":{"id":270996527,"uuid":"911469942","full_name":"TheDeadcoder/Tokkhok-Backend","owner":"TheDeadcoder","description":"KUET-Bitfest-Hackathon Backend","archived":false,"fork":false,"pushed_at":"2025-01-04T16:21:39.000Z","size":436,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-02T09:14:31.792Z","etag":null,"topics":["alembic","authentication","fastapi","google-transliterate","openai","postgresql","qdrant","render","sqlalchemy","supabase","vector-database"],"latest_commit_sha":null,"homepage":"https://buet-genesis.onrender.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TheDeadcoder.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-03T05:08:16.000Z","updated_at":"2025-01-05T05:25:00.000Z","dependencies_parsed_at":"2025-01-04T17:34:45.155Z","dependency_job_id":null,"html_url":"https://github.com/TheDeadcoder/Tokkhok-Backend","commit_stats":null,"previous_names":["thedeadcoder/kuet-bitfest-hackathon-kothok"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheDeadcoder%2FTokkhok-Backend","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheDeadcoder%2FTokkhok-Backend/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheDeadcoder%2FTokkhok-Backend/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheDeadcoder%2FTokkhok-Backend/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TheDeadcoder","download_url":"https://codeload.github.com/TheDeadcoder/Tokkhok-Backend/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241482109,"owners_count":19969850,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alembic","authentication","fastapi","google-transliterate","openai","postgresql","qdrant","render","sqlalchemy","supabase","vector-database"],"created_at":"2025-01-12T21:12:40.857Z","updated_at":"2026-03-11T10:05:40.158Z","avatar_url":"https://github.com/TheDeadcoder.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Running the Backend\n## FastAPI Backend\n![তক্ষক](https://btrnqywodfpanpiyjjiw.supabase.co/storage/v1/object/public/contents/tokkhok-logo.png)\n## Virtual environment\nMake sure python3-venv is in your machine\n```bash\nsudo apt install python3-venv\n```\nMake a python virtual environment with the following command:\n```bash\npython3 -m venv .venv\n```\nactivate the environment\n```bash\nsource .venv/bin/activate\n```\n\n## Install dependencies\nInstall the required packages with the following command:\n```bash\npip install -r requirements.txt\n```\n## Running the backend\nTo run the backend server, use the following command:\n\n```bash\nuvicorn app.main:app --reload\n```\n\nThe app will start:\n\n```bash\nhttp://127.0.0.1:8000/\n```\n\nOnce the application is running, you can access the API documentation provided by Swagger at:\n\n```bash\nhttp://127.0.0.1:8000/docs\n```\n\nHere, you can explore and interact with the various API endpoints.\n\n\n## Make changes in the database\nexecute the following command\n```bash\n./push.sh\n```\n\n## Authentication\n- using supabase for authentication purpose\n- Every path other than login and signup requires bearer token\n- the token refreshes after 1 hour\n\n## Database\n- using postgresql for database\n- Hosted the database in supabase\n- using sqlalchemy as ORM\n- Using Database pooling. Using default pool size = 15\n\n## Database Migration Tool\nUsing Alembic as DB-migration tool\n\n## SMTP\n- Using Gmail custom smtp (smtp.gmail.com) . Thus confirmation mail goes from our own gmail\n- therefore can handle 1100 user mail authentication in 1 hour\n\n# chat with knowledge-base\n## Vector Database\nwe have used Qdrant for vector database\n\n## Embedding model\nwe have used the text-embedding-3-large model for generating embedding\n\n## Pdf Font\nused [Noto-Sans font from Google fonts](https://fonts.google.com/noto/specimen/Noto+Sans+Bengali?query=bangla)\n\n## File-ingestion Pipeline\n- We receive the \"pure bangla text\" from the text editor\n- we generate a suitable title and caption for the file\n- we upload the pdf in supabase bucket and fetch the link of the file\n- we generate some metadata for the parsed content\n- we vectorize and store the chunks in qdrant\n\n## RAG chat pipeline\n- The default knowledge base if user's uploaded contents\n- User can also customize a chat by adding some public files for that chat only\n- User asks a query (in bangla/ banglish)\n- With AI agent, we normalize the user query (for better context-ingestion and searching-ready for the vector database)\n- we vectorize the standardized prompt and search in the vector database\n- we fetch k-most relevant chunks\n- then we feed the query and fetched chunks to AI-agent\n- AI agent then generates Bengali response using our custom knowledge base\n\n## Translation Generation:\n\n### For translation, we have tried 2 ways:\n#### Way-1:\n- We have used Few-shot prompting that is used as a technique to enable in-context learning\n- Our users contribute in geenrating learning samples ({banglish, bangla} pairs)\n- admins approve some of them\n- The approved pairs are used as few shot inferencing\n- Future plan is to run a cron job (after 1 week) to collect the approved samples and use them to train model using openai's fine-tune api. Currently it could not be done due to costing reasons\n\n#### way-2:\n- used Google Transliterate API\n- The transliteration is phonetic, meaning it maps input sounds in one script (e.g., Latin/English) to equivalent sounds in the target script (e.g., Bengali).\n- This engine primarily relies on rule-based linguistic mappings and possibly some statistical or probabilistic enhancements for ambiguity resolution.\n- we chose this option for better latency support\n\n## Audio chat Pipeline\n- we used OpenAI's whisper-1 model for generating transcript for user speech\n- We generated embedding for transcripted text\n- we searched vector database for relevant chunks\n- we fed knowledge and query to AI-agent. It responded in text\n- with browsers SpeechSynthesis api, we can convert the textual response to speech\n- After returning the audio response, we did the db-storing activities using FastAPI's background task\n\n## Latency Handling at the time of translating Banglish to bangla\n- we have used FastAPI's Background task to execute db-operations in a separate thread. When the thread updates the db-operation, we terminate it\n- we return the translation as soon as we get\n  \n\n## deployment\n[deployed-site](https://buet-genesis.onrender.com)\n- used renders docker template for fastapi for deployment\n- how dealt with Render's freezing issue?\n- there is a dummy GET endpoint in /, Ran a cronjon from [a cronjob site](https://cron-job.org/en/)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthedeadcoder%2Ftokkhok-backend","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthedeadcoder%2Ftokkhok-backend","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthedeadcoder%2Ftokkhok-backend/lists"}