{"id":19977435,"url":"https://github.com/dmickelson/knowledgebaseproject","last_synced_at":"2025-03-01T19:14:27.032Z","repository":{"id":252963772,"uuid":"842038684","full_name":"dmickelson/KnowledgebaseProject","owner":"dmickelson","description":"This project outlines the development of a cloud-based, serverless, scalable AI-driven knowledge base platform.","archived":false,"fork":false,"pushed_at":"2024-08-14T16:23:20.000Z","size":119,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-12T10:09:16.652Z","etag":null,"topics":["aws","aws-bedrock","aws-lambda","aws-s3","chatbot","microservices","python","retrieval-augmented-generation","serverless"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dmickelson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-13T14:45:50.000Z","updated_at":"2024-08-14T16:23:23.000Z","dependencies_parsed_at":"2024-08-15T08:16:26.491Z","dependency_job_id":null,"html_url":"https://github.com/dmickelson/KnowledgebaseProject","commit_stats":null,"previous_names":["dmickelson/knowledgebaseproject"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmickelson%2FKnowledgebaseProject","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmickelson%2FKnowledgebaseProject/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmickelson%2FKnowledgebaseProject/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmickelson%2FKnowledgebaseProject/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dmickelson","download_url":"https://codeload.github.com/dmickelson/KnowledgebaseProject/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241411542,"owners_count":19958753,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","aws-bedrock","aws-lambda","aws-s3","chatbot","microservices","python","retrieval-augmented-generation","serverless"],"created_at":"2024-11-13T03:28:03.623Z","updated_at":"2025-03-01T19:14:27.012Z","avatar_url":"https://github.com/dmickelson.png","language":null,"readme":"# AI-Driven Knowledge Base Platform\n\n![Alt text here](knowledgebase.svg)\n\n## Overview\n\nThis project outlines the development of a cloud-based, serverless, microservice, scalable architected AI-driven knowledge base platform. The platform enables team members to ask questions that span multiple documents and leverage the context of prompt/chat history for extended conversations. The architectural design focuses on being interactive, serverless, modular, and scalable.\n\n## Goal\n\nThe primary goal was to create a platform that is:\n\n- Interactive: Remembers previous sessions to facilitate extended conversations.\n- Serverless: No infrastructure engineering is required.\n- Microserviced: Components are modular and replaceable.\n- Scalable: Capable of scaling with the growing demand of the team.\n\nInitially, the plan was to use a separate third-party embedding vector and store them in DynamoDB. However, Amazon Bedrock provided a fully managed solution that better aligned with our needs, leading us to pivot in that direction.\n\n## Architecture Overview\n\n### File Uploading Web Interface\n\n- S3 Buckets: Used to host the static webpage, which serves as a simple UI for uploading documents into S3.\n- REST API Endpoints: Protected behind Amazon API Gateway to facilitate secure document uploads.\n- Amazon CloudFront: Employed to cache the website, ensuring faster load times and content delivery.\n- AWS Certification Manager: Used to create an SSL certificate for secure communications.\n- Route 53: For domain registration, pointing to CloudFront endpoints, and integrating with SSL.\n- Security: Handled through a combination of Amazon Cognito, AWS IAM, and Amazon KMS.\n\n### Gather Information and Import Content\n\n- S3 Buckets: Serve as the repository for unstructured documents used within the Knowledge Base.\n- Upload Process:\n  - Users can upload PDF, DOC, or TXT files using the web interface to the S3 upload bucket.\n  - Upon upload, an AWS Lambda function is triggered to validate the uploaded documents.\n  - If validation is successful, the file is moved to the designated S3 bucket; otherwise, the file is deleted.\n\n### Index Source and Create Knowledge Base\n\n- Amazon Bedrock RAG Knowledge Base:\n  - Securely syncs Foundation Models (FMs) to S3 data sources.\n  - Ingests, retrieves, and augments queries and retrieval.\n  - Offers built-in session content management for multi-turn conversations.\n  - Provides automatic citations with retrievals.\n  - Embedding Model: Amazon Titan Embedding G1 - Text.\n  - Vector Store: Amazon OpenSearch Serverless and Pinecone.\n- AWS Kendra:\n  - Considered for future implementation.\n  - Offers better out-of-the-box integration with various other unstructured data sources.\n  - Creates an enterprise search index off of S3 content sources and metadata.\n\n### Chat Interface\n\nThe chat interface serves as a Q\u0026A GenAI knowledge base-powered conversation platform, built on top of imported content.\n\n- AWS Lex: Acts as the chat interface for the Q\u0026A knowledge base.\n- Model Utilization: Leverages Anthropic Claude V2 models.\n- Content Integration: Uses AWS Bedrock as the RAG content foundation.\n\n## Conclusion\n\nThis project showcases the power of combining various AWS services to create a highly scalable, interactive, and serverless knowledge base platform. By leveraging the capabilities of Amazon Bedrock, AWS Lambda, and other cloud services, we have established a robust framework for future enhancements and integrations.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmickelson%2Fknowledgebaseproject","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdmickelson%2Fknowledgebaseproject","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmickelson%2Fknowledgebaseproject/lists"}