https://github.com/arjun-g/kb-bot.core
https://github.com/arjun-g/kb-bot.core
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/arjun-g/kb-bot.core
- Owner: arjun-g
- License: mit
- Created: 2024-08-23T03:31:45.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-08-23T18:35:24.000Z (8 months ago)
- Last Synced: 2024-11-06T03:51:10.263Z (6 months ago)
- Language: Python
- Size: 20.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# kb-bot.core
The core library of kb-bot. This handles indexing of content, and chatting with the bot.
## Getting Started
```bash
pip install kb-bot.core
```## Preparing the KB
```python
import os
from kb_bot import KBBot
from kb_bot.scraper import WebScraper
from kb_bot.db import TiDBProvider
from kb_bot.chunkers import SemanticChunker
from kb_bot.embedding import OpenAIEmbedClient
from kb_bot.llm import OpenAI# Configure env variables TIDB_DATABASE, TIDB_USERNAME, TIDB_PASSWORD, TIDB_HOST, TIDB_PORT
tidb = TiDBProvider()
tidb.connect()embed_client = OpenAIEmbedClient(
api_key=os.environ.get('OPENAI_API_KEY')
)scraper = WebScraper(
urls=["https://www.pingcap.com/blog/"],
follow_links=True,
restrict_navigation_css=".tmpl-archive.tmpl-archive-blog",
restrict_css=".tmpl-single-post__content",
ignore_css=".tmpl-archive-sidebar",
chunker=SemanticChunker(
embedding_client=embed_client
),
db_provider=tidb,
group="",
embedding_client=embed_client
)scraper.crawl()
```## Chat with the bot
```python
bot = KBBot(
db_provider=tidb,
embedding_client=embed_client,
group="new-user-id",
llm_client=OpenAI(
api_key=os.environ.get('OPENAI_API_KEY')
),
history=[],
tasks_prompt=""
)response = bot.chat(message="what are the benefits of vector search ?")
print(response)
```## TODO
- Add support for multiple llm(s)
- Implement Agentic Chunker
- Add test cases