Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/awa-ai/awadb

AI Native database for embedding vectors
https://github.com/awa-ai/awadb

ai-native aigc chatgpt embedding-vectors llm vectordb

Last synced: 3 months ago
JSON representation

AI Native database for embedding vectors

Awesome Lists containing this project

README

        

# AwaDB - AI Native Database for embedding vectors

Easily Use - No boring database schema definition. No need to pay attention to vector indexing details.

Realtime Search - Lock free realtime index keeps new data fresh with millisecond level latency. No wait no manual operation.

Stability - AwaDB builds upon over 5 years experience running production workloads at scale using a system called [Vearch](https://github.com/vearch/vearch), combined with best-of-breed ideas and practices from the community.

## Run awadb locally on Mac OSX or Linux

First install awadb:
```bash
pip3 install awadb
```

Then use as below:
```bash
import awadb
# 1. Initialize awadb client!
awadb_client = awadb.Client()

# 2. Create table
awadb_client.Create("test_llm1")

# 3. Add sentences, the sentence is embedded with SentenceTransformer by default
# You can also embed the sentences all by yourself with OpenAI or other LLMs
awadb_client.Add([{'embedding_text':'The man is happy'}, {'source' : 'pic1'}])
awadb_client.Add([{'embedding_text':'The man is very happy'}, {'source' : 'pic2'}])
awadb_client.Add([{'embedding_text':'The cat is happy'}, {'source' : 'pic3'}])
awadb_client.Add([{'embedding_text':'The man is eating'}, {'source':'pic4'}])

# 4. Search the most Top3 sentences by the specified query
query = "The man is happy"
results = awadb_client.Search(query, 3)

# Output the results
print(results)
```
Here the text is embedded by SentenceTransformer which is supported by [Hugging Face](https://huggingface.co)
More detailed python local library usage you can read [here](https://ljeagle.github.io/awadb/)

## Run AwaDB as a service
If you are on the Windows platform or want a awadb service, you can download and deploy the awadb docker.
The installation of awadb docker please see [here](https://github.com/awa-ai/awadb/tree/main/docs/source/docker_deploy.md)

- Python Usage

First, Install gRPC and awadb service python client as below:

```bash
pip3 install grpcio
pip3 install awadb-client
```

A simple example as below:

```bash
# Import the package and module
from awadb_client import Awa

# Initialize awadb client
client = Awa()

# Add dict with vector to table 'example1'
client.add("example1", {'name':'david', 'feature':[1.3, 2.5, 1.9]})
client.add("example1", {'name':'jim', 'feature':[1.1, 1.4, 2.3]})

# Search
results = client.search("example1", [1.0, 2.0, 3.0])

# Output results
print(results)

# '_id' is the primary key of each document
# It can be specified clearly when adding documents
# Here no field '_id' is specified, it is generated by the awadb server
db_name: "default"
table_name: "example1"
results {
total: 2
msg: "Success"
result_items {
score: 0.860000074
fields {
name: "_id"
value: "64ddb69d-6038-4311-9118-605686d758d9"
}
fields {
name: "name"
value: "jim"
}
}
result_items {
score: 1.55
fields {
name: "_id"
value: "f9f3035b-faaf-48d4-a947-801416c005b3"
}
fields {
name: "name"
value: "david"
}
}
}
result_code: SUCCESS
```
More python sdk for service is [here](https://ljeagle.github.io/awadb/)

- RESTful Usage
```bash
# add documents to table 'test' of db 'default', no need to create table first
curl -H "Content-Type: application/json" -X POST -d '{"db":"default", "table":"test", "docs":[{"_id":1, "name":"lj", "age":23, "f":[1,0]},{"_id":2, "name":"david", "age":32, "f":[1,2]}]}' http://localhost:8080/add

# search documents by the vector field 'f' of the value '[1, 1]'
curl -H "Content-Type: application/json" -X POST -d '{"db":"default", "table":"test", "vector_query":{"f":[1, 1]}}' http://localhost:8080/search
```
More detailed RESTful API is [here](https://github.com/awa-ai/awadb/tree/main/docs/source/restful_tutorial.md)

## What are the Embeddings?

Any unstructured data(image/text/audio/video) can be transferred to vectors which are generally understanded by computers through AI(LLMs or other deep neural networks).

For example, "The man is happy"-this sentence can be transferred to a 384-dimension vector(a list of numbers `[0.23, 1.98, ....]`) by SentenceTransformer language model. This process is called embedding.

More detailed information about embeddings can be read from [OpenAI](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings)

Awadb uses [Sentence Transformers](https://huggingface.co/sentence-transformers) to embed the sentence by default, while you can also use OpenAI or other LLMs to do the embeddings according to your needs.

## Get involved

- [Issues and PR](https://github.com/awa-ai/awadb/issues)
- [Roadmap and Contribution](https://github.com/awa-ai/awadb/blob/main/ROADMAP.md)

## License

[Apache 2.0](./LICENSE)

## Community

Join the AwaDB community to share any problem, suggestion, or discussion with us:

- [Discord](https://discord.gg/GP7QxRrDjB)
- [Slack](https://awadbhq.slack.com)
- [Reddit](https://www.reddit.com/r/Awadb/)