https://github.com/IngestAI/embedditor

⚡ GUI for editing LLM vector embeddings. No more blind chunking. Upload content in any file extension, join and split chunks, edit metadata and embedding tokens + remove stop-words and punctuation with one click, add images, and download in .veml to share it with your team.
https://github.com/IngestAI/embedditor

datapreprocessing datascience embedding-vectors embeddings genai laravel llm markup-language ml nlp nltk php vector-database vector-search vectorization veml

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/IngestAI/embedditor
Owner: IngestAI
License: agpl-3.0
Created: 2023-05-19T13:17:04.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-11-21T16:28:56.000Z (over 1 year ago)
Last Synced: 2024-10-31T03:35:34.401Z (8 months ago)
Topics: datapreprocessing, datascience, embedding-vectors, embeddings, genai, laravel, llm, markup-language, ml, nlp, nltk, php, vector-database, vector-search, vectorization, veml
Language: PHP
Homepage: https://embedditor.ai
Size: 1.74 MB
Stars: 221
Watchers: 3
Forks: 15
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

Embedditor

Embedditor is the open-source MS Word equivalent for embedding that helps you get the most out of your vector search.

[![PHP version](https://img.shields.io/badge/PHP%208.2-brightgreen)](http://php.org)
[![Laravel version](https://img.shields.io/badge/Laravel%2010.x-green.svg)](https://conventionalcommits.org)

Website •
Discord •
Twitter •
Documentation •
Try demo on IngestAI

# Get the most out of your vector search

Embedditor is an open source embedding pre-reprocessing editor, that helps you edit GPT / LLM embeddings just as if it's a Microsoft Word document, so you can get the most out of your vector search, while significanty reducing costs of embedding and vector storage.

# Join Our Community

[![Stargazers repo roster for @embedditor/embedditor](https://reporoster.com/stars/embedditor/embedditor)](https://github.com/embedditor/embedditor/stargazers)

# Features
**Rich editor Interface**

- ⚡ Join and split one or multiple chunks with a few clicks
- ⚡ Edit embedding metadata and tokens
- ⚡ Exclude words, sentences, or even parts of chunks from embedding
- ⚡ Select the parts of chunk you want to be embedded
- ⚡ Add additional information to your mebeddings, like url links or images
- ⚡ Get a nice looking HTML-markup for your AI search results
- ⚡ Save your pre-processed embedding files in .veml or .jason formats

**Pre-processing automation**
- ⚡ Filteer our from vectorization most of the 'noise', like punctuations or stop-words
- ⚡ Remove from embedidng unsignificant, requently used words with TF-IDF algorithm
- ⚡ Normalize your embedding tokens before vectorization

# Benefits
**Rich Spreadsheet Interface**

- ⚡ Optimized relevance of the content retrieved from a vector database
- ⚡ Improved efficiency and accuracy in your AI / LLM-related applications
- ⚡ Visually better looking search results with images, url links, etc
- ⚡ Increased cost-efficiency with up to 30% cost-reduction on embedding and vector storage
- ⚡ Full control over your data, effortlessly deploying Embedditor locally on your PC or dedicated envirement
- ⚡ Save your pre-processed or ready embeddings in .json or .veml format to use it in LangChain, Chromat or any other Vector DB

## Quick try
**Sign up for free and try it in [IngestAI](https://ingestai.io/signup).**

# GUI

Access Dashboard using: [http://localhost:8080/](http://localhost:8080/)

# Screenshots

![1](https://embedditor.ai/images/embedditor_ui_01.png)
![2](https://embedditor.ai/images/embedditor_ui_02.png)
![3](https://embedditor.ai/images/embedditor_ui_03.png)
![4](https://embedditor.ai/images/embedditor_ui_04.png)

## Installation

1. Copy .env.example into .env

2. Set the following settings in the .env

`OPENAI_API_KEY=`

3. Setup the project

- `php artisan migrate`
- `php artisan db:seed`
- `php artisan storage:link`

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/IngestAI/embedditor

Awesome Lists containing this project

README

Embedditor