An open API service indexing awesome lists of open source software.

https://github.com/jbellis/coherepedia-jvector


https://github.com/jbellis/coherepedia-jvector

Last synced: 9 months ago
JSON representation

Awesome Lists containing this project

README

          

# Coherepedia-JVector

This indexes the [Cohere v3 Wikipedia dataset](https://huggingface.co/datasets/Cohere/wikipedia-2023-11-embed-multilingual-v3) using [JVector](https://github.com/jbellis/jvector).

# Setup

Edit `download.py` with the location you want to save the 180GB dataset.
Then edit Main.java with the corresponding location.

# Usage

Run `Main` class (no maven targets, easiest is to import it to your ide)