https://github.com/jayantaadhikary/dataset-error-reduction

Reducing Error in NLP Datasets using LLMs
https://github.com/jayantaadhikary/dataset-error-reduction

Last synced: 4 months ago
JSON representation

Reducing Error in NLP Datasets using LLMs

Host: GitHub
URL: https://github.com/jayantaadhikary/dataset-error-reduction
Owner: jayantaadhikary
Created: 2024-01-25T05:14:48.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-06-01T06:22:45.000Z (11 months ago)
Last Synced: 2024-06-02T07:50:27.259Z (11 months ago)
Language: Jupyter Notebook
Size: 6.42 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome_ai_agents - Dataset-Error-Reduction - Reducing Error in NLP Datasets using LLMs (Building / Datasets)
awesome_ai_agents - Dataset-Error-Reduction - Reducing Error in NLP Datasets using LLMs (Building / Datasets)

README

        # Dataset Error Reduction

To reduce error in NLP Datasets using LLMs.

The datasets used currently are [SQuAD](https://rajpurkar.github.io/SQuAD-explorer) and [RACE](https://www.cs.cmu.edu/~glai1/data/race) which are Extractive Question Answering Datasets.

### Use in your system

Fork and Clone this Repository.

1. If you are using a OpenAI API key, create a `.env` file in the same directory and add your OpenAI API key.   

`OPENAI_API_KEY = YOUR_API_KEY`

2. If you are using a LLM model locally using a local server, then use the files inside of the LocalLLM folder instead of the default files. The folder /ollama consists of my notebooks which were using Ollama to run LLMs locally. 

PS - I use [LM Studio](https://lmstudio.ai/) and [Ollama](https://ollama.ai)to run the LLMs locally. 

#### References:

1. [SQuAD: 100,000+ Questions for Machine Comprehension of Text](https://aclanthology.org/D16-1264) (Rajpurkar et al., EMNLP 2016)

2. [RACE: Large-scale ReAding Comprehension Dataset From Examinations](https://aclanthology.org/D17-1082) (Lai et al., EMNLP 2017)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jayantaadhikary/dataset-error-reduction

Awesome Lists containing this project

README