Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nitin-bommi/qa-evaluation
https://github.com/nitin-bommi/qa-evaluation
Last synced: 25 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/nitin-bommi/qa-evaluation
- Owner: nitin-bommi
- Created: 2024-03-24T21:17:11.000Z (8 months ago)
- Default Branch: master
- Last Pushed: 2024-04-23T17:34:47.000Z (7 months ago)
- Last Synced: 2024-04-23T19:21:59.646Z (7 months ago)
- Language: Jupyter Notebook
- Size: 436 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# QA-evaluation
Questions and answers taken from: https://github.com/youssefHosni/Data-Science-Interview-Questions-Answers/blob/main/Deep%20Learning%20Questions%20%26%20Answers%20for%20Data%20Scientists.md
Execute QA.ipynb file.
WorkFlow:
1) Dataset has question, student answer and the Text book from which it was taken.
2) We pass the question to the RAG application, It generates the answer from the Text book; we consider it Teacher answer.
3) Then we pass the student answer from the dataset and the teacher answer.
4) The evaluation code will produce a score.Evaluation Results:
1) We need to show : (i) how RAG is performing (means, producing asnwers similar to ground truth teacher answer), (ii) how our evaluation metric us producing scores( means, how similar is our model scoring when comoareed to ground truth scores).
(i) RAG performance:
- We are comparing, the answer generated by RAG from the context Text Book and the ground truth answer (as the dataset doesn't have ground truth, we are scraping it from web).
- We are using BertScore to compare them.(ii) Student Answer Evaluation:
- Pre-Processing: Accronym expantion, spelling correction, coreference resolution, sentence tokenizarion.
- Processing: Sentence-Wise semantic score , Immutable phrases extraction.
- Scoring: Dynamic weighted scoring ( Based on correct phrases and sematinc scores).![image](https://github.com/nitin-bommi/QA-evaluation/assets/56592896/56509417-fc90-4a29-a422-aa9e8c386293)