An open API service indexing awesome lists of open source software.

https://github.com/wb-az/ai-performance-analytics-llms

This repository is created to compared foundational models at different task, using visualisation and statistics.
https://github.com/wb-az/ai-performance-analytics-llms

anova bard-ai contingency-analysis contingency-table foundational-models kruskal-wallis large-language-models likert-data pandas-python plotly-python posthoc-tests prompts scikit-posthocs scipy scipy-optimize scipy-stats statsmodels

Last synced: about 2 months ago
JSON representation

This repository is created to compared foundational models at different task, using visualisation and statistics.

Awesome Lists containing this project

README

        

# AI Analytics - Comparing Performance of GenAI Models

This repository was created to compare the performance of foundational models at different tasks and levels of complexity, using visualisation and statistics.

**Data**:

Data was facilitated by [DataAnnotationTech](https://www.dataannotation.tech/)

**Categories**:

1. Adversarial Dishonesty
2. Adversarial Harmfulness
3. Brain Storming
4. Classification
5. Closed QA
6. Creative Writing
7. Extraction
8. Mathematical Reasoning
10. Open QA
11. Rewriting
12. Poetry
13. Rewriting
14. Sumarization

**Likertype rating scale**

1. Bard much better
2. Bard better
3. Bard slightly better
4. About the same
5. ChatGPT slightly better
6. ChatGPT better
7. Chat GPT much better

Tools used: ```pandas```, ```plotly```, ```statsmodels``` and ```scipy``` and ```scikit-posthocs```