https://github.com/wb-az/ai-performance-analytics-llms

This repository is created to compared foundational models at different task, using visualisation and statistics.
https://github.com/wb-az/ai-performance-analytics-llms

anova bard-ai contingency-analysis contingency-table foundational-models kruskal-wallis large-language-models likert-data pandas-python plotly-python posthoc-tests prompts scikit-posthocs scipy scipy-optimize scipy-stats statsmodels

Last synced: 3 months ago
JSON representation

This repository is created to compared foundational models at different task, using visualisation and statistics.

Host: GitHub
URL: https://github.com/wb-az/ai-performance-analytics-llms
Owner: Wb-az
Created: 2025-02-21T16:34:09.000Z (4 months ago)
Default Branch: main
Last Pushed: 2025-03-14T11:32:13.000Z (3 months ago)
Last Synced: 2025-03-14T12:23:11.686Z (3 months ago)
Topics: anova, bard-ai, contingency-analysis, contingency-table, foundational-models, kruskal-wallis, large-language-models, likert-data, pandas-python, plotly-python, posthoc-tests, prompts, scikit-posthocs, scipy, scipy-optimize, scipy-stats, statsmodels
Language: Jupyter Notebook
Homepage:
Size: 1.82 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# AI Analytics - Comparing Performance of GenAI Models

This repository was created to compare the performance of foundational models at different tasks and levels of complexity, using visualisation and statistics.

**Data**:

Data was facilitated by [DataAnnotationTech](https://www.dataannotation.tech/)

**Categories**:

1. Adversarial Dishonesty
2. Adversarial Harmfulness
3. Brain Storming
4. Classification
5. Closed QA
6. Creative Writing
7. Extraction
8. Mathematical Reasoning
10. Open QA
11. Rewriting
12. Poetry
13. Rewriting
14. Sumarization

**Likertype rating scale**

1. Bard much better
2. Bard better
3. Bard slightly better
4. About the same
5. ChatGPT slightly better
6. ChatGPT better
7. Chat GPT much better

Tools used: ```pandas```, ```plotly```, ```statsmodels``` and ```scipy``` and ```scikit-posthocs```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wb-az/ai-performance-analytics-llms

Awesome Lists containing this project

README