{"id":26327236,"url":"https://github.com/wb-az/ai-performance-analytics-llms","last_synced_at":"2025-03-15T20:17:57.321Z","repository":{"id":278781939,"uuid":"936761131","full_name":"Wb-az/AI-Performance-Analytics-LLMs","owner":"Wb-az","description":"This repository is created to compared foundational models at different task, using visualisation and statistics.","archived":false,"fork":false,"pushed_at":"2025-03-14T11:32:13.000Z","size":1912,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-14T12:23:11.686Z","etag":null,"topics":["anova","bard-ai","contingency-analysis","contingency-table","foundational-models","kruskal-wallis","large-language-models","likert-data","pandas-python","plotly-python","posthoc-tests","prompts","scikit-posthocs","scipy","scipy-optimize","scipy-stats","statsmodels"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Wb-az.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-21T16:34:09.000Z","updated_at":"2025-03-14T11:32:16.000Z","dependencies_parsed_at":"2025-02-21T17:34:02.817Z","dependency_job_id":"fff6e89c-86c1-476c-a03c-bea335eecfa5","html_url":"https://github.com/Wb-az/AI-Performance-Analytics-LLMs","commit_stats":null,"previous_names":["wb-az/ai-analytics"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wb-az%2FAI-Performance-Analytics-LLMs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wb-az%2FAI-Performance-Analytics-LLMs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wb-az%2FAI-Performance-Analytics-LLMs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wb-az%2FAI-Performance-Analytics-LLMs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Wb-az","download_url":"https://codeload.github.com/Wb-az/AI-Performance-Analytics-LLMs/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243785077,"owners_count":20347409,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anova","bard-ai","contingency-analysis","contingency-table","foundational-models","kruskal-wallis","large-language-models","likert-data","pandas-python","plotly-python","posthoc-tests","prompts","scikit-posthocs","scipy","scipy-optimize","scipy-stats","statsmodels"],"created_at":"2025-03-15T20:17:56.722Z","updated_at":"2025-03-15T20:17:57.315Z","avatar_url":"https://github.com/Wb-az.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Analytics - Comparing Performance of GenAI Models \n\nThis repository was created to compare the performance of foundational models at different tasks and levels of complexity, using visualisation and statistics.\n\n**Data**:\n\nData was facilitated by [DataAnnotationTech](https://www.dataannotation.tech/) \n\n\n**Categories**:\n\n1. Adversarial Dishonesty\n2. Adversarial Harmfulness\n3. Brain Storming\n4. Classification\n5. Closed QA\n6. Creative Writing\n7. Extraction\n8. Mathematical Reasoning\n10. Open QA\n11. Rewriting\n12. Poetry\n13. Rewriting\n14. Sumarization\n\n\n**Likertype rating scale**\n\n1. Bard much better\n2. Bard better\n3. Bard slightly better\n4. About the same\n5. ChatGPT slightly better\n6. ChatGPT better\n7. Chat GPT much better\n\nTools used: ```pandas```, ```plotly```, ```statsmodels``` and ```scipy``` and ```scikit-posthocs```\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwb-az%2Fai-performance-analytics-llms","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwb-az%2Fai-performance-analytics-llms","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwb-az%2Fai-performance-analytics-llms/lists"}