Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with evaluation-metrics
A curated list of projects in awesome lists tagged with evaluation-metrics .
https://github.com/confident-ai/deepeval
The LLM Evaluation Framework
evaluation-framework evaluation-metrics llm-evaluation llm-evaluation-framework llm-evaluation-metrics
Last synced: 01 Oct 2024
https://github.com/agentops-ai/agentops
Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen
agent agentops ai anthropic autogen cost-estimation crewai evals evaluation-metrics groq langchain llm mistral ollama openai
Last synced: 26 Sep 2024
https://github.com/xinshuoweng/ab3dmot
(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
2d-mot-evaluation 3d-mot 3d-multi 3d-multi-object-tracking 3d-tracking computer-vision evaluation evaluation-metrics kitti kitti-3d machine-learning multi-object-tracking real-time robotics tracking
Last synced: 30 Sep 2024
https://github.com/xinshuoweng/AB3DMOT
(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
2d-mot-evaluation 3d-mot 3d-multi 3d-multi-object-tracking 3d-tracking computer-vision evaluation evaluation-metrics kitti kitti-3d machine-learning multi-object-tracking real-time robotics tracking
Last synced: 31 Jul 2024
https://github.com/AgentOps-AI/agentops
Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen
agent agentops ai anthropic autogen cost-estimation crewai evals evaluation-metrics groq langchain llm mistral ollama openai
Last synced: 31 Jul 2024
https://github.com/google-research/rliable
[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.
benchmarking evaluation-metrics google machine-learning reinforcement-learning rl
Last synced: 02 Aug 2024
https://github.com/MIND-Lab/OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
bayesian-optimization evaluation-metrics hyperparameter-optimization hyperparameter-search hyperparameter-tuning latent-dirichlet-allocation latent-semantic-analysis natural-language-processing neural-topic-models nlp nlp-library nlproc non-negative-matrix-factorization topic-modeling topic-models
Last synced: 02 Aug 2024
https://github.com/jitsi/jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
automatic-speech-recognition evaluation-metrics python3 speech-to-text wer word-error-rate
Last synced: 01 Aug 2024
https://github.com/up42/image-similarity-measures
:chart_with_upwards_trend: Implementation of eight evaluation metrics to access the similarity between two images. The eight metrics are as follows: RMSE, PSNR, SSIM, ISSM, FSIM, SRE, SAM, and UIQ.
evaluation-metrics image machine-learning metrics p1 processing
Last synced: 03 Aug 2024
https://github.com/proycon/pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
computational-linguistics evaluation-metrics folia language-modelling library linguistics machine-learning natural-language-processing nlp nlp-library python search-algorithms text-processing
Last synced: 02 Oct 2024
https://github.com/huggingface/lighteval
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
evaluation evaluation-framework evaluation-metrics huggingface
Last synced: 01 Aug 2024
https://github.com/Unbabel/COMET
A Neural Framework for MT Evaluation
artificial-intelligence evaluation-metrics machine-learning machine-translation natural-language-processing nlp
Last synced: 07 Aug 2024
https://github.com/AmenRa/ranx
β‘οΈA Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion π
comparison data-fusion evaluation evaluation-metrics information-retrieval information-retrieval-evaluation information-retrieval-metrics metasearch numba python rank-fusion ranking-metrics recommender-systems score-fusion
Last synced: 01 Aug 2024
https://github.com/v-iashin/SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
audio audio-generation bmvc evaluation-metrics gan melgan multi-modal pytorch transformer vas vggsound video video-features video-understanding vqvae
Last synced: 01 Aug 2024
https://github.com/relari-ai/continuous-eval
Open-Source Evaluation for GenAI Application Pipelines
evaluation-framework evaluation-metrics information-retrieval llm-evaluation llmops rag retrieval-augmented-generation
Last synced: 01 Aug 2024
https://github.com/salesforce/factCC
Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper
evaluation-metrics text-summarization
Last synced: 01 Aug 2024
https://github.com/TonicAI/tonic_validate
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
evaluation-framework evaluation-metrics large-language-models llm llmops llms rag retrieval-augmented-generation
Last synced: 01 Aug 2024
https://github.com/davidsbatista/NER-Evaluation
An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tokens that are part of the named-entity
crfsuite evaluation-metrics named-entity-recognition ner ner-evaluation notebook-jupyter semeval semeval-2013
Last synced: 07 Aug 2024
https://github.com/clovaai/CLEval
CLEval: Character-Level Evaluation for Text Detection and Recognition Tasks
end-to-end-ocr evaluation-metrics text-detection text-detection-recognition text-recognition
Last synced: 07 Aug 2024
https://github.com/MantisAI/nervaluate
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEvalβ13
evaluation-metrics machine-learning named-entity-recognition natural-language-processing sequence-models
Last synced: 02 Aug 2024
https://github.com/athina-ai/athina-evals
Python SDK for running evaluations on LLM generated responses
evaluation evaluation-framework evaluation-metrics llm-eval llm-evaluation llm-evaluation-toolkit llm-ops llmops
Last synced: 01 Aug 2024
https://github.com/jantrienes/nereval
Evaluation script for named entity recognition (NER) systems based on entity-level F1 score.
evaluation-metrics machine-learning named-entity-recognition nlp
Last synced: 07 Aug 2024
https://github.com/silviatti/topic-model-diversity
A collection of topic diversity measures for topic modeling
evaluation-metrics gensim latent-dirichlet-allocation lda topic-diversity topic-diversity-measures topic-model topic-modeling topic-modeling-analysis topic-models
Last synced: 02 Aug 2024
https://github.com/sharmaroshan/Insurance-Claim-Prediction
In this Data set we are Predicting the Insurance Claim by each user, Machine Learning algorithms for Regression analysis are used and Data Visualization are also performed to support Analysis.
beginner classification data-analysis data-visualization eda evaluation-metrics finance machine-learning radar-chart
Last synced: 08 Aug 2024
https://github.com/orchardbirds/bokbokbok
Custom Loss Functions and Evaluation Metrics for XGBoost and LightGBM
binary-classification custom-loss-functions evaluation-metrics focal-loss lightgbm loss-functions regression rmspe xgboost
Last synced: 02 Oct 2024
https://github.com/aldenhovel/bleu-rouge-meteor-cider-spice-eval4imagecaption
Evaluation tools for image captioning. Including BLEU, ROUGE-L, CIDEr, METEOR, SPICE scores.
bleu cider evaluation-metrics image-captioning meteor rouge-l spice
Last synced: 30 Sep 2024
https://github.com/kenlimmj/fightin-words
A scikit-learn compliant implementation of Monroe et al.'s Fightin' Words analysis method.
bayesian-methods evaluation-metrics nlp scikit-learn
Last synced: 28 Sep 2024
https://github.com/34j/boost-loss
Utilities for easy use of custom losses in CatBoost, LightGBM, XGBoost.
autograd catboost custom-loss custom-loss-functions evaluation-metrics gbdt gradient-boosting hacktoberfest lightgbm pytorch scikit-learn sklearn sklearn-compatible xgboost
Last synced: 30 Sep 2024
https://github.com/eonu/daze
Better multi-class confusion matrix plots for Scikit-Learn, incorporating per-class and overall evaluation measures.
accuracy classification confusion-matrix confusion-matrix-heatmap evaluation evaluation-metrics f1-score measures multi-class plot precision recall scikit-learn
Last synced: 26 Sep 2024
https://github.com/anindyadeep/easy_eval
A lightweight modular lm-evaluation framework to build evaluation workflows for your LLMs
evaluation evaluation-metrics llama llm
Last synced: 27 Sep 2024
https://github.com/ngmarchant/activeeval
A Python package for pool-based active evaluation
active-evaluation crowdsourcing evaluation evaluation-metrics python-package
Last synced: 01 Aug 2024
https://github.com/fracpete/rmspe-weka-package
Weka package that adds the RMSPE (Root Mean Square Percentage Error) as metric for classifiers.
Last synced: 02 Oct 2024
https://github.com/laoluadewoye/skloverlay
This repository is the official location of the SKLOverlay Project. Here, it will hold everything used for the package on Py Pi, including source files.
classification classification-algorithm data-science data-wrangling evaluation-metrics excel graphics graphs machine-learning machine-learning-algorithms matplotlib modeling pandas preprocessing scikit-learn
Last synced: 26 Sep 2024
https://github.com/greatepee/book-recommendation-system
Book Recommendation System
collaborative-filtering cosine-similarity evaluation-metrics k-precision kaggle kaggledatasets mae numpy pandas python scikit-learn scipy svd svd-matrix-factorisation
Last synced: 28 Sep 2024