Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with evaluation-metrics

A curated list of projects in awesome lists tagged with evaluation-metrics .

https://github.com/agentops-ai/agentops

Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen

agent agentops ai anthropic autogen cost-estimation crewai evals evaluation-metrics groq langchain llm mistral ollama openai

Last synced: 26 Sep 2024

https://github.com/xinshuoweng/ab3dmot

(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"

2d-mot-evaluation 3d-mot 3d-multi 3d-multi-object-tracking 3d-tracking computer-vision evaluation evaluation-metrics kitti kitti-3d machine-learning multi-object-tracking real-time robotics tracking

Last synced: 30 Sep 2024

https://github.com/xinshuoweng/AB3DMOT

(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"

2d-mot-evaluation 3d-mot 3d-multi 3d-multi-object-tracking 3d-tracking computer-vision evaluation evaluation-metrics kitti kitti-3d machine-learning multi-object-tracking real-time robotics tracking

Last synced: 31 Jul 2024

https://github.com/AgentOps-AI/agentops

Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen

agent agentops ai anthropic autogen cost-estimation crewai evals evaluation-metrics groq langchain llm mistral ollama openai

Last synced: 31 Jul 2024

https://github.com/google-research/rliable

[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

benchmarking evaluation-metrics google machine-learning reinforcement-learning rl

Last synced: 02 Aug 2024

https://github.com/jitsi/jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

automatic-speech-recognition evaluation-metrics python3 speech-to-text wer word-error-rate

Last synced: 01 Aug 2024

https://github.com/up42/image-similarity-measures

:chart_with_upwards_trend: Implementation of eight evaluation metrics to access the similarity between two images. The eight metrics are as follows: RMSE, PSNR, SSIM, ISSM, FSIM, SRE, SAM, and UIQ.

evaluation-metrics image machine-learning metrics p1 processing

Last synced: 03 Aug 2024

https://github.com/proycon/pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

computational-linguistics evaluation-metrics folia language-modelling library linguistics machine-learning natural-language-processing nlp nlp-library python search-algorithms text-processing

Last synced: 02 Oct 2024

https://github.com/huggingface/lighteval

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.

evaluation evaluation-framework evaluation-metrics huggingface

Last synced: 01 Aug 2024

https://github.com/v-iashin/SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

audio audio-generation bmvc evaluation-metrics gan melgan multi-modal pytorch transformer vas vggsound video video-features video-understanding vqvae

Last synced: 01 Aug 2024

https://github.com/salesforce/factCC

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

evaluation-metrics text-summarization

Last synced: 01 Aug 2024

https://github.com/TonicAI/tonic_validate

Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.

evaluation-framework evaluation-metrics large-language-models llm llmops llms rag retrieval-augmented-generation

Last synced: 01 Aug 2024

https://github.com/davidsbatista/NER-Evaluation

An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tokens that are part of the named-entity

crfsuite evaluation-metrics named-entity-recognition ner ner-evaluation notebook-jupyter semeval semeval-2013

Last synced: 07 Aug 2024

https://github.com/clovaai/CLEval

CLEval: Character-Level Evaluation for Text Detection and Recognition Tasks

end-to-end-ocr evaluation-metrics text-detection text-detection-recognition text-recognition

Last synced: 07 Aug 2024

https://github.com/MantisAI/nervaluate

Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13

evaluation-metrics machine-learning named-entity-recognition natural-language-processing sequence-models

Last synced: 02 Aug 2024

https://github.com/jantrienes/nereval

Evaluation script for named entity recognition (NER) systems based on entity-level F1 score.

evaluation-metrics machine-learning named-entity-recognition nlp

Last synced: 07 Aug 2024

https://github.com/sharmaroshan/Insurance-Claim-Prediction

In this Data set we are Predicting the Insurance Claim by each user, Machine Learning algorithms for Regression analysis are used and Data Visualization are also performed to support Analysis.

beginner classification data-analysis data-visualization eda evaluation-metrics finance machine-learning radar-chart

Last synced: 08 Aug 2024

https://github.com/orchardbirds/bokbokbok

Custom Loss Functions and Evaluation Metrics for XGBoost and LightGBM

binary-classification custom-loss-functions evaluation-metrics focal-loss lightgbm loss-functions regression rmspe xgboost

Last synced: 02 Oct 2024

https://github.com/aldenhovel/bleu-rouge-meteor-cider-spice-eval4imagecaption

Evaluation tools for image captioning. Including BLEU, ROUGE-L, CIDEr, METEOR, SPICE scores.

bleu cider evaluation-metrics image-captioning meteor rouge-l spice

Last synced: 30 Sep 2024

https://github.com/kenlimmj/fightin-words

A scikit-learn compliant implementation of Monroe et al.'s Fightin' Words analysis method.

bayesian-methods evaluation-metrics nlp scikit-learn

Last synced: 28 Sep 2024

https://github.com/eonu/daze

Better multi-class confusion matrix plots for Scikit-Learn, incorporating per-class and overall evaluation measures.

accuracy classification confusion-matrix confusion-matrix-heatmap evaluation evaluation-metrics f1-score measures multi-class plot precision recall scikit-learn

Last synced: 26 Sep 2024

https://github.com/anindyadeep/easy_eval

A lightweight modular lm-evaluation framework to build evaluation workflows for your LLMs

evaluation evaluation-metrics llama llm

Last synced: 27 Sep 2024

https://github.com/ngmarchant/activeeval

A Python package for pool-based active evaluation

active-evaluation crowdsourcing evaluation evaluation-metrics python-package

Last synced: 01 Aug 2024

https://github.com/fracpete/rmspe-weka-package

Weka package that adds the RMSPE (Root Mean Square Percentage Error) as metric for classifiers.

evaluation-metrics java weka

Last synced: 02 Oct 2024

https://github.com/laoluadewoye/skloverlay

This repository is the official location of the SKLOverlay Project. Here, it will hold everything used for the package on Py Pi, including source files.

classification classification-algorithm data-science data-wrangling evaluation-metrics excel graphics graphs machine-learning machine-learning-algorithms matplotlib modeling pandas preprocessing scikit-learn

Last synced: 26 Sep 2024