An open API service providing repository metadata for many open source software ecosystems.

Topic: "evaluation-metrics"

confident-ai/deepeval

The LLM Evaluation Framework

Language: Python - Size: 78 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 6,185 - Forks: 536

AgentOps-AI/agentops

Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including OpenAI Agents SDK, CrewAI, Langchain, Autogen, AG2, and CamelAI

Language: Python - Size: 144 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4,352 - Forks: 388

datawhalechina/tiny-universe

《大模型白盒子构建指南》:一个全手搓的Tiny-Universe

Language: Jupyter Notebook - Size: 19.9 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2,791 - Forks: 288

xinshuoweng/AB3DMOT

(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"

Language: Python - Size: 181 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 1,729 - Forks: 406

huggingface/lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Language: Python - Size: 4.84 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,493 - Forks: 234

huggingface/evaluation-guidebook

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Language: Jupyter Notebook - Size: 1.01 MB - Last synced at: about 19 hours ago - Pushed at: 4 months ago - Stars: 1,330 - Forks: 81

MIND-Lab/OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

Language: Python - Size: 168 MB - Last synced at: 12 days ago - Pushed at: 10 months ago - Stars: 758 - Forks: 111

jitsi/jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Language: Python - Size: 1.68 MB - Last synced at: about 23 hours ago - Pushed at: 3 months ago - Stars: 722 - Forks: 103

google-research/rliable

[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

Language: Jupyter Notebook - Size: 1.86 MB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 708 - Forks: 42

nekhtiari/image-similarity-measures

:chart_with_upwards_trend: Implementation of eight evaluation metrics to access the similarity between two images. The eight metrics are as follows: RMSE, PSNR, SSIM, ISSM, FSIM, SRE, SAM, and UIQ.

Language: Python - Size: 1.1 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 610 - Forks: 70

Unbabel/COMET

A Neural Framework for MT Evaluation

Language: Python - Size: 9.72 MB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 585 - Forks: 88

AmenRa/ranx

⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍

Language: Python - Size: 34.6 MB - Last synced at: 1 day ago - Pushed at: 10 months ago - Stars: 546 - Forks: 28

proycon/pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Language: Python - Size: 12.8 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 477 - Forks: 67

relari-ai/continuous-eval

Data-Driven Evaluation for LLM-Powered Applications

Language: Python - Size: 1.7 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 446 - Forks: 29

v-iashin/SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Language: Jupyter Notebook - Size: 163 MB - Last synced at: 30 days ago - Pushed at: 10 months ago - Stars: 360 - Forks: 39

JokerJohn/Cloud_Map_Evaluation

[RAL' 2025] MapEval: Towards Unified, Robust and Efficient SLAM Map Evaluation Framework.

Language: C++ - Size: 42.9 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 355 - Forks: 27

TonicAI/tonic_validate

Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.

Language: Python - Size: 6.19 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 297 - Forks: 30

salesforce/factCC

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

Language: Python - Size: 43.9 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 292 - Forks: 30

ziqihuangg/Awesome-Evaluation-of-Visual-Generation

A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems

Size: 2.78 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 284 - Forks: 17

athina-ai/athina-evals

Python SDK for running evaluations on LLM generated responses

Language: Python - Size: 1.82 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 277 - Forks: 17

FuxiaoLiu/LRV-Instruction

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Language: Python - Size: 23.9 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 276 - Forks: 13

clovaai/generative-evaluation-prdc

Code base for the precision, recall, density, and coverage metrics for generative models. ICML 2020.

Language: Python - Size: 290 KB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 255 - Forks: 28

bheinzerling/pyrouge

A Python wrapper for the ROUGE summarization evaluation package

Language: Python - Size: 120 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 251 - Forks: 71

aws-samples/foundation-model-benchmarking-tool

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.

Language: Jupyter Notebook - Size: 95.9 MB - Last synced at: 9 days ago - Pushed at: 27 days ago - Stars: 239 - Forks: 43

davidsbatista/NER-Evaluation

An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tokens that are part of the named-entity

Language: Python - Size: 85.9 KB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 217 - Forks: 48

IBM/unitxt

🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data for end-to-end AI benchmarking

Language: Python - Size: 95.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 191 - Forks: 52

clovaai/CLEval

CLEval: Character-Level Evaluation for Text Detection and Recognition Tasks

Language: Python - Size: 2.52 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 185 - Forks: 29

wenhao728/awesome-diffusion-v2v

Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translation. And a video editing benchmark code.

Language: Python - Size: 443 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 180 - Forks: 7

MantisAI/nervaluate

Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13

Language: Python - Size: 283 KB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 176 - Forks: 22

lartpang/PySODEvalToolkit

PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection

Language: Python - Size: 309 KB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 176 - Forks: 21

sharmaroshan/Twitter-Sentiment-Analysis

It is a Natural Language Processing Problem where Sentiment Analysis is done by Classifying the Positive tweets from negative tweets by machine learning models for classification, text mining, text analysis, data analysis and data visualization

Language: Jupyter Notebook - Size: 2.77 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 168 - Forks: 114

tagucci/pythonrouge

Python wrapper for evaluating summarization quality by ROUGE package

Language: Perl - Size: 350 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 166 - Forks: 33

vectara/open-rag-eval

Open source RAG evaluation package

Language: Python - Size: 1.32 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 144 - Forks: 9

feralvam/easse

Easier Automatic Sentence Simplification Evaluation

Language: Roff - Size: 32.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 143 - Forks: 36

om-ai-lab/VL-CheckList

Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]

Language: Python - Size: 26.6 MB - Last synced at: 19 days ago - Pushed at: 7 months ago - Stars: 129 - Forks: 5

fakufaku/fast_bss_eval

A fast implementation of bss_eval metrics for blind source separation

Language: Python - Size: 1.11 MB - Last synced at: 11 months ago - Pushed at: almost 3 years ago - Stars: 126 - Forks: 8

HKUSTDial/NL2SQL360

Official repository for the paper “The Dawn of Natural Language to SQL: Are We Fully Ready?” (VLDB'24)

Language: Python - Size: 8.61 MB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 114 - Forks: 10

songweige/content-debiased-fvd

[CVPR 2024] On the Content Bias in Fréchet Video Distance

Language: Python - Size: 53.7 KB - Last synced at: 24 days ago - Pushed at: 7 months ago - Stars: 107 - Forks: 8

YuanXinCherry/Person-reID-Evaluation

GOM:New Metric for Re-identification. 👉GOM explicitly balances the effect of performing retrieval and verification into a single unified metric.

Language: Python - Size: 5.53 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 105 - Forks: 17

tohinz/semantic-object-accuracy-for-generative-text-to-image-synthesis

Code for "Semantic Object Accuracy for Generative Text-to-Image Synthesis" (TPAMI 2020)

Language: Python - Size: 7.06 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 105 - Forks: 23

LAIT-CVLab/TopPR

NeurIPS 2023 - TopP&R: Robust Support Estimation Approach for Evaluating Fidelity and Diversity in Generative Models Official Code

Language: Python - Size: 84.9 MB - Last synced at: 23 days ago - Pushed at: 10 months ago - Stars: 103 - Forks: 5

MiXaiLL76/faster_coco_eval

Continuation of an abandoned project fast-coco-eval

Language: Python - Size: 8.28 MB - Last synced at: 29 days ago - Pushed at: 4 months ago - Stars: 100 - Forks: 7

Muhtasham/summarization-eval

📝 Reference-Free automatic summarization evaluation with potential hallucination detection

Language: Python - Size: 504 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 100 - Forks: 8

tanyuqian/ctc-gen-eval

EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation

Language: Python - Size: 52.9 MB - Last synced at: 20 days ago - Pushed at: about 2 years ago - Stars: 96 - Forks: 11

k4black/codebleu

Pip compatible CodeBLEU metric implementation available for linux/macos/win

Language: Python - Size: 1.27 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 88 - Forks: 19

hpclab/rankeval

Official repository of RankEval: An Evaluation and Analysis Framework for Learning-to-Rank Solutions.

Language: Python - Size: 7.73 MB - Last synced at: 8 months ago - Pushed at: over 4 years ago - Stars: 88 - Forks: 11

msmsajjadi/precision-recall-distributions

Assessing Generative Models via Precision and Recall (official repository)

Language: Python - Size: 31.3 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 86 - Forks: 11

richardaecn/cvpr18-caption-eval

Learning to Evaluate Image Captioning. CVPR 2018

Language: Python - Size: 6.11 MB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 84 - Forks: 11

nick7nlp/Counting-Stars

Counting-Stars (★)

Language: Jupyter Notebook - Size: 120 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 82 - Forks: 2

Coldmist-Lu/ErrorAnalysis_Prompt

:gift:[ChatGPT4MTevaluation] ErrorAnalysis Prompt for MT Evaluation in ChatGPT

Language: Python - Size: 6.41 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 75 - Forks: 3

thieu1995/permetrics

Artificial intelligence (AI, ML, DL) performance metrics implemented in Python

Language: Python - Size: 2.36 MB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 73 - Forks: 18

evalkit/evalkit

The TypeScript LLM Evaluation Library

Language: TypeScript - Size: 544 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 71 - Forks: 1

jantrienes/nereval

Evaluation script for named entity recognition (NER) systems based on entity-level F1 score.

Language: Python - Size: 22.5 KB - Last synced at: 8 days ago - Pushed at: about 4 years ago - Stars: 70 - Forks: 8

hyeonsangjeon/computing-Korean-STT-error-rates

STT 한글 문장 인식기 출력 스크립트의 외자 오류율(CER), 단어 오류율(WER)을 계산하는 Python 함수 패키지

Language: Python - Size: 108 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 63 - Forks: 10

microsoft/Data-Discovery-Toolkit 📦

A data discovery and manipulation toolset for unstructured data

Language: Jupyter Notebook - Size: 89.2 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 54 - Forks: 12

kolenaIO/kolena

Python client for Kolena's machine learning testing platform

Language: Python - Size: 75.4 MB - Last synced at: about 20 hours ago - Pushed at: 20 days ago - Stars: 48 - Forks: 5

silviatti/topic-model-diversity

A collection of topic diversity measures for topic modeling

Language: Python - Size: 30.3 KB - Last synced at: 5 days ago - Pushed at: almost 4 years ago - Stars: 45 - Forks: 5

slSeanWU/MusDr

Evaluation metrics for machine-composed symbolic music. Paper: "The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-Composed Music through Quantitative Measures", ISMIR 2020

Language: Python - Size: 7.12 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 45 - Forks: 6

thu-coai/OpenMEVA

Benchmark for evaluating open-ended generation

Language: Python - Size: 7.59 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 44 - Forks: 7

golsun/NLP-tools

Useful python NLP tools (evaluation, GUI interface, tokenization)

Language: Python - Size: 207 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 44 - Forks: 9

Striveworks/valor

Valor is a lightweight, numpy-based library designed for fast and seamless evaluation of machine learning models.

Language: Python - Size: 33.8 MB - Last synced at: about 19 hours ago - Pushed at: 24 days ago - Stars: 38 - Forks: 4

CMU-TBD/SocNavBench

A Grounded Simulation Testing Framework for Evaluating Social Navigation: https://arxiv.org/abs/2103.00047

Language: Python - Size: 37.3 MB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 38 - Forks: 8

sharmaroshan/Insurance-Claim-Prediction

In this Data set we are Predicting the Insurance Claim by each user, Machine Learning algorithms for Regression analysis are used and Data Visualization are also performed to support Analysis.

Language: Jupyter Notebook - Size: 566 KB - Last synced at: 5 months ago - Pushed at: about 6 years ago - Stars: 37 - Forks: 42

orchardbirds/bokbokbok

Custom Loss Functions and Evaluation Metrics for XGBoost and LightGBM

Language: Python - Size: 574 KB - Last synced at: 16 days ago - Pushed at: 24 days ago - Stars: 36 - Forks: 7

cowjen01/repsys

Framework for Interactive Evaluation of Recommender Systems

Language: JavaScript - Size: 12.5 MB - Last synced at: 20 days ago - Pushed at: almost 2 years ago - Stars: 36 - Forks: 5

yg211/summary-reward-no-reference

A reference-free metric for measuring summary quality, learned from human ratings.

Language: Python - Size: 59.1 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 36 - Forks: 4

swapUniba/ClayRS

Complexly represent contents, build recommender systems, evaluate them. All in one place!

Language: Python - Size: 32.5 MB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 35 - Forks: 5

shi-ang/SurvivalEVAL

The most comprehensive Python package for evaluating survival analysis models.

Language: Python - Size: 3.45 MB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 34 - Forks: 5

VinAIResearch/tise-toolbox

TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation (ECCV 2022)

Language: Python - Size: 302 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 34 - Forks: 2

webis-de/summary-workbench

Framework for unified summarisation and evaluation of English documents using state-of-the-art models and measures.

Language: Python - Size: 81.2 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 32 - Forks: 7

SeleenaJM/CapEval

An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)

Language: Python - Size: 267 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 32 - Forks: 5

huster-wgm/Pytorch-metrics

Implementation of Evaluation Metrics for Pytorch

Language: Python - Size: 67.4 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 31 - Forks: 9

bdusell/rougescore

Python implementation of ROUGE

Language: Python - Size: 5.86 KB - Last synced at: 29 days ago - Pushed at: over 7 years ago - Stars: 31 - Forks: 8

waltonfuture/Diff-eRank

[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models

Language: Python - Size: 39.1 KB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 30 - Forks: 2

encord-team/text-to-image-eval

Evaluate custom and HuggingFace text-to-image/zero-shot-image-classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. Metrics include Zero-shot accuracy, Linear Probe, Image retrieval, and KNN accuracy.

Language: Jupyter Notebook - Size: 14.2 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 30 - Forks: 0

hpotechius/ColorTransferLib

A collection of algorithms for color transfer, style transfer, and colorization, complemented by objective evaluation metrics for quantitative assessment.

Language: Python - Size: 203 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 29 - Forks: 1

WanzhengZhu/GRUEN

GRUEN for Evaluating Linguistic Quality of Generated Text (EMNLP 2020 Findings)

Language: Python - Size: 126 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 28 - Forks: 13

Aldenhovel/bleu-rouge-meteor-cider-spice-eval4imagecaption

Evaluation tools for image captioning. Including BLEU, ROUGE-L, CIDEr, METEOR, SPICE scores.

Language: Python - Size: 86.8 MB - Last synced at: 27 days ago - Pushed at: about 2 years ago - Stars: 28 - Forks: 2

zyjwuyan/SOD_Evaluation_Metrics

A more complete python version (GPU) of the evaluation for salient object detection (with S-measure, Fbw measure, MAE, max/mean/adaptive F-measure, max/mean/adaptive E-measure, PRcurve and F-measure curve)

Language: Python - Size: 1.33 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 27 - Forks: 7

sharmaroshan/Big-Mart-Sales-Prediction

Using Machine Learning Algorithms for Regression Analysis to predict the sales pattern and Using Data Analysis and Data Visualizations to Support it.

Language: Jupyter Notebook - Size: 648 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 27 - Forks: 10

pro2nit/STREAM

official implementation of 'STREAM : Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models'

Language: Python - Size: 234 KB - Last synced at: 15 days ago - Pushed at: 10 months ago - Stars: 26 - Forks: 2

minar09/bfscore_python

Boundary F1 Score - Python Implementation

Language: Python - Size: 112 KB - Last synced at: 28 days ago - Pushed at: over 4 years ago - Stars: 26 - Forks: 10

ansarifaisal12/Agent_Mont

Comprehensive metrics, insights, and visualization for Phidata and Crew AI applications

Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 25 - Forks: 8

npurson/fid-metrics

A toolkit for computing Fréchet Inception Distance (FID) & Fréchet Video Distance (FVD) metrics.

Language: Python - Size: 23.4 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 25 - Forks: 7

kamo-naoyuki/pySIIB

A python implementation of Speech intelligibility in bits (SIIB)

Language: C - Size: 4.27 MB - Last synced at: 27 days ago - Pushed at: about 3 years ago - Stars: 24 - Forks: 7

adityajn105/MLfromScratch

Library for machine learning where all algorithms are implemented from scratch. Used only numpy.

Language: Python - Size: 127 KB - Last synced at: 26 days ago - Pushed at: 7 months ago - Stars: 23 - Forks: 8

thu-coai/CTRLEval

Codes for our paper "CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation" (ACL 2022)

Language: Python - Size: 937 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 23 - Forks: 2

guap-ml/guap

Open-source evaluation metric for linking Machine Learning model outputs with Business outcomes

Language: Python - Size: 60.5 KB - Last synced at: 14 days ago - Pushed at: almost 4 years ago - Stars: 23 - Forks: 2

vinid/quica

quica is a tool to run inter coder agreement pipelines in an easy and effective ways. Multiple measures are run and results are collected in a single table than can be easily exported in Latex

Language: Python - Size: 112 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 23 - Forks: 0

sharmaroshan/Students-Performance-Analytics

Students Performance Evaluation using Feature Engineering, Feature Extraction, Manipulation of Data, Data Analysis, Data Visualization and at lat applying Classification Algorithms from Machine Learning to Separate Students with different grades

Language: Jupyter Notebook - Size: 1.07 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 23 - Forks: 12

WHaverals/CERberus

CERberus -- guardian against character errors :dog::dog::dog:

Language: HTML - Size: 13.3 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 22 - Forks: 0

GiulioRossetti/f1-communities

A novel approach to evaluate community detection algorithms on ground truth

Language: Python - Size: 34.2 KB - Last synced at: 25 days ago - Pushed at: almost 4 years ago - Stars: 22 - Forks: 8

blmoistawinde/fense

Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

Language: Python - Size: 103 MB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 21 - Forks: 1

qcraftai/tip

Transcendental Idealism of Planner: Evaluating Perception from Planning Perspective for Autonomous Driving (ICML 2023)

Language: Python - Size: 74.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 1

eXascaleInfolab/clubmark

Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling of Clustering (Community Detection) Algorithms Considering Overlaps (Covers)

Language: Python - Size: 7.66 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 20 - Forks: 2

ma7555/evalify

Evaluate your biometric verification models literally in seconds.

Language: Python - Size: 3.05 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 19 - Forks: 20

nunompmoniz/IRon

R Package for Imbalanced Regression

Language: R - Size: 2.81 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 19 - Forks: 3

hollobit/ML_evaluation_metrics

Landscape of ML/DL performance evaluation metrics

Size: 687 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 18 - Forks: 3

saurf4ng/TaPR

Time-series Aware Precision and Recall for Evaluating Anomaly Detection Methods

Language: Python - Size: 24.4 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 6

SHI-Yu-Zhe/Benchmarks-for-Single-Object-Visual-Tracking

Exploring through 7 popular datasets for visual object tracking, including OTB, UAV, VOT, LaSOT, NFS, TrackingNet and GOT-10k.

Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 3

Related Topics
machine-learning 109 python 68 evaluation 55 classification 47 evaluation-framework 33 nlp 32 logistic-regression 31 deep-learning 31 data-visualization 28 eda 26 feature-engineering 26 data-science 24 data-analysis 21 metrics 21 regression 20 llm 20 exploratory-data-analysis 19 pandas 18 numpy 16 natural-language-processing 16 scikit-learn 15 computer-vision 15 data-preprocessing 14 random-forest 14 decision-trees 14 machine-learning-algorithms 14 linear-regression 14 random-forest-classifier 13 information-retrieval 13 cross-validation 13 hyperparameter-tuning 12 large-language-models 12 python3 11 supervised-learning 11 rag 11 feature-selection 11 seaborn 11 matplotlib 10 ai 9 data-cleaning 9 pytorch 9 decision-tree-classifier 9 classification-algorithm 9 benchmark 9 artificial-intelligence 9 generative-ai 9 object-detection 8 beginner 8 svm-classifier 8 modeling 8 confusion-matrix 8 summarization 8 modelling 8 time-series 8 feature-extraction 8 recall 7 precision 7 xgboost 7 jupyter-notebook 7 recommendation-system 7 llms 7 datacleaning 7 sklearn 7 generative-model 7 unsupervised-learning 7 recommender-system 6 f1-score 6 dataset 6 text-generation 6 visualization 6 regression-models 6 hyperparameter-optimization 6 sentiment-analysis 6 knn-classification 6 gan 6 tensorflow 6 openai 6 neural-network 5 naive-bayes-classifier 5 decision-tree-regression 5 optimization 5 lstm 5 text-summarization 5 benchmarking 5 javascript 5 text-classification 5 accuracy 5 llm-evaluation 5 model-selection 5 llmops 5 bag-of-words 5 segmentation 5 mlops 5 ml 5 generative-adversarial-network 5 mae 5 retrieval-augmented-generation 4 knn 4 model-evaluation 4 llm-inference 4