GitHub topics: ai-benchmarks
DanielButler1/AI-Stats
The Most Comprehensive Set of AI Model Benchmark Scores, Prices & Information
Language: TypeScript - Size: 1.69 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

scicode-bench/SciCode
A benchmark that challenges language models to code solutions for scientific problems
Language: Python - Size: 9.83 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 114 - Forks: 16

lechmazur/deception
Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claude, GPT-4, Gemini, Llama, etc.) with standardized evaluation metrics.
Size: 36.1 KB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 26 - Forks: 2

mrowan137/ml-performance-benchmark
Performance benchmarking for ML/AI workloads ResNet, CosmoFlow, & DeepCam
Language: CWeb - Size: 198 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

Paraskevi-KIvroglou/Hackathon-LlamaEval
LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.
Language: Python - Size: 66.8 MB - Last synced at: 9 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

CAS-CLab/CNN-Inference-Engine-Quick-View
A quick view of high-performance convolution neural networks (CNNs) inference engines on mobile devices.
Size: 54.7 KB - Last synced at: 4 months ago - Pushed at: almost 3 years ago - Stars: 150 - Forks: 18
