GitHub / aflah02 / Humans-v-s-LLM-Benchmarks
LLM Benchmarks play a crucial role in assessing the performance of Language Model Models (LLMs). However, it is essential to recognize that these benchmarks have their own limitations. This interactive tool is designed to engage users in a quiz game based on popular LLM benchmarks, offering an insightful way to explore and understand them
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aflah02%2FHumans-v-s-LLM-Benchmarks
PURL: pkg:github/aflah02/Humans-v-s-LLM-Benchmarks
Stars: 1
Forks: 0
Open issues: 0
License: None
Language: Python
Size: 40.9 MB
Dependencies parsed at: Pending
Created at: over 1 year ago
Updated at: over 1 year ago
Pushed at: over 1 year ago
Last synced at: 5 months ago
Topics: llms, llms-benchmarking, streamlit