Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: evaluation
CarlosNZ/fig-tree-evaluator
A highly configurable custom expression tree evaluator
Language: TypeScript - Size: 17.9 MB - Last synced: about 1 hour ago - Pushed: 1 day ago - Stars: 10 - Forks: 1
HowieHwong/TrustLLM
(ICML 2024) TrustLLM: Trustworthiness in Large Language Models
Language: Python - Size: 9.52 MB - Last synced: about 5 hours ago - Pushed: about 5 hours ago - Stars: 309 - Forks: 27
langchain-ai/langsmith-sdk
LangSmith Client SDK Implementations
Language: Python - Size: 2.39 MB - Last synced: about 10 hours ago - Pushed: about 19 hours ago - Stars: 313 - Forks: 46
umk/fngraph
Primitives for composing and evaluating functions based on their inputs and outputs
Language: TypeScript - Size: 640 KB - Last synced: about 7 hours ago - Pushed: about 16 hours ago - Stars: 0 - Forks: 0
onejune2018/Awesome-LLM-Eval
Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, learderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表,主要面向大型语言模型评测(例如ChatGPT、LLaMA、GLM、Baichuan等).
Size: 13.3 MB - Last synced: about 10 hours ago - Pushed: 1 day ago - Stars: 282 - Forks: 32
CLUEbenchmark/SuperCLUE
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
Size: 24.3 MB - Last synced: about 17 hours ago - Pushed: about 18 hours ago - Stars: 2,640 - Forks: 88
cdaringe/programming-language-selector
Programming Language Selector based on language metadata and user-specified values.
Language: TypeScript - Size: 1.9 MB - Last synced: about 17 hours ago - Pushed: about 19 hours ago - Stars: 2 - Forks: 0
CCAFS/MARLO
Managing Agricultural Research for Learning and Outcomes
Language: Java - Size: 195 MB - Last synced: about 19 hours ago - Pushed: 1 day ago - Stars: 8 - Forks: 8
langwatch/langevals
LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores and LLM guardrails, for you to protect and benchmark your LLM models and pipelines.
Language: Python - Size: 1.15 MB - Last synced: about 18 hours ago - Pushed: 1 day ago - Stars: 11 - Forks: 3
athina-ai/athina-evals
Python SDK for running evaluations on LLM generated responses
Language: Python - Size: 985 KB - Last synced: about 21 hours ago - Pushed: about 21 hours ago - Stars: 135 - Forks: 11
r-lib/evaluate
A version of eval for R that returns more information about what happened
Language: R - Size: 387 KB - Last synced: about 21 hours ago - Pushed: about 22 hours ago - Stars: 107 - Forks: 33
sepandhaghighi/pycm
Multi-class confusion matrix library in Python
Language: Python - Size: 11.4 MB - Last synced: 34 minutes ago - Pushed: 12 days ago - Stars: 1,431 - Forks: 121
promptfoo/promptfoo
Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
Language: TypeScript - Size: 14 MB - Last synced: about 19 hours ago - Pushed: 1 day ago - Stars: 2,878 - Forks: 182
symflower/eval-dev-quality
DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.
Language: Go - Size: 1.94 MB - Last synced: about 20 hours ago - Pushed: 1 day ago - Stars: 27 - Forks: 1
paul-schuhm/git-branching-model-tp
Un sujet de projet pour pratique les git branching models
Language: HTML - Size: 63.5 KB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 0 - Forks: 3
paul-schuhm/git-projet-minitrice
Sujet de projet collaboratif à versionner
Size: 6.84 KB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 0 - Forks: 0
duohongrui/simpipe
The standard pipeline of definiting a simulation method, estimating parameters from datasets, simulating and evaluating datasets.
Language: R - Size: 404 KB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 1 - Forks: 0
alipay/ant-application-security-testing-benchmark
xAST评价体系,让安全工具不再“黑盒”. The xAST evaluation benchmark makes security tools no longer a "black box".
Language: Java - Size: 8.68 MB - Last synced: about 8 hours ago - Pushed: 1 day ago - Stars: 235 - Forks: 27
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language: Python - Size: 2.87 MB - Last synced: 1 day ago - Pushed: 3 days ago - Stars: 2,659 - Forks: 278
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 40+ HF models, 20+ benchmarks
Language: Python - Size: 1.48 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 424 - Forks: 46
langchain-ai/langsmith-docs
Documentation for langsmith
Language: MDX - Size: 138 MB - Last synced: about 18 hours ago - Pushed: 1 day ago - Stars: 58 - Forks: 17
ziqihuangg/Awesome-Evaluation-of-Visual-Generation
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
Size: 1.89 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 67 - Forks: 5
uptrain-ai/uptrain
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
Language: Python - Size: 35.7 MB - Last synced: 1 day ago - Pushed: 2 days ago - Stars: 2,014 - Forks: 169
Sensirion/python-uart-svm4x
Python driver to work with the SVM4x evaluation kit over UART using SHDLC protocol
Language: Python - Size: 6.66 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 0
gagolews/deepr
Deep R Programming (Open-Access Textbook)
Size: 105 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 85 - Forks: 1
kanekescom/laravel-simonevbang
SIMONEVBANG for Laravel
Size: 1.95 KB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 1
taka8t/lieval
lieval is a lightweight Rust crate for parsing and evaluating mathematical expressions from strings.
Language: Rust - Size: 27.3 KB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 1 - Forks: 0
CS-EVAL/CS-Eval
CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.
Size: 1.01 MB - Last synced: 2 days ago - Pushed: 3 days ago - Stars: 2 - Forks: 0
Striveworks/valor
Valor is a centralized evaluation store which makes it easy to measure, explore, and rank model performance.
Language: Python - Size: 27.6 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 34 - Forks: 0
Auto-Playground/ragrank
🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it understands context, its tone, and more. This helps you see how good your LLM applications are.
Language: Python - Size: 569 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 17 - Forks: 4
microsoft/llmops-promptflow-template
LLMOps with Prompt Flow is a "LLMOps template and guidance" to help you build LLM-infused apps using Prompt Flow. It offers a range of features including Centralized Code Hosting, Lifecycle Management, Variant and Hyperparameter Experimentation, A/B Deployment, reporting for all runs and experiments and so on.
Language: Python - Size: 5.54 MB - Last synced: 2 days ago - Pushed: 3 days ago - Stars: 165 - Forks: 152
cbuschka/mockito-python-eval
Evaluation of mockito-python
Language: Python - Size: 1000 Bytes - Last synced: 3 days ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0
cbuschka/k3d-eval
Language: Shell - Size: 19.5 KB - Last synced: 3 days ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
MichaelGrupp/evo
Python package for the evaluation of odometry and SLAM
Language: Python - Size: 6.57 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 3,249 - Forks: 733
BloopAI/COBOLEval
Evaluate LLM-generated COBOL
Language: Python - Size: 140 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 15 - Forks: 2
gereon-t/trajectopy-core
Trajectopy - Trajectory Evaluation in Python
Language: Python - Size: 34.3 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 1 - Forks: 1
mrgloom/awesome-semantic-segmentation
:metal: awesome-semantic-segmentation
Size: 283 KB - Last synced: 3 days ago - Pushed: about 3 years ago - Stars: 10,340 - Forks: 2,489
Cloud-CV/EvalAI
:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
Language: Python - Size: 63.1 MB - Last synced: about 23 hours ago - Pushed: 7 days ago - Stars: 1,694 - Forks: 765
ContinualAI/avalanche
Avalanche: an End-to-End Library for Continual Learning based on PyTorch.
Language: Python - Size: 14.2 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 1,680 - Forks: 283
symblai/speech-recognition-evaluation
Evaluate results from ASR/Speech-to-Text quickly
Language: JavaScript - Size: 36.1 KB - Last synced: 3 days ago - Pushed: over 2 years ago - Stars: 31 - Forks: 7
jianzfb/antgo
Machine Learning Experiment Manage Platform
Language: Python - Size: 17.8 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 186 - Forks: 7
microsoft/promptbench
A unified evaluation framework for large language models
Language: Python - Size: 5.42 MB - Last synced: 3 days ago - Pushed: 9 days ago - Stars: 2,091 - Forks: 162
google/imageinwords
Data release for the ImageInWords (IIW) paper.
Language: JavaScript - Size: 21.3 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 67 - Forks: 0
ianarawjo/ChainForge
An open-source visual programming environment for battle-testing prompts to LLMs.
Language: TypeScript - Size: 76.7 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 2,004 - Forks: 139
xinshuoweng/AB3DMOT
(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
Language: Python - Size: 181 MB - Last synced: 3 days ago - Pushed: about 1 month ago - Stars: 1,631 - Forks: 397
gereon-t/trajectopy
Trajectopy - Trajectory Evaluation in Python
Language: Python - Size: 9.38 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 21 - Forks: 2
aimonlabs/aimon-rely
Aimon Rely is a state-of-the-art, multi-model API that provides detectors for LLM quality metrics in both offline evaluation and online monitoring scenarios.
Language: Python - Size: 1.08 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 6 - Forks: 2
kolenaIO/kolena
Python client for Kolena's machine learning testing platform
Language: Python - Size: 71.4 MB - Last synced: 28 days ago - Pushed: 30 days ago - Stars: 38 - Forks: 4
abo-abo/lispy
Short and sweet LISP editing
Language: Emacs Lisp - Size: 5.07 MB - Last synced: 3 days ago - Pushed: 2 months ago - Stars: 1,187 - Forks: 129
DIAGNijmegen/picai_eval
Evaluation of 3D detection and diagnosis performance —geared towards prostate cancer detection in MRI.
Language: Python - Size: 775 KB - Last synced: 5 days ago - Pushed: 6 days ago - Stars: 15 - Forks: 9
dustalov/llmfao
Large Language Model Feedback Analysis and Optimization (LLMFAO)
Language: Python - Size: 1.08 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 2 - Forks: 0
alopatenko/LLMEvaluation
A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessment, and critically assess the effectiveness of these evaluation methods.
Size: 2.74 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 22 - Forks: 0
modelscope/eval-scope
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
Language: Python - Size: 1.39 MB - Last synced: 2 days ago - Pushed: 4 days ago - Stars: 61 - Forks: 8
jannikmi/multivar_horner
python package implementing a multivariate Horner scheme for efficiently evaluating multivariate polynomials
Language: Python - Size: 4.71 MB - Last synced: 5 days ago - Pushed: 6 days ago - Stars: 26 - Forks: 3
terryyz/ice-score
[EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code
Language: Python - Size: 21.8 MB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 62 - Forks: 7
microsoft/LMChallenge
A library & tools to evaluate predictive language models.
Language: Python - Size: 122 KB - Last synced: 4 days ago - Pushed: 9 months ago - Stars: 61 - Forks: 18
naotake51/evaluation
naotake51/evaluation is a Composer package for creating a simple expression evaluation module. It allows you to register your own functions to evaluate expressions (strings).
Language: PHP - Size: 62.5 KB - Last synced: 6 days ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
andreeaiana/newsreclib
PyTorch-Lightning Library for Neural News Recommendation
Language: Python - Size: 617 KB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 30 - Forks: 4
mjalali/renyi-kernel-entropy
[NeurIPS 2023] Code base for the Renyi Kernel Entropy (RKE) metric for generative models.
Language: Python - Size: 508 KB - Last synced: 3 days ago - Pushed: 7 days ago - Stars: 9 - Forks: 0
microsoft/rag-experiment-accelerator
The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.
Language: Python - Size: 3.84 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 76 - Forks: 19
apacha/MusicObjectDetection
Accompanying source code for the journal paper "A Baseline for General Music Object Detection with Deep Learning"
Language: Python - Size: 530 KB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 10 - Forks: 8
google-deepmind/long-form-factuality
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
Language: Python - Size: 799 KB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 444 - Forks: 44
MMMU-Benchmark/MMMU
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Language: Python - Size: 3.32 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 254 - Forks: 16
iamrk04/LLM-Solutions-Playbook
Unlock the potential of AI-driven solutions and delve into the world of Large Language Models. Explore cutting-edge concepts, real-world applications, and best practices to build powerful systems with these state-of-the-art models.
Language: Jupyter Notebook - Size: 5.36 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 2 - Forks: 1
lyy1994/awesome-data-contamination
The Paper List on Data Contamination for Large Language Models Evaluation.
Size: 356 KB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 22 - Forks: 1
alvarorichard/CortexC
Interpreter is a minimalist yet powerful tool designed to interpret and execute a subset of the C programming language.
Language: C - Size: 23.2 MB - Last synced: 1 day ago - Pushed: 2 months ago - Stars: 8 - Forks: 0
terryyz/llm-benchmark
A list of LLM benchmark frameworks.
Size: 14.6 KB - Last synced: 5 days ago - Pushed: 3 months ago - Stars: 43 - Forks: 3
OPTML-Group/Unlearn-WorstCase
"Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning" by Chongyu Fan*, Jiancheng Liu*, Alfred Hero, Sijia Liu
Language: Python - Size: 18.4 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 5 - Forks: 0
marcusm117/IdentityChain
[ICLR 2024] Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain
Language: Python - Size: 1.67 MB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 6 - Forks: 0
viebel/klipse
Klipse is a JavaScript plugin for embedding interactive code snippets in tech blogs.
Language: HTML - Size: 89.8 MB - Last synced: 6 days ago - Pushed: over 1 year ago - Stars: 3,093 - Forks: 153
opennms-forge/stack-play
🎢 Just the fun parts! - Some docker-compose container stacks for local labs or playgrounds
Language: Shell - Size: 18.6 MB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 5 - Forks: 6
JuezUN/INGInious Fork of UCL-INGI/INGInious
UNCode is an online platform for frequent practice and automatic evaluation of computer programming, Jupyter Notebooks and hardware description language (VHDL/Verilog) assignments. Also provides a pluggable interface with your existing LMS.
Language: Python - Size: 52.2 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 8 - Forks: 6
Awrsha/Machine-Learning-and-Deep-Learning
Some of the topics, algorithms and projects in Machine Learning & Deep Learning that I have worked on and become familiar with.
Language: Jupyter Notebook - Size: 23.9 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 3 - Forks: 1
lisiarend/PRONE
R Package for preprocessing, normalizing, and analyzing proteomics data
Language: R - Size: 40.4 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 0 - Forks: 0
singh-rajiv/EvalAutoUT-Ang
Evaluation of an automatic AI-based Unit Test generation tool for TypeScript/JavaScript
Language: TypeScript - Size: 422 KB - Last synced: 10 days ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0
AmenRa/ranx
⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
Language: Python - Size: 34.5 MB - Last synced: 10 days ago - Pushed: 13 days ago - Stars: 348 - Forks: 21
danthedeckie/simpleeval
Simple Safe Sandboxed Extensible Expression Evaluator for Python
Language: Python - Size: 201 KB - Last synced: about 7 hours ago - Pushed: about 1 month ago - Stars: 426 - Forks: 83
EXP-Tools/steam-discount
steam 特惠游戏榜单(自动刷新)
Language: Python - Size: 10.6 GB - Last synced: 11 days ago - Pushed: 11 days ago - Stars: 53 - Forks: 26
seungjaeryanlee/Doumi-Chess
C++ UCI Chess Engine
Language: C++ - Size: 4.33 MB - Last synced: 10 days ago - Pushed: over 4 years ago - Stars: 4 - Forks: 0
RecList/reclist
Behavioral "black-box" testing for recommender systems
Language: Python - Size: 3.7 MB - Last synced: 3 days ago - Pushed: 9 months ago - Stars: 451 - Forks: 26
ianwalter/rhino-stock-example
An example of how to use Mozilla Rhino to execute JavaScript within Java
Language: Java - Size: 1.11 MB - Last synced: 11 days ago - Pushed: about 11 years ago - Stars: 1 - Forks: 0
yilunzhu/ontogum
Repository for the OntoGUM Corpus
Language: Python - Size: 8.36 MB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 6 - Forks: 0
hitz-zentroa/latxa
Latxa: An Open Language Model and Evaluation Suite for Basque
Language: Shell - Size: 27.4 MB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 16 - Forks: 0
2KAbhishek/EvalTrivia
Expression Evaluation Trivia 🟰🔢
Language: Kotlin - Size: 189 KB - Last synced: 11 days ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
NullDev/Spendenr-AI-d
AI powered Spendenraid evaluation.
Language: JavaScript - Size: 76.3 MB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 14 - Forks: 4
RichardObi/frd-score
Compute the Fréchet Radiomics Distance.
Language: Python - Size: 184 KB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 0 - Forks: 0
zzzprojects/Eval-SQL.NET
SQL Eval Function | Dynamically Evaluate Expression in SQL Server using C# Syntax.
Language: C# - Size: 861 KB - Last synced: 11 days ago - Pushed: 12 days ago - Stars: 95 - Forks: 40
lilakk/BooookScore
A package to generate summaries of long-form text and evaluate the coherence of these summaries. Official package for our ICLR 2024 paper, "BooookScore: A systematic exploration of book-length summarization in the era of LLMs".
Language: Python - Size: 27.1 MB - Last synced: 9 days ago - Pushed: about 1 month ago - Stars: 68 - Forks: 6
Valires/er-evaluation
An End-to-End Evaluation Framework for Entity Resolution Systems
Language: Python - Size: 62.4 MB - Last synced: 11 days ago - Pushed: 5 months ago - Stars: 22 - Forks: 3
mcthouacbb/Sirius
Chess engine
Language: C++ - Size: 36.2 MB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 15 - Forks: 0
gchudnov/bscript
BScript - AST Evaluation & Debugging
Language: Scala - Size: 2 MB - Last synced: 4 days ago - Pushed: 5 days ago - Stars: 3 - Forks: 0
research-outcome/LLM-TicTacToe-Benchmark
Benchmarking Large Language Model (LLM) Performance for Game Playing via Tic-Tac-Toe
Language: Python - Size: 772 KB - Last synced: 12 days ago - Pushed: 13 days ago - Stars: 0 - Forks: 0
MileBench/MileBench
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
Language: Python - Size: 3.51 MB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 7 - Forks: 0
GrumpyZhou/image-matching-toolbox
This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.
Language: Jupyter Notebook - Size: 20 MB - Last synced: 7 days ago - Pushed: 13 days ago - Stars: 494 - Forks: 72
zzzprojects/Eval-Expression.NET
C# Eval Expression | Evaluate, Compile, and Execute C# code and expression at runtime.
Language: C# - Size: 904 KB - Last synced: 11 days ago - Pushed: 12 days ago - Stars: 427 - Forks: 84
Baukebrenninkmeijer/table-evaluator
Evaluate real and synthetic datasets against each other
Language: Jupyter Notebook - Size: 6.23 MB - Last synced: 4 days ago - Pushed: 9 months ago - Stars: 76 - Forks: 27
sourceduty/Professional_Value
🧑💼 Measure the value of professional experience.
Size: 1.95 KB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 0 - Forks: 0
char-ptr/crazed
execute js, py, rust, c++, and c via a bot on discord
Language: Rust - Size: 17.6 KB - Last synced: 13 days ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0
saschaschramm/sc2-evals
Evaluation of GPT-4 on StarCraft II
Language: Python - Size: 18.6 KB - Last synced: 14 days ago - Pushed: 14 days ago - Stars: 0 - Forks: 0
obss/jury
Comprehensive NLP Evaluation System
Language: Python - Size: 284 KB - Last synced: 10 days ago - Pushed: 24 days ago - Stars: 178 - Forks: 20