language-model-evaluation | Topic

Topic: "language-model-evaluation"

microsoft/LMChallenge

A library & tools to evaluate predictive language models.

Language: Python - Size: 122 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 63 - Forks: 13

SALT-NLP/PrivacyLens

A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM agents. (NeurIPS 2024 D&B)

Language: Python - Size: 35.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 23 - Forks: 6

sb-jang/kodialogbench

Code and data for "KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark" (LREC-COLING 2024)

Language: Python - Size: 12.7 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 16 - Forks: 0

Curriculum is a new format of NLI benchmark for evaluation of broad-coverage linguistic phenomena. This linguistic-phenomena-driven benchmark can serve as an effective tool for diagnosing model behavior and verifying model learning quality.

Language: Jupyter Notebook - Size: 134 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

Djasingh/Language-Model

Language Modeling

Language: Jupyter Notebook - Size: 725 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ashioyajotham/lm_finetuning

A from scratch LM finetuning project to understand neural nets, text generation and evals

Language: Python - Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Impelon/log-summarization

A thesis investigating the use of large language models for summarizing application logs.

Language: TeX - Size: 1.11 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos