Topic: "language-model-evaluation"
microsoft/LMChallenge
A library & tools to evaluate predictive language models.
Language: Python - Size: 122 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 63 - Forks: 13

SALT-NLP/PrivacyLens
A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM agents. (NeurIPS 2024 D&B)
Language: Python - Size: 35.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 23 - Forks: 6

sb-jang/kodialogbench
Code and data for "KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark" (LREC-COLING 2024)
Language: Python - Size: 12.7 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 16 - Forks: 0

eric11eca/curriculum-ling
Curriculum is a new format of NLI benchmark for evaluation of broad-coverage linguistic phenomena. This linguistic-phenomena-driven benchmark can serve as an effective tool for diagnosing model behavior and verifying model learning quality.
Language: Jupyter Notebook - Size: 134 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

Djasingh/Language-Model
Language Modeling
Language: Jupyter Notebook - Size: 725 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ashioyajotham/lm_finetuning
A from scratch LM finetuning project to understand neural nets, text generation and evals
Language: Python - Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Impelon/log-summarization
A thesis investigating the use of large language models for summarizing application logs.
Language: TeX - Size: 1.11 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0
