GitHub topics: llm-interpretability

Repositories

Yusen-Peng/CE-Bench

CE-Bench: A Contrastive Evaluation Benchmark of LLM Interpretability with Sparse Autoencoders

Language: Jupyter Notebook - Size: 406 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

Wondermongering/latent-exploration-stack

AI research portfolio bridging technical rigor and humanistic inquiry through the Eigen-Koan Matrix, Codex Illuminata, and specialized metaprompts for diverse interaction styles

Language: Python - Size: 372 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

basics-lab/spectral-explain

Fast XAI with interactions at large scale. SPEX can help you understand the output of your LLM, even if you have a long context!

Language: Jupyter Notebook - Size: 5.38 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

PaulPauls/llama3_interpretability_sae 📦

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

Language: Python - Size: 61.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 606 - Forks: 36

Related Keywords

llm-interpretability 4 sparse-autoencoder 1 pytorch 1 open-research 1 llama3 1 feature-steering 1 feature-extraction 1 xai 1 sparse-transformer 1 shap 1 explainable-ai 1 explainability 1 symbolic-interaction 1 python 1 prompt-engineering 1 latent-space-exploration 1 language-models 1 cognitive-frameworks 1 ai-research 1 sparse-autoencoders 1 contrastive-evaluation 1 benchmark 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub topics: llm-interpretability

Yusen-Peng/CE-Bench

Wondermongering/latent-exploration-stack

basics-lab/spectral-explain

PaulPauls/llama3_interpretability_sae 📦