Topic: "llm-interpretability"
PaulPauls/llama3_interpretability_sae 📦
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.
Language: Python - Size: 61.3 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 606 - Forks: 36

basics-lab/spectral-explain
Fast XAI with interactions at large scale. SPEX can help you understand the output of your LLM, even if you have a long context!
Language: Jupyter Notebook - Size: 5.38 MB - Last synced at: 12 days ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

Yusen-Peng/CE-Bench
CE-Bench: A Contrastive Evaluation Benchmark of LLM Interpretability with Sparse Autoencoders
Language: Python - Size: 304 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 1 - Forks: 0

Wondermongering/latent-exploration-stack
AI research portfolio bridging technical rigor and humanistic inquiry through the Eigen-Koan Matrix, Codex Illuminata, and specialized metaprompts for diverse interaction styles
Language: Python - Size: 372 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0
