GitHub topics: llm-interpretability
Wondermongering/latent-exploration-stack
AI research portfolio bridging technical rigor and humanistic inquiry through the Eigen-Koan Matrix, Codex Illuminata, and specialized metaprompts for diverse interaction styles
Language: Python - Size: 372 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

Yusen-Peng/KAN-LLaMA
Can KAN-based Sparse Autoencoders Interpret a Large Language Model?
Language: Python - Size: 208 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

PaulPauls/llama3_interpretability_sae 📦
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.
Language: Python - Size: 61.3 MB - Last synced at: 21 days ago - Pushed at: about 1 month ago - Stars: 606 - Forks: 36

basics-lab/spectral-explain
Fast XAI with interactions at large scale. SPEX can help you understand the output of your LLM, even if you have a long context!
Language: Jupyter Notebook - Size: 4.73 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0
