An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: sparse-autoencoder

recombee/CompresSAE

Sparse Embedding Compression for Scalable Retrieval in Recommender Systems

Language: Python - Size: 4.38 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 31 - Forks: 2

ruizheliUOA/Awesome-Interpretability-in-Large-Language-Models

This repository collects all relevant resources about interpretability in LLMs

Size: 63.5 KB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 383 - Forks: 26

botosadam/matryoshka

🚀 Build Ruby gems that utilize Rust for enhanced performance through two effective design patterns for seamless collaboration.

Size: 1.34 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

chrisliu298/awesome-sparse-autoencoders

A resource repository of sparse autoencoders for large language models

Size: 8.79 KB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 0

Yusen-Peng/CE-Bench

[BlackboxNLP Workshop @ EMNLP, 2025] CE-Bench: A Contrastive Evaluation Benchmark of LLM Interpretability with Sparse Autoencoders

Language: Jupyter Notebook - Size: 412 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 3 - Forks: 0

explanare/ravel

Evaluate interpretability methods on localizing and disentangling concepts in LLMs.

Language: Jupyter Notebook - Size: 649 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 56 - Forks: 9

Ki-Seki/Awesome-Transformer-Visualization

Explore visualization tools for understanding Transformer-based large language models (LLMs)

Size: 23.4 MB - Last synced at: 7 days ago - Pushed at: 12 months ago - Stars: 20 - Forks: 1

vgel/repeng

A library for making RepE control vectors

Language: Jupyter Notebook - Size: 299 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 649 - Forks: 50

manik-sethi/hallucination-circuits

Tenatively for: Using Monosemantic Features From Sparse Auto-Encoders to Detect Hallucinations

Language: Jupyter Notebook - Size: 270 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

corl-team/flexsae

Official Triton kernels for TopK and HierarchicalTopK Sparse Autoencoder decoders.

Language: Python - Size: 20.5 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 15 - Forks: 0

Faraday-dot-py/MATS-9.0

Do different AIs dream of the same electric sheep?

Language: Shell - Size: 897 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Butanium/tiny-activation-dashboard

A tiny easily hackable implementation of a feature dashboard.

Language: Jupyter Notebook - Size: 127 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 14 - Forks: 2

glami/sansa

SANSA - sparse EASE for millions of items

Language: Python - Size: 1.71 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 44 - Forks: 6

tim-lawson/mlsae

Multi-Layer Sparse Autoencoders (ICLR 2025)

Language: Python - Size: 642 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 24 - Forks: 0

xycoord/Language-Modelling

Implementations and Experiments: Transformers, RoPE, KV cache, SAEs, Tokenisers

Language: Python - Size: 1.52 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

snooky23/K-Sparse-AutoEncoder

Sparse Auto Encoder and regular MNIST classification with mini batch's

Language: Jupyter Notebook - Size: 4.98 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 23 - Forks: 9

codelion/pts

Pivotal Token Search

Language: Python - Size: 692 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 111 - Forks: 7

PaulPauls/llama3_interpretability_sae 📦

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

Language: Python - Size: 61.3 MB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 618 - Forks: 38

maxdreyer/attributing-clip

Repository for "From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance"

Language: Python - Size: 3.35 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

aarnphm/morph

exploration WYSIWYG editor

Language: TypeScript - Size: 63.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0

pmcurtin/model-crosscoders

Final project for cs2222

Language: Jupyter Notebook - Size: 33 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

zer0int/CLIP-SAE-finetune

Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.

Language: Python - Size: 10.8 MB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 15 - Forks: 4

DrejcPesjak/scaling-monosemanticity-llama

Reproducing Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet using LLaMA. This project explores monosemantic neurons in large language models, implementing and extending methods to scale and analyze interpretability in LLaMA-based architectures.

Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 4 - Forks: 0

seonglae/emgsd-hermes

Steering GPT2-EMGSD less biased & Generating stereotyped text with vanilla GPT2 without fine tuning or prompt engineering

Language: Jupyter Notebook - Size: 505 KB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

graphixxxx/Denoising_AutoEncoder

This project integrates Autoencoders, PCA, and CNNs for efficient image processing, combining dimensionality reduction, denoising, and enhanced feature extraction for image analysis and compression.

Size: 1.95 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

cxcscmu/embedding-scope

Interpret and control dense embedding via sparse autoencoder.

Language: Python - Size: 3.52 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 0

Specoptor/bot-iot

Implement a sparse autoencoder on the bot-iot dataset for dimensionality reduction followed by computation of reconstruction error, F1 score, recall, accuracy, weights, and threshold amongst other metrics

Language: Jupyter Notebook - Size: 62.3 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

MaheepChaudhary/SAE-Ravel

Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the paper "Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small"

Language: Python - Size: 11.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 10 - Forks: 1

mrquincle/keras-adversarial-autoencoders

Experiments with Adversarial Autoencoders using Keras

Language: Jupyter Notebook - Size: 6.84 MB - Last synced at: 5 months ago - Pushed at: almost 6 years ago - Stars: 22 - Forks: 13

SayanChakraborty126/ML-CODES

This repository contains Python codes for Autoenncoder, Sparse-autoencoder, HMM, Expectation-Maximization, Sum-product Algorithm, ANN, Disparity map, PCA.

Language: Jupyter Notebook - Size: 146 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

wblgers/tensorflow_stacked_denoising_autoencoder

Implementation of the stacked denoising autoencoder in Tensorflow

Language: Python - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 191 - Forks: 85

khoink94/tensorflow-Deep-learning

Tensorflow Examples

Language: Python - Size: 27.2 MB - Last synced at: over 2 years ago - Pushed at: over 8 years ago - Stars: 27 - Forks: 11

syorami/Autoencoders-Variants

Pytorch implementations of various types of autoencoders

Language: Python - Size: 3.86 MB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 54 - Forks: 19

shantanu-ai/DPN-SA

Repository of Deep Propensity Network - Sparse Autoencoder(DPN-SA) to calculate propensity score using sparse autoencoder

Language: Python - Size: 238 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 2

sushantMoon/isi-nna

Neural Network Architcture | ISI Kolkata

Language: Jupyter Notebook - Size: 4.3 MB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 0

mcanalesmayo/SparseAutoencoder

Sparse Autoencoder based on the Unsupervised Feature Learning and Deep Learning tutorial from the Stanford University

Language: Matlab - Size: 10.7 MB - Last synced at: almost 3 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

vivekamin/semi-supervised-learning

Implemented semi-supervised learning for digit recognition using Sparse Autoencoder

Language: Python - Size: 10.7 KB - Last synced at: almost 3 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

Related Keywords
mechanistic-interpretability 11 sae 8 llm 7 deep-learning 6 machine-learning 5 autoencoder 5 sparse-autoencoders 3 large-language-models 3 tensorflow 3 pytorch 3 steering-vector 2 interpretability 2 transformer 2 variational-autoencoder 2 rnn 2 denoising-autoencoders 2 language-model 2 clip 2 llama3 2 cnn 2 phi4-mini 1 bias-mitigation 1 emgsd 1 gpt2 1 stereotype 1 sum-product-algorithm 1 attributions 1 sum-product 1 pca 1 machine-learning-algorithms 1 mechanistic 1 expectation-maximization 1 ann 1 disparity-map 1 autoenncoder 1 feature-visualization 1 large-language-model 1 capstone-project 1 experimental 1 interface 1 anthropic-claude 1 huggingface-transformers 1 interpretable-deep-learning 1 causal-inference 1 causality 1 deep-propensity-network 1 dpn-sa 1 embedding 1 benchmark-datasets 1 convolution 1 lstm 1 neural-network-architectures 1 numpy 1 pcan 1 perceptron 1 bias-correction 1 llava-next 1 matryoshka 1 matryoshka-representation-learning 1 multimodal-large-language-models 1 nested 1 recursion-schemes 1 representation-learning 1 workflow 1 analysis 1 autoencoder-classification 1 autoencoders 1 clustering 1 convolutional-autoencoders 1 denoising 1 gans 1 neural-networks 1 pre-training 1 single-cell-rna-seq 1 unsupervised-learning 1 crosscoders 1 hallucination-detection 1 anomaly-detection 1 iot 1 mnist-handwriting-recognition 1 semi-supervised-learning 1 rotary-positional-embedding 1 tokenizer 1 ai 1 feature-detection 1 python 1 torch 1 cache 1 chatb 1 data-pipeline 1 data-quality-monitoring 1 fold 1 foundation-models 1 go 1 levelcache 1 llama 1 pivotal-token-search 1 pivotal-tokens 1 reasoning-agent 1 reasoning-language-models 1