An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: sparse-autoencoder

pmcurtin/model-crosscoders

Final project for cs2222

Language: Jupyter Notebook - Size: 33 MB - Last synced at: about 18 hours ago - Pushed at: about 20 hours ago - Stars: 0 - Forks: 0

codelion/pts

Pivotal Token Search

Language: Python - Size: 200 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 6 - Forks: 1

vgel/repeng

A library for making RepE control vectors

Language: Jupyter Notebook - Size: 315 KB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 587 - Forks: 46

ruizheliUOA/Awesome-Interpretability-in-Large-Language-Models

This repository collects all relevant resources about interpretability in LLMs

Size: 63.5 KB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 343 - Forks: 24

aarnphm/morph

exploration WYSIWYG editor

Language: TypeScript - Size: 63.6 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6 - Forks: 0

Butanium/tiny-activation-dashboard

A tiny easily hackable implementation of a feature dashboard.

Language: Jupyter Notebook - Size: 109 KB - Last synced at: about 4 hours ago - Pushed at: 3 months ago - Stars: 10 - Forks: 2

Ki-Seki/Awesome-Transformer-Visualization

Explore visualization tools for understanding Transformer-based large language models (LLMs)

Size: 23.4 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 12 - Forks: 2

DrejcPesjak/scaling-monosemanticity-llama

Reproducing Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet using LLaMA. This project explores monosemantic neurons in large language models, implementing and extending methods to scale and analyze interpretability in LLaMA-based architectures.

Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 4 - Forks: 0

explanare/ravel

Evaluate interpretability methods on localizing and disentangling concepts in LLMs.

Language: Python - Size: 661 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 43 - Forks: 7

glami/sansa

SANSA - sparse EASE for millions of items

Language: Python - Size: 1.71 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 42 - Forks: 6

PaulPauls/llama3_interpretability_sae 📦

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

Language: Python - Size: 61.3 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 606 - Forks: 36

zer0int/CLIP-SAE-finetune

Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.

Language: Python - Size: 10.8 MB - Last synced at: 15 days ago - Pushed at: 5 months ago - Stars: 14 - Forks: 3

seonglae/emgsd-hermes

Steering GPT2-EMGSD less biased & Generating stereotyped text with vanilla GPT2 without fine tuning or prompt engineering

Language: Jupyter Notebook - Size: 505 KB - Last synced at: about 3 hours ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

graphixxxx/Denoising_AutoEncoder

This project integrates Autoencoders, PCA, and CNNs for efficient image processing, combining dimensionality reduction, denoising, and enhanced feature extraction for image analysis and compression.

Size: 1.95 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

cxcscmu/embedding-scope

Interpret and control dense embedding via sparse autoencoder.

Language: Python - Size: 3.52 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

tim-lawson/mlsae

Multi-Layer Sparse Autoencoders (ICLR 2025)

Language: Python - Size: 642 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 17 - Forks: 0

MaheepChaudhary/SAE-Ravel

Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the paper "Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small"

Language: Python - Size: 11.9 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 10 - Forks: 1

chrisliu298/awesome-sparse-autoencoders

A resource repository of sparse autoencoders for large language models

Size: 8.79 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

snooky23/K-Sparse-AutoEncoder

Sparse Auto Encoder and regular MNIST classification with mini batch's

Language: Jupyter Notebook - Size: 4.62 MB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 19 - Forks: 9

mrquincle/keras-adversarial-autoencoders

Experiments with Adversarial Autoencoders using Keras

Language: Jupyter Notebook - Size: 6.84 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 22 - Forks: 13

SayanChakraborty126/ML-CODES

This repository contains Python codes for Autoenncoder, Sparse-autoencoder, HMM, Expectation-Maximization, Sum-product Algorithm, ANN, Disparity map, PCA.

Language: Jupyter Notebook - Size: 146 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

wblgers/tensorflow_stacked_denoising_autoencoder

Implementation of the stacked denoising autoencoder in Tensorflow

Language: Python - Size: 11 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 191 - Forks: 85

khoink94/tensorflow-Deep-learning

Tensorflow Examples

Language: Python - Size: 27.2 MB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 27 - Forks: 11

Specoptor/bot-iot

Implement a sparse autoencoder on the bot-iot dataset for dimensionality reduction followed by computation of reconstruction error, F1 score, recall, accuracy, weights, and threshold amongst other metrics

Language: Jupyter Notebook - Size: 62.3 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

syorami/Autoencoders-Variants

Pytorch implementations of various types of autoencoders

Language: Python - Size: 3.86 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 54 - Forks: 19

shantanu-ai/DPN-SA

Repository of Deep Propensity Network - Sparse Autoencoder(DPN-SA) to calculate propensity score using sparse autoencoder

Language: Python - Size: 238 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 2

sushantMoon/isi-nna

Neural Network Architcture | ISI Kolkata

Language: Jupyter Notebook - Size: 4.3 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

mcanalesmayo/SparseAutoencoder

Sparse Autoencoder based on the Unsupervised Feature Learning and Deep Learning tutorial from the Stanford University

Language: Matlab - Size: 10.7 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

vivekamin/semi-supervised-learning

Implemented semi-supervised learning for digit recognition using Sparse Autoencoder

Language: Python - Size: 10.7 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

Related Keywords
sparse-autoencoder 29 mechanistic-interpretability 6 deep-learning 6 sae 6 autoencoder 5 llm 4 machine-learning 4 tensorflow 3 sparse-autoencoders 3 transformer 2 llama3 2 cnn 2 variational-autoencoder 2 rnn 2 steering-vector 2 pytorch 2 large-language-models 2 denoising-autoencoders 2 stereotype 1 machine-learning-algorithms 1 expectation-maximization 1 disparity-map 1 autoenncoder 1 analysis 1 autoencoder-classification 1 ann 1 autoencoders 1 keras 1 jupyter 1 clustering 1 adversarial-autoencoder 1 python3 1 convolutional-autoencoders 1 pure-python 1 mnist-dataset 1 denoising 1 deep-neural-networks 1 gans 1 neural-networks 1 large-language-model 1 embedding 1 single-cell-rna-seq 1 pre-training 1 semi-supervised-learning 1 mnist-handwriting-recognition 1 unsupervised-learning 1 perceptron 1 pcan 1 numpy 1 neural-network-architectures 1 lstm 1 convolution 1 dpn-sa 1 deep-propensity-network 1 causality 1 causal-inference 1 iot 1 anomaly-detection 1 tf 1 sparseae 1 rbm 1 nearest-neighbors 1 multilayer-perceptron-network 1 mnist 1 mlp 1 logistic-regression 1 linear-regression 1 grbm 1 drnn 1 brnn 1 ae 1 stacked-autoencoder 1 sum-product-algorithm 1 sum-product 1 pca 1 explainable-ai 1 bert 1 awesome 1 attention-mechanism 1 feature-visualization 1 feature-dashboard 1 interface 1 experimental 1 capstone-project 1 interpretability-and-explainability 1 dictionary-learning 1 transformers 1 saes 1 representation-engineering 1 language-model 1 tokens 1 reasoning-models 1 reasoning-language-models 1 reasoning-agent 1 pivotal-tokens 1 pivotal-token-search 1 phi4-mini 1 phi4 1 phi-4-mini 1 phi-4 1