scaling-monosemanticity-llama

Reproducing Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet using LLaMA. This project explores monosemantic neurons in large language models, implementing and extending methods to scale and analyze interpretability in LLaMA-based architectures.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DrejcPesjak%2Fscaling-monosemanticity-llama
PURL: pkg:github/DrejcPesjak/scaling-monosemanticity-llama

Stars: 4
Forks: 0
Open issues: 0

License: mit
Language: Jupyter Notebook
Size: 14.7 MB
Dependencies parsed at: Pending

Created at: 10 months ago
Updated at: 4 months ago
Pushed at: 4 months ago
Last synced at: 4 months ago

Topics: anthropic-claude, huggingface-transformers, interpretable-deep-learning, llama3, llm, sparse-autoencoder

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub / DrejcPesjak / scaling-monosemanticity-llama