An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: interpretability

SeldonIO/alibi

Algorithms for explaining machine learning models

Language: Python - Size: 30.3 MB - Last synced at: 38 minutes ago - Pushed at: 5 days ago - Stars: 2,500 - Forks: 257

jphall663/awesome-machine-learning-interpretability

A curated list of awesome responsible machine learning resources.

Size: 4.45 MB - Last synced at: 44 minutes ago - Pushed at: 27 days ago - Stars: 3,783 - Forks: 599

Dependable-Intelligent-Systems-Lab/xwhy

Explaining black boxes with a SMILE: Statistical Mode-agnostic Interpretability with Local Explanations

Language: JavaScript - Size: 24.9 MB - Last synced at: about 9 hours ago - Pushed at: about 10 hours ago - Stars: 10 - Forks: 2

csinva/imodels

Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).

Language: Jupyter Notebook - Size: 162 MB - Last synced at: about 9 hours ago - Pushed at: 2 months ago - Stars: 1,454 - Forks: 124

AdamCoscia/KnowledgeVIS

Visually compare fill-in-the-blank LLM prompts to uncover learned biases and associations!

Language: JavaScript - Size: 6.59 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 8 - Forks: 1

boniolp/kGraph

Graph Embedding for Interpretable Time Series Clustering

Language: Python - Size: 49.4 MB - Last synced at: about 17 hours ago - Pushed at: about 18 hours ago - Stars: 28 - Forks: 1

frgfm/torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

Language: Python - Size: 10.1 MB - Last synced at: about 11 hours ago - Pushed at: 2 days ago - Stars: 2,189 - Forks: 220

hijohnnylin/neuronpedia

open source interpretability platform 🧠

Language: TypeScript - Size: 10.3 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 114 - Forks: 11

stellargraph/stellargraph

StellarGraph - Machine Learning on Graphs

Language: Python - Size: 92.5 MB - Last synced at: about 2 hours ago - Pushed at: about 1 year ago - Stars: 3,004 - Forks: 434

shap/shap

A game theoretic approach to explain the output of any machine learning model.

Language: Jupyter Notebook - Size: 301 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 23,858 - Forks: 3,359

ndif-team/nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

Language: Jupyter Notebook - Size: 49.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 562 - Forks: 50

jacobgil/pytorch-grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Language: Python - Size: 134 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 11,610 - Forks: 1,635

pytorch/captum

Model interpretability and understanding for PyTorch

Language: Python - Size: 306 MB - Last synced at: about 15 hours ago - Pushed at: 6 days ago - Stars: 5,209 - Forks: 517

sicara/tf-explain

Interpretability Methods for tf.keras models with Tensorflow 2.x

Language: Python - Size: 931 KB - Last synced at: 1 day ago - Pushed at: 11 months ago - Stars: 1,026 - Forks: 110

poloclub/webshap

JavaScript library to explain any machine learning models anywhere!

Language: TypeScript - Size: 35.5 MB - Last synced at: about 11 hours ago - Pushed at: about 2 years ago - Stars: 56 - Forks: 11

chaoyanghe/Awesome-Federated-Learning

FedML - The Research and Production Integrated Federated Learning Library: https://fedml.ai

Size: 210 KB - Last synced at: about 1 hour ago - Pushed at: over 2 years ago - Stars: 1,957 - Forks: 329

stanfordnlp/axbench

Stanford NLP Python library for benchmarking the utility of LLM interpretability methods

Language: Python - Size: 617 MB - Last synced at: about 2 hours ago - Pushed at: about 2 months ago - Stars: 78 - Forks: 6

boniolp/graphit

Graph-based Time Series Clustering Visualisation Tools

Language: Python - Size: 5.75 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

faded53222/NSWord

RNA Modification Detection using Nanopore Direct RNA Sequencing via improved Transformer

Language: Jupyter Notebook - Size: 19.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

stanfordnlp/pyreft

Stanford NLP Python library for Representation Finetuning (ReFT)

Language: Python - Size: 104 MB - Last synced at: about 14 hours ago - Pushed at: 3 months ago - Stars: 1,466 - Forks: 125

alvinwan/neural-backed-decision-trees

Making decision trees competitive with neural networks on CIFAR10, CIFAR100, TinyImagenet200, Imagenet

Language: Python - Size: 2.57 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 621 - Forks: 132

iancovert/sage

For calculating global feature importance using Shapley values.

Language: Python - Size: 7.93 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 270 - Forks: 35

vanderschaarlab/autoprognosis

A system for automating the design of predictive modeling pipelines tailored for clinical prognosis.

Language: Python - Size: 960 KB - Last synced at: 1 day ago - Pushed at: about 2 months ago - Stars: 147 - Forks: 28

google-deepmind/penzai

A JAX research toolkit for building, editing, and visualizing neural networks.

Language: Python - Size: 484 MB - Last synced at: about 7 hours ago - Pushed at: 18 days ago - Stars: 1,779 - Forks: 62

g8a9/ferret

A python package for benchmarking interpretability techniques on Transformers.

Language: Python - Size: 1.52 MB - Last synced at: about 5 hours ago - Pushed at: 8 months ago - Stars: 211 - Forks: 15

SteveKGYang/MentalLLaMA

This repository introduces MentaLLaMA, the first open-source instruction following large language model for interpretable mental health analysis.

Language: Python - Size: 13.2 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 260 - Forks: 27

MAIF/shapash

🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models

Language: Jupyter Notebook - Size: 61.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,872 - Forks: 346

IAAR-Shanghai/Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

Language: TeX - Size: 6.07 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 348 - Forks: 12

ZFancy/awesome-activation-engineering

A curated list of resources for activation engineering

Size: 174 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 69 - Forks: 1

ML4BM-Lab/SENA

Official repository for the SENA-discrepancy-VAE model.

Language: Python - Size: 27.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 4 - Forks: 1

interpretml/interpret

Fit interpretable models. Explain blackbox machine learning.

Language: C++ - Size: 14.7 MB - Last synced at: 2 days ago - Pushed at: 19 days ago - Stars: 6,486 - Forks: 746

bartbussmann/BatchTopK

Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)

Language: Python - Size: 22.5 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 38 - Forks: 6

evandez/neuron-descriptions

Natural Language Descriptions of Deep Visual Features, ICLR 2022

Language: Python - Size: 3.04 MB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 65 - Forks: 7

DavidUdell/sparse_circuit_discovery

Circuit discovery in GPT-2 small, using sparse autoencoding

Language: Python - Size: 19.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7 - Forks: 1

nathanwang16/Fractal

Models learn representations, and world patterns. Fractal is beautiful, but not the key.

Language: Jupyter Notebook - Size: 4.47 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

ndif-team/ndif-website

The website for NDIF, the National Deep Inference Fabric

Language: HTML - Size: 39.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 1

EthicalML/xai

XAI - An eXplainability toolbox for machine learning

Language: Python - Size: 17.8 MB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 1,168 - Forks: 179

ndif-team/ndif

The NDIF server, which performs deep inference and serves nnsight requests remotely

Language: Python - Size: 18.6 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 25 - Forks: 7

microsoft/responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

Language: TypeScript - Size: 111 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 1,529 - Forks: 402

google/yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

Language: C++ - Size: 39.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 569 - Forks: 60

Oxid15/xai-benchmark

Open and extensible benchmark for XAI methods

Language: Python - Size: 1.78 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 6 - Forks: 0

wangyongjie-ntu/Awesome-explainable-AI

A collection of research materials on explainable AI/ML

Language: Markdown - Size: 1.93 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 1,494 - Forks: 203

pietrobarbiero/pytorch_explain

PyTorch Explain: Interpretable Deep Learning in Python.

Language: Jupyter Notebook - Size: 42.1 MB - Last synced at: 3 days ago - Pushed at: 12 months ago - Stars: 154 - Forks: 14

EthicalML/awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

Size: 2.36 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 18,412 - Forks: 2,342

tensorflow/decision-forests

A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.

Language: Python - Size: 5.87 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 680 - Forks: 113

ModelOriented/hstats

Friedman's H-statistics

Language: R - Size: 217 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 30 - Forks: 1

understandable-machine-intelligence-lab/Quantus

Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations

Language: Jupyter Notebook - Size: 147 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 598 - Forks: 77

ModelOriented/DALEX

moDel Agnostic Language for Exploration and eXplanation

Language: Python - Size: 798 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 1,420 - Forks: 168

tensorflow/tcav

Code for the TCAV ML interpretability project

Language: Jupyter Notebook - Size: 625 KB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 640 - Forks: 152

mmschlk/shapiq

Shapley Interactions and Shapley Values for Machine Learning

Language: Python - Size: 309 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 514 - Forks: 34

jorge-martinez-gil/graphcodebert-interpretability

Augmenting the Interpretability of GraphCodeBERT for Code Similarity Tasks

Language: Python - Size: 9.71 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 5 - Forks: 0

chr5tphr/zennit

Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.

Language: Python - Size: 2.28 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 225 - Forks: 34

deel-ai/xplique

👋 Xplique is a Neural Networks Explainability Toolbox

Language: Python - Size: 33.4 MB - Last synced at: about 13 hours ago - Pushed at: 7 months ago - Stars: 688 - Forks: 58

stanfordnlp/pyvene

Stanford NLP Python library for understanding and improving PyTorch models via interventions

Language: Python - Size: 25.4 MB - Last synced at: about 14 hours ago - Pushed at: 13 days ago - Stars: 740 - Forks: 82

jasonjmcghee/livelove

Love2D LSP (VS Code / Neovim / Zed / etc.) extension for live coding and live variable tracking

Language: JavaScript - Size: 5.33 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 130 - Forks: 2

rmovva/HypotheSAEs

Hypothesizing interpretable relationships in text datasets using sparse autoencoders.

Language: Jupyter Notebook - Size: 11.5 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 24 - Forks: 2

bgreenwell/ebm

Explainable Boosting Machines

Language: R - Size: 44.5 MB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 1

KempnerInstitute/overcomplete

👋 Overcomplete is a Vision-based SAE Toolbox

Language: Python - Size: 57.2 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 53 - Forks: 1

rigvedrs/YOLO-V11-CAM

Wanna know what your model sees? Here's a package for applying EigenCAM and generating heatmap from the new YOLO V11 model

Language: Jupyter Notebook - Size: 40 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 201 - Forks: 42

Sorades/CLAT

[TMI 2024] Code for "Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis"

Language: Python - Size: 617 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 19 - Forks: 0

ChicagoHAI/hypothesis-generation

This is the official repository for HypoGeniC (Hypothesis Generation in Context) and HypoRefine, which are automated, data-driven tools that leverage large language models to generate hypothesis for open-domain research. For more details, please see the original paper using the link below.

Language: Python - Size: 121 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 65 - Forks: 8

microsoft/automated-brain-explanations

Generating and validating natural-language explanations for the brain.

Language: Jupyter Notebook - Size: 1.06 GB - Last synced at: 1 day ago - Pushed at: about 2 months ago - Stars: 52 - Forks: 6

poloclub/timbertrek

Explore and compare 1K+ accurate decision trees in your browser!

Language: TypeScript - Size: 36.9 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 161 - Forks: 10

tensorflow/lucid 📦

A collection of infrastructure and tools for research in neural network interpretability.

Language: Jupyter Notebook - Size: 141 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 4,694 - Forks: 654

PKU-Alignment/aligner

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

Language: Python - Size: 16.3 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 170 - Forks: 8

JoaoLages/diffusers-interpret

Diffusers-Interpret 🤗🧨🕵️‍♀️: Model explainability for 🤗 Diffusers. Get explanations for your generated images.

Language: Jupyter Notebook - Size: 77.5 MB - Last synced at: about 9 hours ago - Pushed at: over 2 years ago - Stars: 276 - Forks: 14

trustyai-explainability/trustyai-explainability

TrustyAI Explainability Toolkit

Language: Java - Size: 19 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 39 - Forks: 42

JHoelli/Awesome-Time-Series-Explainability

A list of (post-hoc) XAI for time series

Size: 424 KB - Last synced at: 13 days ago - Pushed at: 8 months ago - Stars: 134 - Forks: 16

OpenMOSS/Language-Model-SAEs

For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.

Language: Python - Size: 10.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 113 - Forks: 13

dobriban/Principles-of-AI-LLMs

Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety and robustness (jailbreaking, oversight, uncertainty), representations, interpretability (circuits), etc.

Size: 188 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 31 - Forks: 2

EleutherAI/knowledge-neurons

A library for finding knowledge neurons in pretrained transformer models.

Language: Python - Size: 11.6 MB - Last synced at: 6 days ago - Pushed at: about 3 years ago - Stars: 157 - Forks: 18

dmis-lab/Monet

[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers

Language: Python - Size: 252 KB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 66 - Forks: 3

koulanurag/mmn

Moore Machine Networks (MMN): Learning Finite-State Representations of Recurrent Policy Networks

Language: Python - Size: 115 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 50 - Forks: 13

datamllab/awesome-fairness-in-ai

A curated list of awesome Fairness in AI resources

Size: 33.2 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 320 - Forks: 65

eliaskempf/ideal_words

A PyTorch implementation of ideal word computation.

Language: Python - Size: 48.8 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 4 - Forks: 0

haofanwang/Awesome-Computer-Vision

Awesome Resources for Advanced Computer Vision Topics

Size: 93.8 KB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 230 - Forks: 43

csinva/tree-prompt

Tree prompting: easy-to-use scikit-learn interface for improved prompting.

Language: Jupyter Notebook - Size: 18.2 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 36 - Forks: 4

taufeeque9/codebook-features

Sparse and discrete interpretability tool for neural networks

Language: Python - Size: 3.58 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 61 - Forks: 3

oneTaken/awesome_deep_learning_interpretability

深度学习近年来关于神经网络模型解释性的相关高引用/顶会论文(附带代码)

Size: 156 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 747 - Forks: 122

ThyrixYang/awesome-artificial-intelligence-research

A curated list of Artificial Intelligence (AI) Research, tracks the cutting edge trending of AI research, including recommender systems, computer vision, machine learning, etc.

Size: 44.9 KB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 124 - Forks: 14

ssfgunner/IIS

[ICLR 2025 Spotlight] This is the official repository for our paper: ''Enhancing Pre-trained Representation Classifiability can Boost its Interpretability''.

Language: Python - Size: 2.89 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 15 - Forks: 0

alanqrwang/keymorph

Robust multimodal image registration via keypoints

Language: Python - Size: 690 MB - Last synced at: 15 days ago - Pushed at: 9 months ago - Stars: 78 - Forks: 17

ALEX-nlp/MUI-Eval

Repository for the paper: Revisiting LLM Evaluation through Mechanism Interpretability: a New Metric and Model Utility Law

Language: Python - Size: 7.36 MB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 0

BirkhoffG/Explainable-ML-Papers

A list of research papers of explainable machine learning.

Size: 13.7 KB - Last synced at: 6 days ago - Pushed at: almost 4 years ago - Stars: 36 - Forks: 3

salvatorecalderaro/geco_explainer

GECo algorithm for Graph Neural Networks Explaination

Language: Jupyter Notebook - Size: 390 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

zju-vipa/awesome-neural-trees

Introduction, selected papers and possible corresponding codes in our review paper "A Survey of Neural Trees"

Size: 1.66 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 78 - Forks: 9

inseq-team/inseq

Interpretability for sequence generation models 🐛 🔍

Language: Python - Size: 7.64 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 412 - Forks: 37

M-Nauta/ProtoTree

ProtoTrees: Neural Prototype Trees for Interpretable Fine-grained Image Recognition, published at CVPR2021

Language: Python - Size: 870 KB - Last synced at: 6 days ago - Pushed at: almost 3 years ago - Stars: 101 - Forks: 21

linkedin/TE2Rules

Python library to explain Tree Ensemble models (TE) like XGBoost, using a rule list.

Language: Python - Size: 10.9 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 55 - Forks: 6

GnanaPrakashSG2004/Concept_Distillation

Framework to distill concepts of a teacher model to a student model. Trying to understand impact on performance and interpretability of the distilled student model obtained from this training paradigm.

Language: Jupyter Notebook - Size: 71.8 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

ModelOriented/kernelshap

Different SHAP algorithms

Language: R - Size: 2.5 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 47 - Forks: 7

DavidF-22/ARI3205-InterpretableAI_Project

This repository contains the entire codebase for the Interpretable AI Group Project. This project focuses on exploring and implementing multiple interpretability techniques in machine learning models to enhance transparency and interpretability

Language: Jupyter Notebook - Size: 3.32 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1 - Forks: 0

fzi-forschungszentrum-informatik/TSInterpret

An Open-Source Library for the interpretability of time series classifiers

Language: Python - Size: 200 MB - Last synced at: about 2 hours ago - Pushed at: 6 months ago - Stars: 133 - Forks: 14

pralab/secml

A Python library for Secure and Explainable Machine Learning

Language: Jupyter Notebook - Size: 67.2 MB - Last synced at: 23 days ago - Pushed at: 4 months ago - Stars: 175 - Forks: 26

evan-lloyd/graphpatch

graphpatch is a library for activation patching on PyTorch neural network models.

Language: Python - Size: 4.49 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 14 - Forks: 0

BCG-X-Official/facet

Human-explainable AI.

Language: Jupyter Notebook - Size: 50.5 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 520 - Forks: 47

ModelOriented/modelStudio

📍 Interactive Studio for Explanatory Model Analysis

Language: R - Size: 36.2 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 332 - Forks: 32

ApocryphalEditor/SRM-mapping-framework

A framework for mapping the internal geometry of transformer representations using angular projection, neuron-level modulation, and epistemically grounded prompts. Based on and extending Bird's original Spotlight Resonance Method (SRM).

Language: Python - Size: 1.55 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

zjunlp/DynamicKnowledgeCircuits

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

Language: Jupyter Notebook - Size: 39.4 MB - Last synced at: 25 days ago - Pushed at: 26 days ago - Stars: 30 - Forks: 0

Julia-XAI/ExplainableAI.jl

Explainable AI in Julia.

Language: Julia - Size: 41.6 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 112 - Forks: 3

Related Keywords
interpretability 697 machine-learning 236 deep-learning 151 explainable-ai 146 explainability 135 xai 107 interpretable-machine-learning 86 pytorch 80 python 70 explainable-ml 57 artificial-intelligence 49 ai 41 data-science 36 interpretable-ai 35 neural-network 34 nlp 30 computer-vision 30 visualization 30 interpretable-deep-learning 30 lime 27 transformers 27 neural-networks 26 llm 25 interpretable-ml 25 shap 25 natural-language-processing 24 large-language-models 24 iml 24 time-series 23 tensorflow 23 explainable-artificial-intelligence 23 ml 21 language-model 20 fairness 20 transparency 19 classification 17 convolutional-neural-networks 17 scikit-learn 16 robustness 16 mechanistic-interpretability 15 transformer 15 medical-imaging 14 cnn 14 statistics 14 deep-neural-networks 14 random-forest 14 feature-importance 14 machine-learning-interpretability 13 reinforcement-learning 13 graph-neural-networks 12 counterfactual-explanations 12 decision-trees 11 feature-attribution 11 python3 11 keras 11 shapley 10 gradcam 10 data-mining 10 transfer-learning 9 llms 9 bias 9 captum 9 awesome-list 9 attention-mechanism 9 explainable-machine-learning 9 regression 9 xgboost 9 explanation 9 adversarial-attacks 8 accountability 8 explanations 8 ai-safety 8 embeddings 8 text-classification 8 healthcare 8 grad-cam 7 awesome 7 probing 7 concept-based-explanations 7 interpretable 7 r 7 interpretability-methods 7 representation-learning 7 shapley-value 7 generative-adversarial-network 7 generalization 7 jupyter-notebook 7 machine-learning-algorithms 7 gradient-boosting 7 concepts 7 gan 6 paper 6 anomaly-detection 6 privacy 6 survey 6 huggingface 6 concept-based-models 6 responsible-ai 6 autoencoder 6 causal-inference 6