GitHub topics: vision-transformers
fahadshamshad/awesome-transformers-in-medical-imaging
A collection of resources on applications of Transformers in Medical Imaging.
Size: 3.92 MB - Last synced at: about 19 hours ago - Pushed at: over 1 year ago - Stars: 1,261 - Forks: 194

BaaaanN/Unsupervised-Domain-Adaptation-and-ViTs
🌍 Enhance land cover classification with our Unsupervised Domain Adaptation framework using Vision Transformers for multimodal satellite imagery.
Language: Python - Size: 267 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

lucidrains/metnet3-pytorch
Implementation of MetNet-3, SOTA neural weather model out of Google Deepmind, in Pytorch
Language: Python - Size: 1.06 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 221 - Forks: 28

vishal-n2403/Unsupervised-Domain-Adaptation-and-ViTs
ViT + MAE for UDA on Sentinel-1/2 (SAR/optical) land-cover classification with CORAL & DANN. PyTorch.
Language: Python - Size: 265 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

microsoft/esvit
EsViT: Efficient self-supervised Vision Transformers
Language: Python - Size: 1.88 MB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 412 - Forks: 41

aim-uofa/Poseur
[ECCV 2022] The official repo for the paper "Poseur: Direct Human Pose Regression with Transformers".
Language: Python - Size: 11.7 MB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 181 - Forks: 14

xiaojieli0903/CKPD-FSCIL
Official code of "Continuous Knowledge-Preserving Decomposition with Adaptive Layer Selection for Few-Shot Class-Incremental Learning"
Language: Python - Size: 1.99 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 32 - Forks: 1

udihermawan/EmpathAI-Your-Emotional-Well-being-Companion
EmpathAI: Emotional Well-being Companion "Where AI Meets Heart: Healing Isolation, One Conversation at a Time." EmpathAI uses Generative AI, Computer Vision, and NLP to provide real-time emotion detection, personalized conversations, and mental health support—empowering users with empathy, privacy, and cultural inclusion.
Language: Python - Size: 15.9 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1 - Forks: 0

imagine-laboratory/squeeze_every_bit
This is the official code for 'Squeeze Every Bit of Insight: Leveraging Few-shot Models with a Compact Support Set for Domain Transfer in Object Detection from Pineapple Fields' and 'Simple Object Detection Framework without Training' project.
Language: Python - Size: 13.5 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 2 - Forks: 0

sniperbroco/bookfusion-classification-app
a classification app using fine-tuned DL models
Language: Python - Size: 74.3 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

UdbhavPrasad072300/Transformer-Implementations
Library - Vanilla, ViT, DeiT, BERT, GPT
Language: Jupyter Notebook - Size: 3.29 MB - Last synced at: 13 days ago - Pushed at: almost 4 years ago - Stars: 67 - Forks: 18

uncbiag/SegNext
Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts (CVPR 2024)
Language: Python - Size: 88.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 95 - Forks: 13

ian-chuang/gaze-av-aloha
Code for paper: "Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers"
Language: Jupyter Notebook - Size: 72.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 12 - Forks: 0

Imageomics/Finer-CAM
This is an official implementation for Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation. [CVPR'25]
Language: Jupyter Notebook - Size: 5.52 MB - Last synced at: 1 day ago - Pushed at: 6 months ago - Stars: 37 - Forks: 4

mehmetkahya0/RealVision-ObjectUnderstandingAI
RealVision: A powerful, real-time object detection and understanding application using Python, OpenCV, and state-of-the-art AI models. Features dual model support (YOLO v8 + MobileNet-SSD), object tracking, performance monitoring, and modern GUI interface.
Language: HTML - Size: 115 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

theSohamTUmbare/CLIP-model
Reimplementation of the CLIP model
Language: Jupyter Notebook - Size: 1.29 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

hamidhosen42/Enhancing-Glaucoma-Diagnosis-with-Explainable-AI-Using-Vision-Transformers-Deep-Learning-Techniques
This project presents an explainable AI-based glaucoma diagnosis system using deep learning and Vision Transformers (ViTs). Retinal fundus images are preprocessed with techniques like CLAHE and edge detection to enhance feature extraction. Multiple models, including CNN, VGG16/19, InceptionResNetV2, Xception, and ViTs, were evaluated, with ViTs ach
Language: Jupyter Notebook - Size: 25.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

najmulmowla1/Earthquake-Building-Damage-Detection
Earthquake building damage detection using UAV-based image datasets.
Language: Python - Size: 14.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

anas-zafar/LLM-Survey
The official GitHub page for the survey paper "A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage"
Size: 29.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 33 - Forks: 6

uncbiag/SimpleClick
SimpleClick: Interactive Image Segmentation with Simple Vision Transformers (ICCV 2023)
Language: Python - Size: 40.2 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 239 - Forks: 40

kyegomez/SSM-As-VLM-Bridge
An exploration into leveraging SSM's as Bridge/Adapter Layers for VLM
Language: Python - Size: 2.19 MB - Last synced at: 11 days ago - Pushed at: 25 days ago - Stars: 2 - Forks: 1

billpsomas/efficient-probing
This repo contains the official implementation of the paper "Attention, Please! Revisiting Attentive Probing for Masked Image Modeling"
Language: Python - Size: 123 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

nateraw/huggingpics
🤗🖼️ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.
Language: Jupyter Notebook - Size: 972 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 303 - Forks: 28

itsDaiton/masters-thesis
Exploration and Comparison of Transformers for Image Classification.
Language: Jupyter Notebook - Size: 41.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

chikap421/videosam
[IEEE SSD 2025] This repository accompanies the paper "VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation"
Language: Jupyter Notebook - Size: 163 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 6 - Forks: 1

Hadi-M-Ibrahim/Beyond-Conventional-Transformers
Beyond Conventional Transformers: The Medical X-ray Attention (MXA) Block for Improved Multi-Label Diagnosis Using Knowledge Distillation
Language: Python - Size: 3.1 GB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

NVlabs/FAN
Official PyTorch implementation of Fully Attentional Networks
Language: Python - Size: 8.6 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 478 - Forks: 28

baaivision/Uni3D
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
Language: Python - Size: 6.05 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 569 - Forks: 37

jacobgil/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Language: Python - Size: 134 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 11,641 - Forks: 1,636

NERSC/sc23-dl-tutorial
SC23 Deep Learning at Scale Tutorial Material
Language: Python - Size: 15.7 MB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 45 - Forks: 10

cosmoimd/feature-selection-gates
Feature Selection Gates with Gradient Routing
Language: Python - Size: 25.5 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

zubair-irshad/NeRF-MAE
[ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields
Language: Python - Size: 4.47 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 101 - Forks: 4

raoyongming/DynamicViT
[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Language: Jupyter Notebook - Size: 8.78 MB - Last synced at: 4 months ago - Pushed at: about 2 years ago - Stars: 603 - Forks: 75

yessasvini23/EmpathAI-Your-Emotional-Well-being-Companion
EmpathAI: Emotional Well-being Companion "Where AI Meets Heart: Healing Isolation, One Conversation at a Time." EmpathAI uses Generative AI, Computer Vision, and NLP to provide real-time emotion detection, personalized conversations, and mental health support—empowering users with empathy, privacy, and cultural inclusion.
Language: Python - Size: 15.9 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

sayakpaul/deit-tf
Includes PyTorch -> Keras model porting code for DeiT models with fine-tuning and inference notebooks.
Language: Jupyter Notebook - Size: 40.4 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 41 - Forks: 7

autodistill/autodistill-owl-vit
OWL-ViT module for Autodistill.
Language: Python - Size: 13.7 KB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 7 - Forks: 3

murufeng/Awesome_vision_transformer
Implementation of vision transformer. ⭐⭐⭐
Language: Python - Size: 198 KB - Last synced at: 8 days ago - Pushed at: almost 4 years ago - Stars: 33 - Forks: 7

wangkai930418/attndistill
code for our paper "Attention Distillation: self-supervised vision transformer students need more guidance" in BMVC 2022
Language: Python - Size: 29.3 KB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 17 - Forks: 0

davide-coccomini/Adversarial-Magnification-to-Deceive-Deepfake-Detection-through-Super-Resolution
Official code for the paper "Adversarial Magnification to Deceive Deepfake Detection through Super Resolution"
Language: Python - Size: 20.5 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 3

marziehoghbaie/VLFAT
"Transformer-based end-to-end classification of variable-length volumetric data" that will appear in MICCAI 2023.
Language: Python - Size: 375 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 5 - Forks: 0

ShirAmir/dino-vit-features
Official implementation for the paper "Deep ViT Features as Dense Visual Descriptors".
Language: Python - Size: 5.85 MB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 422 - Forks: 51

Picsart-AI-Research/SeMask-Segmentation
[NIVT Workshop @ ICCV 2023] SeMask: Semantically Masked Transformers for Semantic Segmentation
Language: Python - Size: 2.17 MB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 253 - Forks: 37

georgosgeorgos/few-shot-diffusion-models
Few-Shot Diffusion Models
Language: Python - Size: 1.51 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 105 - Forks: 5

yuxumin/PoinTr
[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers
Language: Python - Size: 25.6 MB - Last synced at: 6 months ago - Pushed at: 11 months ago - Stars: 666 - Forks: 113

DirtyHarryLYL/Transformer-in-Vision
Recent Transformer-based CV and related works.
Size: 1.84 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 1,332 - Forks: 143

emnzn/DINO
Self-distillation with no labels
Language: Jupyter Notebook - Size: 15.8 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

chinmaynehate/DFSpot-Deepfake-Recognition
Determine whether a given video sequence has been manipulated or synthetically generated
Language: Python - Size: 25.9 MB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 98 - Forks: 19

sayakpaul/deploy-hf-tf-vision-models
This repository shows various ways of deploying a vision model (TensorFlow) from 🤗 Transformers.
Language: Jupyter Notebook - Size: 867 KB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 30 - Forks: 2

struggling-student/Tiny-ViT
🤖 Tiny Vision Transformer (Tiny-ViT): Transformers for Image Recognition
Language: Python - Size: 35.4 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

VITA-Group/SViTE
[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang
Language: Python - Size: 615 KB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 89 - Forks: 12

raj-tyagi/4CLIP-Image-Captioning
This repository presents 4CLIP, a novel approach to image captioning that enhances traditional models by dividing images into four quadrants and processing them individually. By leveraging a pretrained ViT-GPT2 model from Hugging Face, 4CLIP generates more detailed and comprehensive captions, making it suitable for fine-grained visual tasks.
Language: Python - Size: 288 KB - Last synced at: about 23 hours ago - Pushed at: 7 months ago - Stars: 0 - Forks: 1

sayakpaul/ViT-jax2tf
This repository hosts code for converting the original Vision Transformer models (JAX) to TensorFlow.
Language: Jupyter Notebook - Size: 651 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 33 - Forks: 6

YifanXu74/Evo-ViT
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Language: Python - Size: 1.86 MB - Last synced at: 8 months ago - Pushed at: about 3 years ago - Stars: 72 - Forks: 5

Faiga91/ViT-FlexibleHeads
Vision Transformers with Flexible Heads
Language: Python - Size: 135 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

xapaxca/swiftdepth
SwiftDepth: An Efficient Hybrid CNN-Transformer Model for Self-Supervised Monocular Depth Estimation on Mobile Devices
Language: Python - Size: 122 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 4 - Forks: 0

Mr-TalhaIlyas/Segmentation-Transformer-Object-Contextual-Representations-for-Semantic-Segmentation-OCR
PyTorch Implementation of OCR (Object-Contextual Representations)
Language: Python - Size: 5.86 KB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

kyegomez/VisionLLaMA
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta
Language: Python - Size: 2.19 MB - Last synced at: 14 days ago - Pushed at: 10 months ago - Stars: 16 - Forks: 0

protyayofficial/Vision-Architectures
A repository containing implementations of famous Vision Architectures over the years
Language: Python - Size: 191 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

nachiket273/VisTrans
Implementations of transformers based models for different vision tasks
Language: Python - Size: 112 KB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

rprkh/Gravitational-Lensing
Streamlit app that performs binary and multiclass classification of gravitational lensing images along with dark matter halo mass prediction.
Language: Jupyter Notebook - Size: 3.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

suryansh-sinha/ViT-From-Scratch
Implemented a Vision Transformer from famous paper 'An Image is Worth 16x16 Images'. Implemented the Attention and Multi-Head Attention mechanisms from scratch in PyTorch.
Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Marklong7/cats-and-dogs-classification Fork of kayyyywu/cats-and-dogs-classification
Deep learning pet breed recognition app
Language: Jupyter Notebook - Size: 16.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

shizhouxing/ViT_vnncomp2023
Benchmark for formally verifying ViTs
Language: Python - Size: 19.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Seeker38/image_abstract_generating
image-captioning using ViT-PhoBERT model
Language: Jupyter Notebook - Size: 1.76 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

kahnchana/svt
Official repository for "Self-Supervised Video Transformer" (CVPR'22)
Language: Python - Size: 682 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 99 - Forks: 21

uncbiag/iSegFormer
iSegFormer: Interactive Image/Volume Segmentation using Vision Transformers (MICCAI 2022)
Language: Python - Size: 40.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 3

antonio-f/Moondream
Testing the Moondream tiny vision model
Language: Jupyter Notebook - Size: 19.5 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

rishikksh20/CrossViT-pytorch
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Language: Python - Size: 229 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 171 - Forks: 18

antocad/FocusOnDepth
A Monocular depth-estimation for in-the-wild AutoFocus application.
Language: Python - Size: 7.91 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 130 - Forks: 32

hadar-hai/vit-vs-cnn-on-elephants
This project focuses on evaluating Convolutional Neural Networks (CNN) and Vision Transformers (ViT) for image classification tasks, specifically distinguishing between Asian elephants and African elephants.
Language: Jupyter Notebook - Size: 259 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

evachi27/Automated_Oral_Cancer_Classification
This repository accompanies the article entitled "Automated Classification of Oral Cancer Lesions: Vision Transformer vs Radiomics."
Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

guglielmocamporese/relvit
Official code of "Where are my Neighbors? Exploiting Patches Relations in Self-Supervised Vision Transformer", Guglielmo Camporese, Elena Izzo, Lamberto Ballan. BMVC, 2022.
Language: Python - Size: 55.7 KB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 21 - Forks: 2

JayaswalVivek/Transformer_For_Image_Classification
Vision Transfomer for classifying images
Language: Jupyter Notebook - Size: 41 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dedeswim/vits-robustness-torch
Code for the paper "A Light Recipe to Train Robust Vision Transformers" [SaTML 2023]
Language: Jupyter Notebook - Size: 348 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 49 - Forks: 2

kayyyywu/cats-and-dogs-classification
Language: Jupyter Notebook - Size: 17 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

all-things-vits/code-samples
Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and Interpreting Attention in Vision.
Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 145 - Forks: 9

andreped/INF1600-ai-workshop
🔥 Workshop in AI Deployment (INF-1600, UiT)
Language: Python - Size: 45.9 KB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 7

sayakpaul/cait-tf
Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.
Language: Jupyter Notebook - Size: 2.49 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 3

nicholas-dinicola/nanoViT
Implementation of ViT with PyTorch
Language: Python - Size: 229 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

PieroRendina/multidisciplinary-project-2023-INDYcs
Final project of the Multidisciplinary course offered at Politecnico di Milano A.Y. 2022/2023
Language: Jupyter Notebook - Size: 30.5 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

Osamah-ElRadaideh/okr
Simple python package containing backbone architectures used in various computer vision tasks
Language: Python - Size: 51.8 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

MohsenAmiri79/PASTormer
An image restoration framework (Image Deraining code has been implemented) based on the Restormer model as a back-bone. This is an early idea in my "Attending to the past" research project. This model with roughly the same amount of learnable parameters shows better performance under the same training methods
Language: Python - Size: 27.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

FalsoMoralista/plantTL2023 Fork of rtcalumby/plantTL2023
Image-based plant identification at global scale.
Language: Python - Size: 134 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

nbasyl/OFQ
The official implementation of the ICML 2023 paper OFQ-ViT
Language: Python - Size: 640 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 16 - Forks: 0

jaehyunnn/ViTPose_pytorch
An unofficial implementation of ViTPose [Y. Xu et al., 2022]
Language: Jupyter Notebook - Size: 2.11 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 56 - Forks: 10

devanshkhare1705/Music-Genre-Recognition
Classifying musical pieces into appropriate genres using CNNs and Vision Transformers
Language: Jupyter Notebook - Size: 8.81 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

jannik-brinkmann/social_biases_in_vision_transformers
ICCV'23. Code associated with "A Multidimensional Analysis of Social Biases in Vision Transformers" (Brinkmann et al.)
Language: Python - Size: 39.1 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

MJAHMADEE/Vision_Transformers
Vision Transformers
Language: Jupyter Notebook - Size: 905 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

mnguyen0226/multitask_learning_vit
Multitask-Learning (hard-parameter sharing) with Vision Transformers on Cifar10 & Cifar100
Language: Jupyter Notebook - Size: 19.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mikewheel/transformers_image_classification
Based on Dosovitskiy et al., 2020. Final project for DS4440: Practical Neural Networks, Fall 2020.
Language: Python - Size: 508 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 2

jhgan00/image-retrieval-transformers
(Unofficial) PyTorch implementation of Training Vision Transformers for Image Retrieval(El-Nouby, Alaaeldin, et al. 2021).
Language: Python - Size: 538 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 3

ckorgial/ViT-for-Cancer-Skin-Classification-TensorFlow
Cancer Skin Classification (HAM10000) using Vision Transformer (ViT).
Language: Jupyter Notebook - Size: 2.07 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 3

shashvatshah9/3dprinteranomaly
Language: Jupyter Notebook - Size: 127 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ZhouYuxuanYX/SP-ViT-Learning-2D-Spatial-Priors-for-Vision-Transformers
This is the official implementation of our BMVC 2022 paper "SP-ViT: Learning 2D Spatial Priors for Vision Transformers"
Language: Python - Size: 4.86 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

HowieMa/PPT
[ECCV 2022] "PPT: token-Pruned Pose Transformer for monocular and multi-view human pose estimation"
Language: Python - Size: 1000 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 48 - Forks: 1

MingSun-Tse/Awesome-Efficient-ViT
Recent Advances on Efficient Vision Transformers
Size: 11.7 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 0

SharifAmit/VTGAN
[ICCV'21] [Tensorflow] Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers
Language: Python - Size: 733 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 26 - Forks: 5

davide-coccomini/Cross-Forgery-Analysis-of-Vision-Transformers-and-CNNs-for-Deepfake-Image-Detection
Code for the paper Cross Forgery Analysis of Vision Transformers and CNNs for Deepfake Image Detection
Language: Python - Size: 25.4 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

happy-hsy/BCNet
【AAAI 2022】Temporal Action Proposal Generation with Background Constraint
Language: Python - Size: 5.8 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 0

tamasino52/UNETR-Pose
3D Multi-person Pose Estimation in Multi-view Environment using 3D U-Net Transformer Networks
Language: Python - Size: 7.37 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 17 - Forks: 1
