Topic: "multimodality"
lucidrains/big-sleep
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
Language: Python - Size: 6.89 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 2,570 - Forks: 306

BAAI-Agents/Cradle
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
Language: Python - Size: 433 MB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 2,088 - Forks: 185

hymie122/RAG-Survey
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
Size: 6.49 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1,601 - Forks: 110

PreferredAI/cornac
A Comparative Framework for Multimodal Recommender Systems
Language: Python - Size: 24.3 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 949 - Forks: 152

ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Language: Python - Size: 1.61 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 929 - Forks: 126

AIDC-AI/Ovis
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Language: Python - Size: 5.56 MB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 904 - Forks: 56

fnzhan/Generative-AI
[TPAMI 2023] Multimodal Image Synthesis and Editing: The Generative AI Era
Language: TeX - Size: 121 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 753 - Forks: 57

aimclub/FEDOT
Automated modeling and machine learning framework FEDOT
Language: Python - Size: 225 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 667 - Forks: 87

VITA-MLLM/Woodpecker
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
Language: Python - Size: 21.2 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 636 - Forks: 30

jshilong/GPT4RoI
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Language: Python - Size: 15.1 MB - Last synced at: 1 day ago - Pushed at: 11 months ago - Stars: 528 - Forks: 28

microsoft/LLM2CLIP
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
Language: Python - Size: 2.93 MB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 513 - Forks: 24

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Language: HTML - Size: 12.7 MB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 472 - Forks: 26

zengyan-97/X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
Language: Python - Size: 13.5 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 462 - Forks: 51

afiaka87/clip-guided-diffusion
A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.
Language: Python - Size: 51.2 MB - Last synced at: 9 days ago - Pushed at: over 3 years ago - Stars: 462 - Forks: 60

MMMU-Benchmark/MMMU
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Language: Python - Size: 186 MB - Last synced at: about 16 hours ago - Pushed at: about 17 hours ago - Stars: 431 - Forks: 35

HazyResearch/fonduer
A knowledge base construction engine for richly formatted data
Language: Python - Size: 11.5 MB - Last synced at: 3 days ago - Pushed at: almost 4 years ago - Stars: 410 - Forks: 77

lium-lst/nmtpytorch 📦
Sequence-to-Sequence Framework in PyTorch
Language: Jupyter Notebook - Size: 7.49 MB - Last synced at: 13 days ago - Pushed at: over 2 years ago - Stars: 391 - Forks: 51

kyegomez/Med-PaLM
Towards Generalist Biomedical AI
Language: Python - Size: 850 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 381 - Forks: 53

kyegomez/CM3Leon
An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images
Language: Python - Size: 754 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 360 - Forks: 18

OmicsML/dance
DANCE: a deep learning library and benchmark platform for single-cell analysis
Language: Python - Size: 17.3 MB - Last synced at: 44 minutes ago - Pushed at: about 2 hours ago - Stars: 359 - Forks: 38

microsoft/UniVL
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
Language: Python - Size: 219 KB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 354 - Forks: 56

soujanyaporia/multimodal-sentiment-analysis
Attention-based multimodal fusion for sentiment analysis
Language: Python - Size: 87.3 MB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 351 - Forks: 74

Yutong-Zhou-cv/Awesome-Multimodality
A Survey on multimodal learning research.
Size: 1.76 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 324 - Forks: 22

kyegomez/NaViT
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
Language: Python - Size: 210 KB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 231 - Forks: 11

Liang-ZX/VectorNet
Pytorch implementation of CVPR2020 paper “VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation”
Language: Jupyter Notebook - Size: 174 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 200 - Forks: 43

srvk/how2-dataset
This repository contains code and metadata of How2 dataset
Language: Python - Size: 24.4 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 172 - Forks: 18

FoundationVision/GenerateU
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
Language: Python - Size: 14.4 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 168 - Forks: 7

BiomedSciAI/fuse-med-ml
A python framework accelerating ML based discovery in the medical field by encouraging code reuse. Batteries included :)
Language: Python - Size: 104 MB - Last synced at: 4 days ago - Pushed at: 7 days ago - Stars: 146 - Forks: 36

kyegomez/PALI3
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
Language: Python - Size: 2.61 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 146 - Forks: 4

florencejt/fusilli
A Python package housing a collection of deep-learning multi-modal data fusion method pipelines! From data loading, to training, to evaluation - fusilli's got you covered 🌸
Language: Python - Size: 987 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 146 - Forks: 12

kyegomez/swarms-pytorch
Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊
Language: Python - Size: 58.2 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 122 - Forks: 10

senwu/emmental
A deep learning framework for building multimodal multi-task learning systems.
Language: Python - Size: 891 KB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 110 - Forks: 18

kyegomez/PALI
Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"
Language: Python - Size: 624 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 89 - Forks: 8

lucidrains/mirasol-pytorch
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
Language: Python - Size: 1.01 MB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 88 - Forks: 2

MMStar-Benchmark/MMStar
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Language: Python - Size: 3.41 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 84 - Forks: 1

ForestsKing/Awesome-Multimodal-Time-Series
A curated list of paper, code, data, and other resources focus on multimodal time series analysis.
Size: 9.77 KB - Last synced at: 13 days ago - Pushed at: 20 days ago - Stars: 71 - Forks: 4

akashe/Multimodal-action-recognition
Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
Language: Python - Size: 64.7 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 69 - Forks: 11

ForestsKing/ChatTime
PyTorch implementation of "ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data" (AAAI 2025 [oral])
Language: Jupyter Notebook - Size: 2.07 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 65 - Forks: 10

songqiang321/Awesome-AI-Papers
This repository is used to collect papers and code in the field of AI.
Size: 4.08 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 60 - Forks: 6

mims-harvard/Clinical-knowledge-embeddings
Unified Clinical Vocabulary Embeddings for Advancing Precision Medicine
Language: Jupyter Notebook - Size: 16.4 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 53 - Forks: 5

firojalam/multimodal_social_media
multimodal social media content (text, image) classification
Language: Python - Size: 3.54 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 50 - Forks: 14

amazon-science/gluonmm
A library of transformer models for computer vision and multi-modality research
Language: Python - Size: 65.4 KB - Last synced at: 17 days ago - Pushed at: over 3 years ago - Stars: 49 - Forks: 2

firojalam/harmful-memes-detection-resources
Resources (conference/journal publications, references to dataset) for harmful memes detection.
Language: TeX - Size: 3.85 MB - Last synced at: about 17 hours ago - Pushed at: about 3 years ago - Stars: 47 - Forks: 5

YeonwooSung/LIMoE-pytorch
PyTorch implementation of LIMoE
Language: Python - Size: 4.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 1

kyegomez/EXA-1 Fork of pliang279/awesome-multimodal-ml
An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries!
Language: Jupyter Notebook - Size: 1.15 GB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 2

Luka0612/ChineseVLBert
中文领域的多模态Bert
Size: 1.95 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 42 - Forks: 5

kunzhan/MVGL
TCyb 2018: Graph learning for multiview clustering
Language: Matlab - Size: 217 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 39 - Forks: 12

UKPLab/5pils
Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformation.
Language: Python - Size: 3.38 MB - Last synced at: 30 days ago - Pushed at: about 1 month ago - Stars: 38 - Forks: 4

chalk-lab/MCMCTempering.jl
Implementations of parallel tempering algorithms to augment samplers with tempering capabilities
Language: Julia - Size: 565 KB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 37 - Forks: 5

piomin/spring-ai-showcase
Sample Spring AI Application with several use cases
Language: Java - Size: 3.94 MB - Last synced at: 1 day ago - Pushed at: 7 days ago - Stars: 32 - Forks: 16

trislett/TFCE_mediation
Fast regression and mediation analysis of vertex or voxel MRI data with TFCE
Language: Python - Size: 110 MB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 30 - Forks: 9

MileBench/MileBench
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
Language: Python - Size: 3.52 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 29 - Forks: 1

xability/maidr-legacy
[DEPRECATED prototype] Multimodal Access and Interactive Data Representation
Language: HTML - Size: 9.5 MB - Last synced at: 3 days ago - Pushed at: 19 days ago - Stars: 28 - Forks: 5

TheChymera/behaviopy
Behavioral data analysis and plotting in Python.
Language: Python - Size: 144 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 26 - Forks: 14

awslabs/guidance-for-multi-omics-and-multi-modal-data-integration-and-analysis-on-aws
This guidance creates a scalable environment in AWS to prepare genomic, clinical, mutation, expression and imaging data for large-scale analysis and perform interactive queries against a data lake. The solution also demonstrates the use of Amazon Omics for multi-modal analysis.
Language: Jupyter Notebook - Size: 179 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 24 - Forks: 8

xf-zhao/Matcha-agent
Official implementation of Matcha-agent, https://arxiv.org/abs/2303.08268
Language: Python - Size: 22.8 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 22 - Forks: 2

aws-samples/deploy-stable-diffusion-model-on-amazon-sagemaker-endpoint
Deploy Stable Diffusion Model on Amazon SageMaker Endpont
Language: Jupyter Notebook - Size: 11.9 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 20 - Forks: 6

multimodal-ai-lab/DEFAME
Fact-checking system for textual and visual inputs.
Language: Python - Size: 29.7 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 19 - Forks: 2

prml615/prml
Multimodal Fully Convolutional Neural networks for Semantic Segmentation.
Language: Python - Size: 1.62 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 10

rezacsedu/Multimodal-autoencoder-for-breast-cancer
Prognostically Relevant Subtypes and Survival Prediction for Breast Cancer Based on Multimodal Genomics Data
Language: Python - Size: 23.3 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 19 - Forks: 9

SiyuanYan1/PanDerm
PanDerm: A General-Purpose Multimodal Foundation Model for Dermatology
Language: Jupyter Notebook - Size: 3 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 18 - Forks: 1

dicomtools/TriDFusion
TriDFusion (3DF) Medical Imaging Viewer
Language: MATLAB - Size: 18.1 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 18 - Forks: 2

FuxiaoLiu/DocumentCLIP
[ICPRAI 2024] DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents
Language: Python - Size: 2.49 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 0

AmbiTyga/MemSem
A Multi-modal Framework for Sentimental Analysis of Meme
Language: Python - Size: 4.59 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 16 - Forks: 5

ldeecke/mn-torch
Mode normalization (ICLR 2019).
Language: Python - Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 16 - Forks: 1

ahq1993/Multimodal-Deep-Q-Network-for-Social-Human-Robot-Interaction
Multimodal Deep Q-Network (MDQN) for modelling human-like social intelligence.
Language: Lua - Size: 1.74 MB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 14 - Forks: 10

MIMBCD-UI/meta
:paperclip: About MIMBCD-UI Project
Size: 1.04 GB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 12 - Forks: 4

kyegomez/MMCA
The open source community's implementation of the all-new Multi-Modal Causal Attention from "DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention"
Language: Python - Size: 230 KB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 12 - Forks: 0

Agora-X/DailyPaperClub
The repository for the exclusive Daily Paper Club hosted at Agora every 10pm NYC time at this discord: https://discord.gg/Gnzh6dnzyz
Size: 14.6 KB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0

declare-lab/Sealing
[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"
Language: Python - Size: 8.92 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 11 - Forks: 3

thiippal/AI2D-RST
A repository for the AI2D-RST corpus.
Language: Python - Size: 18.2 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 11 - Forks: 3

ChongKaKam/TAMA
Code for TAMA: See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers.
Language: Python - Size: 4.82 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 10 - Forks: 1

tianleimin/ACL2018-MultimodalMultitaskSentimentAnalysis
Codes for ACL2018 Multimodal Language Workshop paper
Language: Python - Size: 234 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 10 - Forks: 1

OlehOnyshchak/pyWikiMM
Collects a multimodal dataset of Wikipedia articles and their images
Language: Python - Size: 7.78 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

kyegomez/swarmalators
Pytorch Implementation of the Swarmalators algorithm from "Exotic swarming dynamics of high-dimensional swarmalators"
Language: Python - Size: 2.16 MB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 8 - Forks: 0

kyegomez/Gen2
Implementation of "Text driven video generation" in pytorch
Language: Python - Size: 222 KB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 0

fuyahuii/ConSK-GCN
The PyTorch code for paper: "CONSK-GCN: Conversational Semantic- and Knowledge-Oriented Graph Convolutional Network for Multimodal Emotion Recognition."
Language: Python - Size: 117 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 2

kyegomez/ConvNet
Implementation of the NFNets from the paper: "ConvNets Match Vision Transformers at Scale" by Google Research
Language: Python - Size: 2.16 MB - Last synced at: 13 days ago - Pushed at: 8 months ago - Stars: 7 - Forks: 0

kyegomez/CELESTIAL-1
Omni-Modality Processing, Understanding, and Generation
Language: Python - Size: 2.49 MB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 1

cleopatra-itn/fair_multimodal_sentiment
Code and Splits for the paper "A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment Analysis Methods", In Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding (MMPT ’21), August 21, 2021,Taipei, Taiwan
Language: Python - Size: 628 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 2

gullalc/multimodal_r1_papers
Deepseek RL (GRPO)-Inspired Research for Vision & Multimodal Reasoning
Size: 37.1 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 6 - Forks: 1

mjunaidca/upwork-leads-gpt
Upwork Leads GPT is an AI-powered Job Finder tool for freelancers. It's built using OpenAI’s CustomGPT. It searches for the most relevant job postings based on provided keywords and capable to generate proposals.
Language: Python - Size: 207 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 6 - Forks: 0

kyegomez/MMCA-MGQA
Experiments around using Multi-Modal Casual Attention with Multi-Grouped Query Attention
Language: Python - Size: 210 KB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

soraxas/Occ-Traj120
A trajectories dataset with associated occupancy maps
Size: 14.5 MB - Last synced at: 8 days ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 0

helenetran3/MER-Databases-and-Emotion-Ambiguity
The most popular databases used in multimodal emotion recognition with a focus on the representation of emotion ambiguity.
Size: 252 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 5 - Forks: 1

Droliven/diverse_sampling
Official project of DiverseSampling (ACMMM2022 Paper)
Language: Python - Size: 98 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

Warvito/integrating-multi-modal-neuroimaging
Integrating machining learning and multi-modal neuroimaging to detect schizophrenia at the level of the individual
Language: Python - Size: 65.4 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 5 - Forks: 2

Spider101/Visual-Semantic-Alignments
An exploration into the possibility of generating multi-sentence image descriptions by leveraging the latent dependencies between visual concepts in an image with their textual counterparts
Language: Python - Size: 149 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

cleopatra-itn/image_text_claim_detection
Code and Dataset for paper "On the Role of Images for Analyzing Claims in Social Media" @2nd International Workshop on Cross-lingual Event-centric Open Analytics (CLEOPATRA) co-located with The Web Conf 2021
Language: Python - Size: 104 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 1

gangeshwark/multimodal_feature_extractors
[IN PROGRESS] Multimodal feature extraction modules for ease of doing research and reproducibility.
Language: Python - Size: 45.9 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 1

e1four15f/ClipSeek
A Text-to-Clip Retrieval System
Language: Python - Size: 114 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

shahariar-shibli/Adversarial-Attack-on-POS-Tags
Adversarial Attacks on Parts of Speech: An Empirical Study in Text-to-Image Generation
Language: Jupyter Notebook - Size: 101 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

peterlipan/FoF
The official implementations of our BIBM'24 paper: Focus on Focus: Focus-oriented Representation Learning and Multi-view Cross-modal Alignment for Glioma Grading
Language: Python - Size: 22.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

Clealiya/Multimodal-model
[FR|EN - Trio] 2023 - 2024 Centrale Méditerranée AI Master | Multimodal retranscription with text, audio and video
Language: Python - Size: 15.5 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

diaoenmao/Multimodal-Controller-for-Generative-Models
[CVMI 2022] Multimodal Controller for Generative Models
Language: Python - Size: 282 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

ArashVahabpour/SOG
Self-Organizing Generator
Language: Jupyter Notebook - Size: 82.9 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

QJYBall/MyoPS-Net
MyoPS-Net: Myocardial Pathology Segmentation with Flexible Combination of Multi-Sequence CMR images
Language: Python - Size: 614 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 2

sutdcv/multi-modal-video-reasoning
[ICCV2021 Workshop] Multi-Modal Video Reasoning and Analyzing Competition
Language: JavaScript - Size: 8.77 MB - Last synced at: 10 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

vita-epfl/AdversarialLoss-SGAN Fork of agrimgupta92/sgan
Analysing Adversarial Loss of Social GAN
Language: Python - Size: 378 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

MichiganNLP/deceptiondetection
Deception Detection project website
Language: JavaScript - Size: 8.99 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1
