GitHub topics: image-understanding
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Language: Python - Size: 38.2 MB - Last synced at: about 17 hours ago - Pushed at: 7 months ago - Stars: 939 - Forks: 46

The-Martyr/Awesome-Multimodal-Reasoning
Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models
Size: 133 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 23 - Forks: 0

Pfilipeferreira2004/DynamicVis
This is the implement of the paper "DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding"
Language: Jupyter Notebook - Size: 42.4 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

yohasebe/openai-chat-api-workflow
🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT models 🤖💬 It also allows image generation/editing/understanding 🖼️, speech-to-text conversion 🎤, and text-to-speech synthesis 🔈
Size: 113 MB - Last synced at: about 15 hours ago - Pushed at: 20 days ago - Stars: 315 - Forks: 9

KyanChen/DynamicVis
This is the implement of the paper "DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding"
Language: Python - Size: 55.5 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 45 - Forks: 1

DmitryRyumin/WACV-2024-Papers
WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
Language: Python - Size: 7.31 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 96 - Forks: 13

KleinYuan/image2text
A deep learning project to tell a story with an image or a video.
Language: Python - Size: 37.1 KB - Last synced at: about 1 month ago - Pushed at: almost 8 years ago - Stars: 42 - Forks: 10

sopermanspace/Unity_OpenAI
This GitHub repository shows how to integrate openai GPT-3 language model and ChatGPT API into a Unity project. It can be a useful way to add natural language processing capabilities to your application.
Language: C# - Size: 2.72 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 38 - Forks: 5

suprosanna/relationformer
Size: 31.5 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 45 - Forks: 2

diviz-mit/visuallydata Fork of cvzoya/visuallydata
A large-scale curated dataset of Visual.ly infographics with metadata and additional crowdsourced annotations for research applications in computer vision and natural language processing.
Language: Jupyter Notebook - Size: 75.7 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 21 - Forks: 10

Serin-Yoon/CS472-Image-Understanding
2022-1 Image Understanding Assignments & Projects
Language: MATLAB - Size: 14.4 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Dulyaaa/IUP_Labs
🏷This repository contains the lab sheets of Image Understanding & Processing (SE4130) Module in Year 4 Semester 1.
Language: Jupyter Notebook - Size: 7.86 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
