An open API service providing repository metadata for many open source software ecosystems.

Topic: "captioning"

facebookresearch/mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Language: Python - Size: 17.4 MB - Last synced at: 6 days ago - Pushed at: 20 days ago - Stars: 5,558 - Forks: 939

roboflow/maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

Language: Python - Size: 10.6 MB - Last synced at: about 3 hours ago - Pushed at: 6 days ago - Stars: 2,551 - Forks: 203

ltguo19/VSUA-Captioning

Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019

Language: Python - Size: 205 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 264 - Forks: 24

DavidHuji/CapDec

CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)

Language: Python - Size: 35.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 158 - Forks: 17

fpgaminer/joycaption

JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.

Language: Python - Size: 290 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 153 - Forks: 2

Labbeti/aac-datasets

Audio Captioning datasets for PyTorch.

Language: Python - Size: 2.68 MB - Last synced at: 17 days ago - Pushed at: about 1 month ago - Stars: 115 - Forks: 6

drethage/fully-convolutional-point-network

Fully-Convolutional Point Networks for Large-Scale Point Clouds

Language: Python - Size: 1.53 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 87 - Forks: 22

audio-captioning/clotho-dataset

Python code for handling the Clotho dataset.

Language: Python - Size: 99.6 KB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 74 - Forks: 15

wangleihitcs/MedicalReportGeneration

A Base Tensorflow Project for Medical Report Generation

Language: Python - Size: 69.7 MB - Last synced at: 21 days ago - Pushed at: almost 6 years ago - Stars: 71 - Forks: 18

ParitoshParmar/MTL-AQA

What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]

Language: Python - Size: 27.7 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 68 - Forks: 15

mitvis/vistext

VisText is a benchmark dataset for semantically rich chart captioning.

Language: Jupyter Notebook - Size: 2.77 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 66 - Forks: 3

aimagelab/pacscore

Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. (CVPR 2023)

Language: Python - Size: 7.15 MB - Last synced at: 22 days ago - Pushed at: about 2 months ago - Stars: 61 - Forks: 5

TheShadow29/VidSitu

[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)

Language: Python - Size: 928 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 50 - Forks: 7

Labbeti/aac-metrics

Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.

Language: Python - Size: 856 KB - Last synced at: 17 days ago - Pushed at: 3 months ago - Stars: 45 - Forks: 3

DavidMChan/caption-by-committee

Using LLMs and pre-trained caption models for super-human performance on image captioning.

Language: Python - Size: 7.43 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 41 - Forks: 4

lucidrains/AoA-pytorch

A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering

Language: Python - Size: 39.1 KB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 41 - Forks: 5

deepgram-devs/video-chat

Sample app to display live captioning to a WebRTC video session with the Deepgram API.

Language: JavaScript - Size: 392 KB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 37 - Forks: 14

audio-captioning/dcase-2020-baseline

Audio captioning baseline system for DCASE 2020 challenge.

Language: Python - Size: 92.5 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 36 - Forks: 11

HaydenFaulkner/Tennis

A Tennis dataset and models for event detection & commentary generation

Language: Python - Size: 30.3 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 35 - Forks: 11

aimagelab/camel

CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022

Language: Python - Size: 8.46 MB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 29 - Forks: 12

CurryYuan/X-Trans2Cap

[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning

Language: Python - Size: 64 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 29 - Forks: 3

RyanLiut/awesome-diverse-captioning

Some papers about *diverse* image (a few videos) captioning

Size: 124 KB - Last synced at: about 16 hours ago - Pushed at: about 2 years ago - Stars: 26 - Forks: 3

ebu/ebu-tt-live-toolkit

Toolkit for supporting the EBU-TT Live specification

Language: Python - Size: 112 MB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 25 - Forks: 10

elbayadm/PaperNotes

My notes on some Deep Learning papers

Language: HTML - Size: 1.57 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 25 - Forks: 4

alecwangcq/show-attend-and-tell

Language: Jupyter Notebook - Size: 3.06 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 25 - Forks: 11

FeiElysia/awesome-zero-shot-captioning

A curated list of zero-shot captioning papers

Size: 15.6 KB - Last synced at: about 5 hours ago - Pushed at: over 1 year ago - Stars: 22 - Forks: 1

AdrianHsu/S2VT-seq2seq-video-captioning-attention

S2VT (seq2seq) video captioning with bahdanau & luong attention implementation in Tensorflow

Language: Python - Size: 52.6 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 19 - Forks: 10

aimagelab/PMA-Net

With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023

Language: Python - Size: 5.34 MB - Last synced at: 17 days ago - Pushed at: 11 months ago - Stars: 17 - Forks: 2

hassanhub/R3Transformer

Official python implementation of R3-Transformer

Language: Python - Size: 54.7 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 16 - Forks: 1

audio-captioning/caption-evaluation-tools

Tools for the evaluation of audio captioning.

Language: Jupyter Notebook - Size: 98.7 MB - Last synced at: 4 months ago - Pushed at: almost 5 years ago - Stars: 15 - Forks: 2

rayandrew/indonesian-image-captioning

Indonesian Image Captioning using Attention-based Semantic Compositional Networks

Language: Jupyter Notebook - Size: 13.2 MB - Last synced at: 15 days ago - Pushed at: over 5 years ago - Stars: 14 - Forks: 5

ZhaoPeiduo/BLIP2-Japanese

Modifying LAVIS' BLIP2 Q-former with models pretrained on Japanese datasets.

Language: Python - Size: 75.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 12 - Forks: 1

nssharmaofficial/reddit-hole

Automated reddit scraper and video creator

Language: Python - Size: 384 KB - Last synced at: 22 days ago - Pushed at: 7 months ago - Stars: 12 - Forks: 2

ImKeTT/ZeroGen

[NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation

Language: Python - Size: 2.94 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0

2dameneko/ide-cap-chan

ide-cap-chan is a utility for batch image captioning with natural language using various VL models

Language: Python - Size: 1.82 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 11 - Forks: 0

naiveHobo/Smart-I

Smart-I is an android application aimed at helping the visually impaired using artificial intelligence and cloud computing.

Language: Python - Size: 2.05 MB - Last synced at: 28 days ago - Pushed at: about 3 years ago - Stars: 10 - Forks: 0

jamesruan/SimpleSubtitleEditor

SimpleSubtitleEditor for Blender

Language: Python - Size: 20.5 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 10 - Forks: 2

fofr/cog-batch-image-captioning

Caption images for lora training

Language: Python - Size: 17.6 KB - Last synced at: 18 days ago - Pushed at: 9 months ago - Stars: 8 - Forks: 5

nikhilkumarsingh/MemeGenerator

Python program to generate memes.

Language: Jupyter Notebook - Size: 274 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 6

Mauville/MedCLIP

Medical image captioning using OpenAI's CLIP

Language: Jupyter Notebook - Size: 3.11 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 4

oshtz/tagmeister-pc

Efficient image captioning using OpenAI API

Language: TypeScript - Size: 14.3 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 6 - Forks: 0

ArchAngelAries/TagScribeR

A tool to streamline AI image captioning

Language: Python - Size: 190 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 6 - Forks: 0

deepgram-devs/twilio-live-captions

Sample app demonstrating adding live captions to Twilio Video rooms

Language: JavaScript - Size: 23.4 KB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 0

congphase/img-captioning-in-vietnamese

An attempt to solve image captioning (in Vietnamese language) regarding ball sports contexts.

Language: Python - Size: 8.75 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 1

Andrew-Ng-s-number-one-fan/Readings

Size: 322 MB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

cd2bit/awesome-list-of-captioned-courses

Online professional courses that are captioned and/or subtitled

Size: 10.7 KB - Last synced at: 5 days ago - Pushed at: about 6 years ago - Stars: 5 - Forks: 0

Hyeongkeun/LAVCap

Official Pytorch Implementation of 'LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport' (ICASSP2025)

Language: Python - Size: 3.58 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 3 - Forks: 0

mrazhou/SEN

Single-stream Extractor Network with Contrastive Pre-training for Remote Sensing Change Captioning

Language: Python - Size: 64.2 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

ebu/ebu-tt

A public repository with key information about the EBU Timed Text (EBU-TT) format.

Size: 7.81 KB - Last synced at: 5 months ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

brayevalerien/ReCap

An image (re)captioning GUI for image generation models dataset preparation, made for easy caption editing.

Language: Python - Size: 2.45 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

KennethWussmann/caption.now

Quickly and efficiently caption your image dataset for AI training

Language: TypeScript - Size: 3.76 MB - Last synced at: 19 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 1

J0SAL/Aide

An App with Voice Assisted Image Captioning and VQA For Visually Challenged Individuals

Language: Dart - Size: 18.2 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 1

wangleihitcs/ImageCaptions

A base model for image captions.

Language: Python - Size: 96.3 MB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 1

dragonfruit-ai/launchpad

🚀 Documentation and component library for Dragonfruit AI's Launchpad

Size: 91.8 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1 - Forks: 0

Wylgrif/Captioninghelper

a small tool to help caption a dataset | coded in python

Language: Python - Size: 650 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

Aavtic/parashu

A video subtitle editor program in rust.

Language: Rust - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

Anshler/ICG_sd_extension

Image caption extension for A1111 Webui 👁️📜🖋️

Language: Python - Size: 181 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

stevennyman/yt-transcript

JavaScript bookmarklet for viewing YouTube video transcripts in a popout window.

Language: JavaScript - Size: 16.6 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Sharif-SLPL/image-captioning

Automatically describing the content of an image in Persian

Language: Jupyter Notebook - Size: 1.21 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

Dong-JinKim/DRCaptioning

Language: Jupyter Notebook - Size: 8.41 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

elbayadm/captioning

Captioning code in PyTorch

Language: Jupyter Notebook - Size: 763 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

jyotishp/neural-captioning Fork of Saiteja-Reddy/Show-and-Tell

A neural network consisting of CNN and LSTM for generating captions of an image thrown at it.

Language: Jupyter Notebook - Size: 139 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

kozhemyak/joycaption-alpha2-runpod-captioner Fork of brendanmckeag/gemma-captioner-images

This project provides a serverless runpod image captioning service using RunPod and Hugging Face's JoyCaption Alpha Two model. This service processes images/photos and generates descriptive captions or tags based on a customizable prompt.

Language: Python - Size: 76.2 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

trucaption/trucaption

A real-time captioning system with support for large and small screen display.

Language: JavaScript - Size: 2.68 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

AMfeta99/NLP_LLM

This repository is dedicated to small projects and some theoretical material that I used to get into NLP and LLM in a practical and efficient way.

Language: Jupyter Notebook - Size: 77.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

git-khandelwal/CNN-to-GPT2

Image Captioning using CNNs and Transformers

Language: Python - Size: 15.6 KB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

ssube/label-prompt-caption

Language: Python - Size: 127 KB - Last synced at: 19 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

basedrhys/text-od-robustness

Evaluating the robustness of text-conditioned OD models such as MDETR

Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: 18 days ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1

petercorke/vtt-clean

Python script to clean VTT files generated by Microsoft Stream

Language: Python - Size: 1.95 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

AmbiTyga/GifCaptioner

A Deep Neural Network for gif captioning

Language: Python - Size: 9.55 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0