captioning
Here are 112 public repositories matching this topic...
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
-
Updated
Jun 17, 2026 - Python
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
-
Updated
Jun 29, 2026 - Python
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
-
Updated
Feb 24, 2026 - Jupyter Notebook
Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019
-
Updated
Oct 18, 2019 - Python
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
-
Updated
Jan 28, 2024 - Python
A Tennis dataset and models for event detection & commentary generation
-
Updated
Jun 20, 2025 - Python
Audio Captioning datasets for PyTorch.
-
Updated
Mar 25, 2026 - Python
Medical image captioning using OpenAI's CLIP
-
Updated
Mar 7, 2023 - Jupyter Notebook
VisText is a benchmark dataset for semantically rich chart captioning.
-
Updated
Aug 10, 2025 - Jupyter Notebook
Automated image & video captioning using Qwen-VL, Gemma4 and SAM3.
-
Updated
Apr 27, 2026 - Python
Fully-Convolutional Point Networks for Large-Scale Point Clouds
-
Updated
Mar 22, 2019 - Python
Python code for handling the Clotho dataset.
-
Updated
Nov 24, 2020 - Python
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
-
Updated
May 5, 2025 - Python
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
-
Updated
Mar 22, 2026 - Python
A Base Tensorflow Project for Medical Report Generation
-
Updated
Jun 16, 2019 - Python
[CVPR 2023 & IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
-
Updated
Jul 29, 2025 - Python
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
-
Updated
Aug 17, 2021 - Python
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
-
Updated
Nov 8, 2020 - Python
Using LLMs and pre-trained caption models for super-human performance on image captioning.
-
Updated
Oct 13, 2023 - Python
Audio captioning baseline system for DCASE 2020 challenge.
-
Updated
Aug 22, 2023 - Python
Improve this page
Add a description, image, and links to the captioning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the captioning topic, visit your repo's landing page and select "manage topics."