Hello. Introduce a paper #6

wjun0830 · 2023-03-29T06:42:48Z

Hello. We'd like to introduce our paper "Query-Dependent Video Representation for Moment Retrieval and Highlight Detection (CVPR 2023 Paper)" regarding cross-modal moment retrieval.

Code : https://github.com/wjun0830/QD-DETR
Arxiv : https://arxiv.org/abs/2303.13874

nguyentthong · 2024-08-26T03:43:47Z

Hi,

Thank you for your wonderful survey!

Would you mind adding 2 papers about video-text retrieval.

Paper 1: Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning

Accepted at ECCV 2024.

It leverages LLaVA to increase the scale of training data to video-text retrieval. The approach is to forward the concatenated frames of a video to LLaVA to generate the caption for the video.

Paper link: https://arxiv.org/abs/2407.03788

Code link: https://github.com/nguyentthong/meta_optimized_angular_margin_contrastive_lvlm

Paper 2: Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

Accepted at ACL 2024 as Findings.

This paper summarizes video-text retrieval methods from model architecture, model training, and data perspectives.

Paper link: https://arxiv.org/abs/2406.05615

Code link: https://github.com/nguyentthong/video-language-understanding

Thanks a lot!