Contextual AD narration with interleaved multimodal sequenceJan 1, 2025·Hanlin Wang,Zhan Tong,Kecheng Zheng,Yujun ShenLimin Wang· 0 min read Cite URLTypeConference paperPublicationProceedings of the IEEE/CVF Conference on Computer Vision and Pattern RecognitionLast updated on Jan 1, 2025AuthorsLimin WangNanjing University← CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding Jan 1, 2025LeviTor: 3D trajectory oriented image-to-video synthesis Jan 1, 2025 →