arxiv:2606.24636
Xinyu Mao
hector-mao
AI & ML interests
Multimodal Large Language Models, Vision Language Models
Recent Activity
authored a paper 1 day ago
CineCap: Structured Reasoning with Spatio-Temporal Anchors for Cinematographic Video Captioning updated a model 3 days ago
hector-mao/CineCap-GRPO-8BOrganizations
None yet