We are pleased to announce VATEX Captioning Challenge 2020! This year, we release an additional private test set with 6,278 new videos for evaluation! The challenge will be hosted at the Workshop on Language & Vision with applications to Video Understanding, CVPR 2020.
Please stay tuned for more information!
To be eligible for result archives and consideration for awards, we kindly request you to send the following information to firstname.lastname@example.org using your main contact email:
The VATEX dataset is a new large-scale multilingual video description dataset, which contains over 41,250 videos and 825,000 captions in both English and Chinese. Among the captions, there are over 206,000 English-Chinese parallel translation pairs. Compared to the widely-used MSRVTT dataset, VATEX is multilingual, larger, linguistically complex, and more diverse in terms of both video and natural language descriptions. Please refer to our ICCV paper for more details. This VATEX Captioning Challenge aims to benchmark progress towards models that can describe the videos in various languages such as English and Chinese.
Please refer to the details at the Download page. You can download English/Chinese captions and video features from the page.
The challenge is hosted at the CodaLab. Please go to the Challenge page to submit your models.