Source: Panasonic
Headline: Panasonic Connect Wins First Place at CVPR 2025 VidLLMs Competition, the World’s Premier Image Recognition Conference
The VidLLMs Workshop, held for the first time at CVPR 2025, was a competition to test the performance of video large language models (VidLLMs). Panasonic Connect entered the “Complex Video Reasoning & Robustness Evaluation” category.(For details, please check the VidLLMs Workshop – CVPR 2025 website.)
In the “Complex Video Understanding” task, video recognition AI is evaluated on how well it can handle various and difficult situations using 214 third-person perspective videos containing complex contexts and 2,400 sets of free-form descriptive questions.
The videos cover 11 complex categories, including grasping temporal order, understanding emotions and social backgrounds, and reasoning based on common sense, requiring understanding in situations close to reality. In addition, questions that deliberately ask about objects or events that are not shown, or questions that are misleading, are included to test the AI’s ability to prevent hallucinations (misidentification of facts). Moreover, answers are required to be free-form descriptions in natural language, testing the ability to express according to the context.
Traditional AI models have a correct answer rate of about 75%, while humans show high accuracy at 97%, indicating that there is still a significant performance gap between AI and humans in this field.