VLM

A Culturally-diverse Multilingual Multimodal Video Benchmark & Model
Large multimodal models (LMMs) have recently gained attention due to their effectiveness to understand and generate descriptions of …
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Existing Large Multimodal Models (LMMs) generally focus on only a few regions and languages. As LMMs continue to improve, it is …