← Back to Model Beat
2Research·Apr 16

Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions

arXiv:2503.23137v2 Announce Type: replace Abstract: Understanding humor-particularly when it involves complex, contradictory narratives that require comparative reasoning-remains a significant challenge for large vision-language models (VLMs). This limitation hinders AI's ability to engage in human-like reasoning and cultural expression. In this paper, we investigate this challenge through an in-depth analysis of comics that juxtapose panels to create humor through contradictions. We introduce the YesBut (V2), a novel benchmark with 1,262 comic images from diverse multilingual and multicultural contexts, featuring comprehensive annotations that capture various aspects of narrative understanding. Using this benchmark, we systematically evaluate a wide range of VLMs through four complementary tasks spanning from surface content comprehension to deep narrative reasoning, with particular emphasis on comparative reasoning between contradictory elements. Our extensive experiments reveal that even the most advanced models significantly underperform compared to humans, with common failures in visual perception, key element identification, comparative analysis and hallucinations. We further…

Covered by 4 sources

  • AarXiv CS.AITuo Liang, Zhe Hu, Jing Li, Hao Zhang, Yiren Lu, Yunlai Zhou, Yiran Qiao, Disheng Liu, Jeirui Peng, Jing Ma, Yu YinApr 16
  • AarXiv CS.AIZhe Hu, Tuo Liang, Jing Li, Yiren Lu, Yunlai Zhou, Yiran Qiao, Jing Ma, Yu YinApr 16
  • AarXiv CS.AIHatice Merve Vural, Doga Kukul, Ege Erdem Ozlu, Demir Ekin Arikan, Bob Mankoff, Erkut Erdem, Aykut ErdemApr 17
  • AarXiv CS.AIVenkata S Govindarajan, Laura BiesterApr 17

Related stories

ResearchMixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM MidtrainingApr 16 · 2 sourcesResearchAutomated Alignment Researchers: Using large language models to scale scalable oversight - AnthropicApr 14 · 2 sourcesResearchAI as scientist? Machine-written papers clear academic reviews, raise questions - MSNApr 13 · 2 sourcesResearchNvidia wants to scale robot simulation training with Lyra 2.0Apr 16