← Back to Model Beat
6Research·4d ago

On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs

Researchers have introduced a new reinforcement learning framework called Be Faithful When Responding, designed to improve the reliability of vision-language models during multimodal reasoning tasks. This approach addresses common instability issues by preventing models from exploiting language patterns rather than accurately interpreting visual information.

Covered by 13 sources

  • AApple Machine Learning Blog1d ago
  • AarXiv CS.AIXin Zou, Haolin Deng, Yibo Yan, Shuliang Liu, Kening Zheng, Zhiwei Jin, Chen Chen, Haonan Lu, Xuming Hu3d ago
  • AarXiv CS.AIEric Peh, Debaditya Roy, Basura Fernando3d ago
  • AarXiv CS.AIShuimu Chen, Yuteng Chen, Yuanshen Guan, Zebang Cheng, Zeyu Zhang, Shengqian Qin, Bin Xia, Jiaran Li, Wenming Yang, Fei Ma4d ago
  • AarXiv CS.AITao Cheng, Shi-Zhe Chen, Hao Zhang, Yixin Qin, Jinwen Luo, Zheng Wei4d ago
  • AarXiv CS.AIKaitao Chen, Weiqian Zhao, Jiamin Wu, Qihao Zheng, Shangquan Sun, Chunfeng Song, Xiaosong Wang, Mu Zhou, Mianxin Liu2d ago
  • AarXiv CS.AIJunha Jung, Minbyul Jeong, Suhyeon Lim, Sungwook Jung, Jaehoon Yun, Taeyun Roh, Mujeen Sung, Jaewoo Kang2d ago
  • AarXiv CS.AIWeixin Chen, Antonio Vergari, Han Zhao2d ago
  • AarXiv CS.AIYutao Sun, Yanting Miao, Hao-Xuan Ma, Mengyu Zhou, Mingshuai Chen, Tiancheng Zhao, Dexin Wang, Lei Lv, Li Xu, Xiaoxi Jiang, Guanjun Jiang3d ago
  • AarXiv CS.AIPeng, Lee, Yin Zhang, Yanglin Zhang, Haonan Wu, Zishan Liu, Ruoxi Zang, Xin Zhu, Jiayin Zheng, Jian Yao, Zefeng Ji, Fei Ma3d ago
  • AarXiv CS.AIWenhao Zhang, Kuanwei Lin, Xuyi Yang, Wei Gao, Ge Li1d ago
  • AarXiv CS.AILingxiao Li, Yifan Wang, Xinyan Gao, Chen Tang, Xiangyu Yue, Chenyu You1d ago
  • AarXiv CS.AIHongxing Li, Xiufeng Huang, Dingming Li, Wenjing Jiang, Zixuan Wang, Haolei Xu, Hanrong Zhang, Haiwen Hong, Longtao Huang, Hui Xue, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen1d ago

Related stories

ResearchWeak Hiring Is Hurting Young Workers More than AI, Study SaysJun 27 · 15 sourcesResearchAI Demand Begins to Justify Massive Cost of Data-Center BuildoutJun 25 · 4 sourcesResearchInsurers turn to generative AI for catastrophe modeling, but hallucinations and sales logic could get in the wayJun 25ResearchPrivacy-Aware Infrastructure in the AI-Native Era: An Asset Classification Case StudyJun 25