Humans assess image quality through a perception-reasoning cascade, integrating sensory cues with implicit reasoning to form self-consistent judgments. In this work, we investigate how a model can acquire both human-like and self-consistent reasoning capability for blind image quality assessment (BIQA). We first collect human evaluation data that capture several aspects of human perception-reasoning pipeline. Then, we adopt reinforcement learning, using human annotations as reward signals to guide the model toward human-like perception and reasoning. To enable the model to internalize self-consistent reasoning capability, we design a reward that drives the model to infer the image quality purely from self-generated descriptions. Empirically, our approach achieves score prediction performance comparable to state-of-the-art BIQA systems under general metrics, including Pearson and Spearman correlation coefficients. In addition to the rating score, we assess human-model alignment using ROUGE-1 to measure the similarity between model-generated and human perception-reasoning chains. On over 1,000 human-annotated samples, our model reaches a ROUGE-1 score of 0.512 (cf. 0.443 for baseline), indicating substantial coverage of human explanations and marking a step toward human-like interpretable reasoning in BIQA.
翻译:人类通过感知-推理级联过程评估图像质量,将感官线索与隐式推理相结合,形成自洽的判断。本研究探讨模型如何为无参考图像质量评估(BIQA)获取类人且自洽的推理能力。我们首先收集了涵盖人类感知-推理流程多个维度的人工评估数据,随后采用强化学习,以人工标注作为奖励信号,引导模型实现类人的感知与推理。为使模型内化自洽推理能力,我们设计了一种奖励机制,驱动模型完全基于自生成的描述推断图像质量。实验表明,在皮尔逊相关系数与斯皮尔曼相关系数等通用指标下,本方法的评分预测性能与最先进的BIQA系统相当。除评分外,我们使用ROUGE-1评估人机对齐度,以衡量模型生成与人类感知-推理链的相似性。在超过1000个人工标注样本上,本模型的ROUGE-1得分达到0.512(基线模型为0.443),表明其能充分覆盖人类解释逻辑,标志着BIQA领域向类人可解释推理迈出了重要一步。