We present LIWhiz, a non-intrusive lyric intelligibility prediction system submitted to the ICASSP 2026 Cadenza Challenge. LIWhiz leverages Whisper for robust feature extraction and a trainable back-end for score prediction. Tested on the Cadenza Lyric Intelligibility Prediction (CLIP) evaluation set, LIWhiz achieves a 22.4% relative root mean squared error reduction over the STOI-based baseline, yielding a substantial improvement in normalized cross-correlation.
翻译:本文提出LIWhiz,一种提交至ICASSP 2026 Cadenza挑战赛的非侵入式歌词可懂度预测系统。LIWhiz利用Whisper进行鲁棒性特征提取,并采用可训练的后端模块进行分数预测。在Cadenza歌词可懂度预测(CLIP)评估集上的测试表明,LIWhiz相较于基于STOI的基线系统实现了22.4%的相对均方根误差降低,在归一化互相关系数上取得了显著提升。