Weakly supervised data are widespread and have attracted much attention. However, since label quality is often difficult to guarantee, sometimes the use of weakly supervised data will lead to unsatisfactory performance, i.e., performance degradation or poor performance gains. Moreover, it is usually not feasible to manually increase the label quality, which results in weakly supervised learning being somewhat difficult to rely on. In view of this crucial issue, this paper proposes a simple and novel weakly supervised learning framework. We guide the optimization of label quality through a small amount of validation data, and to ensure the safeness of performance while maximizing performance gain. As validation set is a good approximation for describing generalization risk, it can effectively avoid the unsatisfactory performance caused by incorrect data distribution assumptions. We formalize this underlying consideration into a novel Bi-Level optimization and give an effective solution. Extensive experimental results verify that the new framework achieves impressive performance on weakly supervised learning with a small amount of validation data.
翻译:然而,由于标签质量往往难以保证,有时使用监督不力的数据会导致业绩不尽人意,即业绩退化或业绩增益不良;此外,手工提高标签质量通常不可行,造成监督不力的学习难以依赖;鉴于这一关键问题,本文件提出一个简单和新颖的、监督不力的学习框架;我们通过少量的验证数据指导标签质量的优化,并确保业绩安全,同时最大限度地提高业绩收益;由于鉴定套是描述一般化风险的良好近似值,因此可以有效地避免数据分配假设不正确造成的业绩不尽人意;我们将这一基本考虑正式化为新的双级优化,并提供有效的解决办法;广泛的实验结果核实,新框架在监督不力的学习上,以少量验证数据取得令人印象深刻的业绩。