The goal of representation learning is different from the ultimate objective of machine learning such as decision making, it is therefore very difficult to establish clear and direct objectives for training representation learning models. It has been argued that a good representation should disentangle the underlying variation factors, yet how to translate this into training objectives remains unknown. This paper presents an attempt to establish direct training criterions and design principles for developing good representation learning models. We propose that a good representation learning model should be maximally expressive, i.e., capable of distinguishing the maximum number of input configurations. We formally define expressiveness and introduce the maximum expressiveness (MEXS) theorem of a general learning model. We propose to train a model by maximizing its expressiveness while at the same time incorporating general priors such as model smoothness. We present a conscience competitive learning algorithm which encourages the model to reach its MEXS whilst at the same time adheres to model smoothness prior. We also introduce a label consistent training (LCT) technique to boost model smoothness by encouraging it to assign consistent labels to similar samples. We present extensive experimental results to show that our method can indeed design representation learning models capable of developing representations that are as good as or better than state of the art. We also show that our technique is computationally efficient, robust against different parameter settings and can work effectively on a variety of datasets. Code available at https://github.com/qlilx/odgrlm.git
翻译:代表性学习的目标与诸如决策等机器学习的最终目标不同,因此很难为代表性学习模式确定明确和直接的目标,因此,很难为代表性学习模式的培训确定明确和直接的目标,有人争辩说,良好的代表性应该解开基本的变异因素,但如何将它转化为培训目标仍然不为人知。本文试图为开发良好的代表性学习模式制定直接的培训标准和设计原则。我们提议,良好的代表性学习模式应当具有最大程度的表达性,即能够区分投入配置的最大数量。我们正式界定明确性,并引入一般学习模式的最大表达性(MEXS)理论。我们提出广泛的实验性结果,通过最大限度地表达性(MEX)来培训模型,同时纳入一般的变异因素,例如模型的平稳性。我们提出了一种鼓励模型达到代表性模式的直接培训标准和设计原则,同时遵循以前的模型的平稳性。我们还采用了一种标签一致的培训(LCT)技术,通过鼓励它为相似的样本指定一致的标签来提升模型的畅通性。我们提出了广泛的实验性结果,以显示我们的方法能够有效地设计稳健健的模型,我们也可以在不同的计算方法上发展一种可靠的模型。