Deep networks for image classification often rely more on texture information than object shape. While efforts have been made to make deep-models shape-aware, it is often difficult to make such models simple, interpretable, or rooted in known mathematical definitions of shape. This paper presents a deep-learning model inspired by geometric moments, a classically well understood approach to measure shape-related properties. The proposed method consists of a trainable network for generating coordinate bases and affine parameters for making the features geometrically invariant, yet in a task-specific manner. The proposed model improves the final feature's interpretation. We demonstrate the effectiveness of our method on standard image classification datasets. The proposed model achieves higher classification performance as compared to the baseline and standard ResNet models while substantially improving interpretability.
翻译:深层图像分类网络往往更多地依赖纹理信息而不是对象形状。 虽然已经作出努力使深层模型的形状识别系统变得简单、可解释或根植于已知的形状数学定义中,但往往很难使这些模型简单、可解释或扎根于已知的形状数学定义。本文件介绍了一个由几何时刻启发的深层学习模型,这是测量形状相关属性的一个传统理解良好的方法。拟议方法包括一个可培训的网络,用以生成坐标基和近距离参数,从而生成几何变化性特征,但以特定任务的方式。拟议模型改进了最终特征的解释。我们展示了我们在标准图像分类数据集上的方法的有效性。拟议模型与基线和标准 ResNet 模型相比,在大幅改进可解释性的同时,实现了更高的分类性能。