Compared to facial expression recognition, expression synthesis requires a very high-dimensional mapping. This problem exacerbates with increasing image sizes and limits existing expression synthesis approaches to relatively small images. We observe that facial expressions often constitute sparsely distributed and locally correlated changes from one expression to another. By exploiting this observation, the number of parameters in an expression synthesis model can be significantly reduced. Therefore, we propose a constrained version of ridge regression that exploits the local and sparse structure of facial expressions. We consider this model as masked regression for learning local receptive fields. In contrast to the existing approaches, our proposed model can be efficiently trained on larger image sizes. Experiments using three publicly available datasets demonstrate that our model is significantly better than $\ell_0, \ell_1$ and $\ell_2$-regression, SVD based approaches, and kernelized regression in terms of mean-squared-error, visual quality as well as computational and spatial complexities. The reduction in the number of parameters allows our method to generalize better even after training on smaller datasets. The proposed algorithm is also compared with state-of-the-art GANs including Pix2Pix, CycleGAN, StarGAN and GANimation. These GANs produce photo-realistic results as long as the testing and the training distributions are similar. In contrast, our results demonstrate significant generalization of the proposed algorithm over out-of-dataset human photographs, pencil sketches and even animal faces.
翻译:与面部表达式识别相比, 表达式合成需要非常高的维度映射。 这个问题随着图像大小的增加而加剧, 并且限制了现有的表达式合成方法, 从而限制相对较小的图像。 我们观察到, 面部表达式往往是一个表达式的分散和本地相关的变化。 通过利用这一观察, 表达式合成模型中的参数数量可以显著减少。 因此, 我们提出一个受限的脊柱回归版本, 利用当地和稀疏的面部表达式结构。 我们认为这个模型是学习本地接收字段的蒙面回归模式。 与现有的方法相比, 我们提议的模型可以在更大的图像大小上得到高效的培训。 使用三种公开的数据集进行的实验表明, 我们的模型比 $\ ell_ 0, \ ell_ 1 和 $\ ell_ 2$- regressurgation, SVD 方法, 以及 中位图像质量和空间复杂性的回归。 参数的减少使得我们的方法在更小的数据集培训后也能被概括化。 拟议的算算算算法, 包括G- G- AN- 长期的模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟模拟的模拟模拟模拟模拟模拟的模拟结果。