Font synthesis has been a very active topic in recent years because manual font design requires domain expertise and is a labor-intensive and time-consuming job. While remarkably successful, existing methods for font synthesis have major shortcomings; they require finetuning for unobserved font style with large reference images, the recent few-shot font synthesis methods are either designed for specific language systems or they operate on low-resolution images which limits their use. In this paper, we tackle this font synthesis problem by learning the font style in the embedding space. To this end, we propose a model, called FontNet, that simultaneously learns to separate font styles in the embedding space where distances directly correspond to a measure of font similarity, and translates input images into the given observed or unobserved font style. Additionally, we design the network architecture and training procedure that can be adopted for any language system and can produce high-resolution font images. Thanks to this approach, our proposed method outperforms the existing state-of-the-art font generation methods on both qualitative and quantitative experiments.
翻译:近年来,字体合成是一个非常活跃的主题,因为人工字体设计需要域域专长,而且是一项劳动密集型和耗时的工作。虽然非常成功,但现有的字体合成方法有重大缺陷;这些方法需要用大参考图像对未观察到的字体样式进行微调,但最近的微小字体合成方法要么是为特定语言系统设计的,要么是在限制其使用的低分辨率图像上操作。在本文中,我们通过学习嵌入空间的字体样式来解决这个字体合成问题。为此,我们提议了一个模型,称为FontNet,同时学习将嵌入空间的字体样式分开,其距离直接对应字体相似度的测量,并将输入图像转换到所观察到或未观察到的字体样式。此外,我们设计了网络架构和培训程序,可以用于任何语言系统,并能够生成高分辨率的字体图像。通过这种方法,我们提出的方法在质和量实验中超越了现有最先进的字体生成方法。