主题： Imitation Attacks and Defenses for Black-box Machine Translation Systems
摘要： 我们考虑一个寻求窃取黑盒机器翻译（MT）系统的对手，以获取经济利益或排除模型错误。我们首先表明，黑盒机器翻译系统可以通过使用单语句子和训练模型来模拟它们的输出来窃取。通过模拟实验，我们证明了即使模仿模型的输入数据或架构与受害者不同，MTmodel的窃取也是可能的。应用这些思想，我们在高资源和低资源语言对上训练了三个生产MT系统的0.6 BLEU以内的模仿模型。然后，我们利用模仿模型的相似性将对抗性示例转移到生产系统。我们使用基于梯度的攻击，这些攻击会暴露输入，从而导致语义错误的翻译，内容丢失和庸俗的模型输出。为了减少这些漏洞，我们提出了一种防御措施，该防御措施会修改翻译输出，从而误导了模仿模型优化的防御措施。这种防御降低了仿真模型BLEU的性能，并降低了BLEU的攻击传输速率和推理速度。
Earnings call (EC), as a periodic teleconference of a publicly-traded company, has been extensively studied as an essential market indicator because of its high analytical value in corporate fundamentals. The recent emergence of deep learning techniques has shown great promise in creating automated pipelines to benefit the EC-supported financial applications. However, these methods presume all included contents to be informative without refining valuable semantics from long-text transcript and suffer from EC scarcity issue. Meanwhile, these black-box methods possess inherent difficulties in providing human-understandable explanations. To this end, in this paper, we propose a Multi-Domain Transformer-Based Counterfactual Augmentation, named MTCA, to address the above problems. Specifically, we first propose a transformer-based EC encoder to attentively quantify the task-inspired significance of critical EC content for market inference. Then, a multi-domain counterfactual learning framework is developed to evaluate the gradient-based variations after we perturb limited EC informative texts with plentiful cross-domain documents, enabling MTCA to perform unsupervised data augmentation. As a bonus, we discover a way to use non-training data as instance-based explanations for which we show the result with case studies. Extensive experiments on the real-world financial datasets demonstrate the effectiveness of interpretable MTCA for improving the volatility evaluation ability of the state-of-the-art by 14.2\% in accuracy.