Advanced neural language models (NLMs) are widely used in sequence generation tasks because they are able to produce fluent and meaningful sentences. They can also be used to generate fake reviews, which can then be used to attack online review systems and influence the buying decisions of online shoppers. To perform such attacks, it is necessary for experts to train a tailored LM for a specific topic. In this work, we show that a low-skilled threat model can be built just by combining publicly available LMs and show that the produced fake reviews can fool both humans and machines. In particular, we use the GPT-2 NLM to generate a large number of high-quality reviews based on a review with the desired sentiment and then using a BERT based text classifier (with accuracy of 96%) to filter out reviews with undesired sentiments. Because none of the words in the review are modified, fluent samples like the training data can be generated from the learned distribution. A subjective evaluation with 80 participants demonstrated that this simple method can produce reviews that are as fluent as those written by people. It also showed that the participants tended to distinguish fake reviews randomly. Three countermeasures, Grover, GLTR, and OpenAI GPT-2 detector, were found to be difficult to accurately detect fake review.
翻译:高级神经语言模型(NLMS)被广泛用于序列生成任务,因为它们能够产生流畅和有意义的句子,因此在序列生成任务中可以广泛使用高级神经语言模型(NLMS),这些模型还可用于生成假审查,然后用来攻击在线审查系统,影响在线购物者的购买决定。为了实施这些袭击,专家们有必要为特定主题培训定制的LM模型(NLMS)。在这项工作中,我们表明,可以通过将公开提供的LMS组合来建立一个低技能威胁模型(NLMs),并表明所制作的假审查可以愚弄人和机器。特别是,我们利用GPT-2 NLM(GPT-2 NLM)来生成大量高质量的审查,这些审查可以以期望的情绪为基础,然后使用基于BERT的文本分类器(精确度为96%),用不理想的情绪过滤审查。由于审查中的文字没有被修改,培训数据等流畅的样本可以从所学的分布中产生。由80名参与者进行的主观评价表明,这种简单方法可以产生像人们所写的那样流利的审查报告。我们还表明,参与者倾向于分辨辨别假的、GPT-2号,然后随机地发现G-TR(OII)和G-II-VI)的三种反措施。