With careful manipulation, malicious agents can reverse engineer private information encoded in pre-trained language models. Security concerns motivate the development of quantum pre-training. In this work, we propose a highly portable quantum language model (PQLM) that can be easily transferred to downstream tasks on classical machines. The framework consists of a cloud PQLM built with random Variational Quantum Classifiers (VQC) and local models for downstream applications. We demonstrate the portability of the quantum model by extracting only the word embeddings and effectively applying them to downstream tasks on classical machines. Our PQLM exhibits comparable performance to its classical counterpart on both intrinsic evaluation (loss, perplexity) and extrinsic evaluation (multilingual sentiment analysis accuracy) metrics and achieves an accuracy of 93.4%, outperforming the classical model. We also perform ablation studies on the factors affecting PQLM performance to analyze model stability. Our work establishes a theoretical foundation for a portable quantum pre-trained language model that could be trained on private data and made available for public use with privacy protection guarantees.
翻译:通过仔细操作,恶意物剂可以逆转在经过训练的语文模型中编码的私人信息。安全关切促使量子预培训的发展。在这项工作中,我们提议一个高度便携式量子语言模型(PQLM),可以很容易地转移到古典机器的下游任务中。框架包括云式PQLM,由随机变量量分级器(VQC)和下游应用的本地模型组成。我们通过提取单词嵌入并有效地将其应用于古典机器的下游任务来证明量子模型的可移动性。我们的PQLM展示了类似于其古典对应方的内在评价(损失、易变性)和外端评价(多语种情绪分析精度)指标的类似性能,实现了93.4%的准确性,超过了古典模型。我们还对影响PQLM性能的因素进行了分析模型稳定性的模拟研究。我们的工作为可就私人数据进行培训并向公众提供隐私保护保障的便携式量子预先语言模型奠定了理论基础。