Language identification (LID) has relevance in many speech processing applications. For the automatic recognition of code-switching speech, the conventional approaches often employ an LID system for detecting the languages present within an utterance. In the existing works, the LID on code-switching speech involves modelling of the underlying languages separately. In this work, we propose a joint modelling based LID system for code-switching speech. To achieve the same, an attention-based end-to-end (E2E) network has been explored. For the development and evaluation of the proposed approach, a recently created Hindi-English code-switching corpus has been used. For the contrast purpose, an LID system employing the connectionist temporal classification-based E2E network is also developed. On comparing both the LID systems, the attention based approach is noted to result in better LID accuracy. The effective location of code-switching boundaries within the utterance by the proposed approach has been demonstrated by plotting the attention weights of E2E network.
翻译:语言识别(LID)在许多语音处理应用程序中具有相关性。为了自动识别密码开关语言,常规方法通常使用LID系统来探测语句中存在的语言。在现有工作中,代码开关语言语言的LID系统涉及分别模拟基础语言。在这项工作中,我们提议为代码开关语言建立一个基于联合模型的LID系统。为了实现同样的目标,已经探索了一个基于关注的端对端(E2E)网络。为了开发和评估拟议方法,最近创建了一个印地语-英语代码开关系统。为了对比起见,还开发了一个使用连接时间分类E2E网络的LID系统。在比较这两种语言时,注意到基于关注的方法会提高LID的准确性。通过绘制E2E网络的注意权重,可以证明代码开关边界在拟议方法的语句中的有效位置。