项目名称: 可学习的脉冲耦合神经网络与基于视-听觉融合的人机交互方法研究
项目编号: No.60805028
项目类型: 青年科学基金项目
立项/批准年度: 2009
项目学科: 金属学与金属工艺
项目作者: 赵增顺
作者单位: 山东科技大学
项目金额: 18万元
中文摘要: 机器人与人和谐共存于同一环境中,需要通过视-听觉主动获取信息并做出反应,所以识别说话人身份、识别环境对于服务机器人是一个首要任务。 本研究提出了一种可学习的脉冲耦合神经网络模型LPCNN,有效地解决了脉冲耦合神经网络缺乏学习机制的不足,可自动设置各项参数。同时引入了人类认知心理的模糊性和非线性特点,结合机器学习理论中的流行学习方法与核技巧,提出了一种多模态的新型联想记忆网络模型。 利用Gabor变换的特征提取方法,构成嵌入式隐马尔科夫模型(EHMM)的观察序列,实现了实时的人脸识别。提出了一种结合微分进化与粒子群协同进化粒子滤波的算法,用于语音共振峰跟踪,并采用一个二层振荡神经网络模型来实现混合语音分离。基于几何关系的约束提出一种局部特征匹配的修正策略,进一步采用隐马尔可夫模型解决了拓扑定位不可靠问题。以改进的PCNN神经网络作为底层视觉单元,以提出的多模态联想记忆作为高层智能单元。视觉-听觉双通道信息在该原型系统中进行融合,并进行了人脸识别、语音识别、非结构化环境建模方面的相关试验研究,取得了预期的良好效果。
中文关键词: 服务机器人;人机交互;神经网络;联想记忆;环境建模
英文摘要: Robot and human harmonious coexist in the same environment, that requires active access to information and visual-auditory response, so speaker identification, environment recognition is very important for service robots. This study proposes a learnable pulse coupled neural network model, LPCNN, which effectively solved the lack of traditional pulse coupled neural network.Meanwhile, it can automatically set the parameters. With the introduction of the fuzzy and non-linear characteristics of human cognitive psychology, we put forward a multi-modal Associative memory network model, combined with the prevalence of learning methods and nuclear techniques in machine learning theory. We make use of the Gabor transform feature extraction method to constitutes the observed sequence of embedded hidden Markov model, to achieve a real-time face recognition performance. A particle filter algorithm is proposed by the co-evolution of differential evolution and particle swarm. A two-layer oscillatory neural network model is adopted to achieve a mixed voice separation. The geometric relationship constraint for local features is used to refine the matching result. then we further use the hidden Markov models to solve the unreliable topological localization problem. In our prototype system, the improved PCNN neural network acts as the lower visual unit, while the proposed multi-modal associative memory plays as a high-level intelligent unit. Visual - auditory dual-channel perception are fused within the cognitive systems integration,whose performance is confirmed by the face recognition experiments, speech recognition experiments and the unstructured environment modeling results.
英文关键词: Service robots; human-computer interaction; neural network; associative memory; environmental modeling