词典驱动的联机手写维吾尔文单词识别方法研究

项目名称： 词典驱动的联机手写维吾尔文单词识别方法研究

项目编号： No.61462081

项目类型： 地区科学基金项目

立项/批准年度： 2015

项目学科： 其他

项目作者： 玛依热·依布拉音

作者单位： 新疆大学

项目金额： 46万元

中文摘要： 联机手写输入做为一种自然、方便的输入方法，已经得到了高度重视并广泛应用。然而，联机维吾尔文手写识别研究至今还非常少见。通过分析维吾尔文字母与单词自身的结构和书写特点, 本项目研究基于词典驱动的、集成切分与识别的联机手写维吾尔文单词识别框架和方法。系统中把单词识别问题转化为一个词典中的词条与手写单词图像匹配的优化问题。首先，去掉单词中的附件部分后，通过分析主要笔划书写轨迹的形状，找出潜在的过分割点并合并被切分成的基本块与对应它的附加部分，得到基本字母片段序列。对相邻的基本片段进行组合形成切分候选网格。然后，采用词典驱动的方法，将字母识别信息、几何信息和词典信息一起加入到单词识别系统的路径匹配过程。其中,采用置信度转换的方法，将分类器的输出转换成概率的形式，使参数调整更为方便；利用动态规划算法实现单词识别过程中的最优路径匹配选择，得到最优识别结果。本研究成果有助于促进少数民族地区的信息化建设步

中文关键词： 维吾尔单词；联机手写识别；词典驱动；集成切分与识别；路径匹配

英文摘要： On-line handwriting input as a natural, convenient has been attached great importance to and has been widely used. However, little work has been done in this area. Through analysis of the unique shape and writing styles of Uyghur characters, this project research an effective approach for online handwritten Uyghur word recognition based on a lexicon-driven, integrated segmentation and recognition have been presented. Word recognition problem is transformed into matching optimization problems between the dictionary entry and the handwritten word image. The first step, after removing delayed strokes from the handwritten words, potential breakpoints are detected from concavities and ligatures by temporal and shape analysis of the stroke trajectory. Reconstruct delayed strokes and obtained a sequence of primitive segments. Then, by combining adjacent fragments, create candidate segmentation grids. In the second step, using lexicon-driven approach, combined with character recognition information, geometric information and dictionary information into path matching procedure in the word recognition system. Then using the confidence transformation method convert the similarity scores into probabilities, such that the tuning of weighting parameters becomes easier. Dynamic matching between characters in the lexicon entry and segment(s) of the input word image is used to ranking the lexicon entries in order to get best match. The research of recognition techniques for online handwritten Uyghur characters has a far-reaching meaning about developing the information technology and national culture of specific ethnic group.

英文关键词： Uyghur word;Online handwritten recognition;Lexicon Directed;Integrated segmentation and recognition;path Matching

成为VIP会员查看完整内容