发明申请
US20090028445A1 CHARACTER IMAGE FEATURE DICTIONARY PREPARATION APPARATUS, DOCUMENT IMAGE PROCESSING APPARATUS HAVING THE SAME, CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM, RECORDING MEDIUM ON WHICH CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM IS RECORDED, DOCUMENT IMAGE PROCESSING PROGRAM, AND RECORDING MEDIUM ON WHICH DOCUMENT IMAGE PROCESSING PROGRAM IS RECORDED 有权
字符图像特征字体制作装置,具有该图像的文件图像处理装置,字符图像特征字典制备程序,记录字符图像特征制备程序的记录介质,文档图像处理程序和记录文件图像处理程序的记录介质 记录

  • 专利标题: CHARACTER IMAGE FEATURE DICTIONARY PREPARATION APPARATUS, DOCUMENT IMAGE PROCESSING APPARATUS HAVING THE SAME, CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM, RECORDING MEDIUM ON WHICH CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM IS RECORDED, DOCUMENT IMAGE PROCESSING PROGRAM, AND RECORDING MEDIUM ON WHICH DOCUMENT IMAGE PROCESSING PROGRAM IS RECORDED
  • 专利标题(中): 字符图像特征字体制作装置,具有该图像的文件图像处理装置,字符图像特征字典制备程序,记录字符图像特征制备程序的记录介质,文档图像处理程序和记录文件图像处理程序的记录介质 记录
  • 申请号: US11972477
    申请日: 2008-01-10
  • 公开(公告)号: US20090028445A1
    公开(公告)日: 2009-01-29
  • 发明人: Bo WuJianjun DouNing LeYadong WuJing Jia
  • 申请人: Bo WuJianjun DouNing LeYadong WuJing Jia
  • 优先权: CN200710129607.X 20070723
  • 主分类号: G06K9/72
  • IPC分类号: G06K9/72
CHARACTER IMAGE FEATURE DICTIONARY PREPARATION APPARATUS, DOCUMENT IMAGE PROCESSING APPARATUS HAVING THE SAME, CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM, RECORDING MEDIUM ON WHICH CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM IS RECORDED, DOCUMENT IMAGE PROCESSING PROGRAM, AND RECORDING MEDIUM ON WHICH DOCUMENT IMAGE PROCESSING PROGRAM IS RECORDED
摘要:
An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided character by character, and image features of each character image are extracted. On the basis of the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters from a character image feature dictionary which stores the image features of character image in units of character, and the first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting the first column of the first index matrix, is subjected to a lexical analysis according to a predetermined language model, whereby a second index matrix adjusted into a character string which makes sense is prepared to he utilized for searching.
公开/授权文献
信息查询
0/0