Methods and systems for decision-tree-based automated symbol recognition

    公开(公告)号:US10068156B2

    公开(公告)日:2018-09-04

    申请号:US14662570

    申请日:2015-03-19

    Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed create and store a decision tree, the nodes of which include classifiers that each recognizes the symbol that corresponds to a symbol image. Input of a symbol image to the decision tree and processing of the symbol image through one or more nodes of the decision tree returns a symbol corresponding to the symbol image.

    METHODS AND SYSTEMS FOR DECISION-TREE-BASED AUTOMATED SYMBOL RECOGNITION
    2.
    发明申请
    METHODS AND SYSTEMS FOR DECISION-TREE-BASED AUTOMATED SYMBOL RECOGNITION 审中-公开
    用于基于决策树的自动符号识别的方法和系统

    公开(公告)号:US20160217123A1

    公开(公告)日:2016-07-28

    申请号:US14662570

    申请日:2015-03-19

    Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed create and store a decision tree, the nodes of which include classifiers that each recognizes the symbol that corresponds to a symbol image. Input of a symbol image to the decision tree and processing of the symbol image through one or more nodes of the decision tree returns a symbol corresponding to the symbol image.

    Abstract translation: 本文件涉及用于识别对应于扫描文档图像或其他含文本图像中的符号图像的符号的方法和系统,其中包含与中文或日文字符对应的符号,韩文形式音节块或符号 使用大量符号进行写入和打印的其他语言。 在一个实现中,当前文档所针对的方法和系统创建并存储决策树,其中节点包括分类器,每个分类器识别对应于符号图像的符号。 将符号图像输入到决策树并通过决策树的一个或多个节点处理符号图像返回与符号图像对应的符号。

    Detecting a junction in a text line of CJK characters
    3.
    发明授权
    Detecting a junction in a text line of CJK characters 有权
    检测CJK字符文本行中的结点

    公开(公告)号:US08989485B2

    公开(公告)日:2015-03-24

    申请号:US14053208

    申请日:2013-10-14

    Abstract: A method for detecting a junction in a received image of the line of text to update a junction list with descriptive data is provided. The method includes creating a color histogram based on a number of color pixels in the received image of the line of text and detecting, based at least in part on the received image of the line of text, a rung within the received image of the line of text. The method also includes identifying a horizontal position of the detected rung in the received image of the line of text and identifying a gateway on the color histogram, wherein the identified gateway is associated with the detected rung. The junction list is updated with data including a description of the identified gateway.

    Abstract translation: 提供了一种用于检测文本行的接收图像中的结以更新具有描述性数据的连接列表的方法。 该方法包括基于文本行的接收图像中的彩色像素的数量创建颜色直方图,并且至少部分地基于所接收的文本行的图像来检测所接收到的图像线内的梯级 的文字。 该方法还包括识别所接收的文本行图像中所检测到的梯级的水平位置,并且识别颜色直方图上的网关,其中所识别的网关与检测到的梯级相关联。 连接列表用包括所识别的网关的描述的数据更新。

    Methods and systems for efficient automated symbol recognition

    公开(公告)号:US09892114B2

    公开(公告)日:2018-02-13

    申请号:US14508492

    申请日:2014-10-07

    Inventor: Yuri Chulinin

    Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify a subset of the total number of symbols frequently used in the scanned document image or images. One or more lists of graphemes for the language of the text are then ordered in most-likely-occurring to least-likely-occurring order to facilitate a second optical-character-recognition step in which symbol images extracted from the one or more scanned-document images are associated with one or more graphemes most likely to correspond to the scanned symbol image.

    Methods and systems for efficient automated symbol recognition using multiple clusters of symbol patterns

    公开(公告)号:US09633256B2

    公开(公告)日:2017-04-25

    申请号:US14565782

    申请日:2014-12-10

    Inventor: Yuri Chulinin

    CPC classification number: G06K9/00456 G06K9/00429 G06K2209/011

    Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify, for each symbol image within a scanned document, a set of graphemes that match, with high frequency, symbol patterns that, in turn, match the symbol image. The set of graphemes identified for a symbol image is associated with the symbol image as a set of candidate graphemes for the symbol image. The set of candidate graphemes are then used, in one or more subsequent steps, to associate each symbol image with a most likely corresponding symbol code.

    METHOD AND SYSTEM FOR OPTICAL CHARACTER RECOGNITION THAT SHORT CIRCUIT PROCESSING FOR NON-CHARACTER CONTAINING CANDIDATE SYMBOL IMAGES
    6.
    发明申请
    METHOD AND SYSTEM FOR OPTICAL CHARACTER RECOGNITION THAT SHORT CIRCUIT PROCESSING FOR NON-CHARACTER CONTAINING CANDIDATE SYMBOL IMAGES 审中-公开
    用于非包含候选符号图像的短片电路处理的光学字符识别方法和系统

    公开(公告)号:US20160048728A1

    公开(公告)日:2016-02-18

    申请号:US14568814

    申请日:2014-12-12

    Inventor: Yuri Chulinin

    CPC classification number: G06K9/00456 G06K2209/011 G06K2209/013

    Abstract: The current document is directed to methods and systems for identifying Chinese, Japanese, Korean, or similar language symbols that correspond to symbol images in a scanned-document image or other text-containing image. In a first processing phase, each symbol image is associated with a set of candidate graphemes. In a second processing phase, each symbol image is evaluated with respect to the set of candidate graphemes identified for the symbol image during the first phase. As candidate graphemes are processed, the currently described methods and systems monitor progress towards identifying a matching grapheme and, when insufficient progress is observed, terminate processing of the candidate graphemes and identify the symbol image as a non-symbol-containing area of the scanned-document image or other text-containing image.

    Abstract translation: 本文件涉及用于识别对应于扫描文档图像或其他含文本图像中的符号图像的中文,日文,韩文或类似语言符号的方法和系统。 在第一处理阶段,每个符号图像与一组候选图形相关联。 在第二处理阶段,相对于在第一阶段期间为符号图像识别的候选图形的集合来评估每个符号图像。 当处理候选图形时,当前描述的方法和系统监视进行识别匹配的图形的进展,并且当观察不到进展时终止候选图形的处理,并且将符号图像识别为扫描图形的非符号区域, 文档图像或其他包含文字的图像。

    METHODS AND SYSTEMS FOR EFFICIENT AUTOMATED SYMBOL RECOGNITION
    7.
    发明申请
    METHODS AND SYSTEMS FOR EFFICIENT AUTOMATED SYMBOL RECOGNITION 有权
    有效自动符号识别的方法和系统

    公开(公告)号:US20150213330A1

    公开(公告)日:2015-07-30

    申请号:US14508492

    申请日:2014-10-07

    Inventor: Yuri Chulinin

    Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify a subset of the total number of symbols frequently used in the scanned document image or images. One or more lists of graphemes for the language of the text are then ordered in most-likely-occurring to least-likely-occurring order to facilitate a second optical-character-recognition step in which symbol images extracted from the one or more scanned-document images are associated with one or more graphemes most likely to correspond to the scanned symbol image.

    Abstract translation: 本文件涉及用于识别对应于扫描文档图像或其他含文本图像中的符号图像的符号的方法和系统,其中包含与中文或日文字符对应的符号,韩文形式音节块或符号 使用大量符号进行写入和打印的其他语言。 在一个实现中,当前文档所针对的方法和系统在一个或多个扫描图像上执行初始处理步骤,以识别经扫描的文档图像或图像中经常使用的符号总数的子集。 然后以最有可能发生到最不可能发生的顺序排列文本语言的一个或多个字母表格,以促进第二光学字符识别步骤,其中从一个或多个扫描 - 文档图像与最可能对应于被扫描的符号图像的一个或多个图形相关联。

    METHODS AND SYSTEMS FOR EFFICIENT AUTOMATED SYMBOL RECOGNITION USING MULTIPLE CLUSTERS OF SYMBOL PATTERNS
    8.
    发明申请
    METHODS AND SYSTEMS FOR EFFICIENT AUTOMATED SYMBOL RECOGNITION USING MULTIPLE CLUSTERS OF SYMBOL PATTERNS 有权
    使用多个符号模式的有效自动符号识别的方法和系统

    公开(公告)号:US20150213313A1

    公开(公告)日:2015-07-30

    申请号:US14565782

    申请日:2014-12-10

    Inventor: Yuri Chulinin

    CPC classification number: G06K9/00456 G06K9/00429 G06K2209/011

    Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify, for each symbol image within a scanned document, a set of graphemes that match, with high frequency, symbol patterns that, in turn, match the symbol image. The set of graphemes identified for a symbol image is associated with the symbol image as a set of candidate graphemes for the symbol image. The set of candidate graphemes are then used, in one or more subsequent steps, to associate each symbol image with a most likely corresponding symbol code.

    Abstract translation: 本文件涉及用于识别对应于扫描文档图像或其他含文本图像中的符号图像的符号的方法和系统,其中包含与中文或日文字符对应的符号,韩文形式音节块或符号 使用大量符号进行写入和打印的其他语言。 在一个实现中,当前文档所指导的方法和系统在一个或多个扫描图像上执行初始处理步骤,以针对扫描文档内的每个符号图像识别与高频率匹配的一组图形, 符号图案又反过来符合符号图像。 为符号图像识别的图形集合与符号图像相关联,作为符号图像的候选字形集合。 然后在一个或多个后续步骤中使用候选图形集合将每个符号图像与最可能相应的符号代码相关联。

Patent Agency Ranking