Method and system for optical character recognition (OCR) of multi-language content

Invention Grant

US10460192B2 Method and system for optical character recognition (OCR) of multi-language content 有权

Please log in to see more content

Patent Title: Method and system for optical character recognition (OCR) of multi-language content
Application No.: US15299717

Application Date: 2016-10-21
Publication No.: US10460192B2

Publication Date: 2019-10-29
Inventor: Sainarayanan Gopalakrishnan , Rajasekar Kanagasabai , Sudhagar Subbaian
Applicant: XEROX CORPORATION
Applicant Address: US CT Norwalk
Assignee: XEROX Corporation
Current Assignee: XEROX Corporation
Current Assignee Address: US CT Norwalk
Agency: Jones Robb, PLLC
Main IPC: G06K9/34
IPC: G06K9/34 ; G06K9/00 ; H04N1/04

Method and system for optical character recognition (OCR) of multi-language content

Abstract:

A method and system are provided for optical character recognition (OCR) of multi-language content. The method includes extracting a text portion from an image received from a user-computing device. The text portion comprises a plurality of keywords associated with a plurality of languages. The method further includes segmenting the plurality of keywords into a plurality of layers. Each layer of the plurality of layers comprises one or more keywords which are associated with a language. The method further comprise generating an OCR output of each of the plurality of layers based on the language associated with the one or more keywords in each of the plurality of layers. The method further comprises generating an electronic document of the received image based on the generated OCR output of each of the plurality of layers. The method further includes transmitting the generated electronic document to the user-computing device.

Public/Granted literature

US20180114085A1 METHOD AND SYSTEM FOR OPTICAL CHARACTER RECOGNITION (OCR) OF MULTI-LANGUAGE CONTENT Public/Granted day:2018-04-26

Information query

Espacenet