Invention Grant
- Patent Title: Method and system for optical character recognition (OCR) of multi-language content
-
Application No.: US15299717Application Date: 2016-10-21
-
Publication No.: US10460192B2Publication Date: 2019-10-29
- Inventor: Sainarayanan Gopalakrishnan , Rajasekar Kanagasabai , Sudhagar Subbaian
- Applicant: XEROX CORPORATION
- Applicant Address: US CT Norwalk
- Assignee: XEROX Corporation
- Current Assignee: XEROX Corporation
- Current Assignee Address: US CT Norwalk
- Agency: Jones Robb, PLLC
- Main IPC: G06K9/34
- IPC: G06K9/34 ; G06K9/00 ; H04N1/04

Abstract:
A method and system are provided for optical character recognition (OCR) of multi-language content. The method includes extracting a text portion from an image received from a user-computing device. The text portion comprises a plurality of keywords associated with a plurality of languages. The method further includes segmenting the plurality of keywords into a plurality of layers. Each layer of the plurality of layers comprises one or more keywords which are associated with a language. The method further comprise generating an OCR output of each of the plurality of layers based on the language associated with the one or more keywords in each of the plurality of layers. The method further comprises generating an electronic document of the received image based on the generated OCR output of each of the plurality of layers. The method further includes transmitting the generated electronic document to the user-computing device.
Public/Granted literature
- US20180114085A1 METHOD AND SYSTEM FOR OPTICAL CHARACTER RECOGNITION (OCR) OF MULTI-LANGUAGE CONTENT Public/Granted day:2018-04-26
Information query