-
公开(公告)号:US09852348B2
公开(公告)日:2017-12-26
申请号:US14690274
申请日:2015-04-17
Applicant: Google Inc.
Inventor: Krishnendu Chaudhury , Lu Chen , David Petrou , Blaise Aguera-Arcas
IPC: G06K9/18 , G06K9/00 , G06K9/32 , G06K9/34 , G06K9/46 , G06K9/62 , G06T3/40 , G06T11/60 , G06K9/36
CPC classification number: G06K9/18 , G06K9/00463 , G06K9/00483 , G06K9/3275 , G06K9/342 , G06K9/4642 , G06K9/6211 , G06K2009/363 , G06T3/4038 , G06T11/60
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, to generate a scannable document. In one aspect, a method includes receiving a scan request, wherein the scan request includes a plurality of text images; for each text image of the plurality of text images: rectifying the text image to generate a text image with parallel image lines, generating a plurality of word bounding boxes that enclose one or more connected components in the text image, wherein each word bounding box is associated with a respective word, and generating, for each respective word in the text image, a plurality of points that represent the respective word; combining the plurality of text images to form a single text document; and providing the combined image as a scannable document.
-
公开(公告)号:US20160307059A1
公开(公告)日:2016-10-20
申请号:US14690274
申请日:2015-04-17
Applicant: Google Inc.
Inventor: Krishnendu Chaudhury , Lu Chen , David Petrou , Blaise Aguera-Arcas
CPC classification number: G06K9/18 , G06K9/00463 , G06K9/00483 , G06K9/3275 , G06K9/342 , G06K9/4642 , G06K9/6211 , G06K2009/363 , G06T3/4038 , G06T11/60
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, to generate a scannable document. In one aspect, a method includes receiving a scan request, wherein the scan request includes a plurality of text images; for each text image of the plurality of text images: rectifying the text image to generate a text image with parallel image lines, generating a plurality of word bounding boxes that enclose one or more connected components in the text image, wherein each word bounding box is associated with a respective word, and generating, for each respective word in the text image, a plurality of points that represent the respective word; combining the plurality of text images to form a single text document; and providing the combined image as a scannable document.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,以产生可扫描文档。 一方面,一种方法包括接收扫描请求,其中扫描请求包括多个文本图像; 对于所述多个文本图像的每个文本图像:对所述文本图像进行校正以生成具有并行图像线的文本图像,生成将所述文本图像中的一个或多个连接分量包围的多个单词边界框,其中每个单词界限框为 与相应词相关联,并且对于文本图像中的每个相应词生成表示相应词的多个点; 组合所述多个文本图像以形成单个文本文档; 并将组合图像提供为可扫描文档。
-