-
公开(公告)号:US11308354B1
公开(公告)日:2022-04-19
申请号:US16834997
申请日:2020-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Ron Litman , Oron Anschel , Shahar Tsiper , Roee Litman , Shai Mazor , Jonathan Wu , Raghavan Manmatha
Abstract: Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.
-
公开(公告)号:US11341605B1
公开(公告)日:2022-05-24
申请号:US16588503
申请日:2019-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Kunwar Yashraj Singh , Amit Adam , Shahar Tsiper , Gal Sabina Star , Roee Litman , Hadar Averbuch Elor , Vijay Mahadevan , Rahul Bhotika , Shai Mazor , Mohammed El Hamalawi
Abstract: Techniques for document rectification via homography recovery using machine learning are described. An image rectification system can intelligently make use of multiple pipelines for rectifying document images based on the detected type of device that generated the images. The image rectification system can provide high-quality rectifications without requiring human cooperation, multiple views of the document in multiple images, and/or without being constrained to only be able to process images from one source context.
-