Patent search ap:("Amazon Technologies Page Inc.") AND inv:"David Paul Ramos"

1.

发明授权
Language agnostic phonetic entity resolution 有权

公开(公告)号：US11157696B1

公开(公告)日：2021-10-26

申请号：US16017313

申请日：2018-06-25

Applicant: Amazon Technologies, Inc.

Inventor： David Paul Ramos , Ameya Ashok Limaye

IPC: G06F40/295 , G10L15/22 , G10L15/02 , G10L15/18

Abstract: Techniques for performing entity resolution as part of natural language understanding processing are described. During offline operations, a system may convert text (representing entities known to the system) into audio of various languages. The languages into which the text is converted may depend on the location where the entity is likely to be spoken by users of the system. At runtime, the system processes a user input using text-based entity resolution. If text-based entity resolution fails, the system may identify user speech corresponding to an entity to be resolved, and attempt to phonetically match the user speech to the audio of the known entities. Results of the phonetic entity resolution may then used by downstream components, such as skills.

2.

发明授权
Recognizing text from frames of image data using contextual information 有权
Title translation: 使用上下文信息识别来自图像数据帧的文本

公开(公告)号：US09355336B1

公开(公告)日：2016-05-31

申请号：US14259905

申请日：2014-04-23

Applicant: Amazon Technologies, Inc.

Inventor： Sonjeev Jahagirdar , Matthew Joseph Cole , David Paul Ramos , Utkarsh Prateek , Emilie Noelle McConville , Ankur Datta , Laura Varnum Finney , Yue Liu , Bhavesh Anil Doshi , Avnish Sikka , Michael Vanne

IPC: G06K9/00 , G06K9/62

CPC classification number: G06K9/6217 , G06K9/00979 , G06K9/723 , G06K2209/01

Abstract: Disclosed are techniques for recognizing text from one or more frames of image data using contextual information. In some implementations, image data including a captured textual item is processed to identify an entity in the image data. A context can be selected using the entity, where the context corresponds to a dictionary. Text in the captured textual item can be identified using the dictionary. The identified text can be output to a display device.

Abstract translation: 公开了使用上下文信息从一个或多个图像数据帧识别文本的技术。在一些实现中，处理包括捕获的文本项的图像数据以识别图像数据中的实体。可以使用实体选择上下文，其中上下文对应于字典。捕获的文本项目中的文本可以使用字典来识别。识别的文本可以输出到显示设备。

3.

发明授权
Text recognition near an edge 有权
Title translation: 靠近边缘的文本识别

公开(公告)号：US09239961B1

公开(公告)日：2016-01-19

申请号：US14495589

申请日：2014-09-24

Applicant: Amazon Technologies, Inc.

Inventor： Matthew Joseph Cole , Yue Liu , David Paul Ramos , Avnish Sikka

IPC: G06K9/00 , G06K9/18 , G06K9/32

CPC classification number: G06K9/00456 , G06K9/2081 , G06K9/325

Abstract: The recognition of text in an acquired image is improved by using general and type-specific heuristics that can determine the likelihood that a portion of the text is truncated at an edge of an image, frame, or screen. Truncated text can be filtered such that the user is not provided with an option to perform an undesirable task, such as to dial an incorrect number or connect to an incorrect Web address, based on recognizing an incomplete text string. The general and type-specific heuristics can be combined to improve confidence, and the image data can be pre-processed on the device before processing with an optical character recognition (OCR) engine. Multiple frames can be analyzed to attempt to recognize words or characters that might have been truncated in one or more of the frames.

Abstract translation: 通过使用可以确定文本的一部分在图像，帧或屏幕的边缘被截断的可能性的一般和类型特定的启发式算法来改进获取的图像中的文本的识别。截断的文本可以被过滤，以便基于识别不完整的文本字符串，用户未被提供执行不期望的任务的选项，例如拨打不正确的号码或连接到不正确的Web地址。一般和类型特定的启发式可以组合以提高置信度，并且可以在使用光学字符识别（OCR）引擎处理之前在设备上预处理图像数据。可以分析多个帧以尝试识别可能在一个或多个帧中被截断的字或字符。

4.

发明授权
Multi-stage entity resolution 有权

公开(公告)号：US11947913B1

公开(公告)日：2024-04-02

申请号：US17356885

申请日：2021-06-24

Applicant: Amazon Technologies, Inc.

Inventor： David Paul Ramos , Tonytip Ketudat , Vikas Chawla , Lukas Leon Brower

IPC: G06F40/295 , G10L13/08 , G10L15/183 , G10L15/22

CPC classification number: G06F40/295 , G10L13/08 , G10L15/183 , G10L15/22 , G10L2015/223

Abstract: Techniques for performing multi-stage entity resolution (ER) processing are described. A system may determine a portion of a user input corresponding to an entity name, and may request an entity provider component to perform a search to determine one or more entities corresponding to the entity name. The preliminary search results may be sent to a skill selection component for processing, while the entity provider component performs a complete search to determine entities corresponding to the entity name. A selected skill component may request the complete search results to perform its processing, including determining an output responsive to the user input.

5.

发明授权
Providing additional information for text in an image 有权

公开(公告)号：US10216989B1

公开(公告)日：2019-02-26

申请号：US14884068

申请日：2015-10-15

Applicant: Amazon Technologies, Inc.

Inventor： David Paul Ramos , Matthew Joseph Cole , Matthew Daniel Hart

IPC: G06K9/72 , G06K9/00 , G06F17/30

Abstract: Disclosed are techniques for providing additional information for text in an image. In some implementations, a computing device receives an image including text. Optical character recognition (OCR) is performed on the image to produce recognized text. A word or a phrase is selected from the recognized text for providing additional information. One or more potential meanings of the selected word or phrase are determined. One of the potential meanings is selected based on other text in the image. A source of additional information corresponding to the selected meaning is selected for providing the additional information to a user's device.

6.

发明授权
Text detection using features associated with neighboring glyph pairs 有权
Title translation: 使用与相邻字形对相关联的功能的文本检测

公开(公告)号：US09367736B1

公开(公告)日：2016-06-14

申请号：US14842125

申请日：2015-09-01

Applicant: Amazon Technologies, Inc.

Inventor： Thibaud Senechal , Quan Wang , Daniel Makoto Willenson , Shuang Wu , Yue Liu , Shiv Naga Prasad Vitaladevuni , David Paul Ramos , Qingfeng Yu

IPC: G06K9/46 , G06K9/00 , G06K9/34

CPC classification number: G06K9/00463 , G06K9/00442 , G06K9/00456 , G06K9/344 , G06K9/348 , G06K9/4638 , G06K9/4652 , G06K2209/01

Abstract: A multi-orientation text detection method and associated system is disclosed that utilizes orientation-variant glyph features to determine a text line in an image regardless of an orientation of the text line. Glyph features are determined for each glyph in an image with respect to a neighboring glyph. The glyph features are provided to a learned classifier that outputs a glyph pair score for each neighboring glyph pair. Each glyph pair score indicates a likelihood that the corresponding pair of neighboring glyphs form part of a same text line. The glyph pair scores are used to identify candidate text lines, which are then ranked to select a final set of text lines in the image.

Abstract translation: 公开了一种多方向文本检测方法和相关系统，其利用取向变体字形特征来确定图像中的文本行，而不管文本行的取向如何。为相对于相邻字形的图像中的每个字形确定字形特征。字形特征被提供给学习的分类器，其为每个相邻字形对输出字形对分数。每个字形对得分表示对应的相邻字形对形成相同文本行的一部分的可能性。字形对分数用于识别候选文本行，然后将其排序以选择图像中的最后一组文本行。

7.

发明授权
Optimizing pre-processing times for faster response 有权
Title translation: 优化预处理时间以加快响应速度

公开(公告)号：US09262689B1

公开(公告)日：2016-02-16

申请号：US14133347

申请日：2013-12-18

Applicant: Amazon Technologies, Inc.

Inventor： Avnish Sikka , David Paul Ramos , Matthew Daniel Hart , Yue Liu , Emilie Noelle McConville

IPC: G06K9/00 , G06K9/46 , G06K9/34

CPC classification number: G06K9/34 , G06K9/325 , G06K2209/01

Abstract: Embodiments of the subject technology provide for determining a region of a first acquired image based at least on a viewing mode and a set of respective positions of graphical elements to decrease the pre-processing time and perceived latency for the first image. One or more regions of text in the first image are detected, and a set of regions of text that overlap with the region of the image is determined and pre-processed. The subject technology may then pre-process an entirety of a subsequent image (e.g., to pick up missing text from the region of the first image). Thus, additional OCR results may be provided to the user by using the subsequent image(s) and merging subsequent results with previous results from the first image.

Abstract translation: 本技术的实施例提供了至少基于观看模式和图形元素的各个位置的集合来确定第一获取图像的区域，以减少第一图像的预处理时间和感知等待时间。检测第一图像中的一个或多个文本区域，并且确定并预处理与图像的区域重叠的一组文本区域。主题技术可以预处理后续图像的整体（例如，从第一图像的区域拾取丢失的文本）。因此，可以通过使用后续图像向用户提供附加的OCR结果，并将后续结果与来自第一图像的先前结果合并。

8.

发明授权
Merging optical character recognized text from frames of image data 有权

公开(公告)号：US09659224B1

公开(公告)日：2017-05-23

申请号：US14230471

申请日：2014-03-31

Applicant: Amazon Technologies, Inc.

Inventor： Matthew Joseph Cole , Sonjeev Jahagirdar , Matthew Daniel Hart , David Paul Ramos , Ankur Datta , Utkarsh Prateek , Emilie Noelle McConville , Prashant Hegde , Avnish Sikka

IPC: G06K9/18 , G06K9/00

CPC classification number: G06K9/18 , G06K9/00979 , G06K9/6292 , G06K9/72 , G06K2209/01 , G06K9/00449 , G06K9/00463 , G06K9/00442

Abstract: Disclosed are techniques for merging optical character recognized (OCR'd) text from frames of image data. In some implementations, a device sends frames of image data to a server, where each frame includes at least a portion of a captured textual item. The server performs optical character recognition (OCR) on the image data of each frame. When OCR'd text from respective frames is returned to the device from the server, the device can perform matching operations on the text, for instance, using bounding boxes and/or edit distance processing. The device can merge any identified matches of OCR'd text from different frames. The device can then display the merged text with any corrections.

9.

发明授权
Hybrid optical character recognition 有权
Title translation: 混合光学字符识别

公开(公告)号：US09305227B1

公开(公告)日：2016-04-05

申请号：US14139752

申请日：2013-12-23

Applicant: Amazon Technologies, Inc.

Inventor： Rakesh Madhavan Nambiar , Sonjeev Jahagirdar , Matthew Joseph Cole , Matias Omar Gregorio Benitez , Junxiong Jia , David Paul Ramos

IPC: G06K9/18

CPC classification number: G06K9/18 , G06K9/00979 , G06K9/6292 , G06K2209/01

Abstract: Embodiments of the subject technology provide for a hybrid OCR approach which combines server and device side processing that can offset disadvantages of performing OCR solely on the server side or the device side. More specifically, the subject technology utilizes image characteristics such as glyph details and image quality measurements to opportunistically schedule OCR processing on the mobile device and/or server. In this regard, text extracted by a “faster” OCR engine (e.g., one with less latency) is displayed to a user, which is then updated by the result of a more accurate OCR engine (e.g., an OCR engine provided by the server). This approach allows factoring in additional parameters such as network latency and user preference for making scheduling decisions. Thus, the subject technology may provide significant gains in terms of reduced latency and increased accuracy by implementing one or more techniques associated with this hybrid OCR approach.

Abstract translation: 本技术的实施例提供了一种组合服务器和设备侧处理的混合OCR方法，其可以抵消仅在服务器侧或设备侧执行OCR的缺点。更具体地，本主题技术利用诸如字形细节和图像质量测量的图像特征来机会地在移动设备和/或服务器上调度OCR处理。在这方面，由“更快的”OCR引擎提取的文本（例如，具有较小延迟的引擎）被显示给用户，然后由更准确的OCR引擎（例如，由服务器提供的OCR引擎）的结果来更新）。这种方法允许考虑附加参数，例如网络延迟和用户偏好以进行调度决策。因此，本技术可以通过实施与该混合OCR方法相关联的一种或多种技术在减少的延迟和增加的准确性方面提供显着的增益。

10.

发明授权
Text detection near display screen edge 有权
Title translation: 文本检测附近显示屏边缘

公开(公告)号：US09286683B1

公开(公告)日：2016-03-15

申请号：US13865114

申请日：2013-04-17

Applicant: Amazon Technologies, Inc.

Inventor： David Paul Ramos

IPC: G06T7/00

CPC classification number: G06T7/004 , G06K9/00979 , G06K9/03 , G06K9/228 , G06K9/325

Abstract: Approaches to enable a computing device, such as a phone or tablet computer, to detect when text contained in an image captured by the camera is sufficiently close to the edge of the screen and to infer whether the text is likely to be cut off by the edge of the screen such that the text contained in the image is incomplete. If the incomplete text corresponds to actionable text associated with a function that can be invoked on the computing device, the computing device may wait until the remaining portion of the actionable text is captured by the camera and made available for processing before invoking the corresponding function on the computing device.

Abstract translation: 使诸如手机或平板电脑之类的计算设备能够检测包含在由相机拍摄的图像中的文本何时足够接近屏幕边缘并且推断文本是否可能被截断的方法屏幕边缘使图像中包含的文本不完整。如果不完整的文本对应于与可以在计算设备上调用的功能相关联的可操作的文本，则计算设备可以等待直到可执行文本的剩余部分被相机捕获并且在调用相应的功能之前可用于处理计算设备。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification