Patent search ap:("AMAZON TECHNOLOGIES Page INC.") AND inv:"Bjorn Hoffmeister"

41.

发明申请
DEEP MULTI-CHANNEL ACOUSTIC MODELING 审中-公开

公开(公告)号：US20200349928A1

公开(公告)日：2020-11-05

申请号：US16932049

申请日：2020-07-17

Applicant: Amazon Technologies, Inc.

Inventor： Arindam Mandal , Kenichi Kumatani , Nikko Strom , Minhua Wu , Shiva Sundaram , Bjorn Hoffmeister , Jeremie Lecomte

IPC: G10L15/16 , G10L15/22 , G10L15/30 , G06N3/08 , H04R3/00 , G10L15/06 , H04R1/40

Abstract: Techniques for speech processing using a deep neural network (DNN) based acoustic model front-end are described. A new modeling approach directly models multi-channel audio data received from a microphone array using a first model (e.g., multi-channel DNN) that takes in raw signals and produces a first feature vector that may be used similarly to beamformed features generated by an acoustic beamformer. A second model (e.g., feature extraction DNN) processes the first feature vector and transforms it to a second feature vector having a lower dimensional representation. A third model (e.g., classification DNN) processes the second feature vector to perform acoustic unit classification and generate text data. These three models may be jointly optimized for speech processing (as opposed to individually optimized for signal enhancement), enabling improved performance despite a reduction in microphones and a reduction in bandwidth consumption during real-time processing.

42.

发明授权
Stochastic modeling of user interactions with a detection system 有权

公开(公告)号：US09899021B1

公开(公告)日：2018-02-20

申请号：US14136712

申请日：2013-12-20

Applicant: Amazon Technologies, Inc.

Inventor： Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister , Rohit Prasad

IPC: G10L15/00 , G10L15/14

CPC classification number: G10L15/142 , G10L15/14 , G10L15/22 , G10L2015/088

Abstract: Features are disclosed for modeling user interaction with a detection system using a stochastic dynamical model in order to determine or adjust detection thresholds. The model may incorporate numerous features, such as the probability of false rejection and false acceptance of a user utterance and the cost associated with each potential action. The model may determine or adjust detection thresholds so as to minimize the occurrence of false acceptances and false rejections while preserving other desirable characteristics. The model may further incorporate background and speaker statistics. Adjustments to the model or other operation parameters can be implemented based on the model, user statistics, and/or additional data.

43.

发明授权
Low latency and memory efficient keyword spotting 有权

公开(公告)号：US09852729B2

公开(公告)日：2017-12-26

申请号：US15207183

申请日：2016-07-11

Applicant: Amazon Technologies, Inc.

Inventor： Bjorn Hoffmeister

IPC: G10L15/00 , G10L15/04 , G10L15/28 , G10L21/00 , G10L25/00 , G10L21/06 , G10L15/14 , G10L15/02 , G10L15/22 , G10L15/08

CPC classification number: G10L15/02 , G10L15/08 , G10L15/142 , G10L15/22 , G10L2015/088 , G10L2015/223

Abstract: Features are disclosed for spotting keywords in utterance audio data without requiring the entire utterance to first be processed. Likelihoods that a portion of the utterance audio data corresponds to the keyword may be compared to likelihoods that the portion corresponds to background audio (e.g., general speech and/or non-speech sounds). The difference in the likelihoods may be determined, and keyword may be triggered when the difference exceeds a threshold, or shortly thereafter. Traceback information and other data may be stored during the process so that a second speech processing pass may be performed. For efficient management of system memory, traceback information may only be stored for those frames that may encompass a keyword; the traceback information for older frames may be overwritten by traceback information for newer frames.

44.

发明授权
Security measures for an electronic device 有权

公开(公告)号：US09706406B1

公开(公告)日：2017-07-11

申请号：US13747245

申请日：2013-01-22

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey Penrod Adams , Ryan Paul Thomas , Bjorn Hoffmeister

IPC: H04W12/06 , H04W12/08

CPC classification number: H04W12/08 , G06F21/32 , H04L63/0861 , H04L63/102 , H04W12/06

Abstract: Approaches are described for detecting when an electronic device (such as a mobile phone) has been stolen or is otherwise being used by someone other than an authorized user of the device. At least one sensor of the device can obtain data during a current use of the device, and the device can determine from the data a set of available features. The features can be compared to a corresponding model associated with an owner (or other authorized user) of the device to generate a confidence value indicative of whether the current user operating the device is likely the owner of the device. The confidence value can be compared to at least one confidence threshold, for example, and based on the comparison, the current user can be provided access to at least a portion of functionality of the device and/or a security action can be performed when the confidence value does not at least meet at least one confidence threshold.

45.

发明授权
Keyword detection modeling using contextual and environmental information 有权

公开(公告)号：US09697828B1

公开(公告)日：2017-07-04

申请号：US14311163

申请日：2014-06-20

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister

IPC: G10L15/18

CPC classification number: G10L15/18 , G10L15/08 , G10L15/30 , G10L2015/088

Abstract: Features are disclosed for detecting words in audio using environmental information and/or contextual information in addition to acoustic features associated with the words to be detected. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

46.

发明授权
Qualifying trigger expressions in speech-based systems 有权

公开(公告)号：US09672812B1

公开(公告)日：2017-06-06

申请号：US14030917

申请日：2013-09-18

Applicant: Amazon Technologies, Inc.

Inventor： Yuzo Watanabe , Paul Joseph Schaffert , Bjorn Hoffmeister , Stan Weidner Salvador

IPC: G10L15/00 , G10L15/04

CPC classification number: G10L15/22 , G10L2015/223

Abstract: A speech-based audio device may be configured to detect a user-uttered trigger expression and to respond by interpreting subsequent words or phrases as commands. In order to distinguish between utterance of the trigger expression by the user and generation of the trigger expression by the device itself, output signals used as speaker inputs are analyzed to detect whether the trigger expression has been generated by the speaker. If a detected trigger expression has been generated by the speaker, it is disqualified. Disqualified trigger expressions are not acted upon the by the audio device.

47.

发明申请
LOW LATENCY AND MEMORY EFFICIENT KEYWORK SPOTTING 审中-公开

公开(公告)号：US20170098442A1

公开(公告)日：2017-04-06

申请号：US15207183

申请日：2016-07-11

Applicant: Amazon Technologies, Inc.

Inventor： Bjorn Hoffmeister

IPC: G10L15/02 , G10L15/22 , G10L15/14

CPC classification number: G10L15/02 , G10L15/08 , G10L15/142 , G10L15/22 , G10L2015/088 , G10L2015/223

Abstract: Features are disclosed for spotting keywords in utterance audio data without requiring the entire utterance to first be processed. Likelihoods that a portion of the utterance audio data corresponds to the keyword may be compared to likelihoods that the portion corresponds to background audio (e.g., general speech and/or non-speech sounds). The difference in the likelihoods may be determined, and keyword may be triggered when the difference exceeds a threshold, or shortly thereafter. Traceback information and other data may be stored during the process so that a second speech processing pass may be performed. For efficient management of system memory, traceback information may only be stored for those frames that may encompass a keyword; the traceback information for older frames may be overwritten by traceback information for newer frames.

48.

发明申请
LANGUAGE MODEL SPEECH ENDPOINTING 审中-公开
Title translation: 语言模式语音终止

公开(公告)号：US20160379632A1

公开(公告)日：2016-12-29

申请号：US14753811

申请日：2015-06-29

Applicant: Amazon Technologies, Inc.

Inventor： Bjorn Hoffmeister , Ariya Rastrow , Baiyang Liu

IPC: G10L15/22 , G10L15/26 , G10L15/18 , G10L25/93

CPC classification number: G10L15/22 , G10L15/18 , G10L15/183 , G10L15/26 , G10L25/87 , G10L25/93 , G10L2025/783

Abstract: An automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder. The ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the non-speech duration by the probability of each hypotheses. When the aggregate weighted non-speech exceeds a threshold, an endpoint may be declared.

Abstract translation: 自动语音识别（ASR）系统使用解码器考虑的活动假设来检测话音的端点。 ASR系统计算由多个假设检测到的非语音量，并以每个假设的概率对非语音持续时间加权。当聚合加权非语音超过阈值时，可以声明端点。

49.

发明授权
Speech recognizer with multi-directional decoding 有权
Title translation: 语音识别器，具有多向解码功能

公开(公告)号：US09286897B2

公开(公告)日：2016-03-15

申请号：US14039383

申请日：2013-09-27

Applicant: Amazon Technologies, Inc.

Inventor： Michael Maximilian Emanuel Bisani , Nikko Strom , Bjorn Hoffmeister , Ryan Paul Thomas

IPC: G10L15/32 , G10L15/16 , G10L21/0272 , G10L25/78 , G10L15/08 , G10L21/0216

CPC classification number: G10L15/32 , G10L15/01 , G10L15/08 , G10L15/16 , G10L21/0272 , G10L25/78 , G10L2021/02166 , H04R1/406 , H04R3/005 , H04R2201/401 , H04R2410/01 , H04R2430/21

Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to process speech based on multiple channels of audio received from a beamformer. The ASR processing system may include a microphone array and the beamformer to output multiple channels of audio such that each channel isolates audio in a particular direction. The multichannel audio signals may include spoken utterances/speech from one or more speakers as well as undesired audio, such as noise from a household appliance. The ASR device may simultaneously perform speech recognition on the multi-channel audio to provide more accurate speech recognition results.

Abstract translation: 在自动语音识别（ASR）处理系统中，ASR处理可以被配置为基于从波束形成器接收的多个音频信道来处理语音。 ASR处理系统可以包括麦克风阵列和波束形成器以输出多个音频通道，使得每个通道在特定方向上隔离音频。多声道音频信号可以包括来自一个或多个扬声器的说话话音/语音以及不期望的音频，例如来自家用电器的噪声。 ASR设备可以同时对多声道音频执行语音识别，以提供更准确的语音识别结果。

50.

发明申请
SPEECH MODEL RETRIEVAL IN DISTRIBUTED SPEECH RECOGNITION SYSTEMS 有权
Title translation: 分布式语音识别系统中的语音模型检索

公开(公告)号：US20140163977A1

公开(公告)日：2014-06-12

申请号：US13712891

申请日：2012-12-12

Applicant: AMAZON TECHNOLOGIES, INC.

Inventor： Bjorn Hoffmeister , Hugh Evan Secker-Walker , Jeffrey Cornelius O'Neill

IPC: G10L15/22

CPC classification number: G10L15/32 , G10L15/22 , G10L15/30

Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

Abstract translation: 公开了用于管理语音识别模型和自动语音识别系统中的数据的使用的特征。可以异步检索模型和数据，并在收到文字或使用更为一般或不同的模型对话语进行初始处理之后进行使用。一旦收到，模型和统计信息可以被缓存。还可以异步检索更新模型和数据所需的统计数据，以便可以在模型和数据可用时更新模型和数据。可以立即使用更新的模型和数据来重新处理话语，或者保存用于处理随后接收的话语。可以跟踪与自动语音识别系统的用户交互，以便预测用户什么时候可能利用该系统。基于这样的预测，模型和数据可以被预先缓存。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification