Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Spyridon Matsoukas"

21.

发明授权
Speech processing with learned representation of user interaction history 有权

公开(公告)号：US10032463B1

公开(公告)日：2018-07-24

申请号：US14982587

申请日：2015-12-29

Applicant: Amazon Technologies, Inc.

Inventor： Ariya Rastrow , Nikko Ström , Spyridon Matsoukas , Markus Dreyer , Ankur Gandhe , Denis Sergeyevich Filimonov , Julian Chan , Rohit Prasad

IPC: G10L15/183 , G10L15/197 , G10L15/16 , G10L25/30 , G10L15/26 , G10L15/06 , G10L15/22

Abstract: An automatic speech recognition (“ASR”) system produces, for particular users, customized speech recognition results by using data regarding prior interactions of the users with the system. A portion of the ASR system (e.g., a neural-network-based language model) can be trained to produce an encoded representation of a user's interactions with the system based on, e.g., transcriptions of prior utterances made by the user. This user-specific encoded representation of interaction history is then used by the language model to customize ASR processing for the user.

22.

发明申请
KEYWORD DETECTION MODELING USING CONTEXTUAL INFORMATION 审中-公开

公开(公告)号：US20180012593A1

公开(公告)日：2018-01-11

申请号：US15641169

申请日：2017-07-03

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister

IPC: G10L15/18

CPC classification number: G10L15/18 , G10L15/08 , G10L15/30 , G10L2015/088

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

23.

发明授权
Generative modeling of speech using neural networks 有权

公开(公告)号：US09653093B1

公开(公告)日：2017-05-16

申请号：US14463411

申请日：2014-08-19

Applicant: Amazon Technologies, Inc.

Inventor： Spyridon Matsoukas , Nikko Ström , Ariya Rastrow , Sri Venkata Surya Siva Rama Krishna Garimella

IPC: G10L15/16 , G10L25/30 , G10L15/07 , G10L15/08 , G10L15/14

CPC classification number: G10L15/16 , G10L15/08 , G10L15/142 , G10L15/144

Abstract: Features are disclosed for using an artificial neural network to generate customized speech recognition models during the speech recognition process. By dynamically generating the speech recognition models during the speech recognition process, the models can be customized based on the specific context of individual frames within the audio data currently being processed. In this way, dependencies between frames in the current sequence can form the basis of the models used to score individual frames of the current sequence. Thus, each frame of the current sequence (or some subset thereof) may be scored using one or more models customized for the particular frame in context.

24.

发明授权
Speech based user recognition 有权

公开(公告)号：US11270685B2

公开(公告)日：2022-03-08

申请号：US16726051

申请日：2019-12-23

Applicant: Amazon Technologies, Inc.

Inventor： Spyridon Matsoukas , Aparna Khare , Vishwanathan Krishnamoorthy , Shamitha Somashekar , Arindam Mandal

IPC: G10L15/01 , G10L15/25 , G10L15/30

Abstract: Systems, methods, and devices for verifying a user are disclosed. A speech-controlled device captures a spoken command, and sends audio data corresponding thereto to a server. The server performs ASR on the audio data to determine ASR confidence data. The server, in parallel, performs user verification on the audio data to determine user verification confidence data. The server may modify the user verification confidence data using the ASR confidence data. In addition or alternatively, the server may modify the user verification confidence data using at least one of a location of the speech-controlled device within a building, a type of the speech-controlled device, or a geographic location of the speech-controlled device.

25.

发明授权
Contextual natural language processing 有权

公开(公告)号：US11081104B1

公开(公告)日：2021-08-03

申请号：US15838917

申请日：2017-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Chengwei Su , Sankaranarayanan Ananthakrishnan , Spyridon Matsoukas , Shirin Saleem , Rahul Gupta , Kavya Ravikumar , John Will Crimmins , Kelly James Vanee , John Pelak , Melanie Chie Bomke Gens

IPC: G10L15/18 , G10L15/22 , G10L15/06 , G10L15/183 , H04L29/08 , G10L15/32 , G06K9/00 , H04W4/02 , G10L15/26 , G06F16/31 , G06F40/295

Abstract: A natural language understanding system that can determine an overall score for a natural language hypothesis using hypothesis-specific component scores from different aspects of NLU processing as well as context data describing the context surrounding the utterance corresponding to the natural language hypotheses. The individual component scores may be input into a feature vector at a location corresponding to a type of a device captured by the utterance. Other locations in the feature vector corresponding to other device types may be populated with zero values. The feature vector may also be populated with other values represent other context data. The feature vector may then be multiplied by a weight vector comprising trained weights corresponding to the feature vector positions to determine a new overall score for each hypothesis, where the overall score incorporates the impact of the context data. Natural language hypotheses can be ranked using their respective new overall scores.

26.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11043218B1

公开(公告)日：2021-06-22

申请号：US16452964

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

27.

发明授权
Scoring of natural language processing hypotheses 有权

公开(公告)号：US11043205B1

公开(公告)日：2021-06-22

申请号：US15838974

申请日：2017-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Chengwei Su , Sankaranarayanan Ananthakrishnan , Spyridon Matsoukas , Rahul Gupta , Kelly James Vanee

IPC: G10L15/22 , G10L15/18 , G10L15/06 , G10L15/16 , G10L15/183 , G06N3/02 , G06N20/00 , G06F16/31 , G06F40/295

Abstract: A natural language processing system that can determine an overall score for a natural language hypothesis using hypothesis-specific component scores from different aspects of NLU processing. The individual component scores may be weighted by weights trained to optimize the overall scores relative to each other. Each domain of the system may be configured with a separate component that determines the overall score with respect to the domain. Natural language hypotheses can be ranked using the overall score either within a specific domain or on a cross-domain basis.

28.

发明授权
Voice profile updating 有权

公开(公告)号：US11004454B1

公开(公告)日：2021-05-11

申请号：US16182021

申请日：2018-11-06

Applicant: Amazon Technologies, Inc.

Inventor： Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad

IPC: G10L17/04 , G06F3/16 , G10L17/00 , G10L15/06

Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.

29.

发明授权
Keyword detection modeling using contextual information 有权

公开(公告)号：US10832662B2

公开(公告)日：2020-11-10

申请号：US15641169

申请日：2017-07-03

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister

IPC: G10L15/18 , G10L15/08 , G10L15/30

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

30.

发明申请
NATURAL LANGUAGE SPEECH PROCESSING APPLICATION SELECTION 审中-公开

公开(公告)号：US20200152195A1

公开(公告)日：2020-05-14

申请号：US16693826

申请日：2019-11-25

Applicant: Amazon Technologies, Inc.

Inventor： Ruhi Sarikaya , Rohit Prasad , Kerry Hammil , Spyridon Matsoukas , Nikko Strom , Frédéric Johan Georges Deramat , Stephen Frederick Potter , Young-Bum Kim

IPC: G10L15/22 , G06F40/295 , G10L15/26 , G10L15/08

Abstract: Techniques for limiting natural language processing performed on input data are described. A system receives input data from a device. The input data corresponds to a command to be executed by the system. The system determines applications likely configured to execute the command. The system performs named entity recognition and intent classification with respect to only the applications likely configured to execute the command.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification