Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Spyridon Matsoukas"

11.

发明授权
Acoustic event detection 有权

公开(公告)号：US11302329B1

公开(公告)日：2022-04-12

申请号：US16914589

申请日：2020-06-29

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Spyridon Matsoukas , Venkata Naga Krishna Chaitanya Puvvada , Chao Wang , Chieh-Chi Kao

IPC: G10L15/22 , G10L15/08

Abstract: A system may include an acoustic event detection component for detecting acoustic events, which may be non-speech sounds. Upon detection of a command to detect a new sound, a device may prompt a user to cause occurrence of the sound one or more times. The acoustic event detection component may then be reconfigured, using audio data corresponding to the occurrences, to detect future occurrences of the event.

12.

发明授权
Intent re-ranker 有权

公开(公告)号：US11227585B2

公开(公告)日：2022-01-18

申请号：US16815188

申请日：2020-03-11

Applicant: AMAZON TECHNOLOGIES, INC.

Inventor： Alexandra R. Shapiro , Melanie Chie Bomke Gens , Spyridon Matsoukas , Kellen Gillespie , Rahul Goel

IPC: G10L15/18 , G10L15/22 , G10L15/08

Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.

13.

发明授权
Voice profile updating 有权

公开(公告)号：US11200884B1

公开(公告)日：2021-12-14

申请号：US16181925

申请日：2018-11-06

Applicant: Amazon Technologies, Inc.

Inventor： Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad

IPC: G10L15/06 , G06F3/16 , G10L15/18 , G10L15/22

Abstract: Techniques for labeling user inputs for updating user recognition voice profiles are described. A system may leverage various signals, generated during or after processing of a user input, to retroactively determine which user spoke the user input. For example, after the system receives the user input, the user may provide the system with non-spoken user verification information. Based on such user verification information, the system may label the previously spoken user input as originating from the particular user. The system may also or alternatively use system usage history to retroactively label user inputs.

14.

发明授权
Keyword detection modeling using contextual and environmental information 有权

公开(公告)号：US09697828B1

公开(公告)日：2017-07-04

申请号：US14311163

申请日：2014-06-20

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister

IPC: G10L15/18

CPC classification number: G10L15/18 , G10L15/08 , G10L15/30 , G10L2015/088

Abstract: Features are disclosed for detecting words in audio using environmental information and/or contextual information in addition to acoustic features associated with the words to be detected. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

15.

发明申请
VOICE PROFILE UPDATING 有权

公开(公告)号：US20210304774A1

公开(公告)日：2021-09-30

申请号：US17228950

申请日：2021-04-13

Applicant: Amazon Technologies, Inc.

Inventor： Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad

IPC: G10L17/04 , G06F3/16 , G10L17/00 , G10L15/06

Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.

16.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11132990B1

公开(公告)日：2021-09-28

申请号：US16453063

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/08 , G10L25/87 , G06F3/16 , G10L25/21

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

17.

发明申请
KEYWORD DETECTION MODELING USING CONTEXTUAL INFORMATION 有权

公开(公告)号：US20210134276A1

公开(公告)日：2021-05-06

申请号：US17090716

申请日：2020-11-05

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister

IPC: G10L15/18 , G10L15/08

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

18.

发明申请
INTENT RE-RANKER 审中-公开

公开(公告)号：US20200279555A1

公开(公告)日：2020-09-03

申请号：US16815188

申请日：2020-03-11

Applicant: AMAZON TECHNOLOGIES, INC.

Inventor： Alexandra R. Shapiro , Melanie Chie Bomke Gens , Spyridon Matsoukas , Kellen Gillespie , Rahul Goel

IPC: G10L15/18 , G10L15/22

Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.

19.

发明授权
Intent re-ranker 有权

公开(公告)号：US10600406B1

公开(公告)日：2020-03-24

申请号：US15463339

申请日：2017-03-20

Applicant: Amazon Technologies, Inc.

Inventor： Alexandra R. Shapiro , Melanie Chie Bomke Gens , Spyridon Matsoukas , Kellen Gillespie , Rahul Goel

IPC: G10L15/18 , G10L15/22 , G10L15/08

Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.

20.

发明授权
User presence detection 有权

公开(公告)号：US10121494B1

公开(公告)日：2018-11-06

申请号：US15474603

申请日：2017-03-30

Applicant: Amazon Technologies, Inc.

Inventor： Shiva Kumar Sundaram , Chao Wang , Shiv Naga Prasad Vitaladevuni , Spyridon Matsoukas , Arindam Mandal

IPC: G10L15/00 , G10L25/78 , G10L15/16 , G10L15/30 , G10L15/02 , G10L15/22 , G10L15/08

Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification