-
公开(公告)号:US11037572B1
公开(公告)日:2021-06-15
申请号:US16684843
申请日:2019-11-15
Applicant: Amazon Technologies, Inc.
Inventor: Jeff Bradley Beal , Kevin Robert Charter , Ajay Gopalakrishnan , Sumedha Arvind Kshirsagar , Nishant Kumar
Abstract: A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform multiple actions corresponding to this intent. The platform may select a target action to perform, and may engage in a back-and-forth dialog to obtain information for completing the target action. The action may include streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user.
-
公开(公告)号:US10853031B2
公开(公告)日:2020-12-01
申请号:US16222751
申请日:2018-12-17
Applicant: Amazon Technologies, Inc.
Inventor: Gautham Kumar Jayakumar , Nishant Kumar , Steven Michael Saxon , Frederic Johan Georges Deramat
Abstract: Systems and methods for audio output control are disclosed. Audio may be output via a speaker of a communal device associated with a first portion of an environment. A user may provide a user utterance indicating an intent to add another device in a second portion of the environment to the audio-output session, and/or an intent to move the audio-output session from the first device to the second device, and/or an intent to remove a device from an audio-output session. Based on this determined intent, audio-session queues may be associated and dissociated from devices and device states may be altered to effectuate the intent of the user utterance.
-
公开(公告)号:US10755709B1
公开(公告)日:2020-08-25
申请号:US16020603
申请日:2018-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Natalia Vladimirovna Mamkina , Naomi Bancroft , Nishant Kumar , Shamitha Somashekar
Abstract: Systems, methods, and devices for recognizing a user are disclosed. A speech-controlled device captures a spoken utterance, and sends audio data corresponding thereto to a server. The server determines content sources storing or having access to content responsive to the spoken utterance. The server also determines multiple users associated with a profile of the speech-controlled device. Using the audio data, the server may determine user recognition data with respect to each user indicated in the speech-controlled device's profile. The server may also receive user recognition confidence threshold data from each of the content sources. The server may determine user recognition data associated that satisfies (i.e., meets or exceeds) a most stringent (i.e., highest) of the user recognition confidence threshold data. Thereafter, the server may send data indicating a user associated with the user recognition data to all of the content sources.
-
公开(公告)号:US10032451B1
公开(公告)日:2018-07-24
申请号:US15385138
申请日:2016-12-20
Applicant: Amazon Technologies, Inc.
Inventor: Natalia Vladimirovna Mamkina , Naomi Bancroft , Nishant Kumar , Shamitha Somashekar
Abstract: Systems, methods, and devices for recognizing a user are disclosed. A speech-controlled device captures a spoken utterance, and sends audio data corresponding thereto to a server. The server determines content sources storing or having access to content responsive to the spoken utterance. The server also determines multiple users associated with a profile of the speech-controlled device. Using the audio data, the server may determine user recognition data with respect to each user indicated in the speech-controlled device's profile. The server may also receive user recognition confidence threshold data from each of the content sources. The server may determine user recognition data associated that satisfies (i.e., meets or exceeds) a most stringent (i.e., highest) of the user recognition confidence threshold data. Thereafter, the server may send data indicating a user associated with the user recognition data to all of the content sources.
-
公开(公告)号:US09721570B1
公开(公告)日:2017-08-01
申请号:US14109738
申请日:2013-12-17
Applicant: Amazon Technologies, Inc.
Inventor: Jeff Bradley Beal , Sumedha Arvind Kshirsagar , Nishant Kumar , Ajay Gopalakrishnan , Kevin Robert Charter
CPC classification number: G10L17/005 , G10L15/22 , G10L2015/223
Abstract: A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform multiple actions corresponding to this intent. The platform may select a target action to perform, and may engage in a back-and-forth dialog to obtain information for completing the target action. The action may include streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user.
-
公开(公告)号:US11990127B2
公开(公告)日:2024-05-21
申请号:US17946203
申请日:2022-09-16
Applicant: Amazon Technologies, Inc.
Inventor: Natalia Vladimirovna Mamkina , Naomi Bancroft , Nishant Kumar , Shamitha Somashekar
CPC classification number: G10L15/22 , G06F3/167 , G06F21/32 , G10L15/01 , G10L15/18 , G10L15/26 , G10L17/06
Abstract: Systems, methods, and devices for recognizing a user are disclosed. A speech-controlled device captures a spoken utterance, and sends audio data corresponding thereto to a server. The server determines content sources storing or having access to content responsive to the spoken utterance. The server also determines multiple users associated with a profile of the speech-controlled device. Using the audio data, the server may determine user recognition data with respect to each user indicated in the speech-controlled device's profile. The server may also receive user recognition confidence threshold data from each of the content sources. The server may determine user recognition data associated that satisfies (i.e., meets or exceeds) a most stringent (i.e., highest) of the user recognition confidence threshold data. Thereafter, the server may send data indicating a user associated with the user recognition data to all of the content sources.
-
公开(公告)号:US11688402B2
公开(公告)日:2023-06-27
申请号:US16919745
申请日:2020-07-02
Applicant: Amazon Technologies, Inc.
Inventor: Nishant Kumar , David Robert Thomas , Sumedha Arvind Kshirsagar , Vikas Jain , Jeff Bradley Beal , Ajay Gopalakrishnan , Shishir Sridhar Bharathi
IPC: G10L15/22 , G10L17/00 , G10L15/183 , G10L15/18
CPC classification number: G10L17/00 , G10L15/18 , G10L15/183 , G10L15/22 , G10L2015/223 , G10L2015/228
Abstract: Features are disclosed for performing functions in response to user requests. Natural Language Understanding (“NLU”) processing may be performed to generate command data that represents a subject of an utterance. The command data may be sent to an application that causes presentation of first output content in a first modality at a first time in response to receiving the command data, and generates second output content in a second modality different from the first modality, wherein the second output content is associated with the first output content. The second output content may be presented in the second modality at a second time subsequent to the first time.
-
公开(公告)号:US10482884B1
公开(公告)日:2019-11-19
申请号:US15663514
申请日:2017-07-28
Applicant: Amazon Technologies, Inc.
Inventor: Jeff Bradley Beal , Kevin Robert Charter , Ajay Gopalakrishnan , Sumedha Arvind Kshirsagar , Nishant Kumar
Abstract: A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform multiple actions corresponding to this intent. The platform may select a target action to perform, and may engage in a back-and-forth dialog to obtain information for completing the target action. The action may include streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user.
-
公开(公告)号:US11915707B1
公开(公告)日:2024-02-27
申请号:US17346916
申请日:2021-06-14
Applicant: Amazon Technologies, Inc.
Inventor: Jeff Bradley Beal , Kevin Robert Charter , Ajay Gopalakrishnan , Sumedha Arvind Kshirsagar , Nishant Kumar
CPC classification number: G10L17/00 , G06F3/167 , G10L15/22 , G10L2015/223
Abstract: A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform multiple actions corresponding to this intent. The platform may select a target action to perform, and may engage in a back-and-forth dialog to obtain information for completing the target action. The action may include streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user.
-
公开(公告)号:US20230139140A1
公开(公告)日:2023-05-04
申请号:US17946203
申请日:2022-09-16
Applicant: Amazon Technologies, Inc.
Inventor: Natalia Vladimirovna Mamkina , Naomi Bancroft , Nishant Kumar , Shamitha Somashekar
Abstract: Systems, methods, and devices for recognizing a user are disclosed. A speech-controlled device captures a spoken utterance, and sends audio data corresponding thereto to a server. The server determines content sources storing or having access to content responsive to the spoken utterance. The server also determines multiple users associated with a profile of the speech-controlled device. Using the audio data, the server may determine user recognition data with respect to each user indicated in the speech-controlled device's profile. The server may also receive user recognition confidence threshold data from each of the content sources. The server may determine user recognition data associated that satisfies (i.e., meets or exceeds) a most stringent (i.e., highest) of the user recognition confidence threshold data. Thereafter, the server may send data indicating a user associated with the user recognition data to all of the content sources.
-
-
-
-
-
-
-
-
-