-
公开(公告)号:US20180210703A1
公开(公告)日:2018-07-26
申请号:US15876858
申请日:2018-01-22
Applicant: Amazon Technologies, Inc.
Inventor: James David Meyers , Shah Samir Pravinchandra , Yue Liu , Arlen Dean , Daniel Miller , Arindam Mandal
Abstract: A system may use multiple speech interface devices to interact with a user by speech. All or a portion of the speech interface devices may detect a user utterance and may initiate speech processing to determine a meaning or intent of the utterance. Within the speech processing, arbitration is employed to select one of the multiple speech interface devices to respond to the user utterance. Arbitration may be based in part on metadata that directly or indirectly indicates the proximity of the user to the devices, and the device that is deemed to be nearest the user may be selected to respond to the user utterance.
-
公开(公告)号:US12039975B2
公开(公告)日:2024-07-16
申请号:US17112512
申请日:2020-12-04
Applicant: Amazon Technologies, Inc.
Inventor: Prakash Krishnan , Arindam Mandal , Siddhartha Reddy Jonnalagadda , Nikko Strom , Ariya Rastrow , Shiv Naga Prasad Vitaladevuni , Angeliki Metallinou , Vincent Auvray , Minmin Shen , Josey Diego Sandoval , Rohit Prasad , Thomas Taylor , Amotz Maimon
IPC: G10L15/22 , G06F3/16 , G06F18/24 , G06V10/40 , G06V40/10 , G06V40/20 , G10L13/08 , G10L15/02 , G10L15/06 , G10L15/08 , G10L15/20 , G10L15/24
CPC classification number: G10L15/22 , G06F3/167 , G06F18/24 , G06V10/40 , G06V40/10 , G06V40/20 , G10L13/08 , G10L15/02 , G10L15/063 , G10L15/08 , G10L15/20 , G10L15/222 , G10L15/24 , G10L2015/0635 , G10L2015/088 , G10L2015/223 , G10L2015/227
Abstract: A natural language system may be configured to act as a participant in a conversation between two users. The system may determine when a user expression such as speech, a gesture, or the like is directed from one user to the other. The system may processing input data related the expression (such as audio data, input data, language processing result data, conversation context data, etc.) to determine if the system should interject a response to the user-to-user expression. If so, the system may process the input data to determine a response and output it. The system may track that response as part of the data related to the ongoing conversation.
-
公开(公告)号:US20240185846A1
公开(公告)日:2024-06-06
申请号:US18439166
申请日:2024-02-12
Applicant: Amazon Technologies, Inc.
Inventor: Arjit Biswas , Shishir Bharathi , Anushree Venkatesh , Yun Lei , Ashish Kumar Agrawal , Siddhartha Reddy Jonnalagadda , Prakash Krishnan , Arindam Mandal , Raefer Christopher Gabriel , Abhay Kumar Jha , David Chi-Wai Tang , Savas Parastatidis
IPC: G10L15/183 , G06F40/279 , G06F40/295 , G06F40/30 , G06F40/35 , G10L15/18 , G10L15/19 , G10L15/22
CPC classification number: G10L15/183 , G06F40/279 , G10L15/1815 , G10L15/22 , G06F40/295 , G06F40/30 , G06F40/35 , G10L15/1822 , G10L15/19 , G10L2015/228
Abstract: Techniques for storing and using multi-session context are described. A system may store context data corresponding to a first interaction, where the context data may include action data, entity data and a profile identifier for a user. Later the stored context data may be retrieved during a second interaction corresponding to the entity of the second interaction. The second interaction may take place at a system different than the first interaction. The system may generate a response during the second interaction using the stored context data of the prior interaction.
-
公开(公告)号:US11749282B1
公开(公告)日:2023-09-05
申请号:US16866903
申请日:2020-05-05
Applicant: Amazon Technologies, Inc.
Inventor: Arindam Mandal , Devesh Mohan Pandey , Kjel Larsen , Prakash Krishnan , Raefer Christopher Gabriel
Abstract: A dialog system receives a user request corresponding to a dialog with a user. The dialog system processes the user request to determine multiple service providers capable of responding to the user request. The dialog system selects one service provider based on a request-to-handle score, and selects another service provider based on a satisfaction rating. The dialog system updates the dialog state based on further input provided by the user to determine an output responsive to the user request.
-
公开(公告)号:US11270685B2
公开(公告)日:2022-03-08
申请号:US16726051
申请日:2019-12-23
Applicant: Amazon Technologies, Inc.
Inventor: Spyridon Matsoukas , Aparna Khare , Vishwanathan Krishnamoorthy , Shamitha Somashekar , Arindam Mandal
Abstract: Systems, methods, and devices for verifying a user are disclosed. A speech-controlled device captures a spoken command, and sends audio data corresponding thereto to a server. The server performs ASR on the audio data to determine ASR confidence data. The server, in parallel, performs user verification on the audio data to determine user verification confidence data. The server may modify the user verification confidence data using the ASR confidence data. In addition or alternatively, the server may modify the user verification confidence data using at least one of a location of the speech-controlled device within a building, a type of the speech-controlled device, or a geographic location of the speech-controlled device.
-
公开(公告)号:US20210312914A1
公开(公告)日:2021-10-07
申请号:US17340378
申请日:2021-06-07
Applicant: Amazon Technologies, Inc.
Inventor: Behnam Hedayatnia , Anirudh Raju , Ankur Gandhe , Chandra Prakash Khatri , Ariya Rastrow , Anushree Venkatesh , Arindam Mandal , Raefer Christopher Gabriel , Ahmad Shikib Mehri
Abstract: Described herein is a system for rescoring automatic speech recognition hypotheses for conversational devices that have multi-turn dialogs with a user. The system leverages dialog context by incorporating data related to past user utterances and data related to the system generated response corresponding to the past user utterance. Incorporation of this data improves recognition of a particular user utterance within the dialog.
-
公开(公告)号:US11004454B1
公开(公告)日:2021-05-11
申请号:US16182021
申请日:2018-11-06
Applicant: Amazon Technologies, Inc.
Inventor: Sundararajan Srinivasan , Arindam Mandal , Krishna Subramanian , Spyridon Matsoukas , Aparna Khare , Rohit Prasad
Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.
-
公开(公告)号:US10522134B1
公开(公告)日:2019-12-31
申请号:US15388458
申请日:2016-12-22
Applicant: Amazon Technologies, Inc.
Inventor: Spyridon Matsoukas , Aparna Khare , Vishwanathan Krishnamoorthy , Shamitha Somashekar , Arindam Mandal
Abstract: Systems, methods, and devices for verifying a user are disclosed. A speech-controlled device captures a spoken command, and sends audio data corresponding thereto to a server. The server performs ASR on the audio data to determine ASR confidence data. The server, in parallel, performs user verification on the audio data to determine user verification confidence data. The server may modify the user verification confidence data using the ASR confidence data. In addition or alternatively, the server may modify the user verification confidence data using at least one of a location of the speech-controlled device within a building, a type of the speech-controlled device, or a geographic location of the speech-controlled device.
-
公开(公告)号:US10147442B1
公开(公告)日:2018-12-04
申请号:US14869803
申请日:2015-09-29
Applicant: Amazon Technologies, Inc.
Inventor: Sankaran Panchapagesan , Shiva Kumar Sundaram , Arindam Mandal
Abstract: A neural network acoustic model is trained to be robust and produce accurate output when used to process speech signals having acoustic interference. The neural network acoustic model can be trained using a source-separation process by which, in addition to producing the main acoustic model output for a given input, the neural network generates predictions of the separate speech and interference portions of the input. The parameters of the neural network can be adjusted to jointly optimize all three outputs (e.g., the main acoustic model output, the speech signal prediction, and the interference signal prediction), rather than only optimizing the main acoustic model output. Once trained, output layers for the speech and interference signal predictions can be removed from the neural network or otherwise disabled.
-
公开(公告)号:US20250149028A1
公开(公告)日:2025-05-08
申请号:US18923949
申请日:2024-10-23
Applicant: Amazon Technologies, Inc.
Inventor: Amitabh Saikia , Devesh Mohan Pandey , Tagyoung Chung , Shanchan Wu , Chien-Wei Lin , Govindarajan Sundaram Thattai , Aishwarya Naresh Reganti , Arindam Mandal , Prakash Krishnan , Raefer Christopher Gabriel , Meyyappan Sundaram
IPC: G10L15/183 , G10L13/027 , G10L15/08
Abstract: Techniques for facilitating natural language interactions with visual interactive content are described. During a build time, a system analyzes various websites and applications relating to a particular user goal to understand website and application navigation and information relating to the user goal. The learned information is used to store configuration data. During runtime, when a user request performance of an action, the system engages in a dialog with the user to complete the user's goal. The system uses the stored configuration data to determine actions to be performed at a website or application to complete the user's goal, and determines system responses to present to the user to facilitate completion of the goal. Such system responses may request information from the user, may inform the user of information displayed at the website or application, etc.
-
-
-
-
-
-
-
-
-