-
公开(公告)号:US11942084B2
公开(公告)日:2024-03-26
申请号:US17828483
申请日:2022-05-31
Applicant: Amazon Technologies, Inc.
Inventor: Colin Wills Wightman , Naresh Narayanan , Daniel Robert Rashid
IPC: G10L15/00 , G06F16/31 , G06F40/289 , G10L15/10 , G10L15/20 , G10L15/26 , G10L15/28 , G10L17/04 , G10L15/22
CPC classification number: G10L15/20 , G06F16/316 , G06F40/289 , G10L15/10 , G10L15/26 , G10L15/285 , G10L17/04 , G10L2015/223
Abstract: Systems and methods for determining that artificial commands, in excess of a threshold value, are detected by multiple voice activated electronic devices is described herein. In some embodiments, numerous voice activated electronic devices may send audio data representing a phrase to a backend system at a substantially same time. Text data representing the phrase, and counts for instances of that text data, may be generated. If the number of counts exceeds a predefined threshold, the backend system may cause any remaining response generation functionality that particular command that is in excess of the predefined threshold to be stopped, and those devices returned to a sleep state. In some embodiments, a sound profile unique to the phrase that caused the excess of the predefined threshold may be generated such that future instances of the same phrase may be recognized prior to text data being generated, conserving the backend system's resources.
-
公开(公告)号:US20230019649A1
公开(公告)日:2023-01-19
申请号:US17828483
申请日:2022-05-31
Applicant: Amazon Technologies, Inc.
Inventor: Colin Wills Wightman , Naresh Narayanan , Daniel Robert Rashid
Abstract: Systems and methods for determining that artificial commands, in excess of a threshold value, are detected by multiple voice activated electronic devices is described herein. In some embodiments, numerous voice activated electronic devices may send audio data representing a phrase to a backend system at a substantially same time. Text data representing the phrase, and counts for instances of that text data, may be generated. If the number of counts exceeds a predefined threshold, the backend system may cause any remaining response generation functionality that particular command that is in excess of the predefined threshold to be stopped, and those devices returned to a sleep state. In some embodiments, a sound profile unique to the phrase that caused the excess of the predefined threshold may be generated such that future instances of the same phrase may be recognized prior to text data being generated, conserving the backend system's resources.
-
公开(公告)号:US09269355B1
公开(公告)日:2016-02-23
申请号:US13831286
申请日:2013-03-14
Applicant: Amazon Technologies, Inc.
Inventor: Hugh Evan Secker-Walker , Naresh Narayanan
CPC classification number: G10L15/30
Abstract: Features are disclosed for transferring speech recognition workloads between pooled execution resources. For example, various parts of an automatic speech recognition engine may be implemented by various pools of servers. Servers in a speech recognition pool may explore a plurality of paths in a graph to find the path that best matches an utterance. A set of active nodes comprising the last node explored in each path may be transferred between servers in the pool depending on resource availability at each server. A history of nodes or arcs traversed in each path may be maintained by a separate pool of history servers, and used to generate text corresponding to the path identified as the best match by the speech recognition servers.
Abstract translation: 公开了用于在池化执行资源之间传送语音识别工作负载的特征。 例如,自动语音识别引擎的各个部分可以由各种服务器池来实现。 语音识别池中的服务器可以在图中探索多个路径,以找到与话语最匹配的路径。 根据每个服务器上的资源可用性,包括在每个路径中探索的最后一个节点的一组活动节点可以在池中的服务器之间传送。 每个路径中遍历的节点或弧的历史可以由单独的历史服务器池维护,并且用于生成与通过语音识别服务器识别为最佳匹配的路径相对应的文本。
-
公开(公告)号:US12080282B2
公开(公告)日:2024-09-03
申请号:US17848901
申请日:2022-06-24
Applicant: Amazon Technologies, Inc.
Inventor: Da Teng , Adrian Evans , Naresh Narayanan
IPC: G10L15/183 , G10L15/22 , G10L17/22
CPC classification number: G10L15/183 , G10L15/22 , G10L17/22 , G10L2015/223 , G10L2015/227 , G10L2015/228
Abstract: This disclosure proposes systems and methods for processing natural language inputs using data associated with multiple language recognition contexts (LRC). A system using multiple LRCs can receive input data from a device, identify a first identifier associated with the device, and further identify second identifiers associated with the first identifier and representing candidate users of the device. The system can access language processing data used for natural language processing for the LRCs corresponding to each of the first and second identifiers, and process the input data using the language processing data at one or more stages of automatic speech recognition, natural language understanding, entity resolution, and/or command execution. User recognition can reduce the number of candidate users, and thus the amount of data used to process the input data. Dynamic arbitration can select from between competing hypotheses representing the first identifier and a second identifier, respectively.
-
公开(公告)号:US20230042420A1
公开(公告)日:2023-02-09
申请号:US17848901
申请日:2022-06-24
Applicant: Amazon Technologies, Inc.
Inventor: Da Teng , Adrian Evans , Naresh Narayanan
IPC: G10L15/183 , G10L17/22 , G10L15/22
Abstract: This disclosure proposes systems and methods for processing natural language inputs using data associated with multiple language recognition contexts (LRC). A system using multiple LRCs can receive input data from a device, identify a first identifier associated with the device, and further identify second identifiers associated with the first identifier and representing candidate users of the device. The system can access language processing data used for natural language processing for the LRCs corresponding to each of the first and second identifiers, and process the input data using the language processing data at one or more stages of automatic speech recognition, natural language understanding, entity resolution, and/or command execution. User recognition can reduce the number of candidate users, and thus the amount of data used to process the input data. Dynamic arbitration can select from between competing hypotheses representing the first identifier and a second identifier, respectively.
-
公开(公告)号:US11908480B1
公开(公告)日:2024-02-20
申请号:US16826950
申请日:2020-03-23
Applicant: Amazon Technologies, Inc.
Inventor: Da Teng , Adrian Evans , Naresh Narayanan
Abstract: This disclosure proposes systems and methods for processing natural language inputs using data associated with multiple language recognition contexts (LRC). A system using multiple LRCs can receive input data from a device, identify a first identifier associated with the device, and further identify second identifiers associated with the first identifier and representing candidate users of the device. The system can access language processing data used for natural language processing for the LRCs corresponding to each of the first and second identifiers, and process the input data using the language processing data at one or more stages of automatic speech recognition, natural language understanding, entity resolution, and/or command execution. User recognition can reduce the number of candidate users, and thus the amount of data used to process the input data. Dynamic arbitration can select from between competing hypotheses representing the first identifier and a second identifier, respectively.
-
公开(公告)号:US10074364B1
公开(公告)日:2018-09-11
申请号:US15085772
申请日:2016-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Colin Wills Wightman , Naresh Narayanan , Alexander David Rosen , Michael James Rodehorst , Daniel Robert Rashid
CPC classification number: G10L15/20 , G06F17/2775 , G10L15/10 , G10L15/26 , G10L15/265 , G10L17/04 , G10L25/51 , G10L2015/223
Abstract: Systems and methods for generating sound profiles of artificial commands detected by multiple voice activated electronic devices is described herein. In some embodiments, numerous voice activated electronic devices may send audio data representing a phrase to a backend system at a substantially same time. Text data representing the phrase, and counts for instances of that text data, may be generated. If the number of counts exceeds a predefined threshold, the backend system may cause any remaining response generation functionality that particular command that is in excess of the predefined threshold to be stopped, and those devices returned to a sleep state. In some embodiments, a sound profile unique to the phrase that caused the excess of the predefined threshold may be generated such that future instances of the same phrase may be recognized prior to text data being generated, conserving the backend system's resources.
-
公开(公告)号:US20250131921A1
公开(公告)日:2025-04-24
申请号:US18817461
申请日:2024-08-28
Applicant: Amazon Technologies, Inc.
Inventor: Da Teng , Adrian Evans , Naresh Narayanan
IPC: G10L15/183 , G10L15/22 , G10L17/22
Abstract: This disclosure proposes systems and methods for processing natural language inputs using data associated with multiple language recognition contexts (LRC). A system using multiple LRCs can receive input data from a device, identify a first identifier associated with the device, and further identify second identifiers associated with the first identifier and representing candidate users of the device. The system can access language processing data used for natural language processing for the LRCs corresponding to each of the first and second identifiers, and process the input data using the language processing data at one or more stages of automatic speech recognition, natural language understanding, entity resolution, and/or command execution. User recognition can reduce the number of candidate users, and thus the amount of data used to process the input data. Dynamic arbitration can select from between competing hypotheses representing the first identifier and a second identifier, respectively.
-
公开(公告)号:US11790898B1
公开(公告)日:2023-10-17
申请号:US17361703
申请日:2021-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Da Teng , Muhammad Bilal Khokhar , Naresh Narayanan , Bharath Bhimanaik Kumar
Abstract: Techniques for prioritizing resources of various users, associated with a device, when responding to a user input received from the device are described. When a user input is received from a device, a system may generate a resource list for a group profile (e.g., a household profile) and each user profile (including any guest user profile) associated with the device. Each resource list may include the catalogs of resources (e.g., songs of a playlist, contacts of a contact list, etc.) of the group profile or user profile. The system may also generate a weight matrix including a respective weight for each catalog of each resource list. Various processing components (e.g., an automatic speech recognition component, a natural language understanding component, and an entity resolution component) may process using the resource lists and the weight matrix to determine an output responsive to the user input.
-
公开(公告)号:US11386887B1
公开(公告)日:2022-07-12
申请号:US16827025
申请日:2020-03-23
Applicant: Amazon Technologies, Inc.
Inventor: Da Teng , Adrian Evans , Naresh Narayanan
IPC: G10L15/183 , G10L15/22 , G10L17/22
Abstract: This disclosure proposes systems and methods for processing natural language inputs using data associated with multiple language recognition contexts (LRC). A system using multiple LRCs can receive input data from a device, identify a first identifier associated with the device, and further identify second identifiers associated with the first identifier and representing candidate users of the device. The system can access language processing data used for natural language processing for the LRCs corresponding to each of the first and second identifiers, and process the input data using the language processing data at one or more stages of automatic speech recognition, natural language understanding, entity resolution, and/or command execution. User recognition can reduce the number of candidate users, and thus the amount of data used to process the input data. Dynamic arbitration can select from between competing hypotheses representing the first identifier and a second identifier, respectively.
-
-
-
-
-
-
-
-
-