-
公开(公告)号:US20250006196A1
公开(公告)日:2025-01-02
申请号:US18345455
申请日:2023-06-30
Applicant: Amazon Technologies, Inc.
Inventor: Hann Wang , Angeliki Metallinou , Melanie C B Gens , Arijit Biswas , Ying Shi
Abstract: Techniques for generating a prompt for a language model to determine an action responsive to a user input, are described. In some embodiments, the system receives a user input, determines one or more application programming interfaces (APIs) configured to perform actions that are relevant to the user input and exemplars representing examples of using the APIs with respect to user inputs similar to the current user input. The system further determines device states of devices that are determined to be related to the user input and also determines other contextual information (e.g., weather information, time of day, geographic location, etc.). The system generates a prompt including the user input, the APIs, the exemplars, the device states, and the other contextual information. A language model processes the prompt to determine an action responsive to the user input and the system causes performance of the action.
-
公开(公告)号:US20220148590A1
公开(公告)日:2022-05-12
申请号:US17454716
申请日:2021-11-12
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Ying Shi , Imre Attila Kiss , Ryan Paul Thomas , Frederic Johan Georges Deramat
IPC: G10L15/22 , G10L15/26 , G06F40/35 , G06F40/40 , G06F40/56 , G06F40/284 , G06F40/295 , G10L13/08
Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.
-
公开(公告)号:US20190325873A1
公开(公告)日:2019-10-24
申请号:US16400905
申请日:2019-05-01
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Ying Shi , Imre Attila Kiss , Ryan Paul Thomas , Frederic Johan Georges Deramat
Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.
-
公开(公告)号:US09454957B1
公开(公告)日:2016-09-27
申请号:US13786237
申请日:2013-03-05
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Weam Abu Zaki , Ying Shi
CPC classification number: G10L15/187 , G06F17/278 , G10L15/02 , G10L15/183 , G10L15/22 , G10L15/265 , G10L2015/025
Abstract: Features are disclosed for determining an element of a user utterance or user intent in conjunction with one or more related elements of the user utterance or user intent. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a natural language understanding (“NLU”) module. The NLU module may perform named entity recognition, intent classification, and/or other processes on the ASR results. In addition, the NLU module may determine or verify the values associated with the recognized named entities using a data store of known values. When two or more named entities are related, their values may be determined and/or verified in conjunction with each other in order to preserve the relationship between them.
Abstract translation: 公开了用于结合用户话语或用户意图的一个或多个相关元素来确定用户话语或用户意图的元素的特征。 用户话语可以通过自动语音识别(“ASR”)模块进行转录,并且可以将结果提供给自然语言理解(“NLU”)模块。 NLU模块可以对ASR结果执行命名实体识别,意图分类和/或其他过程。 此外,NLU模块可以使用已知值的数据存储来确定或验证与识别的命名实体相关联的值。 当两个或多个命名实体相关时,它们的值可以彼此结合确定和/或验证,以便保持它们之间的关系。
-
公开(公告)号:US11176936B2
公开(公告)日:2021-11-16
申请号:US16400905
申请日:2019-05-01
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Ying Shi , Imre Attila Kiss , Ryan Paul Thomas , Frederic Johan Georges Deramat
IPC: G10L15/22 , G10L15/26 , G06F40/35 , G06F40/40 , G06F40/56 , G06F40/284 , G06F40/295 , G10L13/08
Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.
-
公开(公告)号:US09436678B2
公开(公告)日:2016-09-06
申请号:US14754598
申请日:2015-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Ying Shi , Imre Attila Kiss , Ryan Paul Thomas , Frederic Johan Georges Deramat
CPC classification number: G10L15/22 , G06F17/277 , G06F17/278 , G06F17/279 , G06F17/28 , G06F17/2881 , G10L13/08 , G10L15/26
Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.
Abstract translation: 公开了用于处理关于多个主题或域的用户话语的特征,并且从用于响应于话语或以其他方式采取行动的特定域中选择可能的结果。 用户话语可以通过自动语音识别(“ASR”)模块进行转录,并且可以将结果提供给多域自然语言理解(“NLU”)引擎。 多域NLU引擎可以处理多个单个域中的转录,而不是在单个域中处理转录。 在一些情况下,转录可以在多个单独的结构域中并行或基本同时地进行处理。 此外,可以基于先前的用户交互和其他数据生成提示。 ASR模块,多域NLU引擎和口语处理系统的其他组件可以使用提示来更有效地处理输入或更准确地生成输出。
-
公开(公告)号:US11908468B2
公开(公告)日:2024-02-20
申请号:US17112520
申请日:2020-12-04
Applicant: Amazon Technologies, Inc.
Inventor: Prakash Krishnan , Arindam Mandal , Siddhartha Reddy Jonnalagadda , Nikko Strom , Ariya Rastrow , Ying Shi , David Chi-Wai Tang , Nishtha Gupta , Aaron Challenner , Bonan Zheng , Angeliki Metallinou , Vincent Auvray , Minmin Shen
IPC: G10L25/78 , G10L15/22 , G10L15/24 , G10L15/08 , G10L15/06 , G06V40/20 , G06F3/16 , G10L13/08 , G10L15/20 , G06V40/10 , G06V10/40 , G10L15/02 , G06F18/24
CPC classification number: G10L15/22 , G06F3/167 , G06F18/24 , G06V10/40 , G06V40/10 , G06V40/20 , G10L13/08 , G10L15/02 , G10L15/063 , G10L15/08 , G10L15/20 , G10L15/222 , G10L15/24 , G10L2015/0635 , G10L2015/088 , G10L2015/223 , G10L2015/227
Abstract: A system that is capable of resolving anaphora using timing data received by a local device. A local device outputs audio representing a list of entries. The audio may represent synthesized speech of the list of entries. A user can interrupt the device to select an entry in the list, such as by saying “that one.” The local device can determine an offset time representing the time between when audio playback began and when the user interrupted. The local device sends the offset time and audio data representing the utterance to a speech processing system which can then use the offset time and stored data to identify which entry on the list was most recently output by the local device when the user interrupted. The system can then resolve anaphora to match that entry and can perform additional processing based on the referred to item.
-
公开(公告)号:US09754589B2
公开(公告)日:2017-09-05
申请号:US15256176
申请日:2016-09-02
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Ying Shi , Imre Attila Kiss , Ryan Paul Thomas , Frederic Johan Georges Deramat
CPC classification number: G10L15/22 , G06F17/277 , G06F17/278 , G06F17/279 , G06F17/28 , G06F17/2881 , G10L13/08 , G10L15/26
Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.
-
公开(公告)号:US20170116985A1
公开(公告)日:2017-04-27
申请号:US15256176
申请日:2016-09-02
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Ying Shi , Imre Attila Kiss , Ryan Paul Thomas , Frederic Johan Georges Deramat
CPC classification number: G10L15/22 , G06F17/277 , G06F17/278 , G06F17/279 , G06F17/28 , G06F17/2881 , G10L13/08 , G10L15/26
Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.
-
10.
公开(公告)号:US20150302002A1
公开(公告)日:2015-10-22
申请号:US14754598
申请日:2015-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Mathias , Ying Shi , Imre Attila Kiss , Ryan Paul Thomas , Frederic Johan Georges Deramat
IPC: G06F17/28
CPC classification number: G10L15/22 , G06F17/277 , G06F17/278 , G06F17/279 , G06F17/28 , G06F17/2881 , G10L13/08 , G10L15/26
Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.
Abstract translation: 公开了用于处理关于多个主题或域的用户话语的特征,并且从用于响应于话语或以其他方式采取行动的特定域中选择可能的结果。 用户话语可以通过自动语音识别(“ASR”)模块进行转录,并且可以将结果提供给多域自然语言理解(“NLU”)引擎。 多域NLU引擎可以处理多个单个域中的转录,而不是在单个域中处理转录。 在一些情况下,转录可以在多个单独的结构域中并行或基本上同时进行。 此外,可以基于先前的用户交互和其他数据生成提示。 ASR模块,多域NLU引擎和口语处理系统的其他组件可以使用提示来更有效地处理输入或更准确地生成输出。
-
-
-
-
-
-
-
-
-