-
公开(公告)号:WO2017053311A1
公开(公告)日:2017-03-30
申请号:PCT/US2016/052688
申请日:2016-09-20
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: MEYERS, James, David , DEAN, Arlen , LIU, Yue , MANDAL, Arindam , MILLER, Daniel , PRAVINCHANDRA, Shah, Samir
CPC classification number: G06F3/167 , G10L15/00 , G10L15/063 , G10L15/1815 , G10L15/22 , G10L15/222 , G10L15/26 , G10L15/32 , G10L2015/088 , G10L2015/223 , G10L2015/226
Abstract: A system may use multiple speech interface devices to interact with a user by speech. All or a portion of the speech interface devices may detect a user utterance and may initiate speech processing to determine a meaning or intent of the utterance. Within the speech processing, arbitration is employed to select one of the multiple speech interface devices to respond to the user utterance. Arbitration may be based in part on metadata that directly or indirectly indicates the proximity of the user to the devices, and the device that is deemed to be nearest the user may be selected to respond to the user utterance.
Abstract translation: 系统可以使用多个语音接口设备通过语音与用户交互。 语音接口设备的全部或一部分可以检测用户话语,并且可以发起语音处理以确定话语的含义或意图。 在语音处理中,采用仲裁来选择多个语音接口设备中的一个以响应用户话语。 仲裁部分地可以部分地基于直接或间接地指示用户与设备的接近度的元数据,并且可以选择认为最接近用户的设备来响应用户的话语。
-
公开(公告)号:WO2022060970A1
公开(公告)日:2022-03-24
申请号:PCT/US2021/050645
申请日:2021-09-16
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: KRISHNAN, Prakash , MANDAL, Arindam , JONNALAGADDA, Siddhartha Reddy , STROM, Nikko , NATARAJAN, Premkumar , PRASAD, Rohit , TAYLOR, Thomas , NATARAJAN, Pradeep , RASTROW, Ariya , VITALADEVUNI, Shiv Naga Prasad , SHI, Ying , TANG, David Chi-Wai , GUPTA, Nishtha , CHALLENNER, Aaron , ZHANG, Xu , ANISETTY, Krishna , ZHENG, Bonan , METALLINOU, Angeliki , AUVRAY, Vincent , SHEN, Minmin , SANDOVAL, Josey Diego , MAIMON, Amotz
Abstract: A system can operate a speech-controlled device in a mode where the speech-controlled device determines that an utterance is directed at the speech-controlled device using image data showing the user speaking the utterance. If the user is directing the user's gaze at the speech-controlled device while speaking, the system may determine the utterance is system directed and thus may perform further speech processing based on the utterance. If the user's gaze is directed elsewhere, the system may determine the utterance is not system directed (for example directed at another user) and thus the system may not perform further speech processing based on the utterance and may take other actions, for example discarding audio data of the utterance.
-
公开(公告)号:WO2017165040A1
公开(公告)日:2017-09-28
申请号:PCT/US2017/018655
申请日:2017-02-21
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: MATHIAS, Lambert , KOLLAR, Thomas , MANDAL, Arindam , METALLINOU, Angeliki
Abstract: A system capable of performing natural language understanding (NLU) without the concept of a domain that influences NLU results. The present system uses a hierarchical organizations of intents/commands and entity types, and trained models associated with those hierarchies, so that commands and entity types may be determined for incoming text queries without necessarily determining a domain for the incoming text. The system thus operates in a domain agnostic manner, in a departure from multi-domain architecture NLU processing where a system determines NLU results for multiple domains simultaneously and then ranks them to determine which to select as the result.
Abstract translation:
一种能够执行自然语言理解(NLU)的系统,不需要影响NLU结果的域的概念。 本系统使用意图/命令和实体类型的分层组织,以及与这些层次关联的训练模型,使得可以针对传入文本查询确定命令和实体类型,而不必为传入文本确定域。 因此,系统以与域无关的方式运行,与多域体系结构NLU处理背离,其中系统同时确定多个域的NLU结果,然后对它们进行排序以确定选择哪个作为结果。 p>
-
-