-
公开(公告)号:US11579841B1
公开(公告)日:2023-02-14
申请号:US17547802
申请日:2021-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Monty Eich , Clare Elizabeth Veladanda , Shiladitya Roy , Rohit Bhattacharjee , Prashant Jayaram Thakare , Nikhil Gupta , Xu Zhang
Abstract: A speech-processing system may provide access to one or more skills via spoken commands and/or responses in the form of synthesized speech. The system may be capable of keeping one or more skills active in the background while a user interacts (e.g., provides inputs to and/or receives outputs from) with a skill running in the foreground. A background skill may receive some trigger data, and determine to request the system to return the background skill to the foreground to, for example, request a user input regarding an action previously requested by the user. In some cases, the user may invoke a background skill to continue a previous interaction. The system may return the background skill to the foreground. The resumed skill may continue a previous interaction to, for example, to query the user for instructions, provide an update or alert, or continue a previous output.
-
公开(公告)号:US11461779B1
公开(公告)日:2022-10-04
申请号:US15934391
申请日:2018-03-23
Applicant: Amazon Technologies, Inc.
Inventor: Rohin Dabas , Troy Dean Schuring , Xu Zhang , Maksym Kolodeznyi , Andres Felipe Borja Jaramillo , Nnenna Eleanya Okwara , Alberto Milan Gutierrez , Rashmi Tonge
IPC: G06Q20/40 , G10L17/24 , G10L15/187 , G10L15/26
Abstract: Techniques for transferring control of a system-user dialog session are described. A first speechlet component may interact with a user until the first speechlet component receives user input that the first speechlet component cannot handle. The first speechlet component may output an action representing the user input. A system may determine a second speechlet component configured to execute the action. The system may send the second speechlet component a navigator object that results in the second speechlet component handling the user interaction that the first speechlet component could not handle. Once the second speechlet component is finished processing, the second speechlet component may output an updated navigator object, which causes the first speechlet component to either further interact with a user or cause a current dialog session to be closed. The system may additionally maintain a data structure representing calling speechlet components and called speechlet components associated with the session.
-
公开(公告)号:US20220093093A1
公开(公告)日:2022-03-24
申请号:US17112227
申请日:2020-12-04
Applicant: Amazon Technologies, Inc.
Inventor: Prakash Krishnan , Arindam Mandal , Nikko Strom , Pradeep Natarajan , Ariya Rastrow , Shiv Naga Prasad Vitaladevuni , David Chi-Wai Tang , Aaron Challenner , Xu Zhang , Krishna Anisetty , Josey Diego Sandoval , Rohit Prasad , Premkumar Natarajan
Abstract: A system can operate a speech-controlled device in a mode where the speech-controlled device determines that an utterance is directed at the speech-controlled device using image data showing the user speaking the utterance. If the user is directing the user's gaze at the speech-controlled device while speaking, the system may determine the utterance is system directed and thus may perform further speech processing based on the utterance. If the user's gaze is directed elsewhere, the system may determine the utterance is not system directed (for example directed at another user) and thus the system may not perform further speech processing based on the utterance and may take other actions, for example discarding audio data of the utterance.
-
公开(公告)号:US11681364B1
公开(公告)日:2023-06-20
申请号:US17361939
申请日:2021-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Xu Zhang , Yue Wu , Varsha Hedau , Shih-Fu Chang , Pradeep Natarajan
Abstract: An image processing system may receive image data from a camera of a user device and perform gaze prediction processing of the image data to predict one or more gaze patterns. The gaze prediction processing may include processing the image data using a neural network to detect faces and/or objects and generate an image feature map. The gaze prediction processing may include performing gaze direction prediction operations using the feature map and detected faces and/or objects to determine gaze direction probability data. The gaze prediction processing may include predicting a gaze pattern based on the gaze direction probability data and the image feature map. The gaze pattern may be short-term (e.g., atomic-level) or long-term (e.g., event-level).
-
-
-