-
公开(公告)号:US12205577B1
公开(公告)日:2025-01-21
申请号:US17217031
申请日:2021-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Taehwan Kim , Sanqiang Zhao , Robinson Piramuthu , Seokhwan Kim , Yang Liu , Gokhan Tur , Eshan Bhatnagar
Abstract: Techniques for rendering visual content, in response to one or more utterances, are described. A device receives one or more utterances that define a parameter(s) for desired output content. A system (or the device) identifies natural language data corresponding to the desired content, and uses natural language generation processes to update the natural language data based on the parameter(s). The system (or the device) then generates an image based on the updated natural language data. The system (or the device) also generates video data of an avatar. The device displays the image and the avatar, and synchronizes movements of the avatar with output of synthesized speech of the updated natural language data. The device may also display subtitles of the updated natural language data, and cause a word of the subtitles to be emphasized when synthesized speech of the word is being output.
-
公开(公告)号:US12087320B1
公开(公告)日:2024-09-10
申请号:US17671194
申请日:2022-02-14
Applicant: Amazon Technologies, Inc.
Inventor: Qin Zhang , Qingming Tang , Ming Sun , Chao Wang , Steve Mark Lorusso , Andrew Thomas Bydlon , James Garnet Droppo , Viktor Rozgic , Sripal Mehta , Yang Liu
CPC classification number: G10L25/51 , G10L15/1815 , G10L15/22 , G10L15/30
Abstract: A system may be configured to detect custom acoustic events, where the system generates an acoustic event profile for the custom acoustic event based on a natural language description provided by a user and using an audio sample of the described acoustic event. For example, the user may describe the custom acoustic event as “dog bark.” The system may ask the user questions to refine the description (e.g., dog breed, dog gender, age, etc.). Using an audio sample of the refined description, the system may then determine that audio captured in the user's environment is a potential sample of the custom acoustic event. Such captured audio may be presented to the user for confirmation, and then may be used to detect future occurrences of the custom acoustic event in the user's environment.
-
公开(公告)号:US20240428787A1
公开(公告)日:2024-12-26
申请号:US18340342
申请日:2023-06-23
Applicant: Amazon Technologies, Inc.
Inventor: Mahdi Namazifar , Di Jin , Yang Liu , Devamanyu Hazarika , Dilek Hakkani-Tur , Yubin Ge
IPC: G10L15/22 , G06F40/295 , G10L15/183
Abstract: Techniques for constraining the results of a generative language model to valid information using knowledge-grounded documentation. A generative language model may generate invalid results, including compound entities and incorrect entity relations. The techniques include, for a given user inquiry, determining a set of documented information, from a particular knowledge base, that corresponds to the user inquiry. The techniques further include determining a subgraph from a knowledge graph representing the knowledge base, as well as determining a trie data structure representation of the set of documented information. The user inquiry and subgraph are provided as input to a trained generative language model for generating a response to the user inquiry. The techniques include using the trie data structure to validate that the generated response corresponds to real information from the set of documented information.
-
公开(公告)号:US12205483B1
公开(公告)日:2025-01-21
申请号:US18341312
申请日:2023-06-26
Applicant: Amazon Technologies, Inc.
Inventor: Yibo Cao , Chong Huang , Dawei Li , Yang Liu , Kah Kuen Fu
IPC: G08G5/00 , B64U10/13 , B64U101/70
Abstract: An aerial vehicle is configured to calculate ranges to objects around the aerial vehicle when operating within indoor spaces, using a LIDAR sensor or another range sensor. The aerial vehicle calculates ranges within a plurality of sectors around the aerial vehicle and identifies a minimum distance measurement for each of the sectors. Sets of adjacent sectors having distance measurements above a threshold are identified, and bearings and minimum distance measurements of the sets of adjacent sectors are determined. When the aerial vehicle detects an object within a flight path, the aerial vehicle selects one of the sets of adjacent sectors based on the minimum distance measurements and executes a braking maneuver in a direction of the selected one of the sets of adjacent sectors.
-
公开(公告)号:US11501794B1
公开(公告)日:2022-11-15
申请号:US16875425
申请日:2020-05-15
Applicant: Amazon Technologies, Inc.
Inventor: Yelin Kim , Yang Liu , Dilek Hakkani-tur , Thomas Nelson , Anna Chen Santos , Joshua Levy , Saurabh Gupta
IPC: G10L25/63 , G10L15/26 , G10L15/18 , H04N5/247 , H04N5/232 , G05D1/00 , G05D1/02 , G06T7/70 , G06V20/10 , G06V40/10 , G06V40/16
Abstract: Described herein is a system for improving sentiment detection and/or recognition using multiple inputs. For example, an autonomously motile device is configured to generate audio data and/or image data and perform sentiment detection processing. The device may process the audio data and the image data using a multimodal temporal attention model to generate sentiment data that estimates a sentiment score and/or a sentiment category. In some examples, the device may also process language data (e.g., lexical information) using the multimodal temporal attention model. The device can adjust its operations based on the sentiment data. For example, the device may improve an interaction with the user by estimating the user's current emotional state, or can change a position of the device and/or sensor(s) of the device relative to the user to improve an accuracy of the sentiment data.
-
-
-
-