-
公开(公告)号:US12205577B1
公开(公告)日:2025-01-21
申请号:US17217031
申请日:2021-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Taehwan Kim , Sanqiang Zhao , Robinson Piramuthu , Seokhwan Kim , Yang Liu , Gokhan Tur , Eshan Bhatnagar
Abstract: Techniques for rendering visual content, in response to one or more utterances, are described. A device receives one or more utterances that define a parameter(s) for desired output content. A system (or the device) identifies natural language data corresponding to the desired content, and uses natural language generation processes to update the natural language data based on the parameter(s). The system (or the device) then generates an image based on the updated natural language data. The system (or the device) also generates video data of an avatar. The device displays the image and the avatar, and synchronizes movements of the avatar with output of synthesized speech of the updated natural language data. The device may also display subtitles of the updated natural language data, and cause a word of the subtitles to be emphasized when synthesized speech of the word is being output.