-
公开(公告)号:US20240323459A1
公开(公告)日:2024-09-26
申请号:US18419123
申请日:2024-01-22
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Anthony Sylvain Jean-Yves Liot , Anil Unnikrishnan , Sajid Sadi , Sandipan Banerjee , Vignesh Gokul , Janvi Chetan Palan , Hyun Jae Kang , Ondrej Texler
IPC: H04N21/231 , H04N21/266 , H04N21/845
CPC classification number: H04N21/23106 , H04N21/266 , H04N21/8455
Abstract: Computer-implemented content delivery includes caching a source content having a plurality of bridge points. A user event indicating an interaction with a user is received. The user event is received during playback of the source content. In response to the user event, a template is selected from a plurality of cached templates. The template corresponds to a bridge point selected from the plurality of bridge points of the source content as an exit point from the source content. In response to the user event, a bridge is dynamically generated. The bridge links the bridge point with the template. In response to the user event, a target content is selected from a plurality of cached target contents. In response to the user event, the bridge, the template, and the target content is conveyed to a device of the user for playback following the bridge point of the source content.
-
公开(公告)号:US20240013464A1
公开(公告)日:2024-01-11
申请号:US18296202
申请日:2023-04-05
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Siddarth Ravichandran , Dimitar Petkov Dinev , Ondrej Texler , Ankur Gupta , Janvi Chetan Palan , Hyun Jae Kang , Anthony Sylvain Jean-Yves Liot , Sajid Sadi
CPC classification number: G06T13/40 , G06T13/205 , G06T5/50 , G06T7/13 , G06T7/73 , G06V10/761 , G06T2207/20081 , G06T2207/20221 , G06T2207/30201
Abstract: Multimodal disentanglement can include generating a set of silhouette images corresponding to a human face, the generating undoing a correlation between an upper portion and a lower portion of the human face depicted by each silhouette image. A unimodal machine learning model can be trained with the set of silhouette images. As trained, the unimodal machine learning model can generate synthetic images of the human face. The synthetic images generated by the unimodal machine learning model once trained can be used to train a multimodal rendering network. The multimodal rendering network can be trained to generate a voice-animated digital human. Training the multimodal rendering network can be based on minimizing differences between the synthetic images and images generated by the multimodal rendering network.
-
公开(公告)号:US20240221260A1
公开(公告)日:2024-07-04
申请号:US18342721
申请日:2023-06-27
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Dimitar Petkov Dinev , Ondrej Texler , Siddarth Ravichandran , Janvi Chetan Palan , Hyun Jae Kang , Ankur Gupta , Anil Unnikrishnan , Anthony Sylvain Jean-Yves Liot , Sajid Sadi
CPC classification number: G06T13/40 , G06T13/205 , G06T19/20 , G06V40/174 , G06V40/20 , G06T2219/2004
Abstract: Synthesizing speech and movement of a virtual human includes capturing supplemental data generated by a transducer. The supplemental data specifies one or more attributes of a user. The capturing is performed in substantially real-time with the user providing input to a conversational platform. A behavior determiner generates behavioral data based on the supplemental data and an audio response generated by the conversational platform in response to the input to the conversation platform. Based on the behavioral data and the audio response, a rendering network generates a video rendering of a virtual human engaging in a conversation with the user, the video rendering synchronized with the audio response.
-
公开(公告)号:US20230394732A1
公开(公告)日:2023-12-07
申请号:US17967872
申请日:2022-10-17
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Siddarth Ravichandran , Anthony Sylvain Jean-Yves Liot , Dimitar Petkov Dinev , Ondrej Texler , Hyun Jae Kang , Janvi Chetan Palan , Sajid Sadi
CPC classification number: G06T13/40 , G06T17/20 , G06T7/70 , G10L15/25 , G06T2207/30201
Abstract: Creating images and animations of lip motion from mouth shape data includes providing, as one or more input features to a neural network model, a vector of a plurality of coefficients. Each vector of the plurality of coefficients corresponds to a different mouth shape. Using the neural network model, a data structure output specifying a visual representation of a mouth including lips having a shape corresponding to the vector is generated.
-
-
-