-
公开(公告)号:US20240203014A1
公开(公告)日:2024-06-20
申请号:US18299248
申请日:2023-04-12
Applicant: Samsung Electronics Co., Ltd.
Inventor: Liang Zhao , Siva Penke
CPC classification number: G06T13/205 , G06T13/40 , G10L17/02 , G10L17/04 , G10L17/18
Abstract: A method includes obtaining, using at least one processing device of an electronic device, an audio input associated with a speaker. The method also includes extracting, using a feature extractor of a trained machine learning model, audio features from the audio input. The method further includes generating (i) one or more content parameter predictions using content embeddings extracted by a content encoder and decoded by a content decoder of the trained machine learning model and (ii) one or more style parameter predictions using style embeddings extracted by a style encoder and decoded by a style decoder of the trained machine learning model. The content embeddings and the style embeddings are based on the audio features of the audio input. The trained machine learning model is trained to generate the one or more content parameter predictions and the one or more style parameter predictions using disentangled content and style embeddings.
-
2.
公开(公告)号:US12154204B2
公开(公告)日:2024-11-26
申请号:US17673645
申请日:2022-02-16
Applicant: Samsung Electronics Co., Ltd.
Inventor: Liang Zhao , Siva Penke
Abstract: A method includes obtaining a speech segment. The method also includes generating, using at least one processing device of an electronic device, context-independent features and context-dependent features of the speech segment. The method further includes decoding, using the at least one processing device of the electronic device, a first viseme based on the context-independent features. The method also includes decoding, using the at least one processing device of the electronic device, a second viseme based on the context-dependent features and the first viseme. In addition, the method includes generating, using the at least one processing device of the electronic device, an output viseme based on the first and second visemes, where the output viseme is associated with a visual animation of the speech segment.
-
3.
公开(公告)号:US20230130287A1
公开(公告)日:2023-04-27
申请号:US17673645
申请日:2022-02-16
Applicant: Samsung Electronics Co., Ltd.
Inventor: Liang Zhao , Siva Penke
Abstract: A method includes obtaining a speech segment. The method also includes generating, using at least one processing device of an electronic device, context-independent features and context-dependent features of the speech segment. The method further includes decoding, using the at least one processing device of the electronic device, a first viseme based on the context-independent features. The method also includes decoding, using the at least one processing device of the electronic device, a second viseme based on the context-dependent features and the first viseme. In addition, the method includes generating, using the at least one processing device of the electronic device, an output viseme based on the first and second visemes, where the output viseme is associated with a visual animation of the speech segment.
-
公开(公告)号:US20250104318A1
公开(公告)日:2025-03-27
申请号:US18601097
申请日:2024-03-11
Applicant: Samsung Electronics Co., Ltd.
Inventor: Liang Zhao , Siva Penke , Christopher Peri , Byeonghee Yu , Jisun Park
Abstract: In one embodiment, a method includes accessing an audio input that includes a mixture of vocal sounds and non-vocal sounds and separating, by a trained audio source separation model, the audio input into a first audio output representing the vocal sounds and a second audio output representing the non-vocal sounds. The method further includes determining, by one or more trained avatar animation models and by separately encoding the first audio output representing the vocal sounds and the second audio output representing the non-vocal sounds, an avatar animation temporally corresponding to the audio input; and rendering, in real time and temporally coincident with the audio input, the determined avatar animation.
-
-
-