FACE-SPEECH BRIDGING BY CYCLE VIDEO/AUDIO RECONSTRUCTION

    公开(公告)号:WO2021076266A1

    公开(公告)日:2021-04-22

    申请号:PCT/US2020/051371

    申请日:2020-09-18

    Abstract: In an embodiment described herein, a method for face-speech bridging by cycle video/audio reconstruction is described. The method comprises encoding audio data and video data via a mutual autoencoders that comprise an audio autoencoder and a video autoencoder, wherein the mutual autoencoders share a common space with corresponding embeddings derived by each of the audio autoencoder and the video autoencoder. Additionally, the method comprises substituting embeddings from a non-corrupted modality for corresponding corrupted embeddings in a corrupted modality in real-time based at least in part on corrupted audio data or corrupted video data. The method also comprises synthesizing reconstructed audio data and reconstructed video data based on, at least in part, the substituted embeddings.

Patent Agency Ranking