-
公开(公告)号:WO2022271291A1
公开(公告)日:2022-12-29
申请号:PCT/US2022/029083
申请日:2022-05-13
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: JOZE, Hamidreza Vaezi
IPC: G06T7/90 , G06T5/00 , G06N20/00 , G06T2207/10024 , G06T2207/20081 , G06T2207/20084 , G06T5/001 , G06T5/007 , G06T5/50 , G06T7/11 , G06T7/174 , G06V10/60
Abstract: An image processor receives first image data representing an image. The first image data comprising a plurality of color values corresponding to a plurality of pixels in the image. The image processor determines, using a trained machine learning model, second image data based on the first image data. The second image data comprises surface spectral reflection values corresponding to the plurality of pixels in the image, where the surface spectral reflection values are distributed across a plurality of wavelengths of visible light in the image. The image processor then performs at least one image processing operation with respect to the image using the second image data.
-
公开(公告)号:WO2021076266A1
公开(公告)日:2021-04-22
申请号:PCT/US2020/051371
申请日:2020-09-18
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: JOZE, Hamidreza Vaezi , AKBARI, Hassan
IPC: G11B27/036 , G06K9/00
Abstract: In an embodiment described herein, a method for face-speech bridging by cycle video/audio reconstruction is described. The method comprises encoding audio data and video data via a mutual autoencoders that comprise an audio autoencoder and a video autoencoder, wherein the mutual autoencoders share a common space with corresponding embeddings derived by each of the audio autoencoder and the video autoencoder. Additionally, the method comprises substituting embeddings from a non-corrupted modality for corresponding corrupted embeddings in a corrupted modality in real-time based at least in part on corrupted audio data or corrupted video data. The method also comprises synthesizing reconstructed audio data and reconstructed video data based on, at least in part, the substituted embeddings.
-