Systems and Methods for Upmixing Audiovisual Data

    公开(公告)号:US20230308823A1

    公开(公告)日:2023-09-28

    申请号:US18042258

    申请日:2020-08-26

    CPC classification number: H04S7/301 H04S2400/01

    Abstract: A computer-implemented method for upmixing audiovisual data can include obtaining audiovisual data including input audio data and video data accompanying the input audio data. Each frame of the video data can depict only a portion of a larger scene. The input audio data can have a first number of audio channels. The computer-implemented method can include providing the audiovisual data as input to a machine-learned audiovisual upmixing model. The audiovisual upmixing model can include a sequence-to-sequence model configured to model a respective location of one or more audio sources within the larger scene over multiple frames of the video data. The computer-implemented method can include receiving upmixed audio data from the audiovisual upmixing model. The upmixed audio data can have a second number of audio channels. The second number of audio channels can be greater than the first number of audio channels.

    Systems and methods that leverage deep learning to selectively store audiovisual content

    公开(公告)号:US10372991B1

    公开(公告)日:2019-08-06

    申请号:US15944415

    申请日:2018-04-03

    Applicant: Google LLC

    Abstract: Systems, methods, and devices for curating audiovisual content are provided. A mobile image capture device can be operable to capture one or more images; receive an audio signal; analyze at least a portion of the audio signal with a first machine-learned model to determine a first audio classifier label descriptive of an audio event; identify a first image associated with the first audio classifier label; analyze the first image with a second machine-learned model to determine a desirability of a scene depicted by the first image; and determine, based at least in part on the desirability of the scene depicted by the first image, whether to store a copy of the first image associated with the first audio classifier label in the non-volatile memory of the mobile image capture device or to discard the first image without storing a copy of the first image.

    Systems and methods for upmixing audiovisual data

    公开(公告)号:US12273697B2

    公开(公告)日:2025-04-08

    申请号:US18042258

    申请日:2020-08-26

    Applicant: Google LLC

    Abstract: A computer-implemented method for upmixing audiovisual data can include obtaining audiovisual data including input audio data and video data accompanying the input audio data. Each frame of the video data can depict only a portion of a larger scene. The input audio data can have a first number of audio channels. The computer-implemented method can include providing the audiovisual data as input to a machine-learned audiovisual upmixing model. The audiovisual upmixing model can include a sequence-to-sequence model configured to model a respective location of one or more audio sources within the larger scene over multiple frames of the video data. The computer-implemented method can include receiving upmixed audio data from the audiovisual upmixing model. The upmixed audio data can have a second number of audio channels. The second number of audio channels can be greater than the first number of audio channels.

Patent Agency Ranking