Video-informed Spatial Audio Expansion
    1.
    发明公开

    公开(公告)号:US20230305800A1

    公开(公告)日:2023-09-28

    申请号:US18327134

    申请日:2023-06-01

    Applicant: GOOGLE LLC

    CPC classification number: G06F3/165 G10L25/51 G06V20/41 H04S5/005

    Abstract: First video frames that include a visual object and a non-spatialized first audio segment that includes an auditory event are received. If that second video frames do not include the visual object and a first time difference between the first video frames and the second video frames does not exceed a certain time, a motion vector of the visual object is used to assign a spatial location to the auditory event in at least one of the second video frames. A second audio segment that includes the auditory event and third video frames are received. If the third video frames do not include the visual object and a second time difference between the first video frames and the third video frames exceeds the certain time, the auditory event is assigned to a diffuse sound field. An audio output that conveys spatial locations of the visual object is output.

    Video-informed spatial audio expansion

    公开(公告)号:US11704087B2

    公开(公告)日:2023-07-18

    申请号:US16779921

    申请日:2020-02-03

    Applicant: GOOGLE LLC

    CPC classification number: G06F3/165 G06V20/41 G10L25/51 H04S5/005

    Abstract: Assigning spatial information to audio segments is disclosed. A method includes receiving a first audio segment that is non-spatialized and is associated with first video frames; identifying visual objects in the first video frames; identifying auditory events in the first audio segment; identifying a match between a visual object of the visual objects and an auditory event of the auditory events; and assigning a spatial location to the auditory event based on a location of the visual object.

    Video-Informed Spatial Audio Expansion

    公开(公告)号:US20210240431A1

    公开(公告)日:2021-08-05

    申请号:US16779921

    申请日:2020-02-03

    Applicant: GOOGLE LLC

    Abstract: Assigning spatial information to audio segments is disclosed. A method includes receiving a first audio segment that is non-spatialized and is associated with first video frames; identifying visual objects in the first video frames; identifying auditory events in the first audio segment; identifying a match between a visual object of the visual objects and an auditory event of the auditory events; and assigning a spatial location to the auditory event based on a location of the visual object.

    Fast and memory efficient encoding of sound objects using spherical harmonic symmetries

    公开(公告)号:US10674301B2

    公开(公告)日:2020-06-02

    申请号:US16108385

    申请日:2018-08-22

    Applicant: GOOGLE LLC

    Abstract: A method of encoding sound objects includes receiving a set of monophonic sound inputs. Each of the set of monophonic sound inputs includes position and orientation information of a sound object relative to a source position. The set of monophonic sound inputs are encoded into a higher order ambisonic (HOA) sound field in a spherical harmonics domain based on a spherical harmonics dataset including a subset of all spherical harmonic coefficients for a given subset of azimuth and elevation angles. Some embodiments include decoding the HOA sound field to generate a set of loudspeaker signals.

Patent Agency Ranking