Voice activity detection using audio and visual analysis

    公开(公告)号:US11232796B2

    公开(公告)日:2022-01-25

    申请号:US16601482

    申请日:2019-10-14

    Applicant: Facebook, Inc.

    Abstract: A method of detecting voice activity includes performing a video analysis on a frame of video signal to determine a position of a user in the frame and to identify one or more beams of a corresponding audio signal associated with a region including the position of the user. The identified one or more beams of audio signal are analyzed to determine whether voice is present in the frame. When a user is not identified during the video analysis of the frame of video signal, audio analysis is not performed on the corresponding frame of audio signal.

    Detection and removal of wind noise

    公开(公告)号:US11217264B1

    公开(公告)日:2022-01-04

    申请号:US16815664

    申请日:2020-03-11

    Applicant: Facebook, Inc.

    Abstract: An electronic device includes one or more microphones that generate audio signals and a wind noise detection subsystem. The electronic device may also include a wind noise reduction subsystem. The wind noise detection subsystem applies multiple wind noise detection techniques to the set of audio signals to generate corresponding indications of whether wind noise is present. The wind noise detection subsystem determines whether wind noise is present based on the indications generated by each detection technique and generates an overall indication of whether wind noise is present. The wind noise reduction subsystem applies one or more wind noise reduction techniques to the audio signal if wind noise is detected. The wind noise detection and reduction techniques may work in multiple domains (e.g., the time, spatial, and frequency domains).

    VOICE ACTIVITY DETECTION USING AUDIO AND VISUAL ANALYSIS

    公开(公告)号:US20210110830A1

    公开(公告)日:2021-04-15

    申请号:US16601482

    申请日:2019-10-14

    Applicant: Facebook, Inc.

    Abstract: A method of detecting voice activity includes performing a video analysis on a frame of video signal to determine a position of a user in the frame and to identify one or more beams of a corresponding audio signal associated with a region including the position of the user. The identified one or more beams of audio signal are analyzed to determine whether voice is present in the frame. When a user is not identified during the video analysis of the frame of video signal, audio analysis is not performed on the corresponding frame of audio signal.

Patent Agency Ranking