AURALIZATION FOR MULTI-MICROPHONE DEVICES
    14.
    发明申请

    公开(公告)号:US20190387315A1

    公开(公告)日:2019-12-19

    申请号:US16555118

    申请日:2019-08-29

    Applicant: Google LLC

    Abstract: A method for auralizing a multi-microphone device. Path information for one or more sound paths using dimensions and room reflection coefficients of a simulated room for one of a plurality of microphones included in a multi-microphone device is determined. An array-related transfer functions (ARTFs) for the one of the plurality of microphones is retrieved. The auralized impulse response for the one of the plurality of microphones is generated based at least on the retrieved ARTFs and the determined path information.

    Sound model localization within an environment

    公开(公告)号:US12073319B2

    公开(公告)日:2024-08-27

    申请号:US16940294

    申请日:2020-07-27

    Applicant: GOOGLE LLC

    CPC classification number: G06N3/08 G06N3/047 G10L25/51

    Abstract: Systems and techniques are provided for sound model localization within an environment. Sound recordings of sounds in the environment may be received from devices in the environment. Preliminary labels for the sound recordings may be determined using pre-trained sound models. The preliminary labels may have associated probabilities. Sound clips with preliminary labels may be generated based on sound recordings that have preliminary labels whose probability is over a high-recall threshold for the pre-trained sound model that determined the preliminary label. The sound clips with preliminary labels may be sent to a user device. Labeled sound clips may be received from the user device. The labeled sound clips may be based on the sound clips with preliminary labels. Training data sets may be generated for the pre-trained sound models using the labeled sound clips. The pre-trained sound models may be trained using the training data sets to generate localized sound models.

    Auralization for multi-microphone devices

    公开(公告)号:US11470419B2

    公开(公告)日:2022-10-11

    申请号:US16555118

    申请日:2019-08-29

    Applicant: Google LLC

    Abstract: A method for auralizing a multi-microphone device. Path information for one or more sound paths using dimensions and room reflection coefficients of a simulated room for one of a plurality of microphones included in a multi-microphone device is determined. An array-related transfer functions (ARTFs) for the one of the plurality of microphones is retrieved. The auralized impulse response for the one of the plurality of microphones is generated based at least on the retrieved ARTFs and the determined path information.

    QUERY ENDPOINTING BASED ON LIP DETECTION

    公开(公告)号:US20220238112A1

    公开(公告)日:2022-07-28

    申请号:US17722960

    申请日:2022-04-18

    Applicant: Google LLC

    Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.

    SOUND MODEL LOCALIZATION WITHIN AN ENVIRONMENT

    公开(公告)号:US20220027725A1

    公开(公告)日:2022-01-27

    申请号:US16940294

    申请日:2020-07-27

    Applicant: GOOGLE LLC

    Abstract: Systems and techniques are provided for sound model localization within an environment. Sound recordings of sounds in the environment may be received from devices in the environment. Preliminary labels for the sound recordings may be determined using pre-trained sound models. The preliminary labels may have associated probabilities. Sound clips with preliminary labels may be generated based on sound recordings that have preliminary labels whose probability is over a high-recall threshold for the pre-trained sound model that determined the preliminary label. The sound clips with preliminary labels may be sent to a user device. Labeled sound clips may be received from the user device. The labeled sound clips may be based on the sound clips with preliminary labels. Training data sets may be generated for the pre-trained sound models using the labeled sound clips. The pre-trained sound models may be trained using the training data sets to generate localized sound models.

    Query endpointing based on lip detection

    公开(公告)号:US10755714B2

    公开(公告)日:2020-08-25

    申请号:US16412677

    申请日:2019-05-15

    Applicant: Google LLC

    Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.

Patent Agency Ranking