Instantaneous Learning in Text-To-Speech During Dialog

    公开(公告)号:US20220284882A1

    公开(公告)日:2022-09-08

    申请号:US17190456

    申请日:2021-03-03

    Applicant: Google LLC

    Abstract: A method for instantaneous learning in text-to-speech (TTS) during dialog includes receiving a user pronunciation of a particular word present in a query spoken by a user. The method also includes receiving a TTS pronunciation of the same particular word that is present in a TTS input where the TTS pronunciation of the particular word is different than the user pronunciation of the particular word. The method also includes obtaining user pronunciation-related features and TTS pronunciation related features associated with the particular word. The method also includes generating a pronunciation decision selecting one of the user pronunciation or the TTS pronunciation of the particular word that is associated with a highest confidence. The method also include providing the TTS audio that includes a synthesized speech representation of the response to the query using the user pronunciation or the TTS pronunciation for the particular word.

    Instantaneous learning in text-to-speech during dialog

    公开(公告)号:US11676572B2

    公开(公告)日:2023-06-13

    申请号:US17190456

    申请日:2021-03-03

    Applicant: Google LLC

    CPC classification number: G10L13/08 G10L15/187

    Abstract: A method for instantaneous learning in text-to-speech (TTS) during dialog includes receiving a user pronunciation of a particular word present in a query spoken by a user. The method also includes receiving a TTS pronunciation of the same particular word that is present in a TTS input where the TTS pronunciation of the particular word is different than the user pronunciation of the particular word. The method also includes obtaining user pronunciation-related features and TTS pronunciation related features associated with the particular word. The method also includes generating a pronunciation decision selecting one of the user pronunciation or the TTS pronunciation of the particular word that is associated with a highest confidence. The method also include providing the TTS audio that includes a synthesized speech representation of the response to the query using the user pronunciation or the TTS pronunciation for the particular word.

    HOTWORD SUPPRESSION
    3.
    发明申请
    HOTWORD SUPPRESSION 审中-公开

    公开(公告)号:US20200279562A1

    公开(公告)日:2020-09-03

    申请号:US16874646

    申请日:2020-05-14

    Applicant: Google LLC

    Abstract: A method includes obtaining, by data processing hardware, a plurality of non-watermarked speech samples. Each non-watermarked speech does not include an audio watermark sample. The method includes, from each non-watermarked speech sample of the plurality of non-watermarked speech samples, generating one or more corresponding watermarked speech samples that each include at least one audio watermark. The method includes training, using the plurality of non-watermarked speech samples and corresponding watermarked speech samples, a model to determine whether a given audio data sample includes an audio watermark, and after training the model, transmitting the trained model to a user computing device.

    Hotword suppression
    5.
    发明授权

    公开(公告)号:US10692496B2

    公开(公告)日:2020-06-23

    申请号:US16418415

    申请日:2019-05-21

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotwords are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to playback of an utterance. The actions further include providing the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample. The actions further include receiving, from the model, data indicating whether the audio data includes the audio watermark. The actions further include, based on the data indicating whether the audio data includes the audio watermark, determining to continue or cease processing of the audio data.

    Hotword suppression
    6.
    发明授权

    公开(公告)号:US11373652B2

    公开(公告)日:2022-06-28

    申请号:US16874646

    申请日:2020-05-14

    Applicant: Google LLC

    Abstract: A method includes obtaining, by data processing hardware, a plurality of non-watermarked speech samples. Each non-watermarked speech does not include an audio watermark sample. The method includes, from each non-watermarked speech sample of the plurality of non-watermarked speech samples, generating one or more corresponding watermarked speech samples that each include at least one audio watermark. The method includes training, using the plurality of non-watermarked speech samples and corresponding watermarked speech samples, a model to determine whether a given audio data sample includes an audio watermark, and after training the model, transmitting the trained model to a user computing device.

    INSTANTANEOUS LEARNING IN TEXT-TO-SPEECH DURING DIALOG

    公开(公告)号:US20230274727A1

    公开(公告)日:2023-08-31

    申请号:US18312576

    申请日:2023-05-04

    Applicant: Google LLC

    CPC classification number: G10L13/08 G10L15/187

    Abstract: A method for instantaneous learning in text-to-speech (TTS) during dialog includes receiving a user pronunciation of a particular word present in a query spoken by a user. The method also includes receiving a TTS pronunciation of the same particular word that is present in a TTS input where the TTS pronunciation of the particular word is different than the user pronunciation of the particular word. The method also includes obtaining user pronunciation-related features and TTS pronunciation related features associated with the particular word. The method also includes generating a pronunciation decision selecting one of the user pronunciation or the TTS pronunciation of the particular word that is associated with a highest confidence. The method also include providing the TTS audio that includes a synthesized speech representation of the response to the query using the user pronunciation or the TTS pronunciation for the particular word.

    Hotword Suppression
    8.
    发明申请
    Hotword Suppression 审中-公开

    公开(公告)号:US20190362719A1

    公开(公告)日:2019-11-28

    申请号:US16418415

    申请日:2019-05-21

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotwords are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to playback of an utterance. The actions further include providing the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample. The actions further include receiving, from the model, data indicating whether the audio data includes the audio watermark. The actions further include, based on the data indicating whether the audio data includes the audio watermark, determining to continue or cease processing of the audio data.

Patent Agency Ranking