PRESERVING SPEECH HYPOTHESES ACROSS COMPUTING DEVICES AND/OR DIALOG SESSIONS

    公开(公告)号:US20220122589A1

    公开(公告)日:2022-04-21

    申请号:US16949151

    申请日:2020-10-15

    Applicant: Google LLC

    Abstract: Implementations can receive, at a computing device, audio data corresponding to a spoken utterance of a user, process the audio data to generate, for one or more parts of the spoken utterance, a plurality of speech hypotheses, select a given one of the speech hypotheses, cause the given one of the speech hypotheses to be incorporated as a portion of a transcription associated with the software application, and store the plurality of speech hypotheses. In some implementations, the plurality of speech hypotheses can be loaded at an additional computing device when the transcription is accessed at the additional computing device. In additional or alternative implementations, the plurality of speech hypotheses can be loaded into memory of the computing device when the software application is reactivated and/or when a subsequent dialog session associated with the transcription is initiated.

    Contextually prompting users to switch communication modes

    公开(公告)号:US11304041B2

    公开(公告)日:2022-04-12

    申请号:US16814551

    申请日:2020-03-10

    Applicant: GOOGLE LLC

    Abstract: A computer-implemented technique can include detecting, by a first computing device, a set of user communications at least one of transmitted to and received by from a second computing device via a first communication mode, identifying a second communication mode that is available for communication between the first and second computing devices, and obtaining an appropriateness score for the first and second communication modes based on a contextual feature of the set of user communications, wherein the contextual feature relates an appropriateness of a particular communication mode for the set of user communications, and wherein each appropriateness score is indicative of a level of the appropriateness of a particular communication mode for the set of user communications. The technique can also include selectively outputting a suggestion to switch from the first communication mode to the second communication mode.

    Providing Additional Instructions for Difficult Maneuvers During Navigation

    公开(公告)号:US20210364307A1

    公开(公告)日:2021-11-25

    申请号:US17252260

    申请日:2019-12-17

    Applicant: GOOGLE LLC

    Abstract: A dataset descriptive of multiple locations and one or more maneuvers attempted by vehicles at these locations is received. A machine-learning model is trained using this dataset, so that the machine-learning model is configured to generate metrics of difficulty for the set of maneuvers. A query data including indications of a location and a maneuver to be executed by a vehicle at the location is received. The query data is applied to the machine-learning model to generate a metric of difficulty for the maneuver, and a navigation instruction for the maneuver is provided via a user interface, such that at least one parameter of the navigation instruction is selected based on the generated metric of difficulty.

    PROVIDING PRE-COMPUTED HOTWORD MODELS

    公开(公告)号:US20210312921A1

    公开(公告)日:2021-10-07

    申请号:US17304459

    申请日:2021-06-21

    Applicant: Google LLC

    Inventor: Matthew Sharifi

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, for each of multiple words or sub-words, audio data corresponding to multiple users speaking the word or sub-word; training, for each of the multiple words or sub-words, a pre-computed hotword model for the word or sub-word based on the audio data for the word or sub-word; receiving a candidate hotword from a computing device; identifying one or more pre-computed hotword models that correspond to the candidate hotword; and providing the identified, pre-computed hotword models to the computing device.

    SEGMENT-BASED SPEAKER VERIFICATION USING DYNAMICALLY GENERATED PHRASES

    公开(公告)号:US20210295850A1

    公开(公告)日:2021-09-23

    申请号:US17303928

    申请日:2021-06-10

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.

    Systems and Methods for Providing a Machine-Learned Model with Adjustable Computational Demand

    公开(公告)号:US20210232912A1

    公开(公告)日:2021-07-29

    申请号:US16972429

    申请日:2019-09-19

    Applicant: Google LLC

    Abstract: A computing device is disclosed that includes at least one processor and a machine-learned model. The machine-learned model can include a plurality of blocks and one or more residual connections between two or more of the plurality of blocks. The machine-learned model can be configured to receive a model input and, in response to receipt of the model input, output a model output. The machine-learned model can be configured to perform operations including determining a resource allocation parameter that corresponds to a desired allocation of system resources to the machine-learned model at an inference time; deactivating a subset of the plurality of blocks of the machine-learned model based on the resource allocation parameter; inputting the model input into the machine-learned model with the subset of the plurality of blocks deactivated; and receiving, as an output of the machine-learned model, the model output.

    Training Keyword Spotters
    189.
    发明申请

    公开(公告)号:US20210183367A1

    公开(公告)日:2021-06-17

    申请号:US16717518

    申请日:2019-12-17

    Applicant: Google LLC

    Abstract: A method of training a custom hotword model includes receiving a first set of training audio samples. The method also includes generating, using a speech embedding model configured to receive the first set of training audio samples as input, a corresponding hotword embedding representative of a custom hotword for each training audio sample of the first set of training audio samples. The speech embedding model is pre-trained on a different set of training audio samples with a greater number of training audio samples than the first set of training audio samples. The method further includes training the custom hotword model to detect a presence of the custom hotword in audio data. The custom hotword model is configured to receive, as input, each corresponding hotword embedding and to classify, as output, each corresponding hotword embedding as corresponding to the custom hotword.

    ADAPTIVE TEXT-TO-SPEECH OUTPUTS
    190.
    发明申请

    公开(公告)号:US20210142779A1

    公开(公告)日:2021-05-13

    申请号:US17153463

    申请日:2021-01-20

    Applicant: Google LLC

    Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.

Patent Agency Ranking