-
公开(公告)号:US20220368343A1
公开(公告)日:2022-11-17
申请号:US17620448
申请日:2019-09-09
Applicant: Giovanni MOTTA , Francoise BEAUFAYS , Petr ZADRAZIL , Google LLC
Inventor: Giovanni Motta , Francoise Beaufays , Petr Zadrazil
Abstract: Systems and methods for compression of data that exhibits mixed compressibility, such as floating-point data, are provided. As one example, aspects of the present disclosure can be used to compress floating-point data that represents the values of parameters of a machine-learned model. Therefore, aspects of the present disclosure can be used to compress machine-learned models (e.g., for reducing storage requirements associated with the model, reducing the bandwidth expended to transmit the model, etc.).
-
公开(公告)号:US11327652B2
公开(公告)日:2022-05-10
申请号:US16989420
申请日:2020-08-10
Applicant: Google LLC
Inventor: Ouais Alsharif , Peter Ciccotto , Francoise Beaufays , Dragan Zivkovic
IPC: G06F3/04886 , G06F3/023 , G06F40/263 , G06F40/274
Abstract: A keyboard is described that determines, using a first decoder and based on a selection of keys of a graphical keyboard, text. Responsive to determining that a characteristic of the text satisfies a threshold, a model of the keyboard identifies the target language of the text, and determines whether the target language is different than a language associated with the first decoder. If the target language of the text is not different than the language associated with the first decoder, the keyboard outputs, for display, an indication of first candidate words determined by the first decoder from the text. If the target language of the text is different: the keyboard enables a second decoder, where a language associated with the second decoder matches the target language of the text, and outputs, for display, an indication of second candidate words determined by the second decoder from the text.
-
公开(公告)号:US20200257447A1
公开(公告)日:2020-08-13
申请号:US16862628
申请日:2020-04-30
Applicant: Google LLC
Inventor: Shumin Zhai , Thomas Breuel , Ouais Alsharif , Yu Ouyang , Francoise Beaufays , Johan Schalkwyk
IPC: G06F3/0488 , G06N3/04 , G06F40/232 , G06F40/274 , G06F40/279 , G06F3/02 , G06F3/023 , G06F3/0489 , G06F3/0482 , G06N3/08
Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.
-
公开(公告)号:US20190155504A1
公开(公告)日:2019-05-23
申请号:US16261640
申请日:2019-01-30
Applicant: Google LLC
Inventor: Shumin Zhai , Thomas Breuel , Ouais Alsharif , Yu Ouyang , Francoise Beaufays , Johan Schalkwyk
IPC: G06F3/0488 , G06N3/04 , G06F17/27 , G06F3/023 , G06F3/0489 , G06F3/02 , G06N3/08 , G06F3/0482
Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.
-
公开(公告)号:US10148609B2
公开(公告)日:2018-12-04
申请号:US14932233
申请日:2015-11-04
Applicant: Google LLC
Inventor: Brian Patrick Strope , Francoise Beaufays , Hy Murveit
IPC: H04M11/00 , H04L29/12 , H04L12/66 , H04L12/58 , H04M3/42 , H04M1/725 , H04W4/16 , H04W8/18 , H04M1/64
Abstract: In one implementation a computer-implemented method includes generating a group of telephone contacts for a first user, wherein the generating includes identifying a second user as a contact of the first user based upon a determination that the second user has at least a threshold email-based association with the first user; and adding the identified second user to the group of telephone contacts for the first user. The method further includes receiving a first request to connect a first telephone device associated with the first user to a second telephone device associated with the second user. The method also includes identifying a contact identifier of the second telephone device using the generated group of telephone contacts for the first user, and initiating a connection between the first telephone device and the second telephone device using the identified contact identifier.
-
公开(公告)号:US10049672B2
公开(公告)日:2018-08-14
申请号:US15171374
申请日:2016-06-02
Applicant: Google LLC
Inventor: Brian Patrick Strope , Francoise Beaufays , Olivier Siohan
Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.
-
公开(公告)号:US20250094491A1
公开(公告)日:2025-03-20
申请号:US18961038
申请日:2024-11-26
Applicant: Google LLC
Inventor: Johan Schalkwyk , Francoise Beaufays
IPC: G06F16/783 , G06F16/738 , G06F40/169 , G06F40/30
Abstract: A method includes receiving a content feed that includes audio data corresponding to speech utterances and processing the content feed to generate a semantically-rich, structured document. The structured document includes a transcription of the speech utterances and includes a plurality of words each aligned with a corresponding audio segment of the audio data that indicates a time when the word was recognized in the audio data. During playback of the content feed, the method also includes receiving a query from a user requesting information contained in the content feed and processing, by a large language model, the query and the structured document to generate a response to the query. The response conveys the requested information contained in the content feed. The method also includes providing, for output from a user device associated with the user, the response to the query.
-
公开(公告)号:US20240144917A1
公开(公告)日:2024-05-02
申请号:US18494763
申请日:2023-10-25
Applicant: Google LLC
Inventor: Rami Magdi Fahmi Botros , Rohit Prakash Prabhavalkar , Johan Schalkwyk , Tara N. Sainath , Ciprian Ioan Chelba , Francoise Beaufays
IPC: G10L15/16
CPC classification number: G10L15/16
Abstract: A method includes obtaining a base encoder from a pre-trained model, and receiving training data comprising a sequence of acoustic frames characterizing an utterance paired with a ground-truth transcription of the utterance. At each of a plurality of output steps, the method includes: generating, by the base encoder, a first encoded representation for a corresponding acoustic frame; generating, by an exporter network configured to receive a continuous sequence of first encoded representations generated by the base encoder, a second encoded representation for a corresponding acoustic frame; generating, by an exporter decoder, a probability distribution over possible logits; and determining an exporter decoder loss based on the probability distribution over possible logits generated by the exporter decoder at the corresponding output step and the ground-truth transcription. The method also includes training the exporter network based on the exporter decoder losses while parameters of the base encoder are frozen.
-
公开(公告)号:US20240086063A1
公开(公告)日:2024-03-14
申请号:US18517825
申请日:2023-11-22
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F3/16 , G06F40/166 , G06F40/289
CPC classification number: G06F3/04886 , G06F1/1626 , G06F3/0233 , G06F3/04883 , G06F3/167 , G06F40/166 , G06F40/289 , G06F2203/0381 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
公开(公告)号:US11843397B2
公开(公告)日:2023-12-12
申请号:US17620448
申请日:2019-09-09
Applicant: Google LLC
Inventor: Giovanni Motta , Francoise Beaufays , Petr Zadrazil
Abstract: Systems and methods for compression of data that exhibits mixed compressibility, such as floating-point data, are provided. As one example, aspects of the present disclosure can be used to compress floating-point data that represents the values of parameters of a machine-learned model. Therefore, aspects of the present disclosure can be used to compress machine-learned models (e.g., for reducing storage requirements associated with the model, reducing the bandwidth expended to transmit the model, etc.).
-
-
-
-
-
-
-
-
-