SPEECH TRANSLATION DEVICE AND ASSOCIATED METHOD

    公开(公告)号:US20190095430A1

    公开(公告)日:2019-03-28

    申请号:US15714548

    申请日:2017-09-25

    Applicant: Google LLC

    Abstract: A computer-implemented method and associated computing device for translating speech can include receiving, at a microphone of a computing device, an audio signal representing speech of a user in a first language or in a second language at a first time. A positional relationship between the user and the computing device at the first time can be determined and utilized to determine whether the speech is in the first language or the second language. The method can further include obtaining, at the computing device, a machine translation of the speech represented by the audio signal based on the determined language, wherein the machine translation is: (i) in the second language when the determined language is the first language, or (ii) in the first language when the determined language is the second language. An audio representation of the machine translation can be output from a speaker of the computing device.

    Integrated Development Environments for Generating Machine Learning Models

    公开(公告)号:US20240311100A1

    公开(公告)日:2024-09-19

    申请号:US18604444

    申请日:2024-03-13

    Applicant: Google LLC

    Abstract: A method includes receiving a request indication indicating a GUI interaction by a user on a user device, and in response, providing to the device a response configured to cause the device to display, within a GUI, a structured prompt including a plurality of user input fields, each user input field representing a corresponding training sample and including a first corresponding text input field for capturing input text to be provided to an ML model, and a second corresponding text input field for capturing ground-truth output text. The method also includes receiving, from the device, the training samples, and, in response, adjusting the ML model using the training samples. The method further includes receiving, from the device, test input text, generating, using the adjusted ML model, test output text based on the test input text, and providing, to the user device, the test output text for display within the GUI.

    Adaptive diarization model and user interface

    公开(公告)号:US11710496B2

    公开(公告)日:2023-07-25

    申请号:US17596861

    申请日:2019-07-01

    Applicant: Google LLC

    Abstract: A computing device receives a first audio waveform representing a first utterance and a second utterance. The computing device receives identity data indicating that the first utterance corresponds to a first speaker and the second utterance corresponds to a second speaker. The computing device determines, based on the first utterance, the second utterance, and the identity data, a diarization model configured to distinguish between utterances by the first speaker and utterances by the second speaker. The computing device receives, exclusively of receiving further identity data indicating a source speaker of a third utterance, a second audio waveform representing the third utterance. The computing device determines, by way of the diarization model and independently of the further identity data of the first type, the source speaker of the third utterance. The computing device updates the diarization model based on the third utterance and the determined source speaker.

    Adaptive Diarization Model and User Interface

    公开(公告)号:US20220310109A1

    公开(公告)日:2022-09-29

    申请号:US17596861

    申请日:2019-07-01

    Applicant: Google LLC

    Abstract: A computing device receives a first audio waveform representing a first utterance and a second utterance. The computing device receives identity data indicating that the first utterance corresponds to a first speaker and the second utterance corresponds to a second speaker. The computing device determines, based on the first utterance, the second utterance, and the identity data, a diarization model configured to distinguish between utterances by the first speaker and utterances by the second speaker. The computing device receives, exclusively of receiving further identity data indicating a source speaker of a third utterance, a second audio waveform representing the third utterance. The computing device determines, by way of the diarization model and independently of the further identity data of the first type, the source speaker of the third utterance. The computing device updates the diarization model based on the third utterance and the determined source speaker.

Patent Agency Ranking