-
公开(公告)号:US20240185841A1
公开(公告)日:2024-06-06
申请号:US18490808
申请日:2023-10-20
申请人: Google LLC
发明人: Bo Li , Yu Zhang , Nanxin Chen , Rohit Prakash Prabhavalkar , Chao-Han Huck Yang , Tara N. Sainath , Trevor Strohman
IPC分类号: G10L15/065 , G10L15/00
CPC分类号: G10L15/065 , G10L15/005
摘要: A method includes obtaining an ASR model trained to recognize speech in a first language and receiving transcribed training utterances in a second language. The method also includes integrating the ASR model with an input reprogramming module and a latent reprogramming module. The method also includes adapting the ASR model to learn how to recognize speech in the second language by training the input reprogramming module and the latent reprogramming module while parameters of the ASR model are frozen.
-
公开(公告)号:US12002450B2
公开(公告)日:2024-06-04
申请号:US17183743
申请日:2021-02-24
摘要: A computer-implemented method for speech recognition, comprising receiving a frame of speech audio; encoding the frame of speech audio; calculating a halting probability based on the frame of speech audio; adding the halting probability to a first accumulator variable; in response to the first accumulator variable exceeding or reaching a first threshold, calculating a context vector based on the halting probability and the encoding of the frame of speech audio; performing a decoding step using the context vector to derive a token; and executing a function based on the derived token, wherein the executed function comprises at least one of text output or command performance.
-
公开(公告)号:US11996103B2
公开(公告)日:2024-05-28
申请号:US17811605
申请日:2022-07-11
申请人: Google LLC
IPC分类号: G10L15/00 , G06F16/632 , G10L15/04 , G10L15/19 , G10L15/197 , G10L15/22 , G10L15/26 , G10L15/08 , G10L15/183
CPC分类号: G10L15/26 , G06F16/632 , G10L15/04 , G10L15/19 , G10L15/197 , G10L2015/085 , G10L15/183 , G10L15/22
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for voice recognition. In one aspect, a method includes the actions of receiving a voice input; determining a transcription for the voice input, wherein determining the transcription for the voice input includes, for a plurality of segments of the voice input: obtaining a first candidate transcription for a first segment of the voice input; determining one or more contexts associated with the first candidate transcription; adjusting a respective weight for each of the one or more contexts; and determining a second candidate transcription for a second segment of the voice input based in part on the adjusted weights; and providing the transcription of the plurality of segments of the voice input for output.
-
公开(公告)号:US11996095B2
公开(公告)日:2024-05-28
申请号:US16991363
申请日:2020-08-12
申请人: KYNDRYL, INC.
IPC分类号: G10L15/00 , G06F3/01 , G06F18/214 , G10L15/22 , G10L15/30 , H04L12/28 , G10L15/02 , G10L15/16 , G10L25/30
CPC分类号: G10L15/22 , G06F3/017 , G06F18/2155 , G10L15/30 , H04L12/282 , H04L12/2829 , G10L15/02 , G10L15/16 , G10L2015/223 , G10L2015/225 , G10L25/30
摘要: The exemplary embodiments disclose a method, a computer program product, and a computer system for managing user commands. The exemplary embodiments may include a user giving one or more commands to one or more devices, collecting data of the one or more commands, extracting one or more features from the collected data, and determining which one or more of the commands should be executed on which one or more of the devices based on the extracted one or more features and one or more models.
-
公开(公告)号:US11972771B2
公开(公告)日:2024-04-30
申请号:US17131989
申请日:2020-12-23
IPC分类号: G10L15/00 , G06F3/16 , G06F40/279 , G06Q30/0201 , G06Q50/26 , G10L15/08 , G10L15/26 , G10L15/30 , G10L21/10 , G10L21/18 , G10L25/51 , H04W4/90 , H04W88/02
CPC分类号: G10L21/10 , G06F3/165 , G06F40/279 , G06Q30/0201 , G06Q50/26 , G10L15/08 , G10L15/26 , G10L15/30 , G10L21/18 , G10L25/51 , G10L2015/088 , H04W4/90 , H04W88/02
摘要: Systems and methods of monitoring radio channels and automatically providing selective notifications through a network that messages containing useful information, transmitted in the form of voice content, have been received. Keywords are compared with textual data transcribed from voice messages receive on a radio channel. The textual data and the keywords are compared, and upon identifying a correlation therebetween, a notification is automatically generated that indicates receipt of a given message, the existence of the correlation with the keywords, and an identity of the channel, so that client terminals can receive the message and also receive subsequent or related messages.
-
公开(公告)号:US11972226B2
公开(公告)日:2024-04-30
申请号:US17269800
申请日:2020-03-23
申请人: Google LLC
发明人: Dirk Ryan Padfield
IPC分类号: G06F40/58 , G10L15/00 , G10L15/06 , G10L15/197 , G10L15/22
CPC分类号: G06F40/58 , G10L15/005 , G10L15/063 , G10L15/197 , G10L15/22
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that facilitate generating stable real-time textual translations in a target language of an input audio data stream that is recorded in a source language. An audio stream that is recorded in a first language is obtained. A partial transcription of the audio can be generated at each time interval in a plurality of successive time intervals. Each partial transcription can be translated into a second language that is different from the first language. Each translated partial transcription can be input to a model that determines whether a portion of an input translated partial transcription is stable. Based on the input translated partial transcription, the model identifies a portion of the translated partial transcription that is predicted to be stable. This stable portion of the translated partial transcription is provided for display on a user device.
-
公开(公告)号:US11968230B2
公开(公告)日:2024-04-23
申请号:US17205009
申请日:2021-03-18
发明人: Naresh Olladapu , Mudit Mehrotra , Ajay Gupta , Arvind Agarwal
摘要: A method, system, and computer program product for managing communication privacy in a conversation system are provided. The method detects an utterance on a public channel by a user of a computing device. A privacy nature of the utterance is determined. Based on the privacy nature, a classification confidence is determined for the utterance. The method generates a privacy question to be presented to the user based on the privacy nature and the classification confidence. In response to a confirmation response, a privacy channel is established. The method switches from the public channel to the privacy channel.
-
公开(公告)号:US11967248B2
公开(公告)日:2024-04-23
申请号:US17425947
申请日:2019-12-12
申请人: Jangho Lee
发明人: Jangho Lee
CPC分类号: G09B19/06 , G06F40/263 , G09B5/06 , G10L13/08 , G10L15/005 , G10L15/22 , G10L15/26 , G10L2015/223
摘要: A method for foreign language learning between a learner and a terminal, based on video or audio containing foreign language, particularly, to a conversation-based foreign language learning method using a speech recognition function and a TTS function of a terminal, a learner learns a foreign language in a way that: the terminal reads a current learning target sentence to the learner to allow the learner to speak the current learning target sentence after the terminal, when speech input by the learner in a speech waiting state of the terminal is the same as the current learning target sentence or belongs to the same category as the current learning target sentence; and the terminal and the learner alternately speak sentences one-by-one when the speech input by the learner is the same as the next sentence of the current learning target sentence or belongs to the same category as the next sentence.
-
公开(公告)号:US11966706B2
公开(公告)日:2024-04-23
申请号:US17881445
申请日:2022-08-04
申请人: DoorDash, Inc.
IPC分类号: G10L15/00 , G06F40/253 , G06F40/284 , G06F40/289 , G06F40/35 , G10L15/01 , H04M3/51
CPC分类号: G06F40/35 , G06F40/253 , G06F40/284 , G06F40/289 , G10L15/01 , H04M3/5175
摘要: A dialogue complexity assessment method, system, and computer program product including calculating a complexity utilizing domain-dependent terms and domain-independent terms of a dialogue, where the dialogue includes dialogue data from contact centers of service providers.
-
公开(公告)号:US11961506B2
公开(公告)日:2024-04-16
申请号:US18113284
申请日:2023-02-23
发明人: Chansik Bok , Jihun Park
CPC分类号: G10L15/005 , G10L13/086 , G10L15/04 , G10L15/22 , G10L15/26 , G10L2015/223
摘要: An electronic apparatus including a memory configured to store first voice recognition information related to a first language and second voice recognition information related to a second language, and a processor to obtain a first text corresponding to a user voice that is received on the basis of first voice recognition information, based on an entity name being included in the user voice according to the obtained first text, identify a segment in the user voice in which the entity name is included. The processor is to obtain a second text corresponding to the identified segment of the user voice on the basis of the second voice recognition information, and obtain control information corresponding to the user voice on the basis of the first text and the second text.
-
-
-
-
-
-
-
-
-