Patent search ap:("Google LLC") AND inv:"Golan Pundak" Page 3

21.

发明授权
Enhancing audio using multiple recording devices 有权

公开(公告)号：US10586569B2

公开(公告)日：2020-03-10

申请号：US15954105

申请日：2018-04-16

Applicant: Google LLC

Inventor： Dimitri Kanevsky , Golan Pundak

IPC: G06F17/00 , G11B20/10 , G06F3/16 , G10L17/00 , G10L21/0364 , H04M3/56 , G10L25/51 , G10L21/02 , G10L21/0208 , G10L21/028 , G10L25/84

Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for identifying that a first audio stream includes first, second, and third sources of audio. A computing system identifies that a second audio stream includes the first, second, and third sources of audio. The computing system determines that the first and second sources of audio are part of a first conversation. The computing system generates a third audio stream that combines the first source of audio from the first audio stream, the first source of audio from the second audio stream, the second source of audio from the first audio stream, and the second source of audio from the second audio stream, and diminishes the third source of audio from the first audio stream, and the third source of audio from the second audio stream.

22.

发明申请
ENHANCING AUDIO USING MULTIPLE RECORDING DEVICES 审中-公开

公开(公告)号：US20180233173A1

公开(公告)日：2018-08-16

申请号：US15954105

申请日：2018-04-16

Applicant: Google LLC

Inventor： Dimitri Kanevsky , Golan Pundak

IPC: G11B20/10 , G10L17/00 , G06F3/16 , H04M3/56

CPC classification number: G11B20/10527 , G06F3/16 , G06F3/165 , G10L17/00 , G10L21/0202 , G10L21/0208 , G10L21/028 , G10L21/0364 , G10L25/51 , G10L25/84 , G11B2020/10546 , H04M3/56 , H04M3/568

Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for identifying that a first audio stream includes first, second, and third sources of audio. A computing system identifies that a second audio stream includes the first, second, and third sources of audio. The computing system determines that the first and second sources of audio are part of a first conversation. The computing system generates a third audio stream that combines the first source of audio from the first audio stream, the first source of audio from the second audio stream, the second source of audio from the first audio stream, and the second source of audio from the second audio stream, and diminishes the third source of audio from the first audio stream, and the third source of audio from the second audio stream.

23.

发明授权
Enhancing audio using multiple recording devices 有权

公开(公告)号：US12051443B2

公开(公告)日：2024-07-30

申请号：US17891295

申请日：2022-08-19

Applicant: Google LLC

Inventor： Dimitri Kanevsky , Golan Pundak

IPC: G11B20/10 , G06F3/16 , G10L17/00 , G10L21/0208 , G10L21/028 , G10L21/0364 , G10L25/51 , G10L25/84 , H04M3/56

CPC classification number: G11B20/10527 , G06F3/16 , G06F3/165 , G10L17/00 , G10L21/0364 , G10L25/51 , H04M3/56 , H04M3/568 , G10L21/0208 , G10L21/028 , G10L25/84 , G11B2020/10546

Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for identifying that a first audio stream includes first, second, and third sources of audio. A computing system identifies that a second audio stream includes the first, second, and third sources of audio. The computing system determines that the first and second sources of audio are part of a first conversation. The computing system generates a third audio stream that combines the first source of audio from the first audio stream, the first source of audio from the second audio stream, the second source of audio from the first audio stream, and the second source of audio from the second audio stream, and diminishes the third source of audio from the first audio stream, and the third source of audio from the second audio stream.

24.

发明公开
RESOLVING UNIQUE PERSONAL IDENTIFIERS DURING CORRESPONDING CONVERSATIONS BETWEEN A VOICE BOT AND A HUMAN 审中-公开

公开(公告)号：US20230419964A1

公开(公告)日：2023-12-28

申请号：US18462787

申请日：2023-09-07

Applicant: GOOGLE LLC

Inventor： Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni

IPC: G10L15/22 , G10L15/06

CPC classification number: G10L15/22 , G10L15/063 , G10L2015/0635

Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.

25.

发明授权
Resolving unique personal identifiers during corresponding conversations between a voice bot and a human 有权

公开(公告)号：US11790906B2

公开(公告)日：2023-10-17

申请号：US17157207

申请日：2021-01-25

Applicant: GOOGLE LLC

Inventor： Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni

IPC: G10L15/22 , G10L15/06

CPC classification number: G10L15/22 , G10L15/063 , G10L2015/0635

Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.

26.

发明申请
ENHANCING AUDIO USING MULTIPLE RECORDING DEVICES 有权

公开(公告)号：US20220392489A1

公开(公告)日：2022-12-08

申请号：US17891295

申请日：2022-08-19

Applicant: Google LLC

Inventor： Dimitri Kanevsky , Golan Pundak

IPC: G11B20/10 , G06F3/16 , G10L17/00 , G10L21/0364 , H04M3/56 , G10L25/51

Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for identifying that a first audio stream includes first, second, and third sources of audio. A computing system identifies that a second audio stream includes the first, second, and third sources of audio. The computing system determines that the first and second sources of audio are part of a first conversation. The computing system generates a third audio stream that combines the first source of audio from the first audio stream, the first source of audio from the second audio stream, the second source of audio from the first audio stream, and the second source of audio from the second audio stream, and diminishes the third source of audio from the first audio stream, and the third source of audio from the second audio stream.

27.

发明授权
Contextual biasing for speech recognition 有权

公开(公告)号：US11423883B2

公开(公告)日：2022-08-23

申请号：US16836445

申请日：2020-03-31

Applicant: Google LLC

Inventor： Rohit Prakash Prabhavalkar , Golan Pundak , Tara N. Sainath

IPC: G10L15/16 , G10L15/26

Abstract: A method includes receiving audio data encoding an utterance and obtaining a set of bias phrases corresponding to a context of the utterance. Each bias phrase includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio to generate an output from the speech recognition model. The speech recognition model includes a first encoder configured to receive the acoustic features, a first attention module, a bias encoder configured to receive data indicating the obtained set of bias phrases, a bias encoder, and a decoder configured to determine likelihoods of sequences of speech elements based on output of the first attention module and output of the bias attention module. The method also includes determining a transcript for the utterance based on the likelihoods of sequences of speech elements.

28.

发明授权
Contextual biasing for speech recognition using grapheme and phoneme data 有权

公开(公告)号：US11217231B2

公开(公告)日：2022-01-04

申请号：US16863766

申请日：2020-04-30

Applicant: Google LLC

Inventor： Rohit Prakash Prabhavalkar , Golan Pundak , Tara N. Sainath , Antoine Jean Bruguier

IPC: G10L15/187 , G06N20/10 , G10L19/04 , G10L15/08

Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.

29.

发明申请
ENHANCING AUDIO USING MULTIPLE RECORDING DEVICES 有权

公开(公告)号：US20210193180A1

公开(公告)日：2021-06-24

申请号：US17194827

申请日：2021-03-08

Applicant: Google LLC

Inventor： Dimitri Kanevsky , Golan Pundak

IPC: G11B20/10 , G06F3/16 , G10L17/00 , G10L21/0364 , H04M3/56 , G10L25/51

Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for identifying that a first audio stream includes first, second, and third sources of audio. A computing system identifies that a second audio stream includes the first, second, and third sources of audio. The computing system determines that the first and second sources of audio are part of a first conversation. The computing system generates a third audio stream that combines the first source of audio from the first audio stream, the first source of audio from the second audio stream, the second source of audio from the first audio stream, and the second source of audio from the second audio stream, and diminishes the third source of audio from the first audio stream, and the third source of audio from the second audio stream.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification