Patent search ap:("Google LLC") AND inv:"Izhak Shafran" Page 1

1.

发明申请
LEARNING TO EXTRACT ENTITIES FROM CONVERSATIONS WITH NEURAL NETWORKS 有权

公开(公告)号：US20220075944A1

公开(公告)日：2022-03-10

申请号：US17432259

申请日：2020-02-19

Applicant: Google LLC

Inventor： Nan Du , Linh Trans , Yu-hui Chen , Izhak Shafran

IPC: G06F40/284 , G06F40/295 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for extracting entities from conversation transcript data. One of the methods includes obtaining a conversation transcript sequence, processing the conversation transcript sequence using a span detection neural network configured to generate a set of text token spans; and for each text token span: processing a span representation using an entity name neural network to generate an entity name probability distribution over a set of entity names, each probability in the entity name probability distribution representing a likelihood that a corresponding entity name is a name of the entity referenced by the text token span; and processing the span representation using an entity status neural network to generate an entity status probability distribution over a set of entity statuses.

2.

发明申请
COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING 审中-公开

公开(公告)号：US20180174575A1

公开(公告)日：2018-06-21

申请号：US15386979

申请日：2016-12-21

Applicant: Google LLC

Inventor： Samuel Bengio , Mirko Visontai , Christopher Walter George Thornton , Michiel A.U. Bacchiani , Tara N. Sainath , Ehsan Variani , Izhak Shafran

IPC: G10L15/16 , G10L19/02 , G10L15/02

CPC classification number: G10L15/16 , G10H1/00 , G10H2210/036 , G10H2210/046 , G10H2250/235 , G10H2250/311 , G10L15/02 , G10L17/18

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

3.

发明申请
Joint Speech and Language Model Using Large Language Models 有权

公开(公告)号：US20240386881A1

公开(公告)日：2024-11-21

申请号：US18667763

申请日：2024-05-17

Applicant: Google LLC

Inventor： Mingqiu Wang , Hagen Soltau , Izhak Shafran

IPC: G10L15/16

Abstract: Methods and systems for recognizing speech are disclosed herein. A method can include performing blank filtering on a received speech input to generate a plurality of filtered encodings and processing the plurality of filtered encodings to generate a plurality of audio embeddings. The method can also include mapping each audio embedding of the plurality of audio embeddings to a textual embedding using a speech adapter to generate a plurality of combined embeddings and receiving one or more specific textual embeddings from a domain-specific entity retriever based on the plurality of filtered encodings. The method can further include providing plurality of combined embeddings and the one or more specific textual embeddings to a machine-trained model and receiving a textual output representing speech from the speech input from the machine-trained model.

4.

发明授权
Adaptive multichannel dereverberation for automatic speech recognition 有权

公开(公告)号：US11699453B2

公开(公告)日：2023-07-11

申请号：US17005823

申请日：2020-08-28

Applicant: Google LLC

Inventor： Joseph Caroselli , Arun Narayanan , Izhak Shafran , Richard Rose

IPC: G10L21/00 , G10L21/0208 , G10L15/20 , G10L15/22 , G10L15/065 , G06F3/16 , G06N3/02 , G06F17/14 , G10L15/06 , G10L21/0216

CPC classification number: G10L21/0208 , G06F3/167 , G06F17/142 , G06N3/02 , G10L15/063 , G10L15/065 , G10L15/20 , G10L15/22 , G10L2015/223 , G10L2021/02082 , G10L2021/02166

Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).

5.

发明申请
COMPLEX EVOLUTION RECURRENT NEURAL NETWORKS 审中-公开

公开(公告)号：US20190156819A1

公开(公告)日：2019-05-23

申请号：US16251430

申请日：2019-01-18

Applicant: Google LLC

Inventor： Izhak Shafran , Thomas E. Bagby , Russell John Wyatt Skerry-Ryan

IPC: G10L15/16 , G10L19/02 , G10L15/02 , G10H1/00

CPC classification number: G10L15/16 , G06N3/02 , G10H1/00 , G10H2210/036 , G10H2210/046 , G10H2250/235 , G10H2250/311 , G10L15/02 , G10L17/18 , G10L19/0212 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex evolution recurrent neural networks. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A first vector sequence comprising audio features determined from the audio data is generated. A second vector sequence is generated, as output of a first recurrent neural network in response to receiving the first vector sequence as input, where the first recurrent neural network has a transition matrix that implements a cascade of linear operators comprising (i) first linear operators that are complex-valued and unitary, and (ii) one or more second linear operators that are non-unitary. An output vector sequence of a second recurrent neural network is generated. A transcription for the utterance is generated based on the output vector sequence generated by the second recurrent neural network. The transcription for the utterance is provided.

6.

发明授权
Complex linear projection for acoustic modeling 有权

公开(公告)号：US10140980B2

公开(公告)日：2018-11-27

申请号：US15386979

申请日：2016-12-21

Applicant: Google LLC

Inventor： Samuel Bengio , Mirko Visontai , Christopher Walter George Thornton , Michiel A. U. Bacchiani , Tara N. Sainath , Ehsan Variani , Izhak Shafran

IPC: G10L15/16 , G10L19/02 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

7.

发明授权
Learning to extract entities from conversations with neural networks 有权

公开(公告)号：US12216999B2

公开(公告)日：2025-02-04

申请号：US17432259

申请日：2020-02-19

Applicant: Google LLC

Inventor： Nan Du , Linh Mai Tran , Yu-Hui Chen , Izhak Shafran

IPC: G06F40/279 , G06F40/284 , G06F40/295 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for extracting entities from conversation transcript data. One of the methods includes obtaining a conversation transcript sequence, processing the conversation transcript sequence using a span detection neural network configured to generate a set of text token spans; and for each text token span: processing a span representation using an entity name neural network to generate an entity name probability distribution over a set of entity names, each probability in the entity name probability distribution representing a likelihood that a corresponding entity name is a name of the entity referenced by the text token span; and processing the span representation using an entity status neural network to generate an entity status probability distribution over a set of entity statuses.

8.

发明授权
Joint automatic speech recognition and speaker diarization 有权

公开(公告)号：US12039982B2

公开(公告)日：2024-07-16

申请号：US17601662

申请日：2020-04-06

Applicant: Google LLC

Inventor： Laurent El Shafey , Hagen Soltau , Izhak Shafran

IPC: G10L15/22 , G10L15/26 , G10L15/30 , G10L17/18 , G10L15/06

CPC classification number: G10L17/18 , G10L15/22 , G10L15/26 , G10L15/30 , G10L15/063

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing audio data using neural networks.

9.

发明申请
COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING 审中-公开

公开(公告)号：US20200286468A1

公开(公告)日：2020-09-10

申请号：US16879322

申请日：2020-05-20

Applicant: Google LLC

Inventor： Samuel Bengio , Mirko Visontai , Christopher Walter George Thornton , Tara N. Sainath , Ehsan Variani , Izhak Shafran , Michiel A.u. Bacchiani

IPC: G10L15/16 , G10L15/02 , G10H1/00 , G10L19/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

10.

发明授权
Complex evolution recurrent neural networks 有权

公开(公告)号：US10529320B2

公开(公告)日：2020-01-07

申请号：US16251430

申请日：2019-01-18

Applicant: Google LLC

Inventor： Izhak Shafran , Thomas E. Bagby , Russell John Wyatt Skerry-Ryan

IPC: G10L15/16 , G10L19/02 , G10L15/02 , G10H1/00 , G06N3/02 , G10L17/18 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex evolution recurrent neural networks. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A first vector sequence comprising audio features determined from the audio data is generated. A second vector sequence is generated, as output of a first recurrent neural network in response to receiving the first vector sequence as input, where the first recurrent neural network has a transition matrix that implements a cascade of linear operators comprising (i) first linear operators that are complex-valued and unitary, and (ii) one or more second linear operators that are non-unitary. An output vector sequence of a second recurrent neural network is generated. A transcription for the utterance is generated based on the output vector sequence generated by the second recurrent neural network. The transcription for the utterance is provided.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification