Patent search ap:("International Business Machines Corporation") AND inv:"George A. Saon" Page 1

1.

发明授权
Method and system for order-free spoken term detection 有权

公开(公告)号：US09704482B2

公开(公告)日：2017-07-11

申请号：US14644817

申请日：2015-03-11

Applicant: International Business Machines Corporation

Inventor： Brian E. D. Kingsbury , Lidia Mangu , Michael A. Picheny , George A. Saon

IPC: G10L15/00 , G10L15/193 , G10L15/08

CPC classification number: G10L15/193 , G10L2015/085 , G10L2015/088

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

2.

发明授权
Diarization driven by meta-information identified in discussion content 有权

公开(公告)号：US10468031B2

公开(公告)日：2019-11-05

申请号：US15819158

申请日：2017-11-21

Applicant: International Business Machines Corporation

Inventor： Kenneth W. Church , Dimitrios B. Dimitriadis , Petr Fousek , Miroslav Novak , George A. Saon

IPC: G10L17/00 , G10L15/22 , G10L15/30 , G10L15/183 , G10L25/51 , G10L25/78

Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. The STD process analyzes a number of speaker segments using a language model that determines when speaker changes occur. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.

3.

发明申请
METHOD AND SYSTEM FOR JOINT TRAINING OF HYBRID NEURAL NETWORKS FOR ACOUSTIC MODELING IN AUTOMATIC SPEECH RECOGNITION 有权
Title translation: 混合神经网络在自动语音识别中进行声学建模的联合训练方法与系统

公开(公告)号：US20150161522A1

公开(公告)日：2015-06-11

申请号：US14313554

申请日：2014-06-24

Applicant: International Business Machines Corporation

Inventor： George A. Saon , Hagen Soltau

IPC: G06N99/00

CPC classification number: G06N3/08 , G06N3/0454

Abstract: Systems and methods for training networks are provided. A method for training networks comprises receiving an input from each of a plurality of neural networks differing from each other in at least one of architecture, input modality, and feature type, connecting the plurality of neural networks through a common output layer, or through one or more common hidden layers and a common output layer to result in a joint network, and training the joint network.

Abstract translation: 提供了训练网络的系统和方法。一种用于训练网络的方法包括：在架构，输入模态和特征类型中的至少一个中接收彼此不同的多个神经网络中的每一个的输入，通过公共输出层连接所述多个神经网络，或者通过一个或更常见的隐藏层和公共输出层，以形成联合网络，并训练联合网络。

4.

发明申请
Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors 有权
Title translation: 使用I向量的神经网络声学模型的演讲人适应

公开(公告)号：US20150149165A1

公开(公告)日：2015-05-28

申请号：US14500042

申请日：2014-09-29

Applicant: International Business Machines Corporation

Inventor： George A. Saon

IPC: G10L15/06 , G10L17/18

CPC classification number: G10L15/063 , G10L15/16 , G10L17/18

Abstract: A method includes providing a deep neural network acoustic model, receiving audio data including one or more utterances of a speaker, extracting a plurality of speech recognition features from the one or more utterances of the speaker, creating a speaker identity vector for the speaker based on the extracted speech recognition features, and adapting the deep neural network acoustic model for automatic speech recognition using the extracted speech recognition features and the speaker identity vector.

Abstract translation: 一种方法包括提供深层神经网络声学模型，接收包括扬声器的一个或多个话音的音频数据，从扬声器的一个或多个话语中提取多个语音识别特征，基于提取的语音识别特征，并使用提取的语音识别特征和扬声器身份向量来适应用于自动语音识别的深层神经网络声学模型。

5.

发明授权
Vocal recognition using generally available speech-to-text systems and user-defined vocal training 有权

公开(公告)号：US11151996B2

公开(公告)日：2021-10-19

申请号：US16385630

申请日：2019-04-16

Applicant: International Business Machines Corporation

Inventor： George A. Saon , Nicolò Sgobba , Antonello Izzi , Erik Rueger

IPC: G10L15/20 , G10L15/22 , G10L15/07 , G10L15/06 , G10L15/26

Abstract: Techniques for augmenting the output of generally available speech-to-text systems using local profiles are presented. An example method includes receiving an audio recording of a natural language command. The received audio recording of the natural language command is transmitted to a speech-to-text system, and a text string generated from the audio recording is received from the speech-to-text system. The text string is corrected based on a local profile mapping incorrectly transcribed words from the speech-to-text system to corrected words. A function in a software application is invoked based on the corrected text string.

6.

发明授权
Diarization driven by the ASR based segmentation 有权

公开(公告)号：US11120802B2

公开(公告)日：2021-09-14

申请号：US15819127

申请日：2017-11-21

Applicant: International Business Machines Corporation

Inventor： Kenneth W. Church , Dimitrios B. Dimitriadis , Petr Fousek , Miroslav Novak , George A. Saon

IPC: G10L15/26 , G10L25/78 , G10L15/08 , G10L25/51 , G10L17/04 , G10L17/00 , G10L15/06

Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.

7.

发明申请
Diarization Driven by Meta-Information Identified in Discussion Content 审中-公开

公开(公告)号：US20190156835A1

公开(公告)日：2019-05-23

申请号：US15819158

申请日：2017-11-21

Applicant: International Business Machines Corporation

Inventor： Kenneth W. Church , Dimitrios B. Dimitriadis , Petr Fousek , Miroslav Novak , George A. Saon

IPC: G10L17/00 , G10L15/22 , G10L25/78 , G10L15/183 , G10L25/51 , G10L15/30

CPC classification number: G10L17/005 , G06F17/279 , G10L15/183 , G10L15/22 , G10L15/26 , G10L15/30 , G10L17/04 , G10L17/06 , G10L25/51 , G10L25/78

Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. The STD process analyzes a number of speaker segments using a language model that determines when speaker changes occur. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.

8.

发明申请
Diarization Driven by the ASR Based Segmentation 审中-公开

公开(公告)号：US20190156832A1

公开(公告)日：2019-05-23

申请号：US15819127

申请日：2017-11-21

Applicant: International Business Machines Corporation

Inventor： Kenneth W. Church , Dimitrios B. Dimitriadis , Petr Fousek , Miroslav Novak , George A. Saon

IPC: G10L15/26 , G10L25/78 , G10L15/08

Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.

9.

发明授权
Method and system for joint training of hybrid neural networks for acoustic modeling in automatic speech recognition 有权

公开(公告)号：US10262260B2

公开(公告)日：2019-04-16

申请号：US15434643

申请日：2017-02-16

Applicant: International Business Machines Corporation

Inventor： George A. Saon , Hagen Soltau

IPC: G06N3/08 , G06N3/04

Abstract: Systems and methods for training networks are provided. A method for training networks comprises receiving an input from each of a plurality of neural networks differing from each other in at least one of architecture, input modality, and feature type, connecting the plurality of neural networks through a common output layer, or through one or more common hidden layers and a common output layer to result in a joint network, and training the joint network.

10.

发明授权
Method and system for joint training of hybrid neural networks for acoustic modeling in automatic speech recognition 有权

公开(公告)号：US09665823B2

公开(公告)日：2017-05-30

申请号：US14313554

申请日：2014-06-24

Applicant: International Business Machines Corporation

Inventor： George A. Saon , Hagen Soltau

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/08 , G06N3/0454

Abstract: Systems and methods for training networks are provided. A method for training networks comprises receiving an input from each of a plurality of neural networks differing from each other in at least one of architecture, input modality, and feature type, connecting the plurality of neural networks through a common output layer, or through one or more common hidden layers and a common output layer to result in a joint network, and training the joint network.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification