专利检索 ap:("International Business Machines Corporation") AND inv:"Hong-Kwang Kuo" 第 1 页

1.

发明授权
End-to-end integration of dialog history for spoken language understanding 有权

公开(公告)号：US12119008B2

公开(公告)日：2024-10-15

申请号：US17655441

申请日：2022-03-18

申请人： International Business Machines Corporation , The Ohio State University

发明人： Samuel Thomas , Vishal Sunder , Hong-Kwang Kuo , Jatin Ganhotra , Brian E. D. Kingsbury , Eric Fosler-Lussier

IPC分类号： G10L19/00 , G06F40/126 , G06N3/045 , G10L15/00

CPC分类号： G10L19/00 , G06F40/126 , G06N3/045 , G10L15/00

摘要： Systems, computer-implemented methods, and computer program products to facilitate end to end integration of dialogue history for spoken language understanding are provided. According to an embodiment, a system can comprise a processor that executes components stored in memory. The computer executable components comprise a conversation component that encodes speech-based content of an utterance and text-based content of the utterance into a uniform representation.

2.

发明授权
Training end-to-end spoken language understanding systems with unordered entities 有权

公开(公告)号：US12046236B2

公开(公告)日：2024-07-23

申请号：US17458772

申请日：2021-08-27

申请人： International Business Machines Corporation

发明人： Hong-Kwang Kuo , Zoltan Tueske , Samuel Thomas , Brian E. D. Kingsbury , George Andrei Saon

IPC分类号： G10L15/22 , G06N3/08 , G10L15/16 , G10L15/08

CPC分类号： G10L15/22 , G06N3/08 , G10L15/16 , G10L2015/088

摘要： Training data can be received, which can include pairs of speech and meaning representation associated with the speech as ground truth data. The meaning representation includes at least semantic entities associated with the speech, where the spoken order of the semantic entities is unknown. The semantic entities of the meaning representation in the training data can be reordered into spoken order of the associated speech using an alignment technique. A spoken language understanding machine learning model can be trained using the pairs of speech and meaning representation having the reordered semantic entities. The meaning representation, e.g., semantic entities, in the received training data can be perturbed to create random order sequence variations of the semantic entities associated with speech. Perturbed meaning representation with associated speech can augment the training data.

3.

发明公开
END-TO-END INTEGRATION OF DIALOG HISTORY FOR SPOKEN LANGUAGE UNDERSTANDING 审中-公开

公开(公告)号：US20230298596A1

公开(公告)日：2023-09-21

申请号：US17655441

申请日：2022-03-18

申请人： International Business Machines Corporation , The Ohio State University

发明人： Samuel Thomas , Vishal Sunder , Hong-Kwang Kuo , Jatin Ganhotra , Brian E. D. Kingsbury , Eric Fosler-Lussier

IPC分类号： G10L19/00 , G10L15/00 , G06F40/126 , G06N3/04

CPC分类号： G10L19/00 , G10L15/00 , G06F40/126 , G06N3/0454

摘要： Systems, computer-implemented methods, and computer program products to facilitate end to end integration of dialogue history for spoken language understanding are provided. According to an embodiment, a system can comprise a processor that executes components stored in memory. The computer executable components comprise a conversation component that encodes speech-based content of an utterance and text-based content of the utterance into a uniform representation.

4.

发明申请
INTEGRATING DIALOG HISTORY INTO END-TO-END SPOKEN LANGUAGE UNDERSTANDING SYSTEMS 有权

公开(公告)号：US20230056680A1

公开(公告)日：2023-02-23

申请号：US17405532

申请日：2021-08-18

申请人： International Business Machines Corporation

发明人： Samuel Thomas , Jatin Ganhotra , Hong-Kwang Kuo , Sachindra Joshi , George Andrei Saon , Zoltan Tueske , Brian E. D. Kingsbury

IPC分类号： G10L15/16 , G10L15/18 , G10L15/183 , G10L15/065

摘要： Audio signals representing a current utterance in a conversation and a dialog history including at least information associated with past utterances corresponding to the current utterance in the conversation can be received. The dialog history can be encoded into an embedding. A spoken language understanding neural network model can be trained to perform a spoken language understanding task based on input features including at least speech features associated with the received audio signals and the embedding. An encoder can also be trained to encode a given dialog history into an embedding. The spoken language understanding task can include predicting a dialog action of an utterance. The spoken language understanding task can include predicting a dialog intent or overall topic of the conversation.

5.

发明申请
METHOD AND SYSTEM FOR EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS 有权
标题翻译：使用混沌网络进行有效检测的方法和系统

公开(公告)号：US20150279358A1

公开(公告)日：2015-10-01

申请号：US14230790

申请日：2014-03-31

申请人： International Business Machines Corporation

发明人： Brian E.D. Kingsbury , Hong-Kwang Kuo , Lidia Mangu , Hagen Soltau

IPC分类号： G10L15/08

CPC分类号： G10L15/083 , G10L13/08 , G10L15/02 , G10L2015/025 , G10L2015/085

摘要： Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

摘要翻译： 提供了用于词汇检测的系统和方法。一种用于口语术语检测的方法，包括接收电话级词汇（OOV）关键字查询，将电话级OOV关键字查询转换为单词，生成基于混合网络（CN）的关键词搜索（KWS）索引，并使用基于CN的KWS索引用于词汇（IV）关键词查询和OOV关键字查询。

6.

发明公开
MULTI-SPEAKER DATA AUGMENTATION FOR IMPROVED END-TO-END AUTOMATIC SPEECH RECOGNITION 审中-公开

公开(公告)号：US20240331684A1

公开(公告)日：2024-10-03

申请号：US18129328

申请日：2023-03-31

申请人： International Business Machines Corporation

发明人： Samuel Thomas , Hong-Kwang Kuo , George Andrei Saon , Brian E. D. Kingsbury

IPC分类号： G10L15/16 , G10L15/02 , G10L15/06

CPC分类号： G10L15/16 , G10L15/02 , G10L15/063

摘要： Features of two or more single speaker utterances are concatenated together and corresponding labels of the two or more single speaker utterances are concatenated together. Single speaker acoustic embeddings for each of the single speaker utterances of the concatenated single speaker utterances are generated using a single speaker teacher encoder network. An enhanced model is trained on the concatenated single speaker utterances using a classification loss LCLASS and a representation similarity loss LREP, the representation similarity loss LREP defined to influence an embedding derived from the concatenated single speaker utterances, the influence being based on the single speaker acoustic embeddings derived from the single speaker teacher encoder network.

7.

发明授权
Multilingual intent recognition 有权

公开(公告)号：US11900922B2

公开(公告)日：2024-02-13

申请号：US17093673

申请日：2020-11-10

申请人： International Business Machines Corporation

发明人： Samuel Thomas , Hong-Kwang Kuo , Kartik Audhkhasi , Michael Alan Picheny

IPC分类号： G10L15/16 , G10L15/08 , G06F40/295 , G06N3/04 , G06F18/214

CPC分类号： G10L15/16 , G06F18/2148 , G06N3/04 , G06F40/295 , G10L2015/088

摘要： Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.

8.

发明申请
END TO END SPOKEN LANGUAGE UNDERSTANDING MODEL 有权

公开(公告)号：US20220319494A1

公开(公告)日：2022-10-06

申请号：US17218618

申请日：2021-03-31

申请人： International Business Machines Corporation

发明人： Samuel Thomas , Hong-Kwang Kuo , George Andrei Saon , Zoltan Tueske , Brian E. D. Kingsbury

IPC分类号： G10L15/06 , G06K9/62 , G10L13/02

摘要： An approach to training an end-to-end spoken language understanding model may be provided. A pre-trained general automatic speech recognition model may be adapted to a domain specific spoken language understanding model. The pre-trained general automatic speech recognition model may be a recurrent neural network transducer model. The adaptation may provide transcription data annotated with spoken language understanding labels. Adaptation may include audio data may also be provided for in addition to verbatim transcripts annotated with spoken language understanding labels. The spoken language understanding labels may be entity and/or intent based with values associated with each label.

9.

发明申请
MULTILINGUAL INTENT RECOGNITION 有权

公开(公告)号：US20220148581A1

公开(公告)日：2022-05-12

申请号：US17093673

申请日：2020-11-10

申请人： International Business Machines Corporation

发明人： Samuel Thomas , Hong-Kwang Kuo , Kartik Audhkhasi , Michael Alan Picheny

IPC分类号： G10L15/16 , G06N3/04 , G06K9/62

摘要： Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.

10.

发明授权
Method and system for efficient spoken term detection using confusion networks 有权

公开(公告)号：US09734823B2

公开(公告)日：2017-08-15

申请号：US14837876

申请日：2015-08-27

申请人： International Business Machines Corporation

发明人： Brian E. D. Kingsbury , Hong-Kwang Kuo , Lidia Mangu , Hagen Soltau

IPC分类号： G10L15/05 , G10L15/08 , G10L15/02 , G10L13/08

CPC分类号： G10L15/083 , G10L13/08 , G10L15/02 , G10L2015/025 , G10L2015/085

摘要： Systems and methods for spoken term detection are provided. A method for spoken term detection, comprises receiving phone level out-of-vocabulary (OOV) keyword queries, converting the phone level OOV keyword queries to words, generating a confusion network (CN) based keyword searching (KWS) index, and using the CN based KWS index for both in-vocabulary (IV) keyword queries and the OOV keyword queries.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类