Patent search ap:("Electronics AND Telecommunications Research Institute") AND inv:"Hoon CHUNG" Page 1

1.

发明申请
APPARATUS AND METHOD FOR SELF-SUPERVISED TRAINING OF END-TO-END SPEECH RECOGNITION MODEL 有权

公开(公告)号：US20230134942A1

公开(公告)日：2023-05-04

申请号：US17961830

申请日：2022-10-07

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Hoon CHUNG , Byung-Ok KANG , Jeom-Ja KANG , Yun-Kyung LEE , Hyung-Bae JEON

IPC: G10L15/06 , G10L15/187

Abstract: Disclosed herein are an apparatus and method for self-supervised training of an end-to-end speech recognition model. The apparatus includes memory in which at least one program is recorded and a processor for executing the program. The program trains an end-to-end speech recognition model, including an encoder and a decoder, using untranscribed speech data. The program may add predetermined noise to the input signal of the end-to-end speech recognition model, and may calculate loss by reflecting a predetermined constraint based on the output of the encoder of the end-to-end speech recognition model.

2.

发明申请
APPARATUS AND METHOD FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION 有权
Title translation: 大容量连续语音识别的装置和方法

公开(公告)号：US20160240190A1

公开(公告)日：2016-08-18

申请号：US15042309

申请日：2016-02-12

Applicant: Electronics and Telecommunications Research Institute

Inventor： Sung Joo LEE , Byung Ok KANG , Jeon Gue PARK , Yun Keun LEE , Hoon CHUNG

IPC: G10L15/14 , G10L15/16 , G10L15/20

CPC classification number: G10L15/142 , G10L15/063 , G10L15/16 , G10L21/02

Abstract: Provided is an apparatus for large vocabulary continuous speech recognition (LVCSR) based on a context-dependent deep neural network hidden Markov model (CD-DNN-HMM) algorithm. The apparatus may include an extractor configured to extract acoustic model-state level information corresponding to an input speech signal from a training data model set using at least one of a first feature vector based on a gammatone filterbank signal analysis algorithm and a second feature vector based on a bottleneck algorithm, and a speech recognizer configured to provide a result of recognizing the input speech signal based on the extracted acoustic model-state level information.

Abstract translation: 提供了一种基于上下文相关深度神经网络隐马尔可夫模型（CD-DNN-HMM）算法的大词汇连续语音识别（LVCSR）装置。该装置可以包括提取器，其被配置为使用基于伽马一滤波器组信号分析算法和基于第二特征向量的第一特征向量中的至少一个从训练数据模型集中提取与输入语音信号相对应的声学模型状态级别信息以及语音识别器，被配置为基于所提取的声学模型状态级别信息来提供识别输入语音信号的结果。

3.

发明申请
METHOD AND SYSTEM FOR GRADING FOREIGN LANGUAGE FLUENCY ON THE BASIS OF END-TO-END TECHNIQUE 审中-公开

公开(公告)号：US20180268739A1

公开(公告)日：2018-09-20

申请号：US15709686

申请日：2017-09-20

Applicant: Electronics and Telecommunications Research Institute

Inventor： Hoon CHUNG , Jeon Gue PARK , Yoo Rhee OH , Yun Kyung LEE , Yun Keun LEE

IPC: G09B19/06 , G10L15/22 , G10L15/02 , G10L15/16 , G09B5/00

CPC classification number: G09B19/06 , G06N3/0454 , G06N3/084 , G09B5/00 , G09B7/00 , G10L15/02 , G10L15/16 , G10L15/22 , G10L25/30 , G10L25/60

Abstract: Provided are end-to-end method and system for grading foreign language fluency, in which a multi-step intermediate process of grading foreign language fluency in the related art is omitted. The method provides an end-to-end foreign language fluency grading method of grading a foreign language fluency of a non-native speaker from a non-native raw speech signal, and includes inputting the raw speech to a convolution neural network (CNN), training a filter coefficient of the CNN based on a fluency grading score calculated by a human rater for the raw signal so as to generate a foreign language fluency grading model, and grading foreign language fluency for a non-native speech signal newly input to the trained CNN by using the foreign language fluency grading model to output a grading result.

4.

发明申请
METHOD OF AUTOMATICALLY CLASSIFYING SPEAKING RATE AND SPEECH RECOGNITION SYSTEM USING THE SAME 审中-公开

公开(公告)号：US20180166071A1

公开(公告)日：2018-06-14

申请号：US15607880

申请日：2017-05-30

Applicant: Electronics and Telecommunications Research Institute

Inventor： Sung Joo LEE , Jeon Gue PARK , Yun Keun LEE , Hoon CHUNG

IPC: G10L15/08 , G10L15/02 , G10L17/00

CPC classification number: G10L15/07 , G10L15/12 , G10L25/48

Abstract: Provided are a method of automatically classifying a speaking rate and a speech recognition system using the method. The speech recognition system using automatic speaking rate classification includes a speech recognizer configured to extract word lattice information by performing speech recognition on an input speech signal, a speaking rate estimator configured to estimate word-specific speaking rates using the word lattice information, a speaking rate normalizer configured to normalize a word-specific speaking rate into a normal speaking rate when the word-specific speaking rate deviates from a preset range, and a rescoring section configured to rescore the speech signal whose speaking rate has been normalized.

5.

发明公开
APPARATUS AND METHOD FOR AUDIO-VIDEO SAMPLING FREQUENCY RATIO UNIFICATION 审中-公开

公开(公告)号：US20240282345A1

公开(公告)日：2024-08-22

申请号：US18413991

申请日：2024-01-16

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Yoonhyung KIM , Byung Ok KANG , Hoon CHUNG

IPC: G11B27/10 , G06V10/20 , G06V20/40 , G10L25/30

CPC classification number: G11B27/10 , G06V10/20 , G06V20/40 , G10L25/30

Abstract: Disclosed herein are an apparatus and method for audio-video sampling frequency ratio unification, including memory configured to store at least one program, and a processor configured to execute the program, wherein the program is configured to perform receiving an audio signal and a video signal, adjusting a ratio of a sampling frequency of the audio signal to a sampling frequency of the video signal so that the sampling frequency ratio is constant based on a deep learning network, and outputting an adjusted audio signal and the video signal.

6.

发明公开
SELF-SUPERVISED LEARNING METHOD BASED ON PERMUTATION INVARIANT CROSS ENTROPY AND ELECTRONIC DEVICE THEREOF 审中-公开

公开(公告)号：US20240105166A1

公开(公告)日：2024-03-28

申请号：US18350111

申请日：2023-07-11

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Hoon CHUNG , Byung Ok KANG , Yoonhyung KIM

IPC: G10L15/16 , G10L15/06 , G10L15/065

CPC classification number: G10L15/16 , G10L15/063 , G10L15/065

Abstract: Provided is a self-supervised learning method based on permutation invariant cross entropy. A self-supervised learning method based on permutation invariant cross entropy performed by an electronic device includes: defining a cross entropy loss function for pre-training of an end-to-end speech recognition model; configuring non-transcription speech corpus data composed only of speech as input data of the cross entropy loss function; setting all permutations of classes included in the non-transcription speech corpus data as an output target and calculating cross entropy losses for each class; and determining a minimum cross entropy loss among the calculated cross entropy losses for each class as a final loss.

7.

发明申请
APPARATUS AND METHOD FOR VERIFYING UTTERANCE IN SPEECH RECOGNITION SYSTEM 有权

公开(公告)号：US20170200458A1

公开(公告)日：2017-07-13

申请号：US15186286

申请日：2016-06-17

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Jeom Ja KANG , Hwa Jeon SONG , Jeon Gue PARK , Hoon CHUNG

IPC: G10L25/87 , G10L15/02 , G10L15/18 , G10L15/197

CPC classification number: G10L25/87 , G10L15/02 , G10L15/1815 , G10L15/197 , G10L15/22 , G10L25/48

Abstract: An apparatus and method for verifying an utterance based on multi-event detection information in a natural language speech recognition system. The apparatus includes a noise processor configured to process noise of an input speech signal, a feature extractor configured to extract features of speech data obtained through the noise processing, an event detector configured to detect events of the plurality of speech features occurring in the speech data using the noise-processed data and data of the extracted features, a decoder configured to perform speech recognition using a plurality of preset speech recognition models for the extracted feature data, and an utterance verifier configured to calculate confidence measurement values in units of words and sentences using information on the plurality of events detected by the event detector and a preset utterance verification model and perform utterance verification according to the calculated confidence measurement values.

8.

发明申请
SIGNAL PROCESSING ALGORITHM-INTEGRATED DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION APPARATUS AND LEARNING METHOD THEREOF 审中-公开
Title translation: 信号处理算法综合深度基于神经网络的语音识别装置及其学习方法

公开(公告)号：US20160078863A1

公开(公告)日：2016-03-17

申请号：US14737907

申请日：2015-06-12

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Hoon CHUNG , Jeon Gue PARK , Sung Joo LEE , Yun Keun LEE

IPC: G10L15/16

CPC classification number: G10L15/16

Abstract: Provided are a signal processing algorithm-integrated deep neural network (DNN)-based speech recognition apparatus and a learning method thereof. A model parameter learning method in a deep neural network (DNN)-based speech recognition apparatus implementable by a computer includes converting a signal processing algorithm for extracting a feature parameter from a speech input signal of a time domain into signal processing deep neural network (DNN), fusing the signal processing DNN and a classification DNN, and learning a model parameter in a deep learning model in which the signal processing DNN and the classification DNN are fused.

Abstract translation: 提供了一种基于信号处理算法的深度神经网络（DNN）语音识别装置及其学习方法。由计算机实现的基于深神经网络（DNN）的语音识别装置中的模型参数学习方法包括：将来自时域的语音输入信号的特征参数的信号处理算法转换为信号处理深层神经网络（DNN ），融合信号处理DNN和分类DNN，并在信号处理DNN和分类DNN融合的深度学习模型中学习模型参数。

9.

发明申请
APPARATUS AND METHOD FOR RECOGNIZING CONTINUOUS SPEECH 审中-公开
Title translation: 用于识别连续语音的装置和方法

公开(公告)号：US20150006175A1

公开(公告)日：2015-01-01

申请号：US14304104

申请日：2014-06-13

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Ki-Young PARK , Yun-Keun LEE , Hoon CHUNG

IPC: G10L15/18

CPC classification number: G10L15/18 , G10L15/04 , G10L15/32

Abstract: The present invention relates to an apparatus and a method for recognizing continuous speech having large vocabulary. In the present invention, large vocabulary in large vocabulary continuous speech having a lot of same kinds of vocabulary is divided to a reasonable number of clusters, then representative vocabulary for pertinent clusters is selected and first recognition is performed with the representative vocabulary, then if the representative vocabulary is recognized by use of the result of first recognition, re-recognition is performed against all words in the cluster where the recognized representative vocabulary belongs.

Abstract translation: 本发明涉及用于识别具有较大词汇量的连续语音的装置和方法。在本发明中，具有大量相同种类词汇的大词汇连续语音中的大词汇被划分为合理数量的群集，然后选择相关群集的代表性词汇，并用代表性词汇表进行首次识别，通过使用第一识别结果来识别代表性词汇，对所识别的代表词汇所属的群集中的所有单词进行重新识别。

10.

发明申请
APPARATUS AND METHOD FOR DEEP NEURAL NETWORK MODEL PARAMETER REDUCTION USING SPARSITY REGULARIZED FACTORIZED MATRIX 审中-公开

公开(公告)号：US20200184310A1

公开(公告)日：2020-06-11

申请号：US16711317

申请日：2019-12-11

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor： Hoon CHUNG , Jeon Gue PARK , Yun Keun LEE

IPC: G06N3/04 , G06N3/08

Abstract: Provided is an apparatus and method for reducing the number of deep neural network model parameters, the apparatus including a memory in which a program for DNN model parameter reduction is stored, and a processor configured to execute the program, wherein the processor represents hidden layers of the model of the DNN using a full-rank decomposed matrix, uses training that is employed with a sparsity constraint for converting a diagonal matrix value to zero, and determines a rank of each of the hidden layers of the model of the DNN according to a degree of the sparsity constraint.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification