-
公开(公告)号:US20170278509A1
公开(公告)日:2017-09-28
申请号:US15621778
申请日:2017-06-13
发明人: Takashi Fukuda , Osamu Ichikawa , Futoshi Iwama
IPC分类号: G10L15/01
CPC分类号: G10L15/01 , G10L13/08 , G10L15/193
摘要: A method, for testing words defined in a pronunciation lexicon used in an automatic speech recognition (ASR) system, is provided. The method includes: obtaining test sentences which can be accepted by a language model used in the ASR system. The test sentences cover words defined in the pronunciation lexicon. The method further includes obtaining variations of speech data corresponding to each test sentence, and obtaining a plurality of texts by recognizing the variations of speech data, or a plurality of texts generated by recognizing the variation of speech data. The method also includes constructing a word graph, using the plurality of texts, for each test sentence, where each word in the word graph corresponds to each word defined in the pronunciation lexicon; and determining whether or not all or parts of words in a test sentence are present in a path of the word graph derived from the test sentence.
-
公开(公告)号:US20170243113A1
公开(公告)日:2017-08-24
申请号:US15052431
申请日:2016-02-24
发明人: Takashi Fukuda , Osamu Ichikawa
CPC分类号: G06N3/0472 , G06N3/0454
摘要: A method for learning a neural network having a plurality of filters for extracting local features performed by a computing device is disclosed. The computing device calculates a plurality of projection parameter sets by analyzing one or more training data. The plurality of the projection parameter sets define a projection of each training data into a new space and each projection parameter set has a same size as the filters in the neural network. At least part of the plurality of the projection parameter sets is set as initial parameters of at least part of the plurality of the filters in the neural network for training.
-
公开(公告)号:US09640197B1
公开(公告)日:2017-05-02
申请号:US15077523
申请日:2016-03-22
发明人: Takashi Fukuda , Osamu Ichikawa
IPC分类号: G10L21/00 , G10L21/028 , G10L21/0264 , G10L25/21 , G10L21/0216
CPC分类号: G10L21/028 , G10L15/14 , G10L2021/02166
摘要: Methods and systems are provided for separating a target speech from a plurality of other speeches having different directions of arrival. One of the methods includes obtaining speech signals from speech input devices disposed apart in predetermined distances from one another, calculating a direction of arrival of target speeches and directions of arrival of other speeches other than the target speeches for each of at least one pair of speech input devices, calculating an aliasing metric, wherein the aliasing metric indicates which frequency band of speeches is susceptible to spatial aliasing, enhancing speech signals arrived from the direction of arrival of the target speech signals, based on the speech signals and the direction of arrival of the target speeches, to generate the enhanced speech signals, reading a probability model, and inputting the enhanced speech signals and the aliasing metric to the probability model to output target speeches.
-
公开(公告)号:US20150112669A1
公开(公告)日:2015-04-23
申请号:US14060972
申请日:2013-10-23
发明人: Takashi Fukuda , Vaibhava Goel , Steven J. Rennie
IPC分类号: G10L15/065
CPC分类号: G10L15/065 , G10L15/02
摘要: A method and apparatus are provided for training a transformation matrix of a feature vector for an acoustic model. The method includes training the transformation matrix of the feature vector. The transformation matrix maximizes an objective function having a regularization term. The method further includes transforming the feature vector using the transformation matrix of the feature vector, and updating the acoustic model stored in a memory device using the transformed feature vector.
摘要翻译: 提供了一种用于训练用于声学模型的特征向量的变换矩阵的方法和装置。 该方法包括训练特征向量的变换矩阵。 变换矩阵使具有正则化项的目标函数最大化。 该方法还包括使用特征向量的变换矩阵来变换特征向量,以及使用变换的特征向量来更新存储在存储器件中的声学模型。
-
-
-