-
公开(公告)号:US20180075839A1
公开(公告)日:2018-03-15
申请号:US15440497
申请日:2017-02-23
IPC分类号: G10L15/01 , G06F3/16 , G06F3/0484
CPC分类号: G10L15/01 , G06F3/04842 , G06F3/167 , G10L15/22
摘要: A correction system of the embodiment includes an interface system, a calculator, a generator, and a display controller. The interface system receives correction information for correcting a voice recognition result. The calculator estimates a part of the voice recognition result to be corrected and calculates a degree of association between the part to be corrected and the correction information. The generator generates corrected display information comprising at least one of the correction information and the part to be corrected using a display format corresponding to the degree of association. The display controller outputs the corrected display information on a display.
-
公开(公告)号:US20210280168A1
公开(公告)日:2021-09-09
申请号:US17012372
申请日:2020-09-04
发明人: Taira ASHIKAWA , Hiroshi FUJIMURA , Kenji IWATA
摘要: According to one embodiment, a speech recognition error correction apparatus includes a correction network memory and an error correction circuitry. The error correction circuitry calculates a difference between a speech recognition result string of an error correction target, which is a result of performing speech recognition on a new series of speech data, and a correction network, where a speech recognition result string and a correction result by a user for the speech recognition result string are associated, and when a value indicating the difference is equal to or less than a threshold, perform error correction on a speech recognition error portion in the speech recognition result string of the error correction target by using the correction network to generate a speech recognition error correction result string.
-
13.
公开(公告)号:US20200066260A1
公开(公告)日:2020-02-27
申请号:US16296282
申请日:2019-03-08
摘要: According to one embodiment, a signal generation device includes one or more processors. The processors convert an acoustic signal and output amplitude and phase at a plurality of frequencies. The processors, for each of a plurality of nodes of a hidden layer included in a neural network that treats the amplitude and the phase as input, obtain frequency based on a plurality of weights used in arithmetic operation of the node. The processors generate an acoustic signal based on the plurality of obtained frequencies and based on amplitude and phase corresponding to each of the plurality of nodes.
-
14.
公开(公告)号:US20190392839A1
公开(公告)日:2019-12-26
申请号:US16296410
申请日:2019-03-08
发明人: Hiroshi FUJIMURA
摘要: According to one embodiment, a system for creating a speaker model includes one or more processors. The processors change a part of network parameters from an input layer to a predetermined intermediate layer based on a plurality of patterns and inputs a piece of speech into each of neural networks so as to obtain a plurality of outputs from the intermediate layer. The part of network parameters of the each of the neural networks is changed based on one of the plurality of patterns. The processors create a speaker model with respect to one or more words detected from the speech based on the outputs.
-
公开(公告)号:US20190266997A1
公开(公告)日:2019-08-29
申请号:US16130538
申请日:2018-09-13
发明人: Hiroshi FUJIMURA
IPC分类号: G10L15/14 , G10L15/05 , G10L15/02 , G10L15/187
摘要: According to one embodiment, a word detection system acquires speech data including a plurality of frames, generates the speech characteristic amount, calculates a frame score by matching a reference model based on the speech characteristic amount associated with a target word with the frames in the speech data, calculates a first score of the word from the frame score, detects the word from the speech data based on the first score, calculates a second score of the word based on time information on the start and the end of the detected word and the frame score, compares the value of the second score with the second scores of a plurality of words, and determines a word to be output based on the comparison result.
-
公开(公告)号:US20180277106A1
公开(公告)日:2018-09-27
申请号:US15688591
申请日:2017-08-28
发明人: Takami YOSHIDA , Kenji IWATA , Hiroshi FUJIMURA
CPC分类号: G10L15/1822 , G06F16/685 , G06Q30/06 , G10L15/01 , G10L2015/223 , H04M3/493 , H04M2203/355
摘要: According to an embodiment, a verification system includes a storage controller, first and second receivers, a comparator, a response constructor, a response generator, and an output controller. The storage controller stores, in a storage, first response data and first situation data associated with the first response data. The first receiver receives second response data. The comparator determines a similarity between second situation data indicating a second context for using the second response data and the first situation data. The response constructor constructs response content information comprising the second response data and the first response data associated with the first situation data having the similarity equal to or greater than a threshold. The second receiver receives speech data. The response generator generates a response sentence corresponding to the speech data using the response content information. The output controller outputs for display one or more response sentences.
-
17.
公开(公告)号:US20180137863A1
公开(公告)日:2018-05-17
申请号:US15686410
申请日:2017-08-25
发明人: Manabu NAGAO , Hiroshi FUJIMURA
摘要: According to an embodiment, a speech recognition apparatus includes a calculation unit that calculates, based on a speech signal, a score vector sequence including score vectors including an acoustic score for each of input symbols, a search unit that generates an input symbol string by searching for a path of the input symbol tracing the acoustic score having a high likelihood in the score vector sequence and that generates an output symbol representing a recognition result of the speech signal based on a recognition target symbol representing linguistic information as a recognition target among the input symbols, an additional symbol acquisition unit that obtains an additional symbol representing paralinguistic information and/or non-linguistic information from among the input symbols included in a range corresponding to the output symbol, and an output unit that outputs the output symbol and the obtained additional symbol in association with each other.
-
18.
公开(公告)号:US20210065684A1
公开(公告)日:2021-03-04
申请号:US16804388
申请日:2020-02-28
发明人: Ning DING , Hiroshi FUJIMURA
IPC分类号: G10L15/07 , G10L15/02 , G10L15/06 , G10L15/187 , G10L15/22
摘要: According to one embodiment, an information processing apparatus includes following units. The acquisition unit acquires first training data including a combination of a voice feature quantity and a correct phoneme label of the voice feature quantity. The training unit trains an acoustic model using the first training data in a manner to output the correct phoneme label in response to input of the voice feature quantity. The extraction unit extracts from the first training data, second training data including voice feature quantities of at least one of a keyword, a sub-word, a syllable, or a phoneme included in the keyword. The adaptation processing unit adapts the trained acoustic model using the second training data to a keyword detection model.
-
19.
公开(公告)号:US20180279010A1
公开(公告)日:2018-09-27
申请号:US15683543
申请日:2017-08-22
发明人: Nayuko WATANABE , Kosei FUME , Hiroshi FUJIMURA
IPC分类号: H04N21/488
CPC分类号: H04N21/4884 , G06F17/24 , G10L15/26 , G10L21/10
摘要: According to an embodiment, an information processing apparatus includes one or more processors. The one or more processors are configured to acquire target sentence data including a plurality of morphemes obtained by speech recognition and speech generation time of each morpheme from the plurality of morphemes; and assign display time according to a difference between a confirmed sentence of which a user's correction for the target sentence data is confirmed and a second confirmed sentence of a previous speech generation time.
-
公开(公告)号:US20180082688A1
公开(公告)日:2018-03-22
申请号:US15440550
申请日:2017-02-23
CPC分类号: G10L15/265 , G10L15/08 , G10L15/26 , G10L25/21 , G10L25/51 , G10L25/78 , H04M3/567 , H04M2201/40
摘要: According to an embodiment, a conference support system includes a recognizer, a classifier, a first caption controller, a second caption controller, and a display controller. The recognizer is configured to recognize text data corresponding speech from a speech section and configured to distinguish between the speech section and a non-speech section in speech data. The classifier is configured to classify the text data into first utterance data representing a principal utterance and second utterance data representing another utterance. The first caption controller is configured to generate first caption data for displaying the first utterance data without waiting for identification of the first utterance data to finish. The second caption controller is configured to generate second caption data for displaying the second utterance data after identification of the second utterance data finishes. The display controller is configured to control a display of the first caption data and the second caption data.
-
-
-
-
-
-
-
-
-