Patent search ap:("NIPPON TELEGRAPH AND TELEPHONE CORPORATION") AND inv:"Takashi NAKAMURA" Page 1

1.

发明申请
KEYWORD DETECTION APPARATUS, KEYWORD DETECTION METHOD, AND PROGRAM 有权

公开(公告)号：US20220005466A1

公开(公告)日：2022-01-06

申请号：US17298368

申请日：2019-11-19

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Takashi NAKAMURA , Tomohiro TANAKA

IPC: G10L15/18 , G10L15/16

Abstract: A keyword is extracted robustly despite a voice recognition result including an error. A model storage unit 10 stores a keyword extraction model that accepts word vector representations of a plurality of words as an input and extracts and outputs a word vector representation of a word to be extracted as a keyword. A speech detection unit 11 detects a speech part from a voice signal. A voice recognition unit 12 executes voice recognition on the speech part of the voice signal and outputs a confusion network which is a voice recognition result. A word vector representation generating unit 13 generates a word vector representation including reliability of voice recognition with regard to each candidate word for each confusion set. A keyword extraction unit 14 inputs the word vector representation of the candidate word to the keyword extraction model in descending order of the reliability and obtains the word vector representation of the keyword.

2.

发明申请
NON-VERBAL UTTERANCE DETECTION APPARATUS, NON-VERBAL UTTERANCE DETECTION METHOD, AND PROGRAM 有权

公开(公告)号：US20210272587A1

公开(公告)日：2021-09-02

申请号：US17293021

申请日：2019-10-31

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Takashi NAKAMURA , Takaaki FUKUTOMI , Kiyoaki MATSUI

IPC: G10L25/93 , G10L25/51 , G10L25/30 , G10L25/24 , G06N3/02

Abstract: Detection precision of a non-verbal sound is improved. An acoustic model storage unit 10A stores an acoustic model that is configured by a deep neural network with a bottleneck structure, and estimates a phoneme state from a sound feature value. A non-verbal sound model storage unit 10B stores a non-verbal sound model that estimates a posterior probability of a non-verbal sound likeliness from the sound feature value and a bottleneck feature value. A sound feature value extraction unit 11 extracts a sound feature value from an input sound signal. A bottleneck feature value estimation unit 12 inputs the sound feature value to the acoustic model and obtains an output of a bottleneck layer of the acoustic model as a bottleneck feature value. A non-verbal sound detection unit 13 inputs the sound feature value and the bottleneck feature value to the non-verbal sound model and obtains the posterior probability of the non-verbal sound likeliness output by the non-verbal sound model.

3.

发明申请
LEARNING DATA ACQUISITION APPARATUS, MODEL LEARNING APPARATUS, METHODS AND PROGRAMS FOR THE SAME 有权

公开(公告)号：US20220101828A1

公开(公告)日：2022-03-31

申请号：US17429737

申请日：2020-01-29

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Takaaki FUKUTOMI , Takashi NAKAMURA , Kiyoaki MATSUI

IPC: G10L15/06 , G10L25/78 , G10L21/0208 , G06N20/00

Abstract: A learning data acquisition device or the like, capable of acquiring learning data by superimposing noise data on clean voice data at an appropriate SN ratio, is provided. The learning data acquisition device includes a voice recognition influence degree calculation unit and a learning data acquisition unit. The voice recognition influence degree calculation unit calculates an influence degree on voice recognition accuracy caused by a change of a signal-to-noise ratio, based on a result of voice recognition on the kth noise superimposed voice data and a result of voice recognition on the k−1th noise superimposed voice data, where K is an integer of 2 or larger, k=2, 3, . . . , K, and a signal-to-noise ratio of the the kth noise superimposed voice data is smaller than a signal-to-noise ratio of the k−1th noise superimposed voice data, and obtains a largest signal-to-noise ratio SNRapply among signal-to-noise ratios of the k−1th noise superimposed voice data when the influence degree meets a given threshold condition. The learning data acquisition unit acquires noise superimposed voice data having a signal-to-noise ratio that is equal to or larger than the signal-to-noise ratio SNRapply, as learning data.

4.

发明申请
APPROPRIATE UTTERANCE ESTIMATE MODEL LEARNING APPARATUS, APPROPRIATE UTTERANCE JUDGEMENT APPARATUS, APPROPRIATE UTTERANCE ESTIMATE MODEL LEARNING METHOD, APPROPRIATE UTTERANCE JUDGEMENT METHOD, AND PROGRAM 有权

公开(公告)号：US20210035558A1

公开(公告)日：2021-02-04

申请号：US16968126

申请日：2019-02-07

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Takashi NAKAMURA , Takaaki FUKUTOMI

IPC: G10L15/06 , G10L15/22 , G10L15/16 , G10L15/28 , G10L15/05

Abstract: Provided is technology for assessing whether uttered speech detected from input speech is speech suited to a prescribed purpose. A method comprises detecting, from input speech including speech uttered by a speaker and noise, the uttered speech corresponding to the speech uttered by the speaker, extracting an acoustic feature of the uttered speech, generating, from the uttered speech, a speech recognition result set with a recognition score, generating, from the speech recognition result set with the recognition score, a speech recognition result word vector expression set and a speech recognition result part-of-speech vector expression set, generating a target utterance estimation model, providing, using the target utterance estimation model, a probability of the uttered speech being suited to the prescribed purpose, and outputting the uttered speech and the speech recognition result set with the recognition score, the the uttered speech suitable to the prescribed purpose.

5.

发明申请
LEARNING SPEECH DATA GENERATING APPARATUS, LEARNING SPEECH DATA GENERATING METHOD, AND PROGRAM 有权

公开(公告)号：US20210005215A1

公开(公告)日：2021-01-07

申请号：US16979393

申请日：2019-03-11

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Takaaki FUKUTOMI , Manabu OKAMOTO , Takashi NAKAMURA , Kiyoaki MATSUI

IPC: G10L21/0208 , G10L21/013 , G10L15/06

Abstract: A training speech data generating apparatus includes: a voice conversion unit that converts, using fourth noise data, which is noise data based on third noise data, and speech data, the speech data so as to make the speech data clearly audible under a noise environment corresponding to the fourth noise data; and a noise superimposition unit that obtains training speech data by superimposing the third noise data and the converted speech data.

6.

发明公开
CONVERSION DEVICE, CONVERSION METHOD, AND CONVERSION PROGRAM 审中-公开

公开(公告)号：US20240013798A1

公开(公告)日：2024-01-11

申请号：US18036598

申请日：2020-11-13

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Kazunori YAMADA , Ko MITSUDA , Tetsuya KINEBUCHI , Yushi AONO , Hiroko YABUSHITA , Akihiko TAKASHIMA , Takashi NAKAMURA

IPC: G10L21/02 , G10L25/60

CPC classification number: G10L21/02 , G10L25/60

Abstract: A conversion device (10) includes: an evaluation unit (11) that estimates which one of subjective evaluation values obtained by quantifying easiness of transmission of a content of a voice felt by a person is to be taken from an input voice signal; and a conversion unit (12) that converts the input voice signal so as to obtain a subjective evaluation value of a predetermined value on the basis of the subjective evaluation value estimated by the evaluation unit (11).

7.

发明申请
MODEL LEARNING DEVICE, METHOD THEREFOR, AND PROGRAM 审中-公开

公开(公告)号：US20190244604A1

公开(公告)日：2019-08-08

申请号：US16333156

申请日：2017-09-05

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Hirokazu MASATAKI , Taichi ASAMI , Takashi NAKAMURA , Ryo MASUMURA

IPC: G10L15/16 , G06N3/04 , G10L15/06 , G06F17/27

CPC classification number: G10L15/16 , G06F17/2715 , G06N3/0454 , G06N3/049 , G06N3/08 , G06N99/00 , G10L15/06 , G10L15/063 , G10L15/065 , G10L2015/0635

Abstract: A model learning device comprises: an initial value setting part that uses a parameter of a learned first model including a neural network to set a parameter of a second model including a neural network having a same network structure as the first model; a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of an output probability of each unit on an output layer, using learning features and the first model; a second output probability distribution calculating part that calculates a second output probability distribution including a distribution of an output probability of each unit on the output layer, using learning features and the second model; and a modified model update part that obtains a weighted sum of a second loss function calculated from correct information and from the second output probability distribution, and a cross entropy between the first output probability distribution and the second output probability distribution, and updates the parameter of the second model so as to reduce the weighted sum.

8.

发明申请
DIALOGUE APPARATUS, METHOD AND PROGRAM 有权

公开(公告)号：US20230005467A1

公开(公告)日：2023-01-05

申请号：US17779528

申请日：2019-11-26

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Kazunori YAMADA , Takashi NAKAMURA

IPC: G10L13/08 , G10L13/10 , G10L15/08 , G10L15/22

Abstract: A dialogue apparatus includes a speech recognition unit (1) configured to perform speech recognition on utterance input to generate a text corresponding to the utterance, a speech waveform corresponding to the utterance, and information regarding a length of sound of the utterance; a language understanding unit (2) configured to grasp contents of the utterance by using the text corresponding to the utterance; a dialogue management unit (3) configured to determine contents of a response corresponding to the utterance by using the content of the utterance; an utterance state extraction unit (4) configured to extract a state of the utterance by using the text corresponding to the utterance, the speech waveform corresponding to the utterance, and the information regarding the length of the sound of the utterance; a response state determination unit (5) configured to determine a state of the response according to the state of the utterance; a response sentence generation unit (6) configured to generate a response sentence by using the content of the response; and a speech synthesis unit (7) configured to synthesize speech corresponding to the response sentence with the state of the response taken into account.

9.

发明申请
SPEECH RECOGNITION ACCURACY DETERIORATION FACTOR ESTIMATION DEVICE, SPEECH RECOGNITION ACCURACY DETERIORATION FACTOR ESTIMATION METHOD, AND PROGRAM 有权

公开(公告)号：US20210035553A1

公开(公告)日：2021-02-04

申请号：US16968120

申请日：2019-02-06

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Takashi NAKAMURA , Takaaki FUKUTOMI

IPC: G10L15/02 , G10L15/01 , G10L15/22

Abstract: The present invention provides a device for estimating the deterioration factor of speech recognition accuracy by estimating an acoustic factor that leads to a speech recognition error. The device extracts an acoustic feature amount for each frame from an input speech, calculates a posterior probability for each acoustic event for the acoustic feature amount for each frame, corrects the posterior probability by filtering the posterior probability for each acoustic event using a time-series filter with weighting coefficients developed in the time axis, outputs a set of speech recognition results with a recognition score, outputs a feature amount for the speech recognition results for each frame, calculates and outputs a principal deterioration factor class for the speech recognition accuracy for each frame on the basis of the corrected posterior probability, the feature amount for speech recognition results for each frame, and the acoustic feature amount for each frame.

10.

发明申请
ACOUSTIC MODEL LEARNING APPARATUS, METHOD OF THE SAME AND PROGRAM 审中-公开

公开(公告)号：US20200035223A1

公开(公告)日：2020-01-30

申请号：US16337081

申请日：2017-09-27

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor： Taichi ASAMI , Takashi NAKAMURA

IPC: G10L15/16 , G10L15/06 , G06N3/04 , G06F17/18

Abstract: An acoustic model learning apparatus includes a first output probability distribution calculating part that calculates a first output probability distribution including a distribution of output probabilities of respective units of an output layer using a feature amount obtained from an acoustic signal for learning and a learned first acoustic model including a neural network, and the first output probability distribution calculating part obtains the first output probability distribution using a smoothing parameter made up of a real value greater than 0 as input so that the first output probability distribution approaches a uniform distribution as the smoothing parameter is greater, and calculates the first output probability distribution by obtaining logits of respective units of an output layer using the feature amount obtained from the acoustic signal for learning and the first acoustic model and setting a value of the smoothing parameter greater in the case where an output unit number with the greatest logit value is different from a correct unit number than in the case where the output unit number with the greatest logit value matches the correct unit number.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification