-
1.
公开(公告)号:US20230317085A1
公开(公告)日:2023-10-05
申请号:US18019126
申请日:2020-08-11
Applicant: NEC Corporation
Inventor: Hitoshi YAMAMOTO
Abstract: An acoustic feature extraction unit (130) extracts acoustic features indicative of a feature related to speech from audio data. A phoneme classification unit (110) classifies phonemes included in the audio data on the basis of the acoustic features. A first speaker feature calculation unit (140) generates first speaker features indicative of a feature of speech of each phoneme on the basis of acoustic features and phoneme classification information indicative of classification results of the phonemes included in the audio data. A second speaker feature calculation unit (150) generates a second speaker feature indicative of a feature of overall speech by merging first speaker features regarding two or more phonemes.
-
公开(公告)号:US20230109867A1
公开(公告)日:2023-04-13
申请号:US17908292
申请日:2020-03-09
Applicant: NEC Corporation
Inventor: Shuji KOMEIJI , Hitoshi YAMAMOTO
IPC: G10L15/26 , G06F40/166
Abstract: A speech recognition apparatus (2000) acquires source data (10) representing an audio signal including an utterance. The speech recognition apparatus (2000) converts the source data (10) into a text string (30). The speech recognition apparatus (2000) generates a concatenated text (40) representing a content of an utterance by concatenating a text (32) included in the text string (30). Herein, texts (32) adjacent to each other in the text string (30) are such that parts of associated audio signals overlap each other on a time axis. At a time of concatenating texts (32) adjacent to each other, the speech recognition apparatus (2000) eliminates a trailing portion of a preceding text (32) and a leading portion of a succeeding text (32).
-
公开(公告)号:US20190279644A1
公开(公告)日:2019-09-12
申请号:US16333008
申请日:2017-09-11
Applicant: NEC CORPORATION
Inventor: Hitoshi YAMAMOTO , Takafumi KOSHINAKA , Takayuki SUZUKI
Abstract: A speech processing device includes at least one memory configured to store instructions and at least one processor configured to execute the instructions to: store one or more acoustic models; calculate an acoustic feature from a received speech signal, and by using the acoustic feature calculated and the acoustic model stored, calculate an acoustic diversity that is a vector representing a degree of variations of types of sounds; by using the calculated acoustic diversity and a selection coefficient, calculate a weighted acoustic diversity, and by using the weighted acoustic diversity calculated and the acoustic feature, calculate a recognition feature for recognizing identity of a speaker that concerns the speech signal; and calculate a feature vector by using the recognition feature calculated.
-
4.
公开(公告)号:US20220238119A1
公开(公告)日:2022-07-28
申请号:US17612736
申请日:2019-05-28
Applicant: NEC Corporation
Inventor: Takafumi KOSHINAKA , Hitoshi YAMAMOTO , Kaoru KOIDA , Takayuki SUZUKI
Abstract: A neural network input unit 81 inputs a neural network in which a first network having a layer for inputting an anchor signal belonging to a predetermined class and a mixed signal including a target signal belonging to the class and a layer for outputting, as an estimation result, a reconstruction mask indicating a time-frequency domain in which the target signal is present in the mixed signal, and a second network having a layer for inputting the target signal extracted by applying the mixed signal to the reconstruction mask and a layer for outputting a result obtained by classifying the input target signal into a predetermined class are combined. A reconstruction mask estimation unit 82 applies the anchor signal and mixed signal to the first network to estimate the reconstruction mask of the class to which the anchor signal belongs. A signal classification unit 83 applies the mixed signal to the estimated reconstruction mask to extract the target signal, and applies the extracted target signal to the second network to classify the target signal into the class.
-
公开(公告)号:US20220101859A1
公开(公告)日:2022-03-31
申请号:US17545107
申请日:2021-12-08
Applicant: NEC Corporation
Inventor: Hitoshi YAMAMOTO , Takafumi KOSHINAKA
Abstract: This speech processing device is provided with: a contribution degree estimation means which calculates a contribution degree representing a quality of a segment of the speech signal; and a speaker feature calculation means which calculates a feature from the speech signal, for recognizing attribute information of the speech signal, using the contribution degree as a weight of the segment of the speech signal.
-
6.
公开(公告)号:US20220005482A1
公开(公告)日:2022-01-06
申请号:US17288154
申请日:2018-10-25
Applicant: NEC Corporation
Inventor: Hitoshi YAMAMOTO , Takafumi KOSHINAKA
Abstract: An audio processing apparatus 100 is apparatus for generating a training data in speaker recognition. The audio processing apparatus 100 includes a data acquisition unit configured to acquire an audio signal that is a source of the training data as sample data, a data generation unit configured to executes signal processing on the acquired sample data, and to generates a new audio signal as the training data whose similarity with the sample data is within the set range.
-
公开(公告)号:US20210287682A1
公开(公告)日:2021-09-16
申请号:US17255511
申请日:2018-06-27
Applicant: NEC Corporation
Inventor: Ling GUO , Hitoshi YAMAMOTO , Takafumi KOSHINAKA
Abstract: The information processing apparatus (2000) computes a first score representing a degree of similarity between the input sound data (10) and the registrant sound data (22) of the registrant (20). The information processing apparatus (2000) obtains a plurality of pieces of segmented sound data (12) by segmenting the input sound data (10) in the time direction. The information processing apparatus (2000) computes, for each piece of segmented sound data piece (12), a second score representing the degree of similarity between the segmented sound data (12) and the registrant sound data (22). The information processing apparatus 2000 makes first determination to determine whether a number of speakers of sound included in the input sound data (10) is one or multiple, using at least the second score. The information processing apparatus (2000) makes second determination to determine whether the input sound data (10) includes the sound of the registrant (20), based on the first score, the second scores, and a result of the first determination.
-
公开(公告)号:US20210264939A1
公开(公告)日:2021-08-26
申请号:US17253763
申请日:2018-06-21
Applicant: NEC CORPORATION
Inventor: Hitoshi YAMAMOTO , Takafumi KOSHINAKA
Abstract: To provide an attribute identifying device, an attribute identifying method, and a program storage medium in which the accuracy of attribute identification of a person is further enhanced.
An attribute identifying device 100 includes a first attribute identifying unit 130 that identifies, based on a biological signal, first attribute information, which is a range of specific attribute values, from the biological signal, and a second attribute identifying unit 140 that identifies second attribute information, which is specific attribute information, from the biological signal and the first attribute information.-
公开(公告)号:US20210134300A1
公开(公告)日:2021-05-06
申请号:US16475743
申请日:2017-03-07
Applicant: NEC Corporation
Inventor: Hitoshi YAMAMOTO , Takafumi KOSHINAKA
Abstract: This speech processing device is provided with: a contribution degree estimation means which calculates a contribution degree representing a quality of a segment of the speech signal; and a speaker feature calculation means which calculates a feature from the speech signal, for recognizing attribute information of the speech signal, using the contribution degree as a weight of the segment of the speech signal.
-
公开(公告)号:US20210319087A1
公开(公告)日:2021-10-14
申请号:US17355480
申请日:2021-06-23
Applicant: NEC Corporation
Inventor: Koji OKABE , Hitoshi YAMAMOTO , Takafumi KOSHINAKA
Abstract: An authentication device is provided with: a plurality of attribute-dependent score calculation units each calculating an attribute-dependent score dependent on a prescribed attribute for input data; an attribute-independent score calculation unit for calculating an attribute-independent score independent of the attribute for the input data; an attribute estimation unit for performing attribute estimation for the input data; and a score integration unit for determining a score weight of each of a plurality of attribute-dependent scores and of the attribute-independent score using the result of the attribute estimation and calculating an output score using the attribute-dependent scores, the attribute-independent score, and the determined score weights.
-
-
-
-
-
-
-
-
-