-
1.
公开(公告)号:US20240273883A1
公开(公告)日:2024-08-15
申请号:US18644689
申请日:2024-04-24
Inventor: Shintaro OKADA , Masanari MIYAMOTO , Kousuke ITAKURA
CPC classification number: G06V10/803 , G06V10/761 , G06V40/168 , G06V40/172 , G10L17/02 , G10L17/10
Abstract: An information processing device performs: acquiring a face similarity indicating a similarity between a face of a first person and a face of a second person; acquiring a voice similarity indicating a similarity between a voice of the first person and a voice of the second person; calculating an integrated similarity by integrating the face similarity and the voice similarity, and determining the integrated similarity as a final similarity when the face similarity falls within an integrated range including a threshold which is used to determine whether the first person and the second person are identical to each other, and calculating the face similarity as a final similarity when the face similarity is out of the integrated range; and outputting the final similarity.
-
公开(公告)号:US20240282313A1
公开(公告)日:2024-08-22
申请号:US18653631
申请日:2024-05-02
Inventor: Kousuke ITAKURA
Abstract: A speaker recognition device acquires a registered voice, converts the acquired registered voice to a plurality of property converted voices having respective acoustic properties different from each other, extracts a speaker feature indicative of a characteristic of a speaker from the registered voice, extracts a speaker feature from each of the property converted voices, compares all pairs of speaker features of a part or all of the speaker feature extracted from the registered voice and the speaker features extracted from the property converted voices, and calculates a threshold used for recognition of a speaker of an input voice on the basis of a result of the comparison.
-
公开(公告)号:US20200160846A1
公开(公告)日:2020-05-21
申请号:US16682661
申请日:2019-11-13
Inventor: Kousuke ITAKURA , Ko MIZUNO , Misaki DOI
Abstract: A speaker recognition device according to the present disclosure includes: an acoustic feature calculator that calculates, from utterance data indicating a voice of an obtained utterance, acoustic feature of the voice of the utterance; a statistic calculator that calculates an utterance data statistic from the calculated acoustic feature; a speaker feature extractor that extracts speaker feature of a speaker of the utterance data from the calculated utterance data statistic using a deep neural network (DNN); a similarity calculator that calculates a similarity between the extracted speaker feature and pre-stored speaker feature of at least one registered speaker; and a speaker recognizer that recognizes the speaker of the utterance data based on the calculated similarity.
-
4.
公开(公告)号:US20240105174A1
公开(公告)日:2024-03-28
申请号:US18527928
申请日:2023-12-04
Inventor: Takahiro KAMAI , Katsunori DAIMO , Misaki DOI , Kousuke ITAKURA
CPC classification number: G10L15/22 , B60R16/0373 , G10L15/01 , G10L15/02 , G10L15/10 , G10L15/30 , G10L2015/223
Abstract: A voice recognition device includes an estimation unit that compares a plurality of pieces of registration voice data stored in a database with input voice data uttered by a speaker who gets on a mobile body to estimate a registration command corresponding to the input command, a presentation unit that presents an estimation result, a second acquisition unit that acquires an error instruction indicating that the estimation result is an error, a determination unit that, in a case where the error instruction is acquired, determines a correct command corresponding to the input command based on an operation by the speaker, and a database management unit that stores the correct command and the input voice data in the database in association with each other
-
5.
公开(公告)号:US20240112682A1
公开(公告)日:2024-04-04
申请号:US18532054
申请日:2023-12-07
Inventor: Takahiro KAMAI , Misaki DOI , Katsunori DAIMO , Kousuke ITAKURA
CPC classification number: G10L17/06 , G06F16/68 , G10L15/02 , G10L2015/025
Abstract: An utterer identification device executes: performing voice recognition from input utterance data; selecting, from among a plurality of registered utterance contents set in advance, a registered utterance content closest to a recognized utterance content indicated by a result of the voice recognition as a selected utterance content; selecting, from among a plurality of databases respectively associated with the registered utterance contents, a database associated with the selected utterance content; calculating a similarity between a feature quantity of the input utterance data and a feature quantity stored in the selected database; and identifying a certain utterer on the basis of the similarity, and outputting a result of the identification.
-
6.
公开(公告)号:US20240087570A1
公开(公告)日:2024-03-14
申请号:US18517229
申请日:2023-11-22
Inventor: Takahiro KAMAI , Kousuke ITAKURA , Misaki DOI , Katsunori DAIMO
IPC: G10L15/22
CPC classification number: G10L15/22 , G10L2015/223
Abstract: A voice recognition device includes: a calculation unit that calculates a first feature amount that is a feature amount of input voice data acquired by a first acquisition unit; an estimation unit that estimates a driving situation of a mobile object on the basis of operation information acquired by a second acquisition unit; an extraction unit that extracts, from a feature amount database, a second feature amount corresponding to the driving situation; a recognition unit that recognizes an input command on the basis of similarity between the first feature amount and the second feature amount; and an output unit that outputs a recognition result.
-
公开(公告)号:US20230016655A1
公开(公告)日:2023-01-19
申请号:US17949682
申请日:2022-09-21
Inventor: Kousuke ITAKURA
Abstract: A speaker identification device acquires identification target voice data; acquires registered voice data; selects a first speaker identification model machine-learned using male voice data to identify a male speaker in a case where one of a sex of a speaker of the identification target voice data and a sex of a speaker of the registered voice data is male, and selects a second speaker identification model machine-learned using female voice data to identify a female speaker in a case where one of a sex of the speaker of the identification target voice data and a sex of the speaker of the registered voice data is female; and inputs a feature amount of the identification target voice data and a feature amount of the registered voice data to one of the selected first speaker identification model and second speaker identification model to identify the speaker of the identification target voice data.
-
公开(公告)号:US20210056955A1
公开(公告)日:2021-02-25
申请号:US16996408
申请日:2020-08-18
Inventor: Misaki DOI , Takahiro KAMAI , Kousuke ITAKURA
Abstract: A training method of training a speaker identification model which receives voice data as an input and outputs speaker identification information for identifying a speaker of an utterance included in the voice data is provided. The training method includes: performing voice quality conversion of first voice data of a first speaker to generate second voice data of a second speaker; and performing training of the speaker identification model using, as training data, the first voice data and the second voice data.
-
公开(公告)号:US20200160218A1
公开(公告)日:2020-05-21
申请号:US16680656
申请日:2019-11-12
Inventor: Kousuke ITAKURA , Ko MIZUNO
Abstract: In a behavior identification method, surrounding sound is acquired, a feature value that is specified by a spectrum pattern included in spectrum information generated from sound made by a person performing a predetermined behavior is extracted from the sound acquired, the predetermined behavior is identified by the feature value, and information indicating the predetermined behavior identified is output.
-
-
-
-
-
-
-
-