-
-
公开(公告)号:EP4330965A1
公开(公告)日:2024-03-06
申请号:EP22724184.1
申请日:2022-04-27
-
公开(公告)号:EP4323988A1
公开(公告)日:2024-02-21
申请号:EP21742664.2
申请日:2021-06-22
申请人: GOOGLE LLC
发明人: GRANGIER, David , ZEGHIDOUR, Neil , TEBOUL, Oliver
-
公开(公告)号:EP4082008B1
公开(公告)日:2024-01-31
申请号:EP20842838.3
申请日:2020-12-21
发明人: MUHAMED, Aashiq , GHOSE, Susmita
-
公开(公告)号:EP3791393B1
公开(公告)日:2023-01-25
申请号:EP19723576.5
申请日:2019-04-30
发明人: ZHANG, Shixiong , XIAO, Xiong
IPC分类号: G10L17/18 , G01N21/898
-
公开(公告)号:EP4086904A1
公开(公告)日:2022-11-09
申请号:EP22181074.0
申请日:2019-12-04
申请人: Google LLC
发明人: MORENO, Ignacio Lopez , WANG, Quan , PELECANOS, Jason , WAN, Li , GRUENSTEIN, Alexander , ERDOGAN, Hakan
摘要: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.
-
公开(公告)号:EP4084000A3
公开(公告)日:2022-11-09
申请号:EP22179382.1
申请日:2016-07-27
申请人: Google LLC
摘要: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.
-
公开(公告)号:EP4040435A1
公开(公告)日:2022-08-10
申请号:EP21305164.2
申请日:2021-02-05
申请人: ALE International
IPC分类号: G10L17/18
摘要: A method and system for detecting and recognizing an actual participant (607) as an active speaker in a teleconference (606) by training an encoder before the teleconference takes place, and creating a database (602) of reference vectors representing the voice of candidate participants, then, while the teleconference takes place, comparing reference vectors with vectors representing the voice stream of actual participants. The encoder may for example be a Convolutional Neural Network (CNN).
-
公开(公告)号:EP3706117B1
公开(公告)日:2022-05-11
申请号:EP20161160.5
申请日:2020-03-05
发明人: PARK, Sung-Un , KIM, Kyuhong
IPC分类号: G10L17/04 , G10L17/20 , G10L17/02 , G10L17/18 , G10L21/0216
-
10.
公开(公告)号:EP3979240A1
公开(公告)日:2022-04-06
申请号:EP19930251.4
申请日:2019-05-28
申请人: NEC Corporation
摘要: A neural network input unit 81 inputs a neural network in which a first network having a layer for inputting an anchor signal belonging to a predetermined class and a mixed signal including a target signal belonging to the class and a layer for outputting, as an estimation result, a reconstruction mask indicating a time-frequency domain in which the target signal is present in the mixed signal, and a second network having a layer for inputting the target signal extracted by applying the mixed signal to the reconstruction mask and a layer for outputting a result obtained by classifying the input target signal into a predetermined class are combined. A reconstruction mask estimation unit 82 applies the anchor signal and mixed signal to the first network to estimate the reconstruction mask of the class to which the anchor signal belongs. A signal classification unit 83 applies the mixed signal to the estimated reconstruction mask to extract the target signal, and applies the extracted target signal to the second network to classify the target signal into the class.
-
-
-
-
-
-
-
-
-