-
1.
公开(公告)号:US20230169988A1
公开(公告)日:2023-06-01
申请号:US17538336
申请日:2021-11-30
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Myungjong KIM , Vijendra Raj APSINGEKAR , Divya NEELAGIRI , Taeyeon KI
IPC: G10L21/0272 , G10L15/00 , G10L25/30 , G06N3/04
CPC classification number: G10L21/0272 , G10L15/005 , G10L25/30 , G06N3/0454 , G06N3/0481
Abstract: An apparatus for processing speech data may include a processor configured to: separate speech signals from an input speech; identify a language of each of the speech signals that are separated from the input speech; extract speaker embeddings from the speech signals based on the language of each of the speech signals, using at least one neural network configured to receive the speech signals and output the speaker embeddings; and identify a speaker of each of the speech signals by iteratively clustering the speaker embeddings.
-
2.
公开(公告)号:US20230169981A1
公开(公告)日:2023-06-01
申请号:US17538604
申请日:2021-11-30
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Myungjong KIM , Vijendra Raj APSINGEKAR , Aviral ANSHU , Taeyeon KI
IPC: G10L17/06 , G10L21/0308 , G10L17/02 , G10L17/18 , G06N3/04
CPC classification number: G10L17/06 , G10L21/0308 , G10L17/02 , G10L17/18 , G06N3/04
Abstract: An apparatus for processing speech data may include a processor configured to: separate an input speech into speech signals; identify a bandwidth of each of the speech signals; extract speaker embeddings from the speech signals based on the bandwidth of each of the speech signals, using at least one neural network configured to receive the speech signals and output the speaker embeddings; and cluster the speaker embeddings into one or more speaker clusters, each speaker cluster corresponding to a speaker identity.
-