-
公开(公告)号:US20150127342A1
公开(公告)日:2015-05-07
申请号:US14523198
申请日:2014-10-24
Applicant: Google Inc.
Inventor: Matthew Sharifi , Ignacio Lopez Moreno , Ludwig Schmidt
CPC classification number: G10L17/02 , G10L17/005 , G10L17/08 , G10L17/18 , G10L25/51
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, an utterance vector that is derived from an utterance is obtained. Hash values are determined for the utterance vector according to multiple different hash functions. A set of speaker vectors from a plurality of hash tables is determined using the hash values, where each speaker vector was derived from one or more utterances of a respective speaker. The speaker vectors in the set are compared with the utterance vector. A speaker vector is selected based on comparing the speaker vectors in the set with the utterance vector.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于执行说话人识别的计算机程序。 在一些实现中,获得从话语导出的话语向量。 根据多个不同的哈希函数为发声向量确定哈希值。 使用散列值来确定来自多个散列表的一组扬声器向量,其中每个扬声器向量是从相应说话者的一个或多个话语导出的。 将集合中的扬声器矢量与发声矢量进行比较。 基于将集合中的扬声器矢量与发声矢量进行比较来选择扬声器矢量。
-
公开(公告)号:US09514753B2
公开(公告)日:2016-12-06
申请号:US14523198
申请日:2014-10-24
Applicant: Google Inc.
Inventor: Matthew Sharifi , Ignacio Lopez Moreno , Ludwig Schmidt
CPC classification number: G10L17/02 , G10L17/005 , G10L17/08 , G10L17/18 , G10L25/51
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, an utterance vector that is derived from an utterance is obtained. Hash values are determined for the utterance vector according to multiple different hash functions. A set of speaker vectors from a plurality of hash tables is determined using the hash values, where each speaker vector was derived from one or more utterances of a respective speaker. The speaker vectors in the set are compared with the utterance vector. A speaker vector is selected based on comparing the speaker vectors in the set with the utterance vector.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于执行说话人识别的计算机程序。 在一些实现中,获得从话语导出的话语向量。 根据多个不同的哈希函数为发声向量确定哈希值。 使用散列值来确定来自多个散列表的一组扬声器向量,其中每个扬声器向量是从相应说话者的一个或多个话语导出的。 将集合中的扬声器矢量与发声矢量进行比较。 基于将集合中的扬声器矢量与发声矢量进行比较来选择扬声器矢量。
-
公开(公告)号:US20170287487A1
公开(公告)日:2017-10-05
申请号:US15624760
申请日:2017-06-16
Applicant: Google Inc.
Inventor: Matthew Sharifi , Ignacio Lopez Moreno , Ludwig Schmidt
CPC classification number: G10L17/02 , G10L17/005 , G10L17/08 , G10L17/18 , G10L25/51
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, data identifying a media item including speech of a speaker is received. Based on the received data, one or more other media items that include speech of the speaker are identified. One or more search results are generated that each reference a respective media item of the one or more other media items that include speech of the speaker. The one or more search results are provided for display.
-
公开(公告)号:US20160275953A1
公开(公告)日:2016-09-22
申请号:US15170264
申请日:2016-06-01
Applicant: Google Inc.
Inventor: Matthew Sharifi , Ignacio Lopez Moreno , Ludwig Schmidt
CPC classification number: G10L17/02 , G10L17/005 , G10L17/08 , G10L17/18 , G10L25/51
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, data identifying a media item including speech of a speaker is received. Based on the received data, one or more other media items that include speech of the speaker are identified. One or more search results are generated that each reference a respective media item of the one or more other media items that include speech of the speaker. The one or more search results are provided for display.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于执行说话人识别的计算机程序。 在一些实现中,接收识别包括说话者的语音的媒体项目的数据。 基于接收的数据,识别包括说话者的语音的一个或多个其他媒体项目。 生成一个或多个搜索结果,每个引用包括说话者的语音的一个或多个其他媒体项的相应媒体项。 一个或多个搜索结果被提供用于显示。
-
-
-