-
公开(公告)号:US09679573B1
公开(公告)日:2017-06-13
申请号:US14842565
申请日:2015-09-01
Applicant: Google Inc.
Inventor: Gheorghe Postelnicu , Matthew Sharifi , Yaniv Bernstein
IPC: G06F17/00 , G10L19/018 , G06F17/30 , H04N21/233
CPC classification number: G10L19/018 , G06F17/30017 , G06F17/30743 , G06F17/30758 , G10L25/54 , H04N21/233
Abstract: Systems and techniques for adding pitch shift resistance to an audio fingerprint are presented. In particular, an audio track for a media file is received. A first audio fingerprint for the audio track with a first pitch shift and an Nth audio fingerprint for the audio track with an Mth pitch shift are generated, where N is an integer greater than or equal to two and M is an integer greater than or equal to two. A combined audio fingerprint is generated from at least the first audio fingerprint and the Nth audio fingerprint.
-
公开(公告)号:US09508023B1
公开(公告)日:2016-11-29
申请号:US14257683
申请日:2014-04-21
Applicant: Google Inc.
Inventor: Matthew Sharifi , Sergey Ioffe , Jay Yagnik , Gheorghe Postelnicu , Dominik Roblek , George Tzanetakis
CPC classification number: G06K9/6267 , G06F17/3002 , G06F17/30244 , G06K9/00013
Abstract: This disclosure relates to transformation invariant media matching. A fingerprinting component can generate a transformation invariant identifier for media content by adaptively encoding the relative ordering of interest points in media content. The interest points can be grouped into subsets, and stretch invariant descriptors can be generated for the subsets based on ratios of coordinates of interest points included in the subsets. The stretch invariant descriptors can be aggregated into a transformation invariant identifier. An identification component compares the identifier against a set of identifiers for known media content, and the media content can be matched or identified as a function of the comparison.
Abstract translation: 本公开涉及变换不变媒体匹配。 指纹分量可以通过对媒体内容中的兴趣点的相对排序进行自适应编码来生成媒体内容的变换不变标识符。 可以将兴趣点分组为子集,并且可以基于子集中包括的兴趣点坐标的比例为子集生成拉伸不变描述符。 拉伸不变描述符可以聚合成变换不变标识符。 识别部件将标识符与已知媒体内容的一组标识符进行比较,并且媒体内容可以作为比较的函数进行匹配或标识。
-
公开(公告)号:US09236056B1
公开(公告)日:2016-01-12
申请号:US13966200
申请日:2013-08-13
Applicant: Google Inc.
Inventor: Boris Nikolaev Daskalov , Gheorghe Postelnicu
IPC: G06F17/00 , G10L19/018
CPC classification number: G06F17/3002
Abstract: Implementations are provided herein relating to audio matching. A variable length local sensitivity hash (“LSH”) index can be created through a careful examination of existing LSH bands in the LSH index. LSH bands with offset lists that meet a band size threshold can be lengthened repeatedly until a maximum length threshold is reached or an offset list associated with a lengthened LSH band fails to meet the band size threshold. The LSH index can be further tuned by down-sampling or discarding LSH bands that reach a maximum length threshold and still lack discriminate properties.
Abstract translation: 本文提供了与音频匹配有关的实现。 可以通过仔细检查LSH索引中的现有LSH频带来创建可变长度的局部灵敏度散列(“LSH”)索引。 具有满足频带大小阈值的偏移列表的LSH频带可以重复地延长,直到达到最大长度阈值或者与延长的LSH频带相关联的偏移列表不能满足频带大小阈值。 LSH索引可以进一步通过下采样或丢弃达到最大长度阈值并仍然缺乏区别属性的LSH频带进行调整。
-
公开(公告)号:US09031840B2
公开(公告)日:2015-05-12
申请号:US14142042
申请日:2013-12-27
Applicant: Google Inc.
Inventor: Matthew Sharifi , Gheorghe Postelnicu
CPC classification number: G10L15/265 , G06F17/30746 , G10H2210/031 , G10H2240/141 , G10L15/00 , G10L19/00 , G10L25/54
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving (i) audio data that encodes a spoken natural language query, and (ii) environmental audio data, obtaining a transcription of the spoken natural language query, determining a particular content type associated with one or more keywords in the transcription, providing at least a portion of the environmental audio data to a content recognition engine, and identifying a content item that has been output by the content recognition engine, and that matches the particular content type.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于接收(i)编码口语自然语言查询的音频数据,以及(ii)环境音频数据,获得口语自然语言查询的转录, 确定与所述转录中的一个或多个关键字相关联的特定内容类型,将所述环境音频数据的至少一部分提供给内容识别引擎,以及识别由所述内容识别引擎输出的内容项, 特定内容类型。
-
公开(公告)号:US20140074474A1
公开(公告)日:2014-03-13
申请号:US13768232
申请日:2013-02-15
Applicant: GOOGLE INC.
Inventor: Matthew Sharifi , Gheorghe Postelnicu
IPC: G10L15/00
CPC classification number: G10L15/265 , G06F17/30746 , G10H2210/031 , G10H2240/141 , G10L15/00 , G10L19/00 , G10L25/54
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving (i) audio data that encodes a spoken natural language query, and (ii) environmental audio data, obtaining a transcription of the spoken natural language query, determining a particular content type associated with one or more keywords in the transcription, providing at least a portion of the environmental audio data to a content recognition engine, and identifying a content item that has been output by the content recognition engine, and that matches the particular content type.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于接收(i)编码口语自然语言查询的音频数据,以及(ii)环境音频数据,获得口语自然语言查询的转录, 确定与所述转录中的一个或多个关键字相关联的特定内容类型,将所述环境音频数据的至少一部分提供给内容识别引擎,以及识别由所述内容识别引擎输出的内容项, 特定内容类型。
-
公开(公告)号:US08655657B1
公开(公告)日:2014-02-18
申请号:US13768232
申请日:2013-02-15
Applicant: Google Inc.
Inventor: Matthew Sharifi , Gheorghe Postelnicu
IPC: G10L15/04
CPC classification number: G10L15/265 , G06F17/30746 , G10H2210/031 , G10H2240/141 , G10L15/00 , G10L19/00 , G10L25/54
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving (i) audio data that encodes a spoken natural language query, and (ii) environmental audio data, obtaining a transcription of the spoken natural language query, determining a particular content type associated with one or more keywords in the transcription, providing at least a portion of the environmental audio data to a content recognition engine, and identifying a content item that has been output by the content recognition engine, and that matches the particular content type.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于接收(i)编码口语自然语言查询的音频数据,以及(ii)环境音频数据,获得口语自然语言查询的转录, 确定与所述转录中的一个或多个关键字相关联的特定内容类型,将所述环境音频数据的至少一部分提供给内容识别引擎,以及识别由所述内容识别引擎输出的内容项, 特定内容类型。
-
-
-
-
-