HOTWORD RECOGNITION
    22.
    发明申请

    公开(公告)号:US20170110144A1

    公开(公告)日:2017-04-20

    申请号:US14943287

    申请日:2015-11-17

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data corresponding to an utterance, determining that the audio data corresponds to a hotword, generating a hotword audio fingerprint of the audio data that is determined to correspond to the hotword, comparing the hotword audio fingerprint to one or more stored audio fingerprints of audio data that was previously determined to correspond to the hotword, detecting whether the hotword audio fingerprint matches a stored audio fingerprint of audio data that was previously determined to correspond to the hotword based on whether the comparison indicates a similarity between the hotword audio fingerprint and one of the one or more stored audio fingerprints that satisfies a predetermined threshold, and in response to detecting that the hotword audio fingerprint matches a stored audio fingerprint, disabling access to a computing device into which the utterance was spoken.

    HOTWORD RECOGNITION
    23.
    发明申请

    公开(公告)号:US20170110130A1

    公开(公告)日:2017-04-20

    申请号:US15176830

    申请日:2016-06-08

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data corresponding to an utterance, determining that the audio data corresponds to a hotword, generating a hotword audio fingerprint of the audio data that is determined to correspond to the hotword, comparing the hotword audio fingerprint to one or more stored audio fingerprints of audio data that was previously determined to correspond to the hotword, detecting whether the hotword audio fingerprint matches a stored audio fingerprint of audio data that was previously determined to correspond to the hotword based on whether the comparison indicates a similarity between the hotword audio fingerprint and one of the one or more stored audio fingerprints that satisfies a predetermined threshold, and in response to detecting that the hotword audio fingerprint matches a stored audio fingerprint, disabling access to a computing device into which the utterance was spoken.

    Multiple speech locale-specific hotword classifiers for selection of a speech locale
    24.
    发明授权
    Multiple speech locale-specific hotword classifiers for selection of a speech locale 有权
    用于选择语音区域的多语言区域特定的词典分类器

    公开(公告)号:US09589564B2

    公开(公告)日:2017-03-07

    申请号:US14173264

    申请日:2014-02-05

    Applicant: Google Inc.

    Inventor: Matthew Sharifi

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech in an utterance. The methods, systems, and apparatus include actions of receiving an utterance and obtaining acoustic features from the utterance. Further actions include providing the acoustic features from the utterance to multiple speech locale-specific hotword classifiers. Each speech locale-specific hotword classifier (i) may be associated with a respective speech locale, and (ii) may be configured to classify audio features as corresponding to, or as not corresponding to, a respective predefined term. Additional actions may include selecting a speech locale for use in transcribing the utterance based on one or more results from the multiple speech locale-specific hotword classifiers in response to providing the acoustic features from the utterance to the multiple speech locale-specific hotword classifiers. Further actions may include selecting parameters for automated speech recognition based on the selected speech locale.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于以话语识别语音的计算机程序。 方法,系统和装置包括从话语中接收发音和获得声学特征的动作。 进一步的动作包括将声音特征从话语提供给多个语音区域特定的词典分类器。 每个语音区域特定的词典分类器(i)可以与相应的语音区域相关联,并且(ii)可以被配置为将音频特征分类为对应于或相应于相应的预定义术语。 另外的动作可以包括:响应于将声学特征从话语提供给多语音区域特定的词典分类器,基于来自多个语音区域特定的词典分类器的一个或多个结果来选择用于转录话语的语音区域。 进一步的动作可以包括基于所选择的语音区域来选择用于自动语音识别的参数。

    Melody recognition systems
    25.
    发明授权
    Melody recognition systems 有权
    旋律识别系统

    公开(公告)号:US09569532B1

    公开(公告)日:2017-02-14

    申请号:US14300600

    申请日:2014-06-10

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting, from among a collection of videos, a set of candidate videos that (i) are identified as being associated with a particular song, and (ii) are classified as a cappella video recordings; extracting, from each of the candidate videos of the set, a monophonic melody line from an audio channel of the candidate video; selecting, from among the set of candidate videos, a subset of the candidate videos based on a similarity of the monophonic melody line of the candidate videos of the subset with each other; and providing, to a recognizer that recognizes songs from sounds produced by a human voice, (i) an identifier of the particular song, and (ii) one or more of the monophonic melody lines of the candidate videos of the subset.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于从视频集合中选择一组候选视频,所述一组候选视频被识别为与特定歌曲相关联,以及(ii) 被列为无伴奏视频录像; 从所述候选视频的音频频道中提取来自所述组的每个候选视频的单声道旋律线; 基于所述子集的候选视频的单声道旋律线的相似度,从所述一组候选视频中选择所述候选视频的子集; 以及提供识别器,其识别由人类声音产生的声音的歌曲,(i)特定歌曲的标识符,以及(ii)该子集的候选视频的一个或多个单声道旋律线。

    COLLABORATIVE LANGUAGE MODEL BIASING
    26.
    发明申请
    COLLABORATIVE LANGUAGE MODEL BIASING 有权
    协同语言模型偏差

    公开(公告)号:US20170032781A1

    公开(公告)日:2017-02-02

    申请号:US14811190

    申请日:2015-07-28

    Applicant: Google Inc.

    CPC classification number: G10L15/197 G06F17/277 G06F17/2785 G10L2015/227

    Abstract: Methods, including computer programs encoded on a computer storage medium, for collaborative language model biasing. In one aspect, a method includes receiving (i) data including a set of terms associated with a target user, and, (ii) from each of multiple other users, data including a set of terms associated with the other user, selecting a particular other user based at least on comparing the set of terms associated with the target user to the sets of terms associated with the other users, selecting one or more terms from the set of terms that is associated with the particular other user, obtaining, based on the selected terms that are associated with the particular other user, a biased language model, and providing the biased language model to an automated speech recognizer.

    Abstract translation: 方法,包括在计算机存储介质上编码的计算机程序,用于协作语言模型偏移。 在一个方面,一种方法包括接收(i)包括与目标用户相关联的一组术语的数据,以及(ii)来自多个其他用户中的每一个的数据包括与其他用户相关联的一组术语,选择特定的 至少基于将与目标用户相关联的术语集合与与其他用户相关联的术语集进行比较来选择其他用户,从与特定其他用户相关联的术语集中选择一个或多个术语,基于 与特定的其他用户相关联的所选术语,有偏差的语言模型,以及向偏移语言识别器提供有偏差的语言模型。

    Associating audio tracks with video content
    27.
    发明授权
    Associating audio tracks with video content 有权
    将音轨与视频内容相关联

    公开(公告)号:US09542488B2

    公开(公告)日:2017-01-10

    申请号:US13957944

    申请日:2013-08-02

    Applicant: Google Inc.

    Inventor: Matthew Sharifi

    Abstract: In one example, a system comprises at least one processor configured to determine an indication of an audio portion of video content, determine, based at least in part on the indication, one or more candidate audio tracks, determine, based at least in part on the one or more candidate audio tracks, one or more search terms, and provide a search query that includes the search terms. The at least one processor may be further configured to, in response to the search query, receive a response that indicates a number of search results, wherein each one of the search results is associated with content that includes the one or more search terms, select, based at least in part on the response, a particular audio track of the one or more candidate audio tracks, and send a message that associates the video content with at least the particular audio track.

    Abstract translation: 在一个示例中,系统包括被配置为确定视频内容的音频部分的指示的至少一个处理器,至少部分地基于指示确定一个或多个候选音轨,至少部分地基于 一个或多个候选音轨,一个或多个搜索词,并提供包括搜索词的搜索查询。 所述至少一个处理器还可以被配置为响应于搜索查询而接收指示搜索结果的数量的响应,其中每个搜索结果与包括一个或多个搜索项的内容相关联,选择 至少部分地基于所述响应,所述一个或多个候选音频轨道的特定音轨,并且发送将所述视频内容与至少所述特定音频轨道相关联的消息。

    Detection of inactive broadcasts during live stream ingestion
    28.
    发明授权
    Detection of inactive broadcasts during live stream ingestion 有权
    在实况流摄入过程中检测无效广播

    公开(公告)号:US09536151B1

    公开(公告)日:2017-01-03

    申请号:US14581789

    申请日:2014-12-23

    Applicant: Google Inc.

    CPC classification number: G06K9/00744 G06K9/00751 G10L19/018

    Abstract: Systems and methods are provided herein relating to real-time detection of inactive broadcasts during live stream ingestion. Both audio fingerprints and video fingerprints can be dynamically and continuously generated for a live stream ingestion. Sets of video fingerprints and sets of audio fingerprints can be continuously generated based on common successive overlapping time windows. A set of audio fingerprints and a set of video fingerprints can be associated with each time window. Video similarity scores and audio similarity scores can be generates for each time window to determine whether the stream is inactive or static during the time window. Only fingerprints relating to an active broadcast can be indexed in a fingerprint index.

    Abstract translation: 本文提供的系统和方法涉及在实时流摄取期间实时检测无效广播。 可以动态连续地生成音频指纹和视频指纹,用于实况流摄取。 可以基于共同的连续重叠时间窗口连续地生成视频指纹集和音频指纹集。 一组音频指纹和一组视频指纹可以与每个时间窗口相关联。 可以为每个时间窗口生成视频相似度分数和音频相似性分数,以在时间窗口中确定流是不活动还是静态。 只有与主动广播相关的指纹才能在指纹索引中索引。

    Hold back and real time ranking of results in a streaming matching system
    29.
    发明授权
    Hold back and real time ranking of results in a streaming matching system 有权
    在流媒体匹配系统中保持结果的实时排名

    公开(公告)号:US09529907B2

    公开(公告)日:2016-12-27

    申请号:US13732108

    申请日:2012-12-31

    Applicant: Google Inc.

    CPC classification number: G06F17/30743 G06F17/30758 G06F17/30769 G10L25/54

    Abstract: A matching system receives probe audio samples for comparison to references of a data store. Comparisons are generated to determine a sufficient match for a portion or a first amount of the probe sample. Ranking scores are assigned to the resulting match references. The match references are retained, unless meeting a score threshold. Comparisons are continually generated with second amounts of the probe sample and the retained references are updated with further matching references assigned ranking scores. The retained results are merged and determined to satisfy a score threshold for release as outputted results for matching references.

    Abstract translation: 匹配系统接收探针音频样本,以便与数据存储的引用进行比较。 生成比较以确定探针样品的一部分或第一量的足够的匹配。 排名得分被分配给结果匹配引用。 匹配引用被保留,除非满足分数阈值。 使用第二量的探针样品不断产生比较,并且使用进一步的匹配参考指定排名分数来更新保留的参考。 保留的结果被合并并确定为满足用于匹配引用的输出结果的释放分数阈值。

    Method for Siren Detection Based on Audio Samples
    30.
    发明申请
    Method for Siren Detection Based on Audio Samples 有权
    基于音频样本的警笛检测方法

    公开(公告)号:US20160155452A1

    公开(公告)日:2016-06-02

    申请号:US15004232

    申请日:2016-01-22

    Applicant: Google Inc.

    Abstract: The present disclosure provides methods and apparatuses that enable an apparatus to identify sounds from short samples of audio. The apparatus may capture an audio sample and create several audio signals of different lengths, each containing audio from the captured audio sample. The apparatus my process the several audio signals in an attempt to identify features of the audio signal that indicate an identification of the captured sound. Because shorter audio samples can be analyzed more quickly, the system may first process the shortest audio samples in order to quickly identify features of the audio signal. Because longer audio samples contain more information, the system may be able to more accurately identify features in the audio signal in longer audio samples. However, analyzing longer audio signals takes more buffered audio than identifying features in shorter signals. Therefore, the present system attempts to identify features in the shortest audio signals first.

    Abstract translation: 本公开提供使装置能够从短音频样本识别声音的方法和装置。 该设备可以捕获音频样本并创建不同长度的多个音频信号,每个音频信号包含来自捕获的音频样本的音频。 该设备我的处理多个音频信号以尝试识别音频信号的特征,其指示捕获的声音的识别。 因为可以更快地分析较短的音频样本,所以系统可以首先处理最短的音频样本,以便快速识别音频信号的特征。 因为较长的音频样本包含更多信息,所以系统可能能够更准确地识别较长音频样本中的音频信号中的特征。 然而,分析更长的音频信号比识别较短信号中的特征需要更多的缓冲音频。 因此,本系统首先尝试识别最短音频信号中的特征。

Patent Agency Ranking