Patent search ap:("GOOGLE INC.") AND inv:"Matthew Sharifi" Page 3

21.

发明授权
Providing pre-computed hotword models 有权

公开(公告)号：US09646612B2

公开(公告)日：2017-05-09

申请号：US15288241

申请日：2016-10-07

Applicant: Google Inc.

Inventor： Matthew Sharifi

IPC: G10L15/00 , G10L15/22 , G06F3/16 , G10L15/08 , G06F3/0484 , G10L15/30

CPC classification number: G10L15/22 , G06F3/04842 , G06F3/167 , G10L15/063 , G10L15/08 , G10L15/18 , G10L15/265 , G10L15/30 , G10L2015/0631 , G10L2015/0638 , G10L2015/088 , G10L2015/223

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, for each of multiple words or sub-words, audio data corresponding to multiple users speaking the word or sub-word; training, for each of the multiple words or sub-words, a pre-computed hotword model for the word or sub-word based on the audio data for the word or sub-word; receiving a candidate hotword from a computing device; identifying one or more pre-computed hotword models that correspond to the candidate hotword; and providing the identified, pre-computed hotword models to the computing device.

22.

发明申请
HOTWORD RECOGNITION 有权

公开(公告)号：US20170110144A1

公开(公告)日：2017-04-20

申请号：US14943287

申请日：2015-11-17

Applicant: Google Inc.

Inventor： Matthew Sharifi , Jakob Nicolaus Foerster

IPC: G10L25/51 , G10L17/08 , G10L15/18 , G10L15/02

CPC classification number: G10L15/22 , G06F21/31 , G06F21/32 , G10L15/02 , G10L15/08 , G10L15/1815 , G10L17/08 , G10L25/51 , G10L2015/088 , G10L2015/223

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data corresponding to an utterance, determining that the audio data corresponds to a hotword, generating a hotword audio fingerprint of the audio data that is determined to correspond to the hotword, comparing the hotword audio fingerprint to one or more stored audio fingerprints of audio data that was previously determined to correspond to the hotword, detecting whether the hotword audio fingerprint matches a stored audio fingerprint of audio data that was previously determined to correspond to the hotword based on whether the comparison indicates a similarity between the hotword audio fingerprint and one of the one or more stored audio fingerprints that satisfies a predetermined threshold, and in response to detecting that the hotword audio fingerprint matches a stored audio fingerprint, disabling access to a computing device into which the utterance was spoken.

23.

发明申请
HOTWORD RECOGNITION 有权

公开(公告)号：US20170110130A1

公开(公告)日：2017-04-20

申请号：US15176830

申请日：2016-06-08

Applicant: Google Inc.

Inventor： Matthew Sharifi , Jakob Nicolaus Foerster

IPC: G10L17/02 , G10L15/18 , G10L15/28 , G10L25/51

CPC classification number: G10L17/02 , G06F21/32 , G07C9/00071 , G07C9/00158 , G10L15/1815 , G10L15/22 , G10L15/285 , G10L17/22 , G10L19/018 , G10L25/51 , G10L2015/088

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data corresponding to an utterance, determining that the audio data corresponds to a hotword, generating a hotword audio fingerprint of the audio data that is determined to correspond to the hotword, comparing the hotword audio fingerprint to one or more stored audio fingerprints of audio data that was previously determined to correspond to the hotword, detecting whether the hotword audio fingerprint matches a stored audio fingerprint of audio data that was previously determined to correspond to the hotword based on whether the comparison indicates a similarity between the hotword audio fingerprint and one of the one or more stored audio fingerprints that satisfies a predetermined threshold, and in response to detecting that the hotword audio fingerprint matches a stored audio fingerprint, disabling access to a computing device into which the utterance was spoken.

24.

发明授权
Multiple speech locale-specific hotword classifiers for selection of a speech locale 有权
Title translation: 用于选择语音区域的多语言区域特定的词典分类器

公开(公告)号：US09589564B2

公开(公告)日：2017-03-07

申请号：US14173264

申请日：2014-02-05

Applicant: Google Inc.

Inventor： Matthew Sharifi

IPC: G10L15/00 , G10L15/26 , G10L15/08 , G10L15/32

CPC classification number: G10L15/22 , G10L15/005 , G10L15/02 , G10L15/08 , G10L15/26 , G10L15/32 , G10L2015/088 , G10L2015/223

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech in an utterance. The methods, systems, and apparatus include actions of receiving an utterance and obtaining acoustic features from the utterance. Further actions include providing the acoustic features from the utterance to multiple speech locale-specific hotword classifiers. Each speech locale-specific hotword classifier (i) may be associated with a respective speech locale, and (ii) may be configured to classify audio features as corresponding to, or as not corresponding to, a respective predefined term. Additional actions may include selecting a speech locale for use in transcribing the utterance based on one or more results from the multiple speech locale-specific hotword classifiers in response to providing the acoustic features from the utterance to the multiple speech locale-specific hotword classifiers. Further actions may include selecting parameters for automated speech recognition based on the selected speech locale.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的用于以话语识别语音的计算机程序。方法，系统和装置包括从话语中接收发音和获得声学特征的动作。进一步的动作包括将声音特征从话语提供给多个语音区域特定的词典分类器。每个语音区域特定的词典分类器（i）可以与相应的语音区域相关联，并且（ii）可以被配置为将音频特征分类为对应于或相应于相应的预定义术语。另外的动作可以包括：响应于将声学特征从话语提供给多语音区域特定的词典分类器，基于来自多个语音区域特定的词典分类器的一个或多个结果来选择用于转录话语的语音区域。进一步的动作可以包括基于所选择的语音区域来选择用于自动语音识别的参数。

25.

发明授权
Melody recognition systems 有权
Title translation: 旋律识别系统

公开(公告)号：US09569532B1

公开(公告)日：2017-02-14

申请号：US14300600

申请日：2014-06-10

Applicant: Google Inc.

Inventor： Matthew Sharifi , Dominik Roblek , Vera Dron , Ioannis Agiomyrgiannakis

IPC: H04N5/92 , G06F17/30

CPC classification number: G06F17/3082 , G06F17/30758 , G06F17/30787 , G10H1/0008 , G10H1/368 , G10H2240/075 , G10H2240/135 , G10H2240/141

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting, from among a collection of videos, a set of candidate videos that (i) are identified as being associated with a particular song, and (ii) are classified as a cappella video recordings; extracting, from each of the candidate videos of the set, a monophonic melody line from an audio channel of the candidate video; selecting, from among the set of candidate videos, a subset of the candidate videos based on a similarity of the monophonic melody line of the candidate videos of the subset with each other; and providing, to a recognizer that recognizes songs from sounds produced by a human voice, (i) an identifier of the particular song, and (ii) one or more of the monophonic melody lines of the candidate videos of the subset.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于从视频集合中选择一组候选视频，所述一组候选视频被识别为与特定歌曲相关联，以及（ii）被列为无伴奏视频录像; 从所述候选视频的音频频道中提取来自所述组的每个候选视频的单声道旋律线; 基于所述子集的候选视频的单声道旋律线的相似度，从所述一组候选视频中选择所述候选视频的子集; 以及提供识别器，其识别由人类声音产生的声音的歌曲，（i）特定歌曲的标识符，以及（ii）该子集的候选视频的一个或多个单声道旋律线。

26.

发明申请
COLLABORATIVE LANGUAGE MODEL BIASING 有权
Title translation: 协同语言模型偏差

公开(公告)号：US20170032781A1

公开(公告)日：2017-02-02

申请号：US14811190

申请日：2015-07-28

Applicant: Google Inc.

Inventor： Matthew Sharifi , Jakob Nicolaus Foerster

IPC: G10L15/197 , G10L15/24

CPC classification number: G10L15/197 , G06F17/277 , G06F17/2785 , G10L2015/227

Abstract: Methods, including computer programs encoded on a computer storage medium, for collaborative language model biasing. In one aspect, a method includes receiving (i) data including a set of terms associated with a target user, and, (ii) from each of multiple other users, data including a set of terms associated with the other user, selecting a particular other user based at least on comparing the set of terms associated with the target user to the sets of terms associated with the other users, selecting one or more terms from the set of terms that is associated with the particular other user, obtaining, based on the selected terms that are associated with the particular other user, a biased language model, and providing the biased language model to an automated speech recognizer.

Abstract translation: 方法，包括在计算机存储介质上编码的计算机程序，用于协作语言模型偏移。在一个方面，一种方法包括接收（i）包括与目标用户相关联的一组术语的数据，以及（ii）来自多个其他用户中的每一个的数据包括与其他用户相关联的一组术语，选择特定的至少基于将与目标用户相关联的术语集合与与其他用户相关联的术语集进行比较来选择其他用户，从与特定其他用户相关联的术语集中选择一个或多个术语，基于与特定的其他用户相关联的所选术语，有偏差的语言模型，以及向偏移语言识别器提供有偏差的语言模型。

27.

发明授权
Associating audio tracks with video content 有权
Title translation: 将音轨与视频内容相关联

公开(公告)号：US09542488B2

公开(公告)日：2017-01-10

申请号：US13957944

申请日：2013-08-02

Applicant: Google Inc.

Inventor： Matthew Sharifi

IPC: G06F7/00 , G06F17/30 , H04N21/439 , H04N21/4722 , H04N21/81

CPC classification number: G06F17/30769 , G06F17/30743 , G06F17/30749 , H04N21/4394 , H04N21/4722 , H04N21/8113

Abstract: In one example, a system comprises at least one processor configured to determine an indication of an audio portion of video content, determine, based at least in part on the indication, one or more candidate audio tracks, determine, based at least in part on the one or more candidate audio tracks, one or more search terms, and provide a search query that includes the search terms. The at least one processor may be further configured to, in response to the search query, receive a response that indicates a number of search results, wherein each one of the search results is associated with content that includes the one or more search terms, select, based at least in part on the response, a particular audio track of the one or more candidate audio tracks, and send a message that associates the video content with at least the particular audio track.

Abstract translation: 在一个示例中，系统包括被配置为确定视频内容的音频部分的指示的至少一个处理器，至少部分地基于指示确定一个或多个候选音轨，至少部分地基于一个或多个候选音轨，一个或多个搜索词，并提供包括搜索词的搜索查询。所述至少一个处理器还可以被配置为响应于搜索查询而接收指示搜索结果的数量的响应，其中每个搜索结果与包括一个或多个搜索项的内容相关联，选择至少部分地基于所述响应，所述一个或多个候选音频轨道的特定音轨，并且发送将所述视频内容与至少所述特定音频轨道相关联的消息。

28.

发明授权
Detection of inactive broadcasts during live stream ingestion 有权
Title translation: 在实况流摄入过程中检测无效广播

公开(公告)号：US09536151B1

公开(公告)日：2017-01-03

申请号：US14581789

申请日：2014-12-23

Applicant: Google Inc.

Inventor： Gheorghe Postelnicu , Matthew Sharifi

IPC: G06K9/00 , G10L19/018

CPC classification number: G06K9/00744 , G06K9/00751 , G10L19/018

Abstract: Systems and methods are provided herein relating to real-time detection of inactive broadcasts during live stream ingestion. Both audio fingerprints and video fingerprints can be dynamically and continuously generated for a live stream ingestion. Sets of video fingerprints and sets of audio fingerprints can be continuously generated based on common successive overlapping time windows. A set of audio fingerprints and a set of video fingerprints can be associated with each time window. Video similarity scores and audio similarity scores can be generates for each time window to determine whether the stream is inactive or static during the time window. Only fingerprints relating to an active broadcast can be indexed in a fingerprint index.

Abstract translation: 本文提供的系统和方法涉及在实时流摄取期间实时检测无效广播。可以动态连续地生成音频指纹和视频指纹，用于实况流摄取。可以基于共同的连续重叠时间窗口连续地生成视频指纹集和音频指纹集。一组音频指纹和一组视频指纹可以与每个时间窗口相关联。可以为每个时间窗口生成视频相似度分数和音频相似性分数，以在时间窗口中确定流是不活动还是静态。只有与主动广播相关的指纹才能在指纹索引中索引。

29.

发明授权
Hold back and real time ranking of results in a streaming matching system 有权
Title translation: 在流媒体匹配系统中保持结果的实时排名

公开(公告)号：US09529907B2

公开(公告)日：2016-12-27

申请号：US13732108

申请日：2012-12-31

Applicant: Google Inc.

Inventor： Dominik Roblek , Matthew Sharifi

IPC: G06F17/30 , G10L25/54

CPC classification number: G06F17/30743 , G06F17/30758 , G06F17/30769 , G10L25/54

Abstract: A matching system receives probe audio samples for comparison to references of a data store. Comparisons are generated to determine a sufficient match for a portion or a first amount of the probe sample. Ranking scores are assigned to the resulting match references. The match references are retained, unless meeting a score threshold. Comparisons are continually generated with second amounts of the probe sample and the retained references are updated with further matching references assigned ranking scores. The retained results are merged and determined to satisfy a score threshold for release as outputted results for matching references.

Abstract translation: 匹配系统接收探针音频样本，以便与数据存储的引用进行比较。生成比较以确定探针样品的一部分或第一量的足够的匹配。排名得分被分配给结果匹配引用。匹配引用被保留，除非满足分数阈值。使用第二量的探针样品不断产生比较，并且使用进一步的匹配参考指定排名分数来更新保留的参考。保留的结果被合并并确定为满足用于匹配引用的输出结果的释放分数阈值。

30.

发明申请
Method for Siren Detection Based on Audio Samples 有权
Title translation: 基于音频样本的警笛检测方法

公开(公告)号：US20160155452A1

公开(公告)日：2016-06-02

申请号：US15004232

申请日：2016-01-22

Applicant: Google Inc.

Inventor： Matthew Sharifi , Dominick Roblek

IPC: G10L19/06 , G08B29/18 , G08B3/10 , G10L19/022 , G10L25/51

CPC classification number: G10L19/06 , G06F17/30743 , G08B3/10 , G08B29/185 , G10L19/022 , G10L25/51 , H04R29/00

Abstract: The present disclosure provides methods and apparatuses that enable an apparatus to identify sounds from short samples of audio. The apparatus may capture an audio sample and create several audio signals of different lengths, each containing audio from the captured audio sample. The apparatus my process the several audio signals in an attempt to identify features of the audio signal that indicate an identification of the captured sound. Because shorter audio samples can be analyzed more quickly, the system may first process the shortest audio samples in order to quickly identify features of the audio signal. Because longer audio samples contain more information, the system may be able to more accurately identify features in the audio signal in longer audio samples. However, analyzing longer audio signals takes more buffered audio than identifying features in shorter signals. Therefore, the present system attempts to identify features in the shortest audio signals first.

Abstract translation: 本公开提供使装置能够从短音频样本识别声音的方法和装置。该设备可以捕获音频样本并创建不同长度的多个音频信号，每个音频信号包含来自捕获的音频样本的音频。该设备我的处理多个音频信号以尝试识别音频信号的特征，其指示捕获的声音的识别。因为可以更快地分析较短的音频样本，所以系统可以首先处理最短的音频样本，以便快速识别音频信号的特征。因为较长的音频样本包含更多信息，所以系统可能能够更准确地识别较长音频样本中的音频信号中的特征。然而，分析更长的音频信号比识别较短信号中的特征需要更多的缓冲音频。因此，本系统首先尝试识别最短音频信号中的特征。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification