OBJECT DETECTION USING NEURAL NETWORK SYSTEMS

    公开(公告)号:US20190019050A1

    公开(公告)日:2019-01-17

    申请号:US15650790

    申请日:2017-07-14

    Applicant: Google Inc.

    Abstract: Systems, methods, and apparatus, including computer programs encoded on a computer storage medium. In one aspect, a system includes initial neural network layers configured to: receive an input image, and process the input image to generate a plurality of first feature maps that characterize the input image; a location generating convolutional neural network layer configured to perform a convolution on the representation of the first plurality of feature maps to generate data defining a respective location of each of a predetermined number of bounding boxes in the input image, wherein each bounding box identifies a respective first region of the input image; and a confidence score generating convolutional neural network layer configured to perform a convolution on the representation of the first plurality of feature maps to generate a confidence score for each of the predetermined number of bounding boxes in the input image.

    Personalized entity repository
    12.
    发明授权

    公开(公告)号:US10178527B2

    公开(公告)日:2019-01-08

    申请号:US14962415

    申请日:2015-12-08

    Applicant: GOOGLE INC.

    Abstract: Systems and methods are provided for a personalized entity repository. For example, a computing device comprises a personalized entity repository having fixed sets of entities from an entity repository stored at a server, a processor, and memory storing instructions that cause the computing device to identify fixed sets of entities that are relevant to a user based on context associated with the computing device, rank the fixed sets by relevancy, and update the personalized entity repository using selected sets determined based on the rank and on set usage parameters applicable to the user. In another example, a method includes generating fixed sets of entities from an entity repository, including location-based sets and topic-based sets, and providing a subset of the fixed sets to a client, the client requesting the subset based on the client's location and on items identified in content generated for display on the client.

    Melody recognition systems
    14.
    发明授权
    Melody recognition systems 有权
    旋律识别系统

    公开(公告)号:US09569532B1

    公开(公告)日:2017-02-14

    申请号:US14300600

    申请日:2014-06-10

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting, from among a collection of videos, a set of candidate videos that (i) are identified as being associated with a particular song, and (ii) are classified as a cappella video recordings; extracting, from each of the candidate videos of the set, a monophonic melody line from an audio channel of the candidate video; selecting, from among the set of candidate videos, a subset of the candidate videos based on a similarity of the monophonic melody line of the candidate videos of the subset with each other; and providing, to a recognizer that recognizes songs from sounds produced by a human voice, (i) an identifier of the particular song, and (ii) one or more of the monophonic melody lines of the candidate videos of the subset.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于从视频集合中选择一组候选视频,所述一组候选视频被识别为与特定歌曲相关联,以及(ii) 被列为无伴奏视频录像; 从所述候选视频的音频频道中提取来自所述组的每个候选视频的单声道旋律线; 基于所述子集的候选视频的单声道旋律线的相似度,从所述一组候选视频中选择所述候选视频的子集; 以及提供识别器,其识别由人类声音产生的声音的歌曲,(i)特定歌曲的标识符,以及(ii)该子集的候选视频的一个或多个单声道旋律线。

    Hold back and real time ranking of results in a streaming matching system
    15.
    发明授权
    Hold back and real time ranking of results in a streaming matching system 有权
    在流媒体匹配系统中保持结果的实时排名

    公开(公告)号:US09529907B2

    公开(公告)日:2016-12-27

    申请号:US13732108

    申请日:2012-12-31

    Applicant: Google Inc.

    CPC classification number: G06F17/30743 G06F17/30758 G06F17/30769 G10L25/54

    Abstract: A matching system receives probe audio samples for comparison to references of a data store. Comparisons are generated to determine a sufficient match for a portion or a first amount of the probe sample. Ranking scores are assigned to the resulting match references. The match references are retained, unless meeting a score threshold. Comparisons are continually generated with second amounts of the probe sample and the retained references are updated with further matching references assigned ranking scores. The retained results are merged and determined to satisfy a score threshold for release as outputted results for matching references.

    Abstract translation: 匹配系统接收探针音频样本,以便与数据存储的引用进行比较。 生成比较以确定探针样品的一部分或第一量的足够的匹配。 排名得分被分配给结果匹配引用。 匹配引用被保留,除非满足分数阈值。 使用第二量的探针样品不断产生比较,并且使用进一步的匹配参考指定排名分数来更新保留的参考。 保留的结果被合并并确定为满足用于匹配引用的输出结果的释放分数阈值。

    TEXT-DEPENDENT SPEAKER IDENTIFICATION
    16.
    发明申请
    TEXT-DEPENDENT SPEAKER IDENTIFICATION 有权
    文本依赖性扬声器识别

    公开(公告)号:US20150294670A1

    公开(公告)日:2015-10-15

    申请号:US14612830

    申请日:2015-02-03

    Applicant: Google Inc.

    CPC classification number: G10L17/18 G10L17/005

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speaker verification. The methods, systems, and apparatus include actions of inputting speech data that corresponds to a particular utterance to a first neural network and determining an evaluation vector based on output at a hidden layer of the first neural network. Additional actions include obtaining a reference vector that corresponds to a past utterance of a particular speaker. Further actions include inputting the evaluation vector and the reference vector to a second neural network that is trained on a set of labeled pairs of feature vectors to identify whether speakers associated with the labeled pairs of feature vectors are the same speaker. More actions include determining, based on an output of the second neural network, whether the particular utterance was likely spoken by the particular speaker.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于说话者验证的计算机程序。 方法,系统和装置包括将对应于特定话语的语音数据输入到第一神经网络并基于第一神经网络的隐藏层处的输出来确定评估向量的动作。 附加动作包括获得对应于特定说话者的过去话语的参考矢量。 进一步的动作包括将评估向量和参考矢量输入到第二神经网络,该第二神经网络被训练在一组标记的特征矢量对上,以识别与标记的特征矢量对相关联的扬声器是否是相同的扬声器。 更多的动作包括基于第二神经网络的输出确定特定话语是否可能由特定说话者说出。

    Segment-based speaker verification using dynamically generated phrases
    17.
    发明授权
    Segment-based speaker verification using dynamically generated phrases 有权
    使用动态生成的短语进行基于段的演讲者验证

    公开(公告)号:US08812320B1

    公开(公告)日:2014-08-19

    申请号:US14242098

    申请日:2014-04-01

    Applicant: Google Inc.

    CPC classification number: G10L17/24 G10L15/02 G10L17/04 G10L2015/025

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.

    Abstract translation: 方法,系统和装置,包括编码在计算机存储介质上的计算机程序,用于验证用户的身份。 方法,系统和装置包括接收用于验证用户身份的验证短语的请求的动作。 附加动作包括响应于接收到用于验证用户身份的验证短语的请求,识别要包括在验证短语中的子词,并且响应于识别要包括在验证短语中的子词,获得候选短语 其包括至少一些所识别的子词作为验证短语。 进一步的操作包括提供验证短语作为对用于验证用户身份的验证短语的请求的响应。

    Machine Learning to Generate Music from Text
    18.
    发明申请

    公开(公告)号:US20180190249A1

    公开(公告)日:2018-07-05

    申请号:US15394895

    申请日:2016-12-30

    Applicant: Google Inc.

    Abstract: The present disclosure provides systems and methods that leverage one or more machine-learned models to generate music from text. In particular, a computing system can include a music generation model that is operable to extract one or more structural features from an input text. The one or more structural features can be indicative of a structure associated with the input text. The music generation model can generate a musical composition from the input text based at least in part on the one or more structural features. For example, the music generation model can generate a musical composition that exhibits a musical structure that mimics or otherwise corresponds to the structure associated with the input text. For example, the music generation model can include a machine-learned audio generation model. In such fashion, the systems and methods of the present disclosure can generate music that exhibits a globally consistent theme and/or structure.

    Personalized Entity Repository
    19.
    发明申请

    公开(公告)号:US20170118576A1

    公开(公告)日:2017-04-27

    申请号:US14962415

    申请日:2015-12-08

    Applicant: GOOGLE INC.

    Abstract: Systems and methods are provided for a personalized entity repository. For example, a computing device comprises a personalized entity repository having fixed sets of entities from an entity repository stored at a server, a processor, and memory storing instructions that cause the computing device to identify fixed sets of entities that are relevant to a user based on context associated with the computing device, rank the fixed sets by relevancy, and update the personalized entity repository using selected sets determined based on the rank and on set usage parameters applicable to the user. In another example, a method includes generating fixed sets of entities from an entity repository, including location-based sets and topic-based sets, and providing a subset of the fixed sets to a client, the client requesting the subset based on the client's location and on items identified in content generated for display on the client.

    Noise based interest point density pruning
    20.
    发明授权
    Noise based interest point density pruning 有权
    基于噪声的兴趣点密度修剪

    公开(公告)号:US09411884B1

    公开(公告)日:2016-08-09

    申请号:US14325115

    申请日:2014-07-07

    Applicant: Google Inc.

    CPC classification number: G06F17/30743 G06F17/30758 G10L25/54

    Abstract: Systems and methods for noise based interest point density pruning are disclosed herein. The systems include determining an amount of noise in an audio sample and adjusting the amount of interest points within an audio sample fingerprint based on the amount of noise. Samples containing high amounts of noise correspondingly generate fingerprints with more interest points. The disclosed systems and methods allow reference fingerprints to be reduced in size while increasing the size of sample fingerprints. The benefits in scalability do not compromise the accuracy of an audio matching system using noise based interest point density pruning.

    Abstract translation: 本文公开了基于噪声的兴趣点密度修剪的系统和方法。 系统包括确定音频样本中的噪声量,并且基于噪声量来调整音频样本指纹内的兴趣点的数量。 含有大量噪声的样本相应地产生了具有更多兴趣点的指纹。 所公开的系统和方法允许参考指纹的尺寸减小,同时增加样本指纹的大小。 可扩展性的优点不会影响使用基于噪声的兴趣点密度修剪的音频匹配系统的准确性。

Patent Agency Ranking