Accurate extraction of chroma vectors from an audio signal

    公开(公告)号:US10297271B1

    公开(公告)日:2019-05-21

    申请号:US15823357

    申请日:2017-11-27

    Applicant: Google Inc.

    Abstract: A matrix is generated that stores sinusoidal components evaluated for a given sample rate corresponding to the matrix. The matrix is then used to convert an audio signal to chroma vectors representing of a set of “chromae” (frequencies of interest). The conversion of an audio signal portion into its chromae enables more meaningful analysis of the audio signal than would be possible using the signal data alone. The chroma vectors of the audio signal can be used to perform analyzes such as comparisons with the chroma vectors obtained from other audio signals in order to identify audio matches.

    Machine-learned virtual sensor model for multiple sensors

    公开(公告)号:US20180189647A1

    公开(公告)日:2018-07-05

    申请号:US15393322

    申请日:2016-12-29

    Applicant: Google, Inc.

    CPC classification number: G06N3/08 G05B13/0265 G06N3/00 G06N5/022

    Abstract: The present disclosure provides systems and methods that leverage machine learning to refine and/or predict sensor outputs for multiple sensors. In particular, systems and methods of the present disclosure can include and use a machine-learned virtual sensor model that has been trained to receive sensor data from multiple sensors that is indicative of one or more measured parameters in each sensor's physical environment, recognize correlations among sensor outputs of the multiple sensors, and in response to receipt of the sensor data from multiple sensors, output one or more virtual sensor output values. The one or more virtual sensor output values can include one or more of refined sensor output values and one or more predicted future sensor output value.

    Multi-step sequence alignment
    3.
    发明授权

    公开(公告)号:US09959448B2

    公开(公告)日:2018-05-01

    申请号:US15251347

    申请日:2016-08-30

    Applicant: Google Inc.

    CPC classification number: G06K9/00087 G06K9/00758

    Abstract: A method of identifying similar media items is described. The method include identifying a first multiplicity of fingerprints representative of content segments of variable duration for a first media item and a second multiplicity of fingerprints representative of content segments of variable duration for a second media item. The method further includes comparing, by a processing device, a first group of the first multiplicity of fingerprints to a second group of the second multiplicity of fingerprints to generate a first similarity score indicative of a similarity between the first group of fingerprints and the second group of fingerprints. The method also includes determining an alignment score for the first multiplicity of fingerprints and the second multiplicity of fingerprints using the first similarity score.

    Derivation of probabilistic score for audio sequence alignment
    4.
    发明授权
    Derivation of probabilistic score for audio sequence alignment 有权
    推导音频序列比对的概率分数

    公开(公告)号:US09384758B2

    公开(公告)日:2016-07-05

    申请号:US14754539

    申请日:2015-06-29

    Applicant: Google Inc.

    CPC classification number: G10L25/51 G06F17/30743 G10H2210/066 G10H2240/141

    Abstract: A match score provides a semantically-meaningful quantification of the aural similarity of two chromae from two corresponding audio sequences. The match score can be applied to the chroma pairs of two corresponding audio sequences, and is independent of the lengths of the sequences, thereby permitting comparisons of matches across subsequences of different length. Accordingly, a single cutoff match score to identify “good” audio subsequence matches can be determined and has both good precision and good recall metrics. A function for determining the match score is determined by establishing a function PM indicating probabilities that chroma correspondence scores indicate semantic correspondences, and a function PR indicating probabilities that chroma correspondence scores indicate random correspondences, repeatedly updating PM and the match function based on existing values of PM and the match function as applied to audio subsequences with known semantic correspondences.

    Abstract translation: 匹配分数从两个对应的音频序列提供了两个色度的听觉相似性的语义上有意义的量化。 匹配分数可以应用于两个对应的音频序列的色度对,并且与序列的长度无关,从而允许比较不同长度的子序列的匹配。 因此,可以确定用于识别“良好”音频子序列匹配的单个截止匹配分数,并且具有良好的精度和良好的回忆度量。 通过建立指示色度对应分数表示语义对应关系的概率的函数PM和表示色度对应分数表示随机对应关系的概率的功能PR,基于现有的值的重新更新PM和匹配函数来确定匹配分数的功能 PM和匹配功能应用于具有已知语义对应关系的音频子序列。

    DERIVATION OF PROBABILISTIC SCORE FOR AUDIO SEQUENCE ALIGNMENT
    5.
    发明申请
    DERIVATION OF PROBABILISTIC SCORE FOR AUDIO SEQUENCE ALIGNMENT 有权
    用于音频序列比对的概率分数的派生

    公开(公告)号:US20150380004A1

    公开(公告)日:2015-12-31

    申请号:US14754539

    申请日:2015-06-29

    Applicant: Google Inc.

    CPC classification number: G10L25/51 G06F17/30743 G10H2210/066 G10H2240/141

    Abstract: A match score provides a semantically-meaningful quantification of the aural similarity of two chromae from two corresponding audio sequences. The match score can be applied to the chroma pairs of two corresponding audio sequences, and is independent of the lengths of the sequences, thereby permitting comparisons of matches across subsequences of different length. Accordingly, a single cutoff match score to identify “good” audio subsequence matches can be determined and has both good precision and good recall metrics. A function for determining the match score is determined by establishing a function PM indicating probabilities that chroma correspondence scores indicate semantic correspondences, and a function PR indicating probabilities that chroma correspondence scores indicate random correspondences, repeatedly updating PM and the match function based on existing values of PM and the match function as applied to audio subsequences with known semantic correspondences.

    Abstract translation: 匹配分数从两个对应的音频序列提供了两个色度的听觉相似性的语义上有意义的量化。 匹配分数可以应用于两个对应的音频序列的色度对,并且与序列的长度无关,从而允许比较不同长度的子序列的匹配。 因此,可以确定用于识别“良好”音频子序列匹配的单个截止匹配分数,并且具有良好的精度和良好的回忆度量。 通过建立指示色度对应分数表示语义对应关系的概率的函数PM和表示色度对应分数表示随机对应关系的概率的功能PR,基于现有的值的重新更新PM和匹配函数来确定匹配分数的功能 PM和匹配功能应用于具有已知语义对应关系的音频子序列。

    MULTI-STEP SEQUENCE ALIGNMENT
    6.
    发明申请

    公开(公告)号:US20180053039A1

    公开(公告)日:2018-02-22

    申请号:US15251347

    申请日:2016-08-30

    Applicant: Google Inc.

    CPC classification number: G06K9/00087 G06K9/00758

    Abstract: A method of identifying similar media items is described. The method include identifying a first multiplicity of fingerprints representative of content segments of variable duration for a first media item and a second multiplicity of fingerprints representative of content segments of variable duration for a second media item. The method further includes comparing, by a processing device, a first group of the first multiplicity of fingerprints to a second group of the second multiplicity of fingerprints to generate a first similarity score indicative of a similarity between the first group of fingerprints and the second group of fingerprints. The method also includes determining an alignment score for the first multiplicity of fingerprints and the second multiplicity of fingerprints using the first similarity score.

    COLLABORATIVE VOICE CONTROLLED DEVICES
    7.
    发明申请

    公开(公告)号:US20180182397A1

    公开(公告)日:2018-06-28

    申请号:US15387884

    申请日:2016-12-22

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for collaboration between multiple voice controlled devices are disclosed. In one aspect, a method includes the actions of identifying, by a first computing device, a second computing device that is configured to respond to a particular, predefined hotword; receiving audio data that corresponds to an utterance; receiving a transcription of additional audio data outputted by the second computing device in response to the utterance; based on the transcription of the additional audio data and based on the utterance, generating a transcription that corresponds to a response to the additional audio data; and providing, for output, the transcription that corresponds to the response.

    Enhanced Communication Assistance with Deep Learning

    公开(公告)号:US20180137400A1

    公开(公告)日:2018-05-17

    申请号:US15349037

    申请日:2016-11-11

    Applicant: Google Inc.

    CPC classification number: G06N3/0445

    Abstract: The present disclosure provides systems and methods that leverage machine-learned models (e.g., neural networks) to provide enhanced communication assistance. In particular, the systems and methods of the present disclosure can include or otherwise leverage a machine-learned communication assistance model to detect problematic statements included in a communication and/or provide suggested replacement statements to respectively replace the problematic statements. In one particular example, the communication assistance model can include a long short-term memory recurrent neural network that detects an inappropriate tone or unintended meaning within a user-composed communication and provides one or more suggested replacement statements to replace the problematic statements.

Patent Agency Ranking