Objective speech quality metric
    1.
    发明授权
    Objective speech quality metric 有权
    客观语音质量度量

    公开(公告)号:US09524733B2

    公开(公告)日:2016-12-20

    申请号:US13891978

    申请日:2013-05-10

    Applicant: Google Inc.

    CPC classification number: G10L25/60

    Abstract: Methods and systems are provided for using a model of human speech quality perception to provide an objective measure for predicting subjective quality assessments. A Virtual Speech Quality Objective Listener (ViSQOL) model is a signal-based full-reference metric that uses a spectro-temporal measure of similarity between a reference signal and test speech signal. Specifically, the model provides for the ability to detect and predict the level of clock drift, and determine whether such clock drift will impact a listener's quality of experience.

    Abstract translation: 提供了使用人类语言质量感知模型的方法和系统来提供用于预测主观质量评估的客观量度。 虚拟语音质量目标监听器(ViSQOL)是一种基于信号的全参考度量,它使用参考信号和测试语音信号之间的相似性的频谱测量。 具体来说,该模型提供了检测和预测时钟漂移水平的能力,并确定这种时钟漂移是否会影响听众的体验质量。

    Detection of chopped speech
    2.
    发明授权
    Detection of chopped speech 有权
    检测切碎的言语

    公开(公告)号:US09263061B2

    公开(公告)日:2016-02-16

    申请号:US13899381

    申请日:2013-05-21

    Applicant: Google Inc.

    CPC classification number: G10L25/78 G10L21/0232 G10L25/60

    Abstract: Methods and systems are provided for detecting chop in an audio signal. A time-frequency representation, such as a spectrogram, is created for an audio signal and used to calculate a gradient of mean power per frame of the audio signal. Positive and negative gradients are defined for the signal based on the gradient of mean power, and a maximum overlap offset between the positive and negative gradients is determined by calculating a value that maximizes the cross-correlation of the positive and negative gradients. The negative gradient values may be combined (e.g., summed) with the overlap offset, and the combined values then compared with a threshold to estimate the amount of chop present in the audio signal. The chop detection model provided is low-complexity and is applicable to narrowband, wideband, and superwideband speech.

    Abstract translation: 提供了用于检测音频信号中的斩波的方法和系统。 为音频信号创建时频表示,如频谱图,用于计算音频信号每帧平均功率的梯度。 基于平均功率梯度的信号定义正和负梯度,通过计算使正和负梯度的互相关最大化的值来确定正梯度和负梯度之间的最大重叠偏移。 负梯度值可以与重叠偏移组合(例如,相加),然后将组合值与阈值进行比较以估计音频信号中存在的斩波量。 提供的斩波检测模型是低复杂度的,适用于窄带,宽带和超宽带语音。

Patent Agency Ranking