-
公开(公告)号:US20060200346A1
公开(公告)日:2006-09-07
申请号:US11364251
申请日:2006-02-28
申请人: Wai-Yip Chan , Wei Zha , Mohamed El-Hennawey
发明人: Wai-Yip Chan , Wei Zha , Mohamed El-Hennawey
IPC分类号: G10L15/20
CPC分类号: G10L25/69
摘要: Auditory processing is used in conjunction with cognitive mapping to produce an objective measurement of speech quality that approximates a subjective measurement such as MOS. In order to generate a data model for measuring speech quality from a clean speech signal and a degraded speech signal, the clean speech signal is subjected to auditory processing to produce a subband decomposition of the clean speech signal; the degraded speech signal is subjected to auditory processing to produce a subband decomposition of the degraded speech signal; and cognitive mapping is performed based on the clean speech signal, the subband decomposition of the clean speech signal, and the subband decomposition of the degraded speech signal. Various statistical analysis techniques, such as MARS and CART, may be employed, either alone or in combination, to perform data mining for cognitive mapping. From the large number of features extracted from the distortion surface, MARS is employed to find a smaller subset of features to form the speech quality estimator. The subset of feature variables, together with the particular manner of combining them, are jointly optimized to produce a statistically consistent estimate (data model) of subjective opinion scores such as MOS.
摘要翻译: 听觉处理与认知映射结合使用以产生近似诸如MOS的主观测量的语音质量的客观测量。 为了从干净的语音信号和劣化的语音信号生成用于测量语音质量的数据模型,对干净的语音信号进行听觉处理,以产生干净的语音信号的子带分解; 对退化语音信号进行听觉处理,以产生劣化语音信号的子带分解; 并且基于干净的语音信号,干净的语音信号的子带分解和劣化语音信号的子带分解来执行认知映射。 可以单独地或组合地使用各种统计分析技术,例如MARS和CART来进行认知映射的数据挖掘。 从从失真表面提取的大量特征中,采用MARS来找到较小的特征子集以形成语音质量估计器。 特征变量的子集以及组合它们的特定方式被共同优化,以产生统计学上一致的主观意见评估(数据模型),如MOS。
-
公开(公告)号:US20070203694A1
公开(公告)日:2007-08-30
申请号:US11364252
申请日:2006-02-28
申请人: Wai-Yip Chan , Tiago Falk , Mohamed El-Hennawey
发明人: Wai-Yip Chan , Tiago Falk , Mohamed El-Hennawey
IPC分类号: G10L19/00
CPC分类号: G10L25/69
摘要: A non-intrusive speech quality estimation technique is based on statistical or probability models such as Gaussian Mixture Models (“GMMs”). Perceptual features are extracted from the received speech signal and assessed by an artificial reference model formed using statistical models. The models characterize the statistical behavior of speech features. Consistency measures between the input speech features and the models are calculated to form indicators of speech quality. The consistency values are mapped to a speech quality score using a mapping optimized using machine learning algorithms, such as Multivariate Adaptive Regression Splines (“MARS”). The technique provides competitive or better quality estimates relative to known techniques while having lower computational complexity.
摘要翻译: 非侵入式语音质量估计技术是基于统计或概率模型,如高斯混合模型(“GMM”)。 从接收到的语音信号中提取感知特征,并通过使用统计模型形成的人造参考模型进行评估。 这些模型描述了语音特征的统计行为。 计算输入语音特征和模型之间的一致性度量,以形成语音质量指标。 使用使用机器学习算法优化的映射(例如多变量自适应回归样条(“MARS”))将一致性值映射到语音质量得分。 该技术相对于已知技术提供竞争性或更好的质量估计,同时具有较低的计算复杂度。
-
公开(公告)号:US20110288865A1
公开(公告)日:2011-11-24
申请号:US13195338
申请日:2011-08-01
申请人: Wai-Yip Chan , Tiago H. Falk , Qingfeng Xu
发明人: Wai-Yip Chan , Tiago H. Falk , Qingfeng Xu
IPC分类号: G10L15/00
CPC分类号: G10L25/69
摘要: A non-intrusive speech quality estimation technique is based on statistical or probability models such as Gaussian Mixture Models (“GMMs”). Perceptual features are extracted from the received speech signal and assessed by an artificial reference model formed using statistical models. The models characterize the statistical behavior of speech features. Consistency measures between the input speech features and the models are calculated to form indicators of speech quality. The consistency values are mapped to a speech quality score using a mapping optimized using machine learning algorithms, such as Multivariate Adaptive Regression Splines (“MARS”). The technique provides competitive or better quality estimates relative to known techniques while having lower computational complexity.
摘要翻译: 非侵入式语音质量估计技术是基于统计或概率模型,如高斯混合模型(“GMM”)。 从接收到的语音信号中提取感知特征,并通过使用统计模型形成的人造参考模型进行评估。 这些模型描述了语音特征的统计行为。 计算输入语音特征和模型之间的一致性度量,以形成语音质量指标。 使用使用机器学习算法优化的映射(例如多变量自适应回归样条(“MARS”))将一致性值映射到语音质量得分。 该技术相对于已知技术提供竞争性或更好的质量估计,同时具有较低的计算复杂度。
-
公开(公告)号:US09786300B2
公开(公告)日:2017-10-10
申请号:US13195338
申请日:2011-08-01
申请人: Wai-Yip Chan , Tiago H Falk , Qingfeng Xu
发明人: Wai-Yip Chan , Tiago H Falk , Qingfeng Xu
CPC分类号: G10L25/69
摘要: A non-intrusive speech quality estimation technique is based on statistical or probability models such as Gaussian Mixture Models (“GMMs”). Perceptual features are extracted from the received speech signal and assessed by an artificial reference model formed using statistical models. The models characterize the statistical behavior of speech features. Consistency measures between the input speech features and the models are calculated to form indicators of speech quality. The consistency values are mapped to a speech quality score using a mapping optimized using machine learning algorithms, such as Multivariate Adaptive Regression Splines (“MARS”). The technique provides competitive or better quality estimates relative to known techniques while having lower computational complexity.
-
公开(公告)号:US07295614B1
公开(公告)日:2007-11-13
申请号:US09945116
申请日:2001-08-31
申请人: Jiandong Shen , Wai-Yip Chan
发明人: Jiandong Shen , Wai-Yip Chan
IPC分类号: H04N7/12
CPC分类号: H04N19/94 , H04N19/103 , H04N19/147 , H04N19/51
摘要: The present invention relates to systems and methods for compressing, decompressing, and transmitting video data. The systems and methods include pixel by pixel motion estimation and compensation and efficient quantization of residual errors. The present invention applies block estimation of the residual error produced by motion compensation. The block estimation is applied by a local decoder to generate synthesized blocks of video data. The block estimation approximated uses a set of predetermined motion estimation errors that are stored as error vectors in a codebook. The codebook is included in an encoder of the present invention and converts an error vector for each block to an error vector index. The error vector index, which introduces minimal transmission burden, is then sent from the encoder to a target decoder. A receiving decoder also includes a copy of the codebook and converts the error vector index to its associated error vector for reconstruction of video data.
摘要翻译: 本发明涉及用于压缩,解压缩和发送视频数据的系统和方法。 系统和方法包括逐像素运动估计和补偿以及残余误差的有效量化。 本发明对由运动补偿产生的残差进行块估计。 块估计由本地解码器应用以产生视频数据的合成块。 近似的块估计使用在码本中作为误差向量存储的一组预定的运动估计误差。 码本被包括在本发明的编码器中,并将每个块的误差向量转换为误差向量索引。 引入最小传输负荷的误差向量索引然后从编码器发送到目标解码器。 接收解码器还包括码本的副本,并且将误差向量索引转换成其相关联的误差向量,以重建视频数据。
-
-
-
-