Method and apparatus for vocal tract resonance tracking using nonlinear predictor and target-guided temporal restraint
    51.
    发明授权
    Method and apparatus for vocal tract resonance tracking using nonlinear predictor and target-guided temporal restraint 有权
    使用非线性预测器和目标引导时间约束的声道共振跟踪的方法和装置

    公开(公告)号:US07643989B2

    公开(公告)日:2010-01-05

    申请号:US10652976

    申请日:2003-08-29

    IPC分类号: G10L19/06

    CPC分类号: G10L25/48 G10L25/15

    摘要: A method and apparatus map a set of vocal tract resonant frequencies, together with their corresponding bandwidths, into a simulated acoustic feature vector in the form of LPC cepstrum by calculating a separate function for each individual vocal tract resonant frequency/bandwidth and summing the result to form an element of the simulated feature vector. The simulated feature vector is applied to a model along with an input feature vector to determine a probability that the set of vocal tract resonant frequencies is present in a speech signal. Under one embodiment, the model includes a target-guided transition model that provides a probability of a vocal tract resonant frequency based on a past vocal tract resonant frequency and a target for the vocal tract resonant frequency. Under another embodiment, the phone segmentation is provided by an HMM system and is used to precisely determine which target value to use at each frame.

    摘要翻译: 一种方法和装置将一组声道共振频率及其相应带宽与LPC倒谱谱形式映射成模拟的声学特征向量,通过计算每个单独的声道共振频率/带宽的单独函数,并将结果相加到 形成模拟特征向量的元素。 将模拟特征向量与输入特征向量一起应用于模型,以确定声道谐振频率的集合存在于语音信号中的概率。 在一个实施例中,该模型包括目标引导的转换模型,其基于过去的声道共振频率和用于声道共振频率的目标提供声道共振频率的概率。 在另一个实施例中,电话分割由HMM系统提供,并且用于精确地确定在每个帧处使用哪个目标值。

    Indexing and ranking processes for directory assistance services
    52.
    发明授权
    Indexing and ranking processes for directory assistance services 有权
    目录援助服务的索引和排名流程

    公开(公告)号:US07580942B2

    公开(公告)日:2009-08-25

    申请号:US11652733

    申请日:2007-01-12

    IPC分类号: G06F17/30

    摘要: A computer-implemented method is disclosed for providing a directory assistance service. The method includes generating an indexing file that is a representation of information associated with a collection of listings stored in an index. The indexing file is utilized as a basis for ranking listings in an index based on the strength of association with a query. Based at least in part on the ranking, an output is provided and is indicative of listings in the index that are likely correspond to the query. At least one particular listing in the index is excluded from the output without there ever being a comparison of features in the query with features in the one particular listing.

    摘要翻译: 公开了一种用于提供目录辅助服务的计算机实现的方法。 该方法包括生成索引文件,其是与存储在索引中的列表的集合相关联的信息的表示。 基于与查询的关联强度,索引文件被用作在索引中对列表进行排名的基础。 至少部分地基于排名,提供输出并且指示索引中可能对应于查询的列表。 索引中的至少一个特定列表从输出中排除,而不会将查询中的功能与特定列表中的功能进行比较。

    Quantitative model for formant dynamics and contextually assimilated reduction in fluent speech
    54.
    发明授权
    Quantitative model for formant dynamics and contextually assimilated reduction in fluent speech 有权
    流动语言的共同作用动力学和语境相似化减少的量化模型

    公开(公告)号:US07565292B2

    公开(公告)日:2009-07-21

    申请号:US10944262

    申请日:2004-09-17

    IPC分类号: G10L13/00

    CPC分类号: G10L13/02 G10L25/15

    摘要: A method of identifying a sequence of formant trajectory values is provided in which a sequence of target values are identified for a formant as step functions. The target values and the duration for each segment target for the formant are applied to a finite impulse response filter to form a sequence of formant trajectory values. The parameters of this filter, as well as the duration of the targets for each phone, can be modified to produce many kinds of target undershooting effects in a contextually assimilated manner. The procedure for producing the formant trajectory values does not require any acoustic data from speech.

    摘要翻译: 提供了一种识别共振峰轨迹值序列的方法,其中以阶跃函数为共振峰识别目标值序列。 将目标值和共振峰的每个段目标的持续时间应用于有限脉冲响应滤波器以形成共振峰轨迹值序列。 可以修改此过滤器的参数以及每个手机的目标持续时间,以上下文相同的方式产生多种目标下冲效应。 用于产生共振峰轨迹值的过程不需要来自语音的任何声学数据。

    MAXIMUM ENTROPY MODEL PARAMETERIZATION
    55.
    发明申请
    MAXIMUM ENTROPY MODEL PARAMETERIZATION 有权
    最大熵模型参数

    公开(公告)号:US20090150308A1

    公开(公告)日:2009-06-11

    申请号:US11952130

    申请日:2007-12-07

    IPC分类号: G06F15/18

    摘要: Described is a technology by which a maximum entropy model used for classification is trained with a significantly lesser amount of training data than is normally used in training other maximum entropy models, yet provides similar accuracy to the others. The maximum entropy model is initially parameterized with parameter values determined from weights obtained by training a vector space model or an n-gram model. The weights may be scaled into the initial parameter values by determining a scaling factor. Gaussian mean values may also be determined, and used for regularization in training the maximum entropy model. Scaling may also be applied to the Gaussian mean values. After initial parameterization, training comprises using training data to iteratively adjust the initial parameters into adjusted parameters until convergence is determined.

    摘要翻译: 描述了一种技术,通过该技术,用于分类的最大熵模型以比通常在训练其他最大熵模型中通常使用的训练数据少得多的训练来训练,但是提供了与其他最大熵模型相似的精度。 最大熵模型最初参数化,其参数值由通过训练向量空间模型或n-gram模型获得的权重确定。 通过确定缩放因子,权重可以被缩放到初始参数值中。 也可以确定高斯平均值,并用于训练最大熵模型的正则化。 缩放也可以应用于高斯平均值。 在初始参数化之后,训练包括使用训练数据将初始参数迭代地调整为调整参数,直到确定收敛。

    Noise reduction using correction vectors based on dynamic aspects of speech and noise normalization
    56.
    发明授权
    Noise reduction using correction vectors based on dynamic aspects of speech and noise normalization 有权
    基于语音和噪声归一化的动态方面的校正矢量降噪

    公开(公告)号:US07542900B2

    公开(公告)日:2009-06-02

    申请号:US11429630

    申请日:2006-05-05

    IPC分类号: G10L21/02

    CPC分类号: G10L21/0208

    摘要: A method and apparatus are provided for reducing noise in a signal. Under one aspect of the invention, a correction vector is selected based on a noisy feature vector that represents a noisy signal. The selected correction vector incorporates dynamic aspects of pattern signals. The selected correction vector is then added to the noisy feature vector to produce a cleaned feature vector. In other aspects of the invention, a noise value is produced from an estimate of the noise in a noisy signal. The noise value is subtracted from a value representing a portion of the noisy signal to produce a noise-normalized value. The noise-normalized value is used to select a correction value that is added to the noise-normalized value to produce a cleaned noise-normalized value. The noise value is then added to the cleaned noise-normalized value to produce a cleaned value representing a portion of a cleaned signal.

    摘要翻译: 提供了一种降低信号噪声的方法和装置。 在本发明的一个方面,基于表示噪声信号的噪声特征向量来选择校正矢量。 所选择的校正矢量包含模式信号的动态方面。 然后将所选择的校正向量加到噪声特征向量中以产生清除的特征向量。 在本发明的其他方面,噪声值是由噪声信号中的噪声的估计产生的。 从表示噪声信号的一部分的值中减去噪声值,以产生噪声归一化值。 噪声归一化值用于选择加到噪声归一化值的校正值以产生清洁的噪声归一化值。 然后将噪声值添加到清洁的噪声归一化值,以产生表示清洁信号的一部分的清洁值。

    Method and apparatus using harmonic-model-based front end for robust speech recognition
    58.
    发明授权
    Method and apparatus using harmonic-model-based front end for robust speech recognition 有权
    使用基于谐波模型的前端进行鲁棒语音识别的方法和装置

    公开(公告)号:US07516067B2

    公开(公告)日:2009-04-07

    申请号:US10647586

    申请日:2003-08-25

    IPC分类号: G10L21/02 G10L15/00 G10L15/20

    CPC分类号: G10L21/0208 G10L15/20

    摘要: A system and method are provided that reduce noise in speech signals. The system and method decompose a noisy speech signal into a harmonic component and a residual component. The harmonic component and residual component are then combined as a sum to form a noise-reduced value. In some embodiments, the sum is a weighted sum where the harmonic component is multiplied by a scaling factor. In some embodiments, the noise-reduced value is used in speech recognition.

    摘要翻译: 提供一种降低语音信号噪声的系统和方法。 系统和方法将噪声语音信号分解为谐波分量和残差分量。 然后将谐波分量和残余分量合并为一个以形成噪声减小的值。 在一些实施例中,和是一个加权和,其中谐波分量乘以比例因子。 在一些实施例中,降噪值被用于语音识别。

    Greedy algorithm for identifying values for vocal tract resonance vectors
    59.
    发明授权
    Greedy algorithm for identifying values for vocal tract resonance vectors 有权
    用于识别声道共振载体的值的贪婪算法

    公开(公告)号:US07475011B2

    公开(公告)日:2009-01-06

    申请号:US10925585

    申请日:2004-08-25

    IPC分类号: G10L19/06

    CPC分类号: G10L25/48 G10L15/02 G10L25/15

    摘要: A method and apparatus identify values for components of a vocal tract resonance vector by sequentially determining values for each component of the vocal tract resonance vector. To determine a value for a component, the other components are set to static values. A plurality of values for a function are then determined using a plurality of values for the component that is being determined while using the static values for all of the other components. One of the plurality of values for the component is then selected based on the plurality of values for the function.

    摘要翻译: 一种方法和装置通过依次确定声道共振矢量的每个分量的值来识别声道共振矢量的分量的值。 要确定组件的值,其他组件将设置为静态值。 然后,使用正在确定的组件的多个值来确定功能的多个值,同时使用所有其他组件的静态值。 然后基于该功能的多个值来选择该组件的多个值之一。

    SENSOR ARRAY BEAMFORMER POST-PROCESSOR
    60.
    发明申请
    SENSOR ARRAY BEAMFORMER POST-PROCESSOR 有权
    传感器阵列后处理器

    公开(公告)号:US20080288219A1

    公开(公告)日:2008-11-20

    申请号:US11750319

    申请日:2007-05-17

    IPC分类号: H04B15/00 G06F15/00

    CPC分类号: H04B7/0854

    摘要: A novel beamforming post-processor technique with enhanced noise suppression capability. The present beam forming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction resulting in minimal artifacts and musical noise.

    摘要翻译: 一种具有增强噪声抑制能力的新型波束成形后处理器技术。 本发明的波束形成后处理器技术是用于提高方向性和信号分离能力的传感器阵列(例如,麦克风阵列)的非线性后处理技术。 该技术在所谓的瞬时到达空间方向上工作,估计来自给定入射角或查找方向的声音的概率,并且应用时间变化的基于增益的时空滤波器来抑制来自其他方向的声音 比声源方向导致最小的伪像和音乐噪声。