Audio segmentation and classification using threshold values
    1.
    发明授权
    Audio segmentation and classification using threshold values 失效
    使用阈值进行音频分割和分类

    公开(公告)号:US07080008B2

    公开(公告)日:2006-07-18

    申请号:US10843011

    申请日:2004-05-11

    IPC分类号: G10L11/06

    CPC分类号: G10L25/48 G10L25/36

    摘要: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.

    摘要翻译: 音频信号的一部分被分成多个帧,从中提取一个或多个不同的特征。 使用这些不同的特征,结合一组规则,将音频信号的一部分分类成多个不同分类(例如,语音,非语音,音乐,环境声音,静音等)之一。 在一个实施例中,这些不同的特征包括一个或多个频带中的一个或多个线路频谱对(LSP),噪声帧比,特定频带的周期性,频谱通量特征以及能量分布。 线谱对也可选地用于分割音频信号,识别音频分类改变以及当音频信号是语音时的扬声器变化。

    Method and apparatus for compression and decompression of digital image
data
    2.
    发明授权
    Method and apparatus for compression and decompression of digital image data 失效
    用于数字图像数据的压缩和解压缩的方法和装置

    公开(公告)号:US5347600A

    公开(公告)日:1994-09-13

    申请号:US781586

    申请日:1991-10-23

    CPC分类号: H04N19/99 G06T9/001 G10L25/36

    摘要: Digital image data is automatically processed by dividing stored image data into domain blocks and range blocks. The range blocks are subjected to processes such as a shrinking process to obtain mapped range blocks. The range blocks or domain blocks may also be processed by processes such as affine transforms. Then, for each domain block, the mapped range block which is most similar to the domain block is determined, and the address of that range block and the processes the blocks were subjected to are combined as an identifier which is appended to a list of identifiers for other domain blocks. The list of identifiers for all domain blocks is called a fractal transform and constitutes a compressed representation of the input image. To decompress the fractal transform and recover the input image, an arbitrary input image is formed into range blocks and the range blocks processed in a manner specified by the identifiers to form a representation of the original input image.

    摘要翻译: 通过将存储的图像数据分割成域块和范围块来自动处理数字图像数据。 范围块经受诸如收缩处理的处理以获得映射的范围块。 范围块或域块也可以由诸如仿射变换之类的过程来处理。 然后,对于每个域块,确定与域块最相似的映射范围块,并且该范围块的地址和块所经历的处理被组合为附加到标识符列表的标识符 对于其他域块。 所有域块的标识符列表称为分形变换,并构成输入图像的压缩表示。 为了解压缩分形变换并恢复输入图像,将任意输入图像形成为范围块,并且以由标识符指定的方式处理的范围块以形成原始输入图像的表示。

    NOISE SPEED-UPS IN HIDDEN MARKOV MODELS WITH APPLICATIONS TO SPEECH RECOGNITION
    3.
    发明申请
    NOISE SPEED-UPS IN HIDDEN MARKOV MODELS WITH APPLICATIONS TO SPEECH RECOGNITION 审中-公开
    噪音速度型UPS用于语音识别应用

    公开(公告)号:US20160005399A1

    公开(公告)日:2016-01-07

    申请号:US14802760

    申请日:2015-07-17

    摘要: A learning computer system may estimate unknown parameters and states of a stochastic or uncertain system having a probability structure. The system may include a data processing system that may include a hardware processor that has a configuration that: receives data; generates random, chaotic, fuzzy, or other numerical perturbations of the data, one or more of the states, or the probability structure; estimates observed and hidden states of the stochastic or uncertain system using the data, the generated perturbations, previous states of the stochastic or uncertain system, or estimated states of the stochastic or uncertain system; and causes perturbations or independent noise to be injected into the data, the states, or the stochastic or uncertain system so as to speed up training or learning of the probability structure and of the system parameters or the states.

    摘要翻译: 学习计算机系统可以估计具有概率结构的随机或不确定系统的未知参数和状态。 该系统可以包括数据处理系统,其可以包括具有以下配置的硬件处理器:接收数据; 产生数据的随机,混乱,模糊或其他数字扰动,一个或多个状态或概率结构; 使用数据,生成的扰动,随机或不确定系统的先前状态或随机或不确定系统的估计状态的随机或不确定系统的观测和隐藏状态的估计; 并引起扰动或独立噪声注入到数据,状态或随机或不确定系统中,以加速对概率结构和系统参数或状态的训练或学习。

    Apparatus for predicting the spectral information of voice signals and a method therefor
    4.
    发明申请
    Apparatus for predicting the spectral information of voice signals and a method therefor 审中-公开
    用于预测语音信号的频谱信息的装置及其方法

    公开(公告)号:US20070011001A1

    公开(公告)日:2007-01-11

    申请号:US11483890

    申请日:2006-07-10

    申请人: Hyun-Soo Kim

    发明人: Hyun-Soo Kim

    IPC分类号: G10L19/00

    CPC分类号: G10L19/06 G10L25/18 G10L25/36

    摘要: Disclosed is a method for predicting the spectral information of voice signals, including inputting the voice signals, performing morphological operations with the waveform image of the voice signals, extracting harmonic peaks as a result of the morphological operations, and predicting the spectral envelope information of the voice signals by interpolating the harmonic peaks.

    摘要翻译: 公开了一种用于预测语音信号的频谱信息的方法,包括输入语音信号,用语音信号的波形图像进行形态学运算,作为形态学运算的结果提取谐波峰值,以及预测频谱包络信息 通过内插谐波峰值的语音信号。

    Audio segmentation and classification

    公开(公告)号:US20050060152A1

    公开(公告)日:2005-03-17

    申请号:US10974298

    申请日:2004-10-27

    IPC分类号: G10L11/00 G10L15/12

    CPC分类号: G10L25/48 G10L25/36

    摘要: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.

    Audio Segmentation and Classification
    6.
    发明申请
    Audio Segmentation and Classification 审中-公开
    音频分段和分类

    公开(公告)号:US20060178877A1

    公开(公告)日:2006-08-10

    申请号:US11278250

    申请日:2006-03-31

    IPC分类号: G10L19/00

    CPC分类号: G10L25/48 G10L25/36

    摘要: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.

    摘要翻译: 音频信号的一部分被分成多个帧,从中提取一个或多个不同的特征。 使用这些不同的特征,结合一组规则,将音频信号的一部分分类成多个不同分类(例如,语音,非语音,音乐,环境声音,静音等)之一。 在一个实施例中,这些不同的特征包括一个或多个频带中的一个或多个线路频谱对(LSP),噪声帧比,特定频带的周期性,频谱通量特征以及能量分布。 线谱对也可选地用于分割音频信号,识别音频分类改变以及当音频信号是语音时的扬声器变化。

    Audio segmentation and classification
    7.
    发明授权
    Audio segmentation and classification 失效
    音频分割和分类

    公开(公告)号:US06901362B1

    公开(公告)日:2005-05-31

    申请号:US09553166

    申请日:2000-04-19

    IPC分类号: G10L11/00 G10L19/02

    CPC分类号: G10L25/48 G10L25/36

    摘要: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.

    摘要翻译: 音频信号的一部分被分成多个帧,从中提取一个或多个不同的特征。 使用这些不同的特征,结合一组规则,将音频信号的一部分分类成多个不同分类(例如,语音,非语音,音乐,环境声音,静音等)之一。 在一个实施例中,这些不同的特征包括一个或多个频带中的一个或多个线路频谱对(LSP),噪声帧比,特定频带的周期性,频谱通量特征以及能量分布。 线谱对也可选地用于分割音频信号,识别音频分类改变以及当音频信号是语音时的扬声器变化。

    Audio segmentation and classification

    公开(公告)号:US20050075863A1

    公开(公告)日:2005-04-07

    申请号:US10998766

    申请日:2004-11-29

    IPC分类号: G10L11/00 G10L19/14

    CPC分类号: G10L25/48 G10L25/36

    摘要: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.

    Fractal coding of data
    9.
    发明授权
    Fractal coding of data 失效
    数据的分形编码

    公开(公告)号:US5768437A

    公开(公告)日:1998-06-16

    申请号:US295637

    申请日:1994-08-26

    CPC分类号: H04N19/99 G06T9/001 G10L25/36

    摘要: A method of fractal coding of data and apparatus therefor, which method comprises dividing data into domains, determining a set of transformations relating the domains to the data in such a manner as to minimize error between the data and an approximation to the data obtained by application of the transformation, and providing an expression of a series of quantized fractal coefficients characterizing the transformations. A transformation includes at least one part indicating a domain and at least another (functional) part indicating a value for a measure associatable with a specific domain or aspect thereof.

    摘要翻译: PCT No.PCT / GB93 / 00422 Sec。 371日期1994年8月26日 102(e)日期1994年8月26日PCT 1993年3月1日PCT公布。 公开号WO93 / 17519 日期1993年9月2日一种数据及其装置的分形编码方法,该方法包括将数据划分成域,确定一组与数据相关的变换,以使数据之间的误差最小化, 通过应用变换获得的数据,并提供表征变换的一系列量化分形系数的表达式。 变换包括指示域的至少一个部分和指示与特定域或其方面相关联的度量的值的至少另一(功能)部分。

    SAMPLING RATE CONVERSION APPARATUS AND METHOD THEREOF
    10.
    发明申请
    SAMPLING RATE CONVERSION APPARATUS AND METHOD THEREOF 有权
    采样速率转换装置及其方法

    公开(公告)号:US20090240508A1

    公开(公告)日:2009-09-24

    申请号:US12363293

    申请日:2009-01-30

    申请人: Junichi Saito

    发明人: Junichi Saito

    IPC分类号: G10L19/00

    CPC分类号: G10L25/90 G10L21/00 G10L25/36

    摘要: A sampling rate conversion apparatus and a method thereof are provided which increase the sampling rate of a discrete audio signal sampled at a predetermined sampling rate by using a fractal interpolation function (FIF). An audio signal portion formed by a predetermined number of sampling data items is divided into a plurality of interpolation intervals. On the audio signal portion, mapping points are determined. The number of the mapping points is in accordance with the degree of increase in the sampling rate. For the respective interpolation intervals, mapping parameters for performing mapping using the FIF on the mapping points are calculated. In all of the interpolation intervals, the mapping using the FIF is performed on the mapping points with the use of the mapping parameters according to the respective interpolation intervals. Thereby, new sampling data items are generated.

    摘要翻译: 提供了一种采样率转换装置及其方法,其通过使用分形插值函数(FIF)来增加以预定采样率采样的离散音频信号的采样率。 由预定数量的采样数据项形成的音频信号部分被分成多个插值间隔。 在音频信号部分,确定映射点。 映射点的数量符合采样率的增加程度。 对于各个内插间隔,计算使用映射点上的FIF执行映射的映射参数。 在所有内插间隔中,使用FIF的映射使用映射参数根据各自的内插间隔在映射点上进行。 由此,生成新的采样数据项。