SYSTEM AND APPARATUS FOR SPEECH COMMUNICATION AND SPEECH RECOGNITION
    71.
    发明申请
    SYSTEM AND APPARATUS FOR SPEECH COMMUNICATION AND SPEECH RECOGNITION 审中-公开
    用于语音通信和语音识别的系统和装置

    公开(公告)号:WO03036614A2

    公开(公告)日:2003-05-01

    申请号:PCT/SG0200149

    申请日:2002-07-02

    Abstract: A headset system is proposed including a headset unit to be worn by a user and having two or more microphones, and a base unit in wireless communication with the headset. Signals received from the microphones are processed using a first adaptive filter to enhance a target signal, and then divided and supplied to a second adaptive filter arranged to reduce interference signals and a third filter arranged to reduce noise. The outputs of the second and third filters are combined, and are be subject to further processing in the frequency domain. The results are transmitted to a speech recognition engine.

    Abstract translation: 提出一种耳机系统,其包括要由用户佩戴并具有两个或更多个麦克风的头戴耳机单元以及与耳机无线通信的基座单元。 使用第一自适应滤波器处理从麦克风接收的信号以增强目标信号,然后被分频并提供给布置成减少干扰信号的第二自适应滤波器和布置成降低噪声的第三滤波器。 第二和第三滤波器的输出被组合,并且在频域中被进一步处理。 结果被传送到语音识别引擎。

    CODING SUCCESSIVE PITCH PERIODS IN SPEECH SIGNAL
    72.
    发明申请
    CODING SUCCESSIVE PITCH PERIODS IN SPEECH SIGNAL 审中-公开
    在语音信号中编码成功的休眠时间

    公开(公告)号:WO02101718A3

    公开(公告)日:2003-04-10

    申请号:PCT/IB0202078

    申请日:2002-06-07

    CPC classification number: G10L19/08

    Abstract: A method and apparatus for coding successive pitch periods (Fig. 5) of a speech signal. Based on a priori knowledge of statistical properties of successive speech periods, a shaped lattice structure is designed to cover the most probable points in the pitch space. The codebook index search starts with finding an open-loop estimate in the pitch space considering all dimensions and refining the open-loop estimate in a closed-loop search separately in each dimension based on the shaped lattice structure. The closed-loop search for the first subframe is for obtaining an absolute pitch period or a delta pitch while the closed-loop search for each of the other subframes is for obtaining a delta pitch for the respective subframe.

    Abstract translation: 一种用于编码语音信号的连续音调周期(图5)的方法和装置。 基于对连续语音周期的统计特性的先验知识,设计了一种形状的格子结构来覆盖音调空间中最可能的点。 码本索引搜索开始于考虑所有维度的音调空间中的开环估计,并且基于形状的格子结构在每个维度中分开地在闭环搜索中优化开环估计。 对于第一子帧的闭环搜索用于获得绝对音调周期或增量音调,而对于每个其他子帧的闭环搜索用于获得相应子帧的增量间距。

    METHOD AND APPARATUS FOR VOICE SIGNAL EXTRACTION
    73.
    发明申请
    METHOD AND APPARATUS FOR VOICE SIGNAL EXTRACTION 审中-公开
    用于语音信号提取的方法和装置

    公开(公告)号:WO0176319A3

    公开(公告)日:2002-12-27

    申请号:PCT/US0110550

    申请日:2001-03-30

    Inventor: ERTEN GAMZE

    CPC classification number: H04R1/406 H04R25/405

    Abstract: A method is provided for positioning the individual elements of a microphone arrangement including at least two such elements. The spacing among the microphone elements supports the generation of numerous combinations of the signal of interest and a sum of interfering sources. Use of the microphone element placement method leads to the formation of many types of microphone arrangements, comprising at least two microphone elements, and provides the input data to signal processing system for sound discrimination. Many examples of these microphone arrangements are provided, some of which are integrated with everyday objects. Also, enhancements and extensions are provided for a signal separation-based processing system for sound discrimination, which uses the microphone arrangements as the sensory front end.

    Abstract translation: 提供一种用于定位包括至少两个这样的元件的麦克风装置的各个元件的方法。 麦克风元件之间的间隔支持产生感兴趣的信号和干扰源的总和的许多组合。 使用麦克风元件放置方法导致形成许多类型的麦克风布置,其包括至少两个麦克风元件,并且将输入数据提供给用于声音辨别的信号处理系统。 提供了这些麦克风布置的许多示例,其中一些与日常物体集成。 此外,为基于信号分离的声音识别处理系统提供增强和扩展,其使用麦克风布置作为感官前端。

    INTERPRETATION OF FEATURES FOR SIGNAL PROCESSING AND PATTERN RECOGNITION
    74.
    发明申请
    INTERPRETATION OF FEATURES FOR SIGNAL PROCESSING AND PATTERN RECOGNITION 审中-公开
    信号处理和模式识别特征的解释

    公开(公告)号:WO2002095730A1

    公开(公告)日:2002-11-28

    申请号:PCT/GB2002/002197

    申请日:2002-05-20

    Inventor: MING, Ji

    CPC classification number: G10L15/20 G10L15/142 G10L19/0204 G10L21/0232

    Abstract: A method of interpretation of features for signal processing and pattern recognition provides a model in which the pattern or signal to be interpreted is considered as a set of N observations, M of which are corrupt, and a disjunction is performed over all possible combinations of N different values (1,...,N) taken N-M at a time. The value of M defines the order of the model, and is determined using an optimality criterion which chooses the order that corresponds to a clean signal based on comparing the state duration probability of the signal or pattern to be interpreted with that of a clean signal.

    Abstract translation: 用于信号处理和模式识别的特征的解释的方法提供了一种模型,其中要解释的模式或信号被认为是一组N个观察值,其中M个被破坏,并且对所有可能的组合N进行分离 不同的值(1,...,N)一次取NM。 M的值定义了模型的顺序,并且使用优化准则来确定,该最优性准则通过将待解释的信号或模式的状态持续时间概率与干净信号的状态持续时间概率进行比较来选择对应于干净信号的顺序。

    SYSTEM AND METHOD FOR CLASSIFICATION OF SOUND SOURCES
    75.
    发明申请
    SYSTEM AND METHOD FOR CLASSIFICATION OF SOUND SOURCES 审中-公开
    用于分类声源的系统和方法

    公开(公告)号:WO0116937A9

    公开(公告)日:2002-09-06

    申请号:PCT/US0023754

    申请日:2000-08-29

    CPC classification number: G10L17/26 G10L15/20

    Abstract: A system and method to identify a sound source among a group of sound sources. The invention matches the acoustic input to a number of signal models, one per source class, and produces a goodness-of-match number for each signal model. The sound source is declared to be of the same class as that of the signal model with the best goodness-of-match if that score is sufficiently high. The data are recorded with a microphone, digitized and transformed into the frequency domain. A signal detector is applied to the transient. A harmonic detection method can be used to determine if the sound source has harmonic characteristics. If at least some part of a transient contains signal of interest, the spectrum of the signal after rescaling is compared to a set of signal models, and the input signal's parameters are fitted to the data. The average distortion is calculated to compare patterns with those of sources that used in training the signal models. Before classification can occur, a source model is trained with signal data. Each signal model is built by creating templates from input signal spectrograms when they are significantly different from existing templates. If an existing template is found that resembles the input pattern, the template is averaged with the pattern in such a way that the resulting template is the average of all the spectra that matched that template in the past.

    Abstract translation: 一组识别声源的声源的系统和方法。 本发明将声输入匹配到多个信号模型,每个源类一个信号模型,并且为每个信号模型产生一个良好的匹配次数。 如果该分数足够高,则声源被声明为与具有最佳匹配度的信号模型相同的类。 用麦克风记录数据,数字化并转换到频域。 信号检测器应用于瞬态。 可以使用谐波检测方法来确定声源是否具有谐波特性。 如果瞬态的至少部分包含感兴趣的信号,则将重新缩放之后的信号的频谱与一组信号模型进行比较,并将输入信号的参数拟合到数据中。 计算平均失真以将模式与用于训练信号模型的源的模式进行比较。 在分类之前,可以用信号数据对源模型进行训练。 每个信号模型是通过从输入信号谱图创建模板构建的,当它们与现有模板显着不同时。 如果找到类似于输入模式的现有模板,则模板将以该模式进行平均,使得所得到的模板是与过去匹配该模板的所有光谱的平均值。

    EMPIRICAL MODE DECOMPOSITION FOR ANALYZING ACOUSTICAL SIGNALS
    76.
    发明申请
    EMPIRICAL MODE DECOMPOSITION FOR ANALYZING ACOUSTICAL SIGNALS 审中-公开
    用于分析声学信号的实验模式分解

    公开(公告)号:WO02065157A2

    公开(公告)日:2002-08-22

    申请号:PCT/US0201250

    申请日:2002-02-13

    Inventor: HUANG NORDEN E

    Abstract: The present invention discloses a computer (410) implemented signal analysis method through the Hilbert-Huang Transformation "HHT" for analyzing acoustical signals (10), which are assumed to be nonlinear and nonstationary. The Empirical Decomposition Method "EMD" and the Hilbert Spectral Analysis "HSA" are used to obtain the HHT. Essentially, the acoustical signal will be decomposed into the Intrinsic Mode Function Components "IMFs". Once the invention decomposes the acoustic signal into its constituting components, all operations such as analyzing, identifying, and removing unwanted signals can be performed on these components. Upon transforming the IMFs into Hilbert spectrum, the acoustical signal may be compared with other acoustical signals.

    Abstract translation: 本发明公开了一种通过希尔伯特黄变换“HHT”分析声信号(10)的计算机(410)信号分析方法,其被假定为非线性和非平稳的。 经验分解法“EMD”和希尔伯特谱分析“HSA”用于获得HHT。 本质上,声信号将被分解为内在模式功能组件“IMF”。 一旦本发明将声信号分解成其构成部件,可以对这些部件执行所有操作,例如分析,识别和去除不需要的信号。 在将IMF转换成希尔伯特频谱时,声信号可以与其他声学信号进行比较。

    METHOD AND APPARATUS FOR ROBUST SPEECH CLASSIFICATION
    77.
    发明申请
    METHOD AND APPARATUS FOR ROBUST SPEECH CLASSIFICATION 审中-公开
    用于鲁棒语音分类的方法和设备

    公开(公告)号:WO0247068A3

    公开(公告)日:2002-08-22

    申请号:PCT/US0146971

    申请日:2001-12-04

    Applicant: QUALCOMM INC

    Inventor: HUANG PENGJUN

    CPC classification number: G10L25/93 G10L19/025 G10L19/22 G10L25/78

    Abstract: A speech classification technique (502-530) for robust classification of varying modes of speech to enable maximum performance of multi-mode variable bit rate encoding techniques. A speech classifier accurately classifies a high percentage of speech segments for encoding at minimal bit rates, meeting lower bit rate requirements. Highly accurate speech classification produces a lower average encoded bit rate, and higher quality decoded speech. The speech classifier considers a maximum number of parameters for each frame of speech, producing numerous and accurate speech mode classifications for each frame. The speech classifier correctly classifies numerous modes of speech under varying environmental conditions. The speech classifier inputs classification parameters from external components, generates internal classification parameters from the input parameters, sets a Normalized Auto-correlation Coefficient Function threshold and selects a parameter analyzer according to the signal environment, and then analyzes the parameters to produce a speech mode classification.

    Abstract translation: 一种语音分类技术(502-530),用于对变化模式的语音进行鲁棒分类以实现多模式可变比特率编码技术的最大性能。 语音分类器能够以最低比特率对高比例的语音段进行编码,从而满足较低的比特率要求。 高度准确的语音分类产生较低的平均编码比特率和较高质量的解码语音。 语音分类器考虑每个语音帧的最大参数数量,为每个帧产生许多和准确的语音模式分类。 语音分类器可以在各种环境条件下正确分类众多语音模式。 语音分类器从外部组件输入分类参数,根据输入参数生成内部分类参数,设置归一化自相关系数函数阈值并根据信号环境选择参数分析器,然后分析参数以产生语音模式分类 。

    METHOD FOR SEARCH IN AN AUDIO DATABASE
    78.
    发明申请
    METHOD FOR SEARCH IN AN AUDIO DATABASE 审中-公开
    在音频数据库中搜索的方法

    公开(公告)号:WO0211123A3

    公开(公告)日:2002-05-30

    申请号:PCT/EP0108709

    申请日:2001-07-26

    Abstract: A method for recognizing an audio sample locates an audio file that most closely matches the audio sample from a database indexing a large set of original recordings. Each indexed audio file is represented in the database index by a set of landmark timepoints and associated fingerprints. Landmarks occur at reproductible locations within the file, while fingerprints represent features of the signal at or near the landmark timepoints. To perform recognition, landmarks and fingerprints are computed for the unknown sample and used to retrieve matching fingerprints from the database. For each file containing matching fingerprints, the landmarks are compared with landmarks of the sample at which the same fingerprints were computed. If a large number of corresponding landmarks are linearly related, i.e., if equivalent fingerprints of the sample and retrieved file have the same time evolution, then the file is identified with the sample. The method can be used for any type of sound or music, and is particularly effective for audio signals subject to linear and nonlinear distortion such as background noise, compression artifacts, or transmission dropouts. The sample can be identified in a time proportional to the logarithm of the number of entries in the database; given sufficient computational power, recognition can be performed in nearly real time as the sound is being sampled.

    Abstract translation: 用于识别音频样本的方法定位与索引大量原始记录的数据库最接近匹配的音频文件的音频文件。 每个索引的音频文件通过一组里程碑时间点和相关指纹在数据库索引中表示。 地标出现在文件内的可再生位置,而指纹表示在地标时间点处或附近的信号的特征。 为了执行识别,计算未知样本的地标和指纹,并用于从数据库中检索匹配的指纹。 对于包含匹配指纹的每个文件,将地标与样本的与所计算的相同指纹的地标进行比较。 如果大量对应的地标是线性相关的,即如果样本和检索文件的等效指纹具有相同的时间演化,则用该样本识别该文件。 该方法可以用于任何类型的声音或音乐,并且对于经历线性和非线性失真(例如背景噪声,压缩伪像或传输丢失)的音频信号特别有效。 可以在与数据库中条目数的对数成比例的时间内识别样本; 给予足够的计算能力,随着声音被采样,可以几乎实时地执行识别。

    SPEECH RECOGNITION SYSTEM AND METHOD
    79.
    发明申请
    SPEECH RECOGNITION SYSTEM AND METHOD 审中-公开
    语音识别系统和方法

    公开(公告)号:WO02023525A1

    公开(公告)日:2002-03-21

    申请号:PCT/NZ2001/000192

    申请日:2001-09-17

    CPC classification number: G10L15/197 G10L15/142

    Abstract: The invention provides a method of speech recognition comprising the steps of receiving a signal comprising one or more spoken words, extracting a spoken word from the signal using a Hidden Markov Model, passing the spoken word to a plurality of word models, one or more of the word models based on a Hidden Markov Model, determining the word model most likely to represent the spoken word, and outputting the word model representing the spoken word. The invention also provides a related speech recognition system and a speech recognition computer program.

    Abstract translation: 本发明提供了一种语音识别方法,包括以下步骤:接收包括一个或多个口语单词的信号,使用隐马尔可夫模型从该信号中提取口语单词,将口语单词传递到多个单词模型,一个或多个单词模型 基于隐马尔可夫模型的单词模型,确定最有可能表示口语单词的单词模型,并输出表示口语单词的单词模型。 本发明还提供了一种相关的语音识别系统和语音识别计算机程序。

    METHODS AND APPARATUSES FOR SIGNAL ANALYSIS
    80.
    发明申请
    METHODS AND APPARATUSES FOR SIGNAL ANALYSIS 审中-公开
    用于分析信号的方法和设备

    公开(公告)号:WO01033547A1

    公开(公告)日:2001-05-10

    申请号:PCT/NL2000/000808

    申请日:2000-11-06

    CPC classification number: G10L25/90 G01R23/175 G10L21/0208

    Abstract: A basilar membrane model is used to receive an input signal including a target signal in step I. With successive further steps the target signal is filtered from the input signal. After the filtering the target signal can be used as an input for further processing, like for example signal recognition of data compression. The target signal can also be applied to a substantially reverse method to obtain an improved or clean signal.

    Abstract translation: 在步骤I中用于接收包括目标信号的输入信号的基底膜模型。在以下步骤中,从输入信号中滤除目标信号。 在过滤之后,目标信号可以用作输入用于进一步处理,例如识别数据压缩信号。 目标信号也可以应用于基本上反向的方法以获得改进的或干净的信号。

Patent Agency Ranking