Feature-domain concatenative speech synthesis
    1.
    发明授权
    Feature-domain concatenative speech synthesis 有权
    特征域级联语音合成

    公开(公告)号:US07035791B2

    公开(公告)日:2006-04-25

    申请号:US09901031

    申请日:2001-07-10

    申请人: Dan Chazan Ron Hoory

    发明人: Dan Chazan Ron Hoory

    IPC分类号: G10L11/04

    CPC分类号: G10L13/07 G10L25/18

    摘要: A method for speech synthesis includes receiving an input speech signal containing a set of speech segments, and estimating spectral envelopes of the input speech signal in a succession of time intervals during each of the speech segments. The spectral envelopes are integrated over a plurality of window functions in a frequency domain so as to determine elements of feature vectors corresponding to the speech segments. An output speech signal is reconstructed by concatenating the feature vectors corresponding to a sequence of the speech segments.

    摘要翻译: 一种用于语音合成的方法包括接收包含一组语音段的输入语音信号,并且在每个语音段期间以一连串的时间间隔估计输入语音信号的频谱包络。 频谱包络被集成在频域中的多个窗口函数上,以便确定与语音段对应的特征向量的元素。 通过连接对应于语音片段序列的特征向量来重构输出语音信号。

    Fast frequency-domain pitch estimation
    2.
    发明授权
    Fast frequency-domain pitch estimation 有权
    快速频域间距估计

    公开(公告)号:US06587816B1

    公开(公告)日:2003-07-01

    申请号:US09617582

    申请日:2000-07-14

    IPC分类号: G10L1104

    CPC分类号: G10L25/90

    摘要: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.

    摘要翻译: 一种用于估计音频信号的音调频率的方法包括:在第一时间间隔上计算信号到频域的第一变换,以及在第二时间间隔上计算信号到频域的第二变换,该第二时间间隔包含 第一时间间隔。 基于第一和第二变换,发现包括具有各自线路幅度和线路频率的谱线的频谱的信号线谱。 然后计算在频谱中的线的频率中周期性的效用函数。 该功能针对给定音调频率范围内的每个候选音调频率指示频谱与候选音调频率的兼容性。 响应于效用函数来估计语音信号的音调频率。

    Method and system for speech reconstruction from speech recognition features, pitch and voicing with resampled basis functions providing reconstruction of the spectral envelope
    4.
    发明授权
    Method and system for speech reconstruction from speech recognition features, pitch and voicing with resampled basis functions providing reconstruction of the spectral envelope 有权
    用于语音识别特征的语音重建的方法和系统,具有重新采样的基函数的音调和发音,提供频谱包络的​​重建

    公开(公告)号:US06725190B1

    公开(公告)日:2004-04-20

    申请号:US09432081

    申请日:1999-11-02

    IPC分类号: G10L1902

    CPC分类号: G10L13/07 G10L25/18

    摘要: A speech reconstruction method and system for converting a series of binned spectra or functions thereof such as the Mel Frequency Cepstra Coefficients (MFCC), of an original digitized speech signal, into a reconstructed speech signal, where each binned spectrum has a respective pitch value and voicing decision. The binned spectra are derived from the original digitized speech signal at successive instances by multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions and computing the integrals thereof. At each respective time instance, harmonic frequencies and weights are generated according to the respective pitch value and voicing decision. Basis functions having bounded supports on the frequency axis are each sampled at all said harmonic frequencies, which are within its support and multiplied by respective harmonic weights. The sampled basis functions are combined with respective phases, generated according to the pitch value, voicing decision and possibly the binned spectrum, resulting in a complex line spectrum corresponding to each basis function. Coefficients are generated of the basis functions, and each of the points of the respective complex line spectra is multiplied by the respective basis function coefficient. The complex line spectra are summed up to generate for each time instance a single complex line spectrum with values for all harmonic frequencies. A time signal is generated from complex line spectra computed at successive instances of time.

    摘要翻译: 一种将原始数字化语音信号的Mel频率Cepstra系数(MFCC)的一系列二进制频谱或其功能转换为重构语音信号的语音重建方法和系统,其中每个合并频谱具有相应的音调值, 发声决定。 通过将频谱包络的​​每个估计乘以预定的一组频域窗口函数并计算其积分,在连续实例中从原始数字化语音信号导出分箱频谱。 在各个时间的情况下,根据相应的音调值和发音决定产生谐波频率和权重。 在频率轴上具有界限支撑的基础功能在所有谐波频率下进行采样,所述谐波频率在其支持范围内并乘以相应的谐波权重。 采样基函数与根据音调值,发声判定和可能的分频谱产生的相位相结合,得到与每个基函数对应的复谱线谱。 生成基函数的系数,并将各个复谱谱的每个点乘以各自的基函数系数。 归纳出复谱线谱,为每个时间实例生成具有所有谐波频率值的单个复谱谱线。 时间信号由在连续的时间实例计算出的复线谱产生。

    Method and system for low bit rate speech coding with speech recognition features and pitch providing reconstruction of the spectral envelope
    6.
    发明授权
    Method and system for low bit rate speech coding with speech recognition features and pitch providing reconstruction of the spectral envelope 有权
    用于具有语音识别特征的低比特率语音编码的方法和系统,并且提供频谱包络的​​重建

    公开(公告)号:US06678655B2

    公开(公告)日:2004-01-13

    申请号:US10291590

    申请日:2002-11-12

    IPC分类号: G10L1912

    CPC分类号: G10L19/02 G10L15/02

    摘要: A method for encoding a digitized speech signal so as to generate data capable of being decoded as speech. A digitized speech signal is first converted to a series of feature vectors using for example known Mel-frequency Cepstral coefficients (MFCC) techniques. At successive instances instance of time a respective pitch value of the digitized speech signal is computed, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.

    摘要翻译: 一种用于编码数字化语音信号以便产生能够被解码为语音的数据的方法。 使用例如已知的Mel-frequency倒谱系数(MFCC)技术,首先将数字化语音信号转换成一系列特征向量。 在连续的实例中,计算数字化语音信号的相应音调值,并且压缩每个包含相应音调值和特征向量的连续声矢量,从而从其中导出比特流。 合适的解码器反转操作以提取特征向量和音调值,从而允许语音再现和回放。 另外,使用解压缩的特征向量可以进行语音识别,而不会损害识别精度并且没有计算开销。

    Method for encoding and decoding spectral phase data for speech signals

    公开(公告)号:US07127389B2

    公开(公告)日:2006-10-24

    申请号:US10243580

    申请日:2002-09-13

    申请人: Dan Chazan Zvi Kons

    发明人: Dan Chazan Zvi Kons

    IPC分类号: G10L11/04

    CPC分类号: G10L25/90

    摘要: A speech decoder and a segment aligner are provided in the present invention. The speech decoder may include a spectrum reconstructor operative to reconstruct the spectrum of a speech segment from the amplitude envelope of the spectrum of said speech segment and pitch information, a phase combiner operative to reconstruct the complex spectrum of the speech segment from the reconstructed spectrum, phase information describing the speech segment, and pitch information describing the speech segment. The speech decoder may further include a delay operative to store a complex spectrum of a previous speech segment; and a segment aligner operative to determine the relative offset between the complex spectrum of the speech segment and the complex spectrum of the previous speech segment, align the position of the first pitch excitation of the current speech segment to the last pitch excitation of the previous speech segment; and to apply a time shift and a complex Hilbert filter to said complex spectra, wherein the segment aligner is operative to cross-correlate the complex spectra as C ⁡ ( τ ) = ∑ n = 0 N ⁢ ⁢ F n ⁢ G _ m ⁢ ⅇ - 2 ⁢ ⁢ π ⁢ ⁢ in ⁢ ⁢ τ , m = ⌊ n ⁢ p G p F + 0.5 ⌋ , where Fn and Gm are the computed complex magnitude of the pitch harmonics n and m of the current and previous spectra respectively, and pF and pG are their corresponding pitch periods.

    Adaptive noise cancellation device
    8.
    发明授权
    Adaptive noise cancellation device 失效
    自适应噪声消除装置

    公开(公告)号:US5568558A

    公开(公告)日:1996-10-22

    申请号:US164912

    申请日:1993-12-03

    申请人: Dov Ramm Dan Chazan

    发明人: Dov Ramm Dan Chazan

    CPC分类号: H03H21/0027 H04B1/123

    摘要: An adaptive noise cancellation device comprises: convolution logic (10) for convolving the signal from a reference input (x) with a discretized L-tap filter to form a filtered reference signal; and logic (20) for subtracting the filtered reference signal from a signal input to form an output signal; logic for generating the filter taps as a linear combination of N basis functions each having a corresponding coefficient C.sub.k ; and logic for repeatedly determining the coefficients C.sub.k which minimize the power in the output signal (z), characterized in that N is less than the number of filter taps L and the basis functions have respective values over a portion of finite width, outside of which portion the functions are substantially zero, both in the frequency and time domains, in an embodiment they are gaussian. A full-duplex speakerphone is disclosed including such a noise cancellation device.

    摘要翻译: 自适应噪声消除装置包括:卷积逻辑(10),用于将来自参考输入(x)的信号与离散的L抽头滤波器进行卷积以形成滤波的参考信号; 以及逻辑(20),用于从信号输入中减去滤波的参考信号以形成输出信号; 用于产生滤波器抽头的逻辑作为每个具有对应系数C k的N个基函数的线性组合; 以及用于重复确定使输出信号(z)中的功率最小化的系数C k的逻辑,其特征在于,N小于滤波器抽头L的数量,并且基函数在有限宽度的一部分上具有相应的值 在一个实施例中,在频率和时域中,功能基本上为零,它们是高斯​​的。 公开了一种包括这种噪声消除装置的全双工扬声器。

    Method for tracking a pitch signal
    9.
    发明授权
    Method for tracking a pitch signal 失效
    跟踪音调信号的方法

    公开(公告)号:US07251597B2

    公开(公告)日:2007-07-31

    申请号:US10331451

    申请日:2002-12-27

    申请人: Dan Chazan

    发明人: Dan Chazan

    IPC分类号: G10L11/04

    CPC分类号: G10L25/90 G10L21/013

    摘要: A method for tracking pitch signal, including receiving a detected pitch signal that consists of a succession of pitch values, and for each current pitch value in the detected signal perform the following steps: constructing sub-sequences of consistent pitch values from neighboring pitch values. Next, calculating significance of the sub-sequences, and selecting a sub-sequence or a collection of consistent subsequences with highest significance. If the current pitch value is not consistent with the sub-sequence with highest significance, smoothing the current pitch value by diving it or multiplying it by an integer value>1, so as to render it consistent with the sub-sequence with highest significance.

    摘要翻译: 一种用于跟踪音调信号的方法,包括接收由一系列音调值组成的检测音调信号,并且对于检测信号中的每个当前音调值执行以下步骤:从相邻音调值构建一致音调值的子序列。 接下来,计算子序列的重要性,并选择具有最高重要性的一致子序列的子序列或集合。 如果当前音调值与具有最高有效性的子序列不一致,则通过潜水或将其乘以整数值1来平滑当前音调值,使其与具有最高重要性的子序列一致。

    Audio mixer
    10.
    发明授权
    Audio mixer 失效
    音频混音器

    公开(公告)号:US06459797B1

    公开(公告)日:2002-10-01

    申请号:US09053454

    申请日:1998-04-01

    IPC分类号: H04R500

    CPC分类号: H04S3/00

    摘要: An audio mixer system is described for producing coded output in which at least a left audio signal, a right audio signal and a surround audio signal are encoded in two output channels so that the surround signal can be decoded from the difference of the two output channels. The system comprises means for generating position data designating a desired position for a sound source in a 360 degree sound field. Logic is provided for determining the relative volume of the sound source in the left, right and surround audio signals from the position data. A signed continuity factor is maintained so that the sign of the continuity factor is changed in response the desired position crossing a nominal position of the surround signal in the sound field and logic is provided for encoding the sound source data into the two output channels in accordance with the determined relative volume of the sound source in at least two of the left, right and surround signals each multiplied by the continuity factor. This reduces audible artifacts associated with phase discontinuities in the output signals either side of the surround speaker nominal position.

    摘要翻译: 描述了用于产生编码输出的音频混合器系统,其中至少左音频信号,右音频信号和环绕音频信号被编码在两个输出通道中,使得可以从两个输出通道的差分解码环绕信号 。 该系统包括用于产生指定360度声场中的声源的期望位置的位置数据的装置。 提供逻辑用于确定来自位置数据的左,右和环绕音频信号中的声源的相对体积。 维持有符号的连续性因子,使得连续性因子的符号响应于在声场中与环绕信号的标称位置相交的期望位置而改变,并且逻辑被提供用于根据声音源数据将声源数据编码到两个输出通道中 在左,右和环绕信号中的至少两个中确定的声源的相对体积每乘以连续性因子。 这减少了与环绕扬声器标称位置两侧的输出信号中的相位不连续性相关的可听见的伪像。