Method and apparatus for speech reconstruction in a distributed speech recognition system
    31.
    发明授权
    Method and apparatus for speech reconstruction in a distributed speech recognition system 有权
    分布式语音识别系统中语音重建的方法和装置

    公开(公告)号:US06633839B2

    公开(公告)日:2003-10-14

    申请号:US09775951

    申请日:2001-02-02

    IPC分类号: G10L1500

    摘要: In a distributed speech recognition system comprising a first communication device which receives a speech input (34), encodes data representative of the speech input, and transmits the encoded data and a second remotely-located communication device which receives the encoded data and compares the encoded data with a known data set, the device including a processor with a program which controls the processor to operate according to a method of reconstructing the speech input including the step of receiving encoded data including encoded spectral data and encoded energy data. The method further includes the step of decoding the encoded spectral data and encoded energy data to determine the spectral data and energy data. The method also includes the step of combining the spectral data and energy data to reconstruct the speech input.

    摘要翻译: 在包括接收语音输入( 34 )的第一通信设备的分布式语音识别系统中,对表示语音输入的数据进行编码,并将编码数据和第二远程 接收编码数据并将编码数据与已知数据集进行比较的装置,该装置包括具有程序的处理器,该程序控制处理器根据重构语音输入的方法进行操作,该方法包括接收编码数据的步骤 包括编码的光谱数据和编码的能量数据。 该方法还包括对编码的光谱数据和编码的能量数据进行解码以确定光谱数据和能量数据的步骤。 该方法还包括组合光谱数据和能量数据以重构语音输入的步骤

    Speech encoder using a soft interpolation decision for spectral
parameters
    33.
    发明授权
    Speech encoder using a soft interpolation decision for spectral parameters 失效
    语音编码器使用光谱参数的软插值决策

    公开(公告)号:US5265219A

    公开(公告)日:1993-11-23

    申请号:US944855

    申请日:1992-09-14

    IPC分类号: G01L9/02 G10L19/00 G10L19/02

    CPC分类号: G10L19/02

    摘要: A speech encoder uses a soft interpolation decision for spectral parameters. For each frame, the encoder first calculates the residual energy for interpolated spectral parameters, and then calculates the residual energy for non-interpolated spectral parameters. The encoder then compares these residual energy calculations. If the encoder determines that the interpolated spectral parameters yields the lowest residual energy, it indicates to a far-end decoder to use the interpolated values for the current frame. Otherwise, it indicates to the far-end decoder to use the non-interpolated values for the current frame. The encoder signals the far-end decoder as to which spectral parameters (interpolated or non-interpolated values) to use by encoding and transmitting a special signalling bit.

    摘要翻译: 语音编码器对频谱参数使用软插值决定。 对于每个帧,编码器首先计算内插频谱参数的残余能量,然后计算非内插频谱参数的剩余能量。 编码器然后比较这些剩余能量计算。 如果编码器确定插值的频谱参数产生最低的残余能量,则其向远端解码器指示对当前帧使用内插值。 否则,它向远端解码器指示使用当前帧的非内插值。 编码器向远端解码器发信号通过编码和发送特殊信令位来使用哪些频谱参数(内插或非内插值)。

    Delta-coded lag information for use in a speech coder
    34.
    发明授权
    Delta-coded lag information for use in a speech coder 失效
    用于语音编码器的增量编码滞后信息

    公开(公告)号:US5253269A

    公开(公告)日:1993-10-12

    申请号:US755265

    申请日:1991-09-05

    IPC分类号: G10L19/06 H04B14/06

    CPC分类号: H04B14/06 G10L19/06

    摘要: Lag information for use in a speech coder is developed by estimating lag values for the various subframes (201) of a speech coding frame (200) of information, and by then selecting lag values for each subframe that are both closely corresponding to the estimated lag values and that also observe the restrictions of a selected delta-coding routine. When a plurality of candidate sets of such information have been developed, they are compared against one another to identify that set which appears to provide the best set of lag values. This information is then available for framing and transmission. In one embodiment, the sets of candidate values are also selected to ensure provision for subsequent adjustment in either a positive or negative direction. With this adjustability capability so provided, closed-loop adjustments can be made with respect to the selected values to ensure that the ultimately transmitted coding for the lag value most closely corresponds to an ultimate output that most closely represents the speech signal to be represented.

    摘要翻译: 通过估计信息的语音编码帧(200)的各个子帧(201)的滞后值,然后选择与估计的滞后紧密对应的每个子帧的滞后值来开发用于语音编码器的滞后信息 值,并且还观察到所选的Δ-编码例程的限制。 当已经开发了多个这样的信息的候选集合时,将它们彼此进行比较以识别看起来提供最佳滞后值集合的集合。 此信息可用于构图和传输。 在一个实施例中,还选择候选值集合以确保在正方向或负方向上随后进行调整。 利用如此提供的这种可调性能力,可以针对所选择的值进行闭环调整,以确保滞后值的最终传输编码最接近地对应于最接近地表示待表示的语音信号的最终输出。