-
公开(公告)号:US20160203825A1
公开(公告)日:2016-07-14
申请号:US15079524
申请日:2016-03-24
Inventor: Takuya KAWASHIMA , Masahiro OSHIKIRI
CPC classification number: G10L19/0204 , G10L19/0208 , G10L19/0212 , G10L21/0388 , G10L25/06
Abstract: A threshold amplitude is calculated for each subband obtained by splitting an extension band. For each subband, an amplitude of transform coefficients is compared with the threshold amplitude to extract a transform coefficient having an amplitude larger than the threshold amplitude as a representative transform coefficient. When a number of the extracted representative transform coefficients is less than a predetermined number, the threshold amplitude is updated in accordance with an amount by which the number of the representative transform coefficients is less than the predetermined number. A transform coefficient is extracted again using the updated threshold amplitude. For each of the subbands, a value of correlation is calculated between the representative transform coefficient and a normalized core encoded low-band transform coefficient. A subband having a largest value of correlation is selected when the number of the extracted representative transform coefficients reaches the predetermined number.
Abstract translation: 对通过分割扩展频带获得的每个子带计算阈值振幅。 对于每个子带,将变换系数的幅度与阈值幅度进行比较,以提取具有大于阈值振幅的幅度的变换系数作为代表变换系数。 当提取的代表性变换系数的数量小于预定数量时,根据代表变换系数的数量小于预定数量的量来更新阈值幅度。 使用更新的阈值振幅再次提取变换系数。 对于每个子带,在代表变换系数和归一化的核心编码低频带变换系数之间计算相关值。 当提取的代表变换系数的数量达到预定数量时,选择具有最大相关值的子带。
-
公开(公告)号:US20180158466A1
公开(公告)日:2018-06-07
申请号:US15843842
申请日:2017-12-15
Inventor: Takuya KAWASHIMA , Katsunori DAIMOU , Masahiro OSHIKIRI
IPC: G10L19/26 , G10L19/02 , G10L21/0388
CPC classification number: G10L19/265 , G10L19/0204 , G10L21/0388
Abstract: A coding apparatus, including a processor that performs operations including encoding a first band of an input audio signal to be a first spectrum, dividing the first spectrum into a plurality of subbands, at equal intervals each including a predetermined number of samples for flattening the first spectrum, searching a largest amplitude value of the divided first spectrum in each of the subbands, normalizing the divided first spectrum with the largest amplitude values searched in each of the subbands, searching best bands among each normalized divided first spectrum which has a largest correlation value between each divided band of a second band spectrum and each normalized divided first spectrum, the second spectrum being higher than a predetermined frequency, and encoding the second spectrum using lag information identifying the best bands for transmitting the lag information to a decoder side.
-
公开(公告)号:US20170337931A1
公开(公告)日:2017-11-23
申请号:US15646645
申请日:2017-07-11
Inventor: Takuya KAWASHIMA , Katsunori DAIMOU , Masahiro OSHIKIRI
IPC: G10L19/26 , G10L21/0388 , G10L19/02
CPC classification number: G10L19/265 , G10L19/0204 , G10L21/0388
Abstract: A coding apparatus encodes a first band of an input audio signal, normalizes a first spectrum included in each sub-band of the first band using a spectrum power envelope, performs a clipping process on the normalized first spectrum, the clipping process comparing between a predetermined threshold and the absolute value of an amplitude of the spectrum and replaces the amplitude value of the spectrum with the threshold if the absolute value of the amplitude of the spectrum exceeds the threshold, calculates a correlation between a spectrum in each divided band of a second band and a spectrum in a plurality of candidate bands containing the clipped normalized first spectrum, the second spectrum being higher than a predetermined frequency, identifies the best bands of the plurality of candidate bands, and encodes the second spectrum using lag information identifying the best band for transmitting the lag information to a decoder.
-
4.
公开(公告)号:US20170243594A1
公开(公告)日:2017-08-24
申请号:US15590360
申请日:2017-05-09
Inventor: Takuya KAWASHIMA , Masahiro OSHIKIRI
IPC: G10L19/02 , G10L19/032 , G10L19/002
CPC classification number: G10L19/0204 , G10L19/002 , G10L19/02 , G10L19/0212 , G10L19/032 , G10L19/24 , G10L21/038
Abstract: A speech/audio coding apparatus is provided that includes a receiver that receives a time-domain speech input signal and a processor. The processor transforms a time-domain speech input signal into a frequency-domain spectrum, and divides a frequency region of the spectrum in an extended band into a plurality of bands. The processor also sets a limited band for each divided band in the current frame, when a difference between a first frequency with a first maximum amplitude in a spectrum of the divided band in a preceding frame and a second frequency with a second maximum amplitude in a spectrum of the divided band in a current frame is below a threshold. The processor further encodes the spectrum in the limited band within each divided band in the current frame, and does not encode a spectrum outside the limited band within each divided band in the current frame.
-
公开(公告)号:US20160293178A1
公开(公告)日:2016-10-06
申请号:US15168805
申请日:2016-05-31
Inventor: Takuya KAWASHIMA , Katsunori DAIMOU , Masahiro OSHIKIRI
IPC: G10L19/26 , G10L19/02 , G10L21/0388
CPC classification number: G10L19/265 , G10L19/0204 , G10L21/0388
Abstract: A coding apparatus normalizes a low-frequency spectrum included in each of sub-bands obtained from dividing a low band part, using a largest amplitude value among the low-frequency spectrum included in each sub-band, obtains a normalized low-frequency spectrum by decoding the first encoded data, and calculates a correlation between each divided band of a high-frequency spectrum and a plurality of candidate bands of the normalized low-frequency spectrum. The best bands of a plurality of candidate bands are identified, each candidate band having a starting frequency position with non-zero amplitude in the normalized low-frequency spectrum, the high-frequency spectrum being in a high band part of the input audio signal that is higher than the predetermined frequency, and the high-frequency spectrum is encoded using lag information identifying the best band for transmitting the lag information to a decoder.
Abstract translation: 编码装置使包含在每个子带中的低频频谱中的最大振幅值,通过分割低频部分而获得的每个子频带中包括的低频频谱进行归一化,通过以下方式获得归一化的低频谱: 对第一编码数据进行解码,并且计算高频谱的每个划分频带与归一化低频频谱的多个候选频带之间的相关性。 识别多个候选频带中的最佳频带,每个候选频带具有归一化低频频谱中具有非零幅度的起始频率位置,高频频谱位于输入音频信号的高频部分中, 高于预定频率,并且使用识别用于将滞后信息发送到解码器的最佳频带的滞后信息来编码高频频谱。
-
公开(公告)号:US20190198035A1
公开(公告)日:2019-06-27
申请号:US16290321
申请日:2019-03-01
Inventor: Takuya KAWASHIMA , Katsunori DAIMOU , Masahiro OSHIKIRI
IPC: G10L19/26 , G10L21/0388 , G10L19/02
CPC classification number: G10L19/265 , G10L19/0204 , G10L21/0388
Abstract: A coding apparatus includes a processor and a memory that stores instructions, which when executed causes the processor to perform operations, including encoding a first band of an input audio signal to be a first spectrum and dividing the first spectrum into a plurality of sub-bands. The operations also include searching a largest amplitude value of the divided first spectrum in each of the plurality of sub-bands, and normalizing the divided first spectrum in each of the plurality of sub-bands. The operations further include emphasizing a harmonic structure in the normalized first spectrum, and searching a best band that has a largest correlation value between each divided band of a second band spectrum and the emphasized first spectrum in which the harmonic structure is emphasized, and encoding the second band spectrum using lag information identifying the best band and transmitting the lag information to a decoder side.
-
7.
公开(公告)号:US20190147897A1
公开(公告)日:2019-05-16
申请号:US16243588
申请日:2019-01-09
Inventor: Takuya KAWASHIMA , Masahiro OSHIKIRI
IPC: G10L19/02 , G10L19/002 , G10L19/032
Abstract: A speech/audio coding apparatus includes a receiver that receives a time-domain speech input signal. The apparatus also includes a processor that transforms a time-domain speech input signal into a frequency-domain spectrum, and divides a frequency region of the spectrum in an extended band into a plurality of bands. The processor sets a limited band for each divided band in the current frame, a width of the limited band in the current frame being narrower than the divided band and the limited band including a first frequency. The processor further encodes the spectrum in the limited band within each divided band in the current frame, wherein the width of the limited band is predetermined and is set to 31.
-
公开(公告)号:US20240127830A1
公开(公告)日:2024-04-18
申请号:US18276752
申请日:2021-10-15
Inventor: Yuichi KAMIYA , Takuya KAWASHIMA , Akira HARADA , Hiroyuki EHARA
IPC: G10L19/008
CPC classification number: G10L19/008
Abstract: This encoding device comprises: a downmix circuit that switches mixing processing according to the characteristic of an input stereo signal to generate either a first stereo signal or a second stereo signal obtained by mixing processing of a left channel signal and a right channel signal; a first encoding circuit that encodes the first stereo signal; and a second encoding circuit that encodes two signals included in the second stereo signal. The second encoding circuit performs monaural encoding on the basis of the encoding mode of the first encoding circuit in a first section in which switching from the first stereo signal to the second stereo signal is performed and/or a second section in which switching from the second stereo signal to the first stereo signal is performed.
-
公开(公告)号:US20170076728A1
公开(公告)日:2017-03-16
申请号:US15358184
申请日:2016-11-22
Inventor: Takuya KAWASHIMA , Masahiro OSHIKIRI
IPC: G10L19/002 , G10L19/035 , G10L19/02
CPC classification number: G10L19/002 , G10L19/0208 , G10L19/035 , G10L19/06 , G10L19/12
Abstract: A speech/audio encoding device for selectively allocating bits for higher precision encoding. The speech/audio encoding device receives a time-domain speech/audio input signal, transforms the speech/audio input signal into a frequency domain, and quantizes an energy envelope corresponding to an energy level for a frequency spectrum of the speech/audio input signal. The speech/audio encoding device further groups quantized energy envelopes into a plurality of groups, determines a perceptual significant group including one or more significant bands and a local-peak frequency, and allocates bits to a plurality of subbands corresponding to the grouped quantized energy envelopes, in which each of the subbands is obtained by splitting the frequency spectrum of the speech/audio input signal. The speech/audio encoding device encodes the frequency spectrum using the bits allocated to the subbands.
Abstract translation: 用于选择性地分配比特以用于更高精度编码的语音/音频编码装置。 语音/音频编码装置接收时域语音/音频输入信号,将语音/音频输入信号变换成频域,并量化对应于语音/音频输入信号的频谱的能级的能量包络 。 语音/音频编码装置进一步将量化的能量包络分组成多个组,确定包括一个或多个有效频带和局部峰值频率的感知有效组,并将比特分配给对应于分组的量化能量包络的多个子带 ,其中通过分割语音/音频输入信号的频谱来获得每个子带。 语音/音频编码设备使用分配给子带的比特对频谱进行编码。
-
10.
公开(公告)号:US20230306978A1
公开(公告)日:2023-09-28
申请号:US18011390
申请日:2021-04-22
Inventor: Yuichi KAMIYA , Takuya KAWASHIMA , Hiroyuki EHARA , Akira HARADA
IPC: G10L19/20 , G10L19/02 , G10L19/005 , G10L19/12 , G10L19/008
CPC classification number: G10L19/20 , G10L19/0204 , G10L19/005 , G10L19/12 , G10L19/008
Abstract: A coding apparatus includes: a first coding circuit that codes an input signal selectively using coding in a time domain or a frequency domain according to the characteristic of the input signal in a core layer; and a second coding circuit that codes an error in coding by the first coding circuit using a coding method corresponding to the domain type of coding used in the core layer in an extension layer for the core layer.
-
-
-
-
-
-
-
-
-