-
公开(公告)号:US12001808B2
公开(公告)日:2024-06-04
申请号:US17435995
申请日:2021-08-23
发明人: Yoonjung Choi , Sangha Kim , Hakjung Kim , Yoonjin Yoon , Seokchan Ahn
IPC分类号: G06F40/58 , G06F40/44 , G10L15/183 , G10L21/04 , G10L25/87
CPC分类号: G06F40/58 , G10L21/04 , G10L25/87 , G06F40/44 , G10L15/183
摘要: A method is provided. The method includes receiving a speech input in a first language from a first device; obtaining, by using an artificial intelligence (AI) model, an estimated interpretation time that indicates a time expected to be required to interpret the speech input in the first language into a second language; transmitting, based on the estimated interpretation time, interpretation situation information to at least one of the first device or a second device; interpreting the speech input in the first language into the second language; and transmitting, to the second device a result of the interpreting of the speech input into the second language.
-
公开(公告)号:US11830507B2
公开(公告)日:2023-11-28
申请号:US17270035
申请日:2019-08-21
发明人: Arijit Biswas , Harald Mundt
IPC分类号: G10L19/025 , G10L19/00 , G10L19/16 , G10L21/0316 , G10L21/04 , H03M7/50 , H03M7/30
CPC分类号: G10L19/025 , G10L19/00 , G10L19/167 , G10L21/0316 , G10L21/04 , H03M7/50 , H03M7/3059
摘要: Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A method of processing an audio signal includes the following operations. A system receives an audio signal. The system determines that a first frame of the audio signal includes a sparse transient signal. The system determines that a second frame of the audio signal includes a dense transient signal. The system compresses/expands (compands) the audio signal using a companding rule that applies a first companding exponent to the first frame of the audio signal and applies a second companding exponent to the second frame of the audio signal, each companding exponent being used to derive a respective degree of dynamic range compression and expansion for a corresponding frame. The system then provides the companded audio signal to a downstream device.
-
公开(公告)号:US20180247642A1
公开(公告)日:2018-08-30
申请号:US15697923
申请日:2017-09-07
发明人: Hyun Woo KIM , Ho Young JUNG , Jeon Gue PARK , Yun Keun LEE
CPC分类号: G10L15/16 , G06N3/08 , G06N3/084 , G10L15/02 , G10L15/04 , G10L21/04 , G10L25/84 , G10L2015/025 , G10L2015/027
摘要: The present invention relates to a method and apparatus for improving spontaneous speech recognition performance. The present invention is directed to providing a method and apparatus for improving spontaneous speech recognition performance by extracting a phase feature as well as a magnitude feature of a voice signal transformed to the frequency domain, detecting a syllabic nucleus on the basis of a deep neural network using a multi-frame output, determining a speaking rate by dividing the number of syllabic nuclei by a voice section interval detected by a voice detector, calculating a length variation or an overlap factor according to the speaking rate, and performing cepstrum length normalization or time scale modification with a voice length appropriate for an acoustic model.
-
公开(公告)号:US10014001B2
公开(公告)日:2018-07-03
申请号:US15880869
申请日:2018-01-26
申请人: Bose Corporation
发明人: Michael Elliot , Debasmit Banerjee
CPC分类号: G10L21/04 , G10L19/26 , H04N21/4307 , H04N21/8106 , H04R29/007 , H04R2227/003 , H04R2227/005 , H04R2420/07
摘要: A method of synchronizing playback of audio data sent over a first wireless network from an audio source to a wireless speaker package that is adapted to play the audio data. The method includes comparing a first time period over which audio data was sent over the first wireless network to a second time period over which the audio data was received by the wireless speaker package, and playing the received audio data on the wireless speaker package over a third time period that is related to the comparison of the first and second time periods.
-
公开(公告)号:US09997167B2
公开(公告)日:2018-06-12
申请号:US14973729
申请日:2015-12-18
发明人: Stefan Reuschl , Stefan Doehla , Jeremie Lecomte , Manuel Jander
IPC分类号: G10L19/00 , G10L21/00 , H04J3/00 , H04L12/00 , G10L19/022 , H04J3/06 , G10L19/012 , G10L19/04 , G10L21/04
CPC分类号: G10L19/022 , G10L19/012 , G10L19/04 , G10L21/04 , H04J3/0632 , H04J3/0664
摘要: A jitter buffer control for controlling a provision of a decoded audio content on the basis of an input audio content is configured to select a frame-based time scaling or a sample-based time scaling in a signal-adaptive manner. An audio decoder uses such a jitter buffer control.
-
公开(公告)号:US09940941B2
公开(公告)日:2018-04-10
申请号:US15480859
申请日:2017-04-06
发明人: Lars Villemoes
IPC分类号: G10L19/02 , G10L19/025 , G10L19/26
CPC分类号: G10L19/0208 , G10L19/025 , G10L19/265 , G10L21/038 , G10L21/04 , H03G3/00 , H03G3/3089
摘要: The invention provides an efficient implementation of cross-product enhanced high-frequency reconstruction (HFR), wherein a new component at frequency QΩ+rΩ0 is generated on the basis of existing components at Ω and Ω+Ω0. The invention provides a block-based harmonic transposition, wherein a time block of complex subband samples is processed with a common phase modification. Superposition of several modified samples has the net effect of limiting undesirable intermodulation products, thereby enabling a coarser frequency resolution and/or lower degree of oversampling to be used. In one embodiment, the invention further includes a window function suitable for use with block-based cross-product enhanced HFR. A hardware embodiment of the invention may include an analysis filter bank, a subband processing unit configurable by control data and a synthesis filter bank.
-
公开(公告)号:US09858945B2
公开(公告)日:2018-01-02
申请号:US15644983
申请日:2017-07-10
发明人: Lars Villemoes
IPC分类号: G10L19/00 , G10L21/038 , G10L19/032 , G10L25/18 , G10L19/02 , G10L19/022 , G10L21/04
CPC分类号: G10L21/038 , G10L19/0204 , G10L19/022 , G10L19/032 , G10L21/04 , G10L25/18
摘要: The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank configured to provide an analysis subband signal from the input signal; wherein the analysis subband signal comprises a plurality of complex valued analysis samples, each having a phase and a magnitude. Furthermore, the system comprises a subband processing unit configured to determine a synthesis subband signal from the analysis subband signal using a subband transposition factor Q and a subband stretch factor S. The subband processing unit performs a block based nonlinear processing wherein the magnitude of samples of the synthesis subband signal are determined from the magnitude of corresponding samples of the analysis subband signal and a predetermined sample of the analysis subband signal. In addition, the system comprises a synthesis filterbank configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal.
-
公开(公告)号:US09852734B1
公开(公告)日:2017-12-26
申请号:US14250710
申请日:2014-04-11
发明人: Zhuojin Sun , Bingsen Xie
IPC分类号: G10L19/00
摘要: System and methods are provided for modifying audio signals. A waveform representing an audio signal changing over time is received. A first time length is selected. A first starting point in the waveform is selected. A first pair of adjacent segments of the waveform are determined based at least in part on the first starting point and the first time length. The first pair of adjacent segments each correspond to the first time length. A first difference measure associated with the first pair of adjacent segments is calculated. In response to the first difference measure being smaller than a threshold, compression or expansion of the waveform is performed based at least in part on the first time length and the first starting point.
-
公开(公告)号:US09735750B2
公开(公告)日:2017-08-15
申请号:US14854498
申请日:2015-09-15
发明人: Lars Villemoes
IPC分类号: H03G3/00 , H03G3/30 , G10L21/04 , G10L21/038
CPC分类号: G10L19/0208 , G10L19/025 , G10L19/265 , G10L21/038 , G10L21/04 , H03G3/00 , H03G3/3089
摘要: The invention provides an efficient implementation of cross-product enhanced high-frequency reconstruction (HFR), wherein a new component at frequency QΩ+rΩ0 is generated on the basis of existing components at Ω and Ω+Ω0. The invention provides a block-based harmonic transposition, wherein a time block of complex subband samples is processed with a common phase modification. Superposition of several modified samples has the net effect of limiting undesirable intermodulation products, thereby enabling a coarser frequency resolution and/or lower degree of oversampling to be used. In one embodiment, the invention further includes a window function suitable for use with block-based cross-product enhanced HFR. A hardware embodiment of the invention may include an analysis filter bank, a subband processing unit configurable by control data and a synthesis filter bank.
-
公开(公告)号:US09502049B2
公开(公告)日:2016-11-22
申请号:US14538751
申请日:2014-11-11
发明人: Stefan Bayer , Sascha Disch , Ralf Geiger , Guillaume Fuchs , Max Neuendorf , Gerald Schuller , Bernd Edler
IPC分类号: G10L19/00 , G10L21/04 , G10L19/002 , G10L19/028 , G10L19/022 , G10L19/032 , G10L21/043 , G10L25/90 , G10L19/26
CPC分类号: G10L21/04 , G10L19/002 , G10L19/0212 , G10L19/022 , G10L19/028 , G10L19/032 , G10L19/265 , G10L21/043 , G10L25/90
摘要: An audio encoder has a window function controller, a windower, a time warper with a final quality check functionality, a time/frequency converter, a TNS stage or a quantizer encoder, the window function controller, the time warper, the TNS stage or an additional noise filling analyzer are controlled by signal analysis results obtained by a time warp analyzer or a signal classifier. Furthermore, a decoder applies a noise filling operation using a manipulated noise filling estimate depending on a harmonic or speech characteristic of the audio signal.
-
-
-
-
-
-
-
-
-