-
公开(公告)号:US20240347042A1
公开(公告)日:2024-10-17
申请号:US18298473
申请日:2023-04-11
发明人: Felix Weninger , Marco Gaudesi , Puming Zhan
CPC分类号: G10L15/063 , G10L15/04 , G10L15/16 , G10L15/22 , G10L25/45
摘要: A method, computer program product, and computing system for dividing a speech signal into a plurality of chunks. A first context window is defined with a first period of past context for processing the plurality of chunks with a neural network of a speech processing system. The neural network is trained using the first context window. A second context window is defined with a first period of past context for processing the plurality of chunks with the neural network. The neural network is trained using the second context window.
-
公开(公告)号:US20240304192A1
公开(公告)日:2024-09-12
申请号:US18646877
申请日:2024-04-26
发明人: Martin SEHLSTEDT
CPC分类号: G10L19/005 , G06F17/142 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80
摘要: Controlling a concealment method for a lost audio frame associated with a received audio signal is provided. At least one bin vector of a spectral representation for at least one tone is obtained, wherein the at least one bin vector includes three consecutive bin values for the at least one tone. Whether each of the three consecutive bin values has a complex value or a real value is determined. Responsive to the determination, the three consecutive bin values are processed to estimate a frequency of the at least one tone based on whether each bin value has a complex value or a real value.
-
公开(公告)号:US12073361B2
公开(公告)日:2024-08-27
申请号:US17955693
申请日:2022-09-29
IPC分类号: G16H15/00 , G06F3/16 , G06F40/117 , G06F40/30 , G06Q10/10 , G06T7/20 , G10L15/22 , G10L15/26 , G10L15/30 , G10L25/45 , G10L25/51 , G16H10/20 , G16H10/40 , G16H10/60 , G16H50/70 , H04R1/40 , H04R3/00
CPC分类号: G06Q10/10 , G06F3/165 , G06F40/117 , G06F40/30 , G06T7/20 , G10L15/22 , G10L15/26 , G10L15/30 , G10L25/45 , G10L25/51 , G16H10/20 , G16H10/40 , G16H10/60 , G16H15/00 , G16H50/70 , H04R1/406 , H04R3/005 , G06T2207/30196
摘要: A method, computer program product, and computing system for obtaining encounter information of a patient encounter; processing the encounter information to generate an encounter transcript; and processing the encounter transcript to locate one or more procedural events within the encounter transcript.
-
公开(公告)号:USRE49999E1
公开(公告)日:2024-06-04
申请号:US17845607
申请日:2022-06-21
发明人: Markus Schnell , Manfred Lutzky , Markus Lohwasser , Markus Schmidt , Marc Gayer , Michael Mellar , Bernd Edler , Markus Multrus , Gerald Schuller , Ralf Geiger , Bernhard Grill
IPC分类号: G10L19/00 , G10L19/02 , G10L19/022 , G10L25/45 , H03H17/02 , G10L21/038
CPC分类号: G10L19/0204 , G10L19/022 , G10L25/45 , H03H17/0266 , G10L21/038
摘要: An embodiment of an apparatus for generating audio subband values in audio subband channels includes an analysis windower for windowing a frame of time-domain audio input samples being in a time sequence extending from an early sample to a later sample using an analysis window function including a sequence of window coefficients to obtain windowed samples. The analysis window function includes a first number of window coefficients derived from a larger window function including a sequence of a larger second number of window coefficients, wherein the window coefficients of the window function are derived by an interpolation of window coefficients of the larger window function. The apparatus further includes a calculator for calculating the audio subband values using the windowed samples.
-
公开(公告)号:US20240135936A1
公开(公告)日:2024-04-25
申请号:US18524622
申请日:2023-11-30
发明人: Martin SEHLSTEDT , Jonas SVEDBERG
CPC分类号: G10L19/005 , G06F17/142 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80
摘要: A method, decoder, and program code for controlling a concealment method for a lost audio frame is provided. A first audio frame and a second audio frame of the received audio signal are decoded to obtain modified discrete cosine transform (MDCT) coefficients. Values of a first spectral shape based upon the MDCT coefficients decoded from the first audio frame decoded and values of a second spectral shape based upon MDCT coefficients decoded from the second audio frame decoded are determined, the spectral shapes each comprising a number of sub-bands. The values of the spectral shapes and frame energies of the first audio frame and second audio frame are transformed into representations of FFT based spectral analyses. A transient condition is detected based on the representations of the FFTs. Responsive to detecting the transient condition, the concealment method is modified by selectively adjusting a spectrum magnitude of a substitution frame spectrum.
-
公开(公告)号:USRE49813E1
公开(公告)日:2024-01-23
申请号:US17589228
申请日:2022-01-31
发明人: Hyen-O Oh , Chang Heon Lee , Hong-Goo Kang , Jeungook Song
IPC分类号: G10L19/00 , G10L25/45 , G10L21/00 , G10L19/04 , G10L19/022 , G10L19/18 , G10L19/005
CPC分类号: G10L19/04 , G10L19/005 , G10L19/022 , G10L19/18 , G10L25/45
摘要: An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving, by an audio processing apparatus, an audio signal including a first data of a first block encoded with rectangular coding scheme and a second data of a second block encoded with non-rectangular coding scheme; receiving a compensation signal corresponding to the second block; estimating a prediction of an aliasing part using the first data; and, obtaining a reconstructed signal for the second block based on the second data, the compensation signal and the prediction of aliasing part.
-
公开(公告)号:US11862180B2
公开(公告)日:2024-01-02
申请号:US17432260
申请日:2020-02-20
发明人: Martin Sehlstedt , Jonas Svedberg
CPC分类号: G10L19/005 , G06F17/142 , G10L19/02 , G10L19/0204 , G10L19/0212 , G10L25/18 , G10L25/45 , H04L65/75 , H04L65/80
摘要: A method, decoder, and program code for controlling a concealment method for a lost audio frame is provided. A first audio frame and a second audio frame of the received audio signal are decoded to obtain modified discrete cosine transform (MDCT) coefficients. Values of a first spectral shape based upon the MDCT coefficients decoded from the first audio frame decoded and values of a second spectral shape based upon MDCT coefficients decoded from the second audio frame decoded are determined, the spectral shapes each comprising a number of sub-bands. The values of the spectral shapes and frame energies of the first audio frame and second audio frame are transformed into representations of FFT based spectral analyses. A transient condition is detected based on the representations of the FFTs. Responsive to detecting the transient condition, the concealment method is modified by selectively adjusting a spectrum magnitude of a substitution frame spectrum.
-
公开(公告)号:US11776528B2
公开(公告)日:2023-10-03
申请号:US17380426
申请日:2021-07-20
申请人: Xinapse Co., Ltd.
发明人: Jinbeom Kang , Dong Won Joo , Yongwook Nam
IPC分类号: G10L13/033 , G10L21/013 , G10L25/45 , G10L21/043 , G10L21/14
CPC分类号: G10L13/0335 , G10L21/013 , G10L21/043 , G10L21/14 , G10L25/45
摘要: This application relates to a method of synthesizing a speech of which a speed and a pitch are changed. In one aspect, the method includes a spectrogram may be generated by performing a short-time Fourier transformation on a first speech signal based on a first hop length and a first window length, and speech signals of sections having a second window length at the interval of a second hop length from the spectrogram. A ratio between the first hop length and the second hop length may be set to be equal to the value of a playback rate and a ratio between the first window length and the second window length may be set to be equal to the value of a pitch change rate, thereby generating a second speech signal of which the speed and the pitch are changed.
-
公开(公告)号:US20230298597A1
公开(公告)日:2023-09-21
申请号:US18203280
申请日:2023-05-30
发明人: Martin SEHLSTEDT
CPC分类号: G10L19/005 , G10L19/0204 , G10L19/0212 , G10L25/18 , H04L65/75 , G10L19/02 , G10L25/45 , H04L65/80 , G06F17/142
摘要: Controlling a concealment method for a lost audio frame associated with a received audio signal is provided. At least one bin vector of a spectral representation for at least one tone is obtained, wherein the at least one bin vector includes three consecutive bin values for the at least one tone. Whether each of the three consecutive bin values has a complex value or a real value is determined. Responsive to the determination, the three consecutive bin values are processed to estimate a frequency of the at least one tone based on whether each bin value has a complex value or a real value.
-
公开(公告)号:US20230245668A1
公开(公告)日:2023-08-03
申请号:US17911733
申请日:2020-09-30
发明人: Quanzhi XIAO , Yufeng YAN , Rongjun HUANG , Guiping FANG
摘要: An audio packet loss repairing method, device and system based on a neural network. The method comprises: obtaining an audio data packet (S101), the audio data packet comprises a plurality of audio data frames, and the plurality of audio data frames at least comprise a plurality of voice signal frames; determining a position of a lost voice signal frame in the plurality of audio data packet to obtain position information of the lost frame (S103), the position comprising a first preset position or a second reset position; selecting, according to the position information of the lost frame, a neural network model for repairing the lost frame (S105), the neural network model comprising a first repairing model and a second repairing model; and sending the plurality of audio data frames to the selected neural network model so as to repair the lost voice signal frame (S107).
-
-
-
-
-
-
-
-
-