-
公开(公告)号:US20230050178A1
公开(公告)日:2023-02-16
申请号:US17887178
申请日:2022-08-12
发明人: Jaeyoung CHOI , Sangheon KIM , Byunghyuk MOON , Jinhyuk WOO , Eunji LIM , Banghyun KWON , Inhyung JUNG , Yeunwook LIM
IPC分类号: G10L21/055 , G10L19/022
摘要: An electronic device according to various embodiments of the disclosure includes: a display configured to output image data of content based on execution of an application, a sound output module comprising circuitry configured to output audio data of the content, and a processor adaptively connected to the display and the sound output module, wherein the processor is configured to: identify a schedule for sequentially receiving read tasks (RTs) at a specified time interval to encode audio segments sequentially input in a specified size into an audio buffer from the audio data, and control time points at which the RTs are called, based on at least one of a situation in which the RTs are received according to the schedule and an audio buffer state and encode the audio segments corresponding to the RTs received at the controlled time points.
-
公开(公告)号:US11581001B2
公开(公告)日:2023-02-14
申请号:US16922934
申请日:2020-07-07
发明人: Ralf Geiger , Max Neuendorf , Yoshikazu Yokotani , Nikolaus Rettelbach , Juergen Herre , Stefan Geyersberger
IPC分类号: G10L19/18 , G10L19/02 , H04N21/2368 , H04N21/2383 , H04N21/2662 , H04N21/434 , H04N21/438 , H04N19/00 , G10L19/00 , G10L19/022 , G10L19/26 , G10L19/032
摘要: An apparatus for decoding data segments representing a time-domain data stream, a data segment being encoded in the time domain or in the frequency domain, a data segment being encoded in the frequency domain having successive blocks of data representing successive and overlapping blocks of time-domain data samples. The apparatus includes a time-domain decoder for decoding a data segment being encoded in the time domain and a processor for processing the data segment being encoded in the frequency domain and output data of the time-domain decoder to obtain overlapping time-domain data blocks. The apparatus further includes an overlap/add-combiner for combining the overlapping time-domain data blocks to obtain a decoded data segment of the time-domain data stream.
-
公开(公告)号:US11580999B2
公开(公告)日:2023-02-14
申请号:US17331416
申请日:2021-05-26
发明人: Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Woo-taek Lim , Inseon Jang
IPC分类号: G10L19/022 , G10L19/06 , G10L19/16 , G10L19/035
摘要: An audio signal encoding method performed by an encoder includes identifying an audio signal of a time domain in units of a block, generating a combined block by combining i) a current original block of the audio signal and ii) a previous original block chronologically adjacent to the current original block, extracting a first residual signal of a frequency domain from the combined block using linear predictive coding of a time domain, overlapping chronologically adjacent first residual signals among first residual signals converted into a time domain, and quantizing a second residual signal of a time domain extracted from the overlapped first residual signal by converting the second residual signal of the time domain into a frequency domain using linear predictive coding of a frequency domain.
-
公开(公告)号:US11574639B2
公开(公告)日:2023-02-07
申请号:US17127938
申请日:2020-12-18
发明人: Naoyuki Kanda , Xuankai Chang , Yashesh Gaur , Xiaofei Wang , Zhong Meng , Takuya Yoshioka
IPC分类号: G10L15/00 , G10L17/02 , G10L15/22 , G10L15/26 , G10L19/022 , G10L21/0272
摘要: A hypothesis stitcher for speech recognition of long-form audio provides superior performance, such as higher accuracy and reduced computational cost. An example disclosed operation includes: segmenting the audio stream into a plurality of audio segments; identifying a plurality of speakers within each of the plurality of audio segments; performing automatic speech recognition (ASR) on each of the plurality of audio segments to generate a plurality of short-segment hypotheses; merging at least a portion of the short-segment hypotheses into a first merged hypothesis set; inserting stitching symbols into the first merged hypothesis set, the stitching symbols including a window change (WC) symbol; and consolidating, with a network-based hypothesis stitcher, the first merged hypothesis set into a first consolidated hypothesis. Multiple variations are disclosed, including alignment-based stitchers and serialized stitchers, which may operate as speaker-specific stitchers or multi-speaker stitchers, and may further support multiple options for differing hypothesis configurations.
-
公开(公告)号:US11570564B2
公开(公告)日:2023-01-31
申请号:US16753698
申请日:2018-09-24
发明人: Lasse Laaksonen , Mikko Tammi , Miikka Vilermo , Arto Lehtiniemi
IPC分类号: H04S7/00 , G10L19/008 , G10L19/022
摘要: An apparatus for audio signal processing audio objects within at least one audio scene, the apparatus comprising at least one processor configured to:define for at least one time period at least one contextual grouping comprising at least two of a plurality of audio objects and at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping, the plurality of audio objects within at least one audio scene; anddefine with respect to the at least one contextual grouping at least one first parameter and/or parameter rule type which is configured to be applied with respect to a common element associated with the at least two of the plurality of audio objects and wherein the at least one first parameter and/or parameter rule type is configured to be applied with respect to individual element associatedwith the at least one further audio object outside of the at least one contextual grouping, the at least one first parameter and/or parameter rule type being applied in audio rendering of both the at least two of the plurality of audio objects and the at least one further audio object.
-
公开(公告)号:US20220406321A1
公开(公告)日:2022-12-22
申请号:US17895256
申请日:2022-08-25
申请人: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
发明人: Seungkwon BEACK , Tae Jin LEE , Min Je KIM , Kyeongok KANG , Dae Young JANG , Jeongil SEO , Jin Woo HONG , Chieteuk AHN , Ho Chong PARK , Young-cheol PARK
IPC分类号: G10L19/22 , G10L19/022 , G10L19/06 , G10L19/18
摘要: A Unified Speech and Audio Codec (USAC) that may process a window sequence based on mode switching is provided. The USAC may perform encoding or decoding by overlapping between frames based on a folding point when mode switching occurs. The USAC may process different window sequences for each situation to perform encoding or decoding, and thereby may improve a coding efficiency.
-
公开(公告)号:US11410664B2
公开(公告)日:2022-08-09
申请号:US16795548
申请日:2020-02-19
发明人: Stefan Bayer , Eleni Fotopoulou , Markus Multrus , Guillaume Fuchs , Emmanuel Ravelli , Markus Schnell , Stefan Doehla , Wolfgang Jaegers , Martin Dietz , Goran Markovic
IPC分类号: G10L19/008 , G10L19/022 , G10L19/02 , G10L19/04 , G10L25/18 , H04S3/00
摘要: An apparatus for estimating an inter-channel time difference between a first channel signal and a second channel signal, includes: a calculator for calculating a cross-correlation spectrum for a time block from the first channel signal in the time block and the second channel signal in the time block; a spectral characteristic estimator for estimating a characteristic of a spectrum of the first channel signal or the second channel signal for the time block; a smoothing filter for smoothing the cross-correlation spectrum over time using the spectral characteristic to obtain a smoothed cross-correlation spectrum; and a processor for processing the smoothed cross-correlation spectrum to obtain the inter-channel time difference.
-
公开(公告)号:US11386906B2
公开(公告)日:2022-07-12
申请号:US17006349
申请日:2020-08-28
发明人: Jérémie Lecomte , Adrian Tomasek
IPC分类号: G10L19/005 , G10L19/00 , G10L19/022
摘要: There is provided an error concealment unit, method, and computer program, for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. In one embodiment, the error concealment unit provides an error concealment audio information for a lost audio frame on the basis of a properly decoded audio frame preceding the lost audio frame. The error concealment unit derives a damping factor on the basis of characteristics of a decoded representation of the properly decoded audio frame preceding the lost audio frame. The error concealment unit performs a fade out using the damping factor.
-
公开(公告)号:US11341980B2
公开(公告)日:2022-05-24
申请号:US17515286
申请日:2021-10-29
发明人: Markus Schnell , Manfred Lutzky , Eleni Fotopoulou , Konstantin Schmidt , Conrad Benndorf , Adrian Tomasek , Tobias Albert , Timon Seidl
IPC分类号: G10L19/00 , G10L19/02 , G10L19/022
摘要: A downscaled version of an audio decoding procedure may more effectively and/or at improved compliance maintenance be achieved if the synthesis window used for downscaled audio decoding is a downsampled version of a reference synthesis window involved in the non-downscaled audio decoding procedure by downsampling by the downsampling factor by which the downsampled sampling rate and the original sampling rate deviate, and downsampled using a segmental interpolation in segments of ¼ of the frame length.
-
公开(公告)号:US20220157328A1
公开(公告)日:2022-05-19
申请号:US17592423
申请日:2022-02-03
发明人: Emmanuel RAVELLI , Manuel JANDER , Grzegorz PIETRZYK , Martin DIETZ , Marc GAYER
IPC分类号: G10L19/26 , G10L19/022 , G10L19/20 , G10L19/12
摘要: A method is described that processes an audio signal. A discontinuity between a filtered past frame and a filtered current frame of the audio signal is removed using linear predictive filtering.
-
-
-
-
-
-
-
-
-