-
公开(公告)号:US20210272576A1
公开(公告)日:2021-09-02
申请号:US17255191
申请日:2019-06-20
Applicant: Sony Corporation
Inventor: Mitsuyuki Hatanaka , Toru Chinen , Minoru Tsuji , Hiroyuki Honma , Yuki Yamamoto
IPC: G10L19/035 , G10L21/00
Abstract: The present technology relates to an information processing device and method, and a program capable of reducing a code amount.
The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.-
公开(公告)号:US11087774B2
公开(公告)日:2021-08-10
申请号:US16617785
申请日:2018-04-24
Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
Inventor: Ryosuke Sugiura , Yutaka Kamamoto , Takehiro Moriya
IPC: G10L19/035
Abstract: A log spectral envelope sequence L0, L1, . . . , LN−1 and an envelope code for the log spectral envelope sequence L0, L1, . . . , LN−1 are obtained. The log spectral envelope sequence L0, L1, . . . , LN−1 is an integer value sequence corresponding to binary logarithms of respective sample values of a spectral envelope sequence and is an integer value sequence whose total sum is 0. For a quantized spectral sequence {circumflex over ( )}X0, {circumflex over ( )}X1, . . . , {circumflex over ( )}XN−1, a smoothed spectral sequence ˜X0, ˜X1, . . . , ˜XN−1 is obtained by: for {circumflex over ( )}Xk with Lk being a positive value, adopting {circumflex over ( )}Xk with Lk digits from its least significant digit removed as ˜Xk; for {circumflex over ( )}Xk with Lk being a negative value, adopting {circumflex over ( )}Xk with −Lk digits added to its least significant digit in accordance with a predefined rule as ˜Xk; and when Lk is 0, adopting {circumflex over ( )}Xk as ˜Xk. The respective samples of the smoothed spectral sequence ˜X0, ˜X1, . . . , ˜XN−1 are then encoded with a fixed code length to obtain a signal code.
-
公开(公告)号:US11024323B2
公开(公告)日:2021-06-01
申请号:US15643908
申请日:2017-07-07
Inventor: Nikolaus Rettelbach , Bernhard Grill , Guillaume Fuchs , Stefan Geyersberger , Markus Multrus , Harald Popp , Juergen Herre , Stefan Wabnik , Gerald Schuller , Jens Hirschfeld
IPC: G10L19/035 , G10L19/02 , G10L19/028 , G10L19/032 , G10L19/008 , G10L25/18
Abstract: An encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error. A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.
-
64.
公开(公告)号:US20210142813A1
公开(公告)日:2021-05-13
申请号:US17154495
申请日:2021-01-21
Applicant: DOLBY INTERNATIONAL AB
Inventor: Lars Villemoes , Heiko Purnhagen , Per Ekstrand
IPC: G10L19/16 , G10L19/035 , G10L19/24 , G10L21/038
Abstract: Embodiments relate to an audio processing unit that includes a bitstream payload deformatter and a decoding subsystem. The decoding subsystem is coupled to the bitstream payload deformatter and configured to decode at least a portion of a block of an encoded audio bitstream. The block includes a fill element with an identifier indicating a start of the fill element and fill data after the identifier. The fill data includes at least one flag identifying whether a base form of spectral band replication or an enhanced form of spectral band replication is to be performed on audio content of the block. The identifier is a three bit unsigned integer transmitted most significant bit first and having a value of 0x6.
-
公开(公告)号:US10978084B2
公开(公告)日:2021-04-13
申请号:US16594867
申请日:2019-10-07
Inventor: Maria Luis Valero , Christian Helmrich , Johannes Hilpert
IPC: G10L19/028 , G10L19/035 , H04S3/00 , G10L19/008
Abstract: In multichannel audio coding, an improved coding efficiency is achieved by the following measure: the noise filling of zero-quantized scale factor bands is performed using noise filling sources other than artificially generated noise or spectral replica. In particular, the coding efficiency in multichannel audio coding may be rendered more efficient by performing the noise filling based on noise generated using spectral lines from a previous frame of, or a different channel of the current frame of, the multichannel audio signal.
-
公开(公告)号:US20210056980A1
公开(公告)日:2021-02-25
申请号:US16548146
申请日:2019-08-22
Applicant: Google LLC
Inventor: Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi
IPC: G10L19/035 , G10L25/18 , G10L19/038 , G06N20/00
Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.
-
公开(公告)号:US20190392847A1
公开(公告)日:2019-12-26
申请号:US16479916
申请日:2018-01-03
Applicant: Nokia Technologies Oy
Inventor: Adriana VASILACHE
IPC: G10L19/008 , G10L19/035 , G10L19/24
Abstract: A method comprising: receiving at least two audio channel signals; determining, for a first frame, at least two parameters representing a difference between the at least two channel audio signals; scalar quantising the at least two parameters to generate at least two index values; adaptively encoding an initial scalar quantized parameter of the at least two parameters; determining whether the initial scalar quantized parameter has a value different from a predetermined value; adaptively encoding any unencoded scalar quantized parameters where the initial scalar quantized parameter has a value different from the predetermined value; determining whether all of the at least two scalar quantized parameters have values equal to the predetermined value where the initial scalar quantized parameter has a value equal to the predetermined value; adaptively encoding any unencoded scalar quantized parameters and generating an indicator that an output is one of fixed or variable rate coding where the initial scalar quantized parameter has a value equal to the predetermined value and at least one of the at least two scalar quantized parameters have values different from the predetermined value; generating an indicator that the output is the other of the one of fixed or variable rate coding where the initial scalar quantized parameter has a value equal to the predetermined value and all of the at least two scalar quantized parameters have values equal to the predetermined value; generating a single channel representation of the at least two audio channel signals dependent on the at least two parameters; and encoding the single channel representation.
-
公开(公告)号:US20190341064A1
公开(公告)日:2019-11-07
申请号:US16050844
申请日:2018-07-31
Applicant: QUALCOMM Incorporated
Inventor: Taher Shahbazi Mirzahasanloo , Rogerio Guedes Alves
IPC: G10L19/035 , G10L19/038 , G10L19/00 , G10L25/18 , H04W4/80
Abstract: An example apparatus includes a memory configured to store the audio data; and one or more processors in communication with the memory, the one or more processors configured to: obtain, for each of a plurality of subbands of audio data, a respective energy scalar and a respective residual identifier; determine overall distortion levels for a plurality of candidate subband pulse allocations for performing pyramid vector dequantization (PVdQ) of the residual identifiers; select, from the plurality of subband pulse allocations and based on the overall distortion levels, a candidate subband pulse allocation; and perform, using the candidate subband pulse allocation, PVdQ on the residual identifers to reconstruct a residual vector for each subband.
-
公开(公告)号:US10468043B2
公开(公告)日:2019-11-05
申请号:US14812465
申请日:2015-07-29
Inventor: Martin Dietz , Guillaume Fuchs , Christian Helmrich , Goran Markovic
IPC: G10L25/00 , G10L19/00 , G10L19/035 , G10L19/02 , G10H1/06 , G10L25/18 , G10L25/21 , G10L25/45 , G10L25/03
Abstract: The invention provides an audio encoder for encoding an audio signal so as to produce therefrom an encoded signal, the audio encoder including: a framing device configured to extract frames from the audio signal; a quantizer configured to map spectral lines of a spectrum signal derived from the frame of the audio signal to quantization indices, wherein the quantizer has a dead-zone, in which the input spectral lines are mapped to quantization index zero; and a control device configured to modify the dead-zone; wherein the control device includes a tonality calculating device configured to calculate at least one tonality indicating value for at least one spectrum line or for at least one group of spectral lines, wherein the control device is configured to modify the dead-zone for the at least one spectrum line or the at least one group of spectrum lines depending on the respective tonality indicating value.
-
公开(公告)号:US10424304B2
公开(公告)日:2019-09-24
申请号:US14687008
申请日:2015-04-15
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Ki-hyun Choo , Eun-mi Oh
IPC: G10L19/00 , G10L21/00 , G10L19/032 , G10L19/035 , G10L19/02
Abstract: A lossless encoding method is provided that includes determining a lossless encoding mode of a quantization coefficient as one of an infinite-range lossless encoding mode and a finite-range lossless encoding mode; encoding the quantization coefficient in the infinite-range lossless encoding mode in correspondence with a result of the lossless encoding mode determination; and encoding the quantization coefficient in the finite-range lossless encoding mode in correspondence with a result of the lossless encoding mode determination.
-
-
-
-
-
-
-
-
-