INFORMATION PROCESSING DEVICE AND METHOD, AND PROGRAM

    公开(公告)号:US20210272576A1

    公开(公告)日:2021-09-02

    申请号:US17255191

    申请日:2019-06-20

    Abstract: The present technology relates to an information processing device and method, and a program capable of reducing a code amount.
    The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.

    Encoding apparatus, decoding apparatus, smoothing apparatus, inverse smoothing apparatus, methods therefor, and recording media

    公开(公告)号:US11087774B2

    公开(公告)日:2021-08-10

    申请号:US16617785

    申请日:2018-04-24

    Abstract: A log spectral envelope sequence L0, L1, . . . , LN−1 and an envelope code for the log spectral envelope sequence L0, L1, . . . , LN−1 are obtained. The log spectral envelope sequence L0, L1, . . . , LN−1 is an integer value sequence corresponding to binary logarithms of respective sample values of a spectral envelope sequence and is an integer value sequence whose total sum is 0. For a quantized spectral sequence {circumflex over ( )}X0, {circumflex over ( )}X1, . . . , {circumflex over ( )}XN−1, a smoothed spectral sequence ˜X0, ˜X1, . . . , ˜XN−1 is obtained by: for {circumflex over ( )}Xk with Lk being a positive value, adopting {circumflex over ( )}Xk with Lk digits from its least significant digit removed as ˜Xk; for {circumflex over ( )}Xk with Lk being a negative value, adopting {circumflex over ( )}Xk with −Lk digits added to its least significant digit in accordance with a predefined rule as ˜Xk; and when Lk is 0, adopting {circumflex over ( )}Xk as ˜Xk. The respective samples of the smoothed spectral sequence ˜X0, ˜X1, . . . , ˜XN−1 are then encoded with a fixed code length to obtain a signal code.

    DECODING AUDIO BITSTREAMS WITH ENHANCED SPECTRAL BAND REPLICATION METADATA IN AT LEAST ONE FILL ELEMENT

    公开(公告)号:US20210142813A1

    公开(公告)日:2021-05-13

    申请号:US17154495

    申请日:2021-01-21

    Abstract: Embodiments relate to an audio processing unit that includes a bitstream payload deformatter and a decoding subsystem. The decoding subsystem is coupled to the bitstream payload deformatter and configured to decode at least a portion of a block of an encoded audio bitstream. The block includes a fill element with an identifier indicating a start of the fill element and fill data after the identifier. The fill data includes at least one flag identifying whether a base form of spectral band replication or an enhanced form of spectral band replication is to be performed on audio content of the block. The identifier is a three bit unsigned integer transmitted most significant bit first and having a value of 0x6.

    Self-Supervised Audio Representation Learning for Mobile Devices

    公开(公告)号:US20210056980A1

    公开(公告)日:2021-02-25

    申请号:US16548146

    申请日:2019-08-22

    Applicant: Google LLC

    Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

    STEREO AUDIO SIGNAL ENCODER
    67.
    发明申请

    公开(公告)号:US20190392847A1

    公开(公告)日:2019-12-26

    申请号:US16479916

    申请日:2018-01-03

    Abstract: A method comprising: receiving at least two audio channel signals; determining, for a first frame, at least two parameters representing a difference between the at least two channel audio signals; scalar quantising the at least two parameters to generate at least two index values; adaptively encoding an initial scalar quantized parameter of the at least two parameters; determining whether the initial scalar quantized parameter has a value different from a predetermined value; adaptively encoding any unencoded scalar quantized parameters where the initial scalar quantized parameter has a value different from the predetermined value; determining whether all of the at least two scalar quantized parameters have values equal to the predetermined value where the initial scalar quantized parameter has a value equal to the predetermined value; adaptively encoding any unencoded scalar quantized parameters and generating an indicator that an output is one of fixed or variable rate coding where the initial scalar quantized parameter has a value equal to the predetermined value and at least one of the at least two scalar quantized parameters have values different from the predetermined value; generating an indicator that the output is the other of the one of fixed or variable rate coding where the initial scalar quantized parameter has a value equal to the predetermined value and all of the at least two scalar quantized parameters have values equal to the predetermined value; generating a single channel representation of the at least two audio channel signals dependent on the at least two parameters; and encoding the single channel representation.

    COOPERATIVE PYRAMID VECTOR QUANTIZERS FOR SCALABLE AUDIO CODING

    公开(公告)号:US20190341064A1

    公开(公告)日:2019-11-07

    申请号:US16050844

    申请日:2018-07-31

    Abstract: An example apparatus includes a memory configured to store the audio data; and one or more processors in communication with the memory, the one or more processors configured to: obtain, for each of a plurality of subbands of audio data, a respective energy scalar and a respective residual identifier; determine overall distortion levels for a plurality of candidate subband pulse allocations for performing pyramid vector dequantization (PVdQ) of the residual identifiers; select, from the plurality of subband pulse allocations and based on the overall distortion levels, a candidate subband pulse allocation; and perform, using the candidate subband pulse allocation, PVdQ on the residual identifers to reconstruct a residual vector for each subband.

Patent Agency Ranking