METHODS, SYSTEMS, AND MEDIA FOR SEAMLESS AUDIO MELDING BETWEEN SONGS IN A PLAYLIST

    公开(公告)号:US20220093130A1

    公开(公告)日:2022-03-24

    申请号:US17542757

    申请日:2021-12-06

    Applicant: Google LLC

    Abstract: In accordance with some embodiments of the disclosed subject matter, mechanisms for seamless audio melding between audio items in a playlist are provided. In some embodiments, a method for transitioning between audio items in playlists is provided, comprising: identifying a sequence of audio items in a playlist of audio items, wherein the sequence of audio items includes a first audio item and a second audio item that is to be played subsequent to the first audio item; and modifying an end portion of the first audio item and a beginning portion of the second audio item, where the end portion of the first audio item and the beginning portion of the second audio item are to be played concurrently to transition between the first audio item and the second audio item, wherein the end portion of the first audio item and the beginning portion of the second audio item have an overlap duration, and wherein modifying the end portion of the first audio item and the beginning portion of the second audio item comprises: generating a first spectrogram corresponding to the end portion of the first audio item and a second spectrogram corresponding to the beginning portion of the second audio item; identifying, for each frequency band in a series of frequency bands, a window over which the first spectrogram within the end portion of the first audio item and the second spectrogram within the beginning portion of the second audio item have a particular cross-correlation; modifying, for each frequency band in the series of frequency bands, the end portion of the first spectrogram and the beginning portion of the second spectrogram such that amplitudes of frequencies within the frequency band decrease within the first spectrogram over the end portion of the first spectrogram and that amplitudes of frequencies within the frequency band increase within the second spectrogram over the beginning portion of the second spectrogram; and generating a modified version of the first audio item the includes the modified end portion of the first audio item based on the modified end portion of the first spectrogram and generating a modified version of the second audio item that includes the modified beginning portion of the second audio item based on the modified beginning portion of the second spectrogram.

    LOOK-UP TABLE BASED NEURAL NETWORKS
    12.
    发明申请

    公开(公告)号:US20200234126A1

    公开(公告)日:2020-07-23

    申请号:US16751175

    申请日:2020-01-23

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing a network input using a neural network to generate a network output for the network input. One of the methods includes maintaining, for each of the plurality of neural network layers, a respective look-up table that maps each possible combination of a quantized input index and a quantized weight index to a multiplication result; and generating a network output from a network input, comprising, for each of the neural network layers: receiving data specifying a quantized input to the neural network layer, the quantized input comprising a plurality of quantized input values; and generating a layer output for the neural network layer from the quantized input to the neural network layer using the respective look-up table for the neural network layer.

    IMAGE COMPRESSION AND DECOMPRESSION USING TRIANGULATION

    公开(公告)号:US20190356931A1

    公开(公告)日:2019-11-21

    申请号:US16413992

    申请日:2019-05-16

    Applicant: GOOGLE LLC

    Abstract: An encoder system can include a pixel grid generator to receive an image having a first dimension, generate a grid having a second dimension, add a plurality of points to positions on the grid, and map a plurality of pixels of the image to the plurality of points. The encoder system can include a color module to assign a color to each of the plurality of points using a color table, a triangulation module to generate a plurality of vertices based on the plurality of points and triangulate the grid using the vertices, and a compression module to compress the vertices as a set of compressed vertex positions and a set of vertex colors.

    Compression of occupancy or indicator grids
    15.
    发明申请

    公开(公告)号:US20190238893A1

    公开(公告)日:2019-08-01

    申请号:US15883639

    申请日:2018-01-30

    Applicant: GOOGLE LLC

    Abstract: Encoding and decoding occupancy information is disclosed. A method includes determining row sums for the region, determining column sums for the region, encoding, in a compressed bitstream, at least one of the row sums and the column sums, and encoding, in the compressed bitstream and based on a coding order, at least one of the rows and the columns of the region. The coding order is based on the encoded at least one of the row sums and the column sums. The row sums include, for each row of the region, a respective count of a number of locations in the row having a specified value. The column sums include, for each column of the region, a respective count of a number of locations in the column having the specified value. A location having the specified value is indicative of the occupancy information at the location.

    Stop code tolerant image compression neural networks

    公开(公告)号:US11354822B2

    公开(公告)日:2022-06-07

    申请号:US16610063

    申请日:2018-05-16

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image compression and reconstruction. A request to generate an encoded representation of an input image is received. The encoded representation of the input image is then generated. The encoded representation includes a respective set of binary codes at each iteration. Generating the set of binary codes for the iteration from an initial set of binary includes: for any tiles that have already been masked off during any previous iteration, masking off the tile. For any tiles that have not yet been masked off during any of the previous iterations, a determination is made as to whether a reconstruction error of the tile when reconstructed from binary codes at the previous iterations satisfies an error threshold. When the reconstruction quality satisfies the error threshold, the tile is masked off.

    STOP CODE TOLERANT IMAGE COMPRESSION NEURAL NETWORKS

    公开(公告)号:US20210335017A1

    公开(公告)日:2021-10-28

    申请号:US16610063

    申请日:2018-05-16

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image compression and reconstruction. A request to generate an encoded representation of an input image is received. The encoded representation of the input image is then generated. The encoded representation includes a respective set of binary codes at each iteration. Generating the set of binary codes for the iteration from an initial set of binary includes: for any tiles that have already been masked off during any previous iteration, masking off the tile. For any tiles that have not yet been masked off during any of the previous iterations, a determination is made as to whether a reconstruction error of the tile when reconstructed from binary codes at the previous iterations satisfies an error threshold. When the reconstruction quality satisfies the error threshold, the tile is masked off.

    Image compression with recurrent neural networks

    公开(公告)号:US10713818B1

    公开(公告)日:2020-07-14

    申请号:US16259207

    申请日:2019-01-28

    Applicant: Google LLC

    Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.

Patent Agency Ranking