-
公开(公告)号:US20220093130A1
公开(公告)日:2022-03-24
申请号:US17542757
申请日:2021-12-06
Applicant: Google LLC
Inventor: Michele Covell , Shumeet Baluja
Abstract: In accordance with some embodiments of the disclosed subject matter, mechanisms for seamless audio melding between audio items in a playlist are provided. In some embodiments, a method for transitioning between audio items in playlists is provided, comprising: identifying a sequence of audio items in a playlist of audio items, wherein the sequence of audio items includes a first audio item and a second audio item that is to be played subsequent to the first audio item; and modifying an end portion of the first audio item and a beginning portion of the second audio item, where the end portion of the first audio item and the beginning portion of the second audio item are to be played concurrently to transition between the first audio item and the second audio item, wherein the end portion of the first audio item and the beginning portion of the second audio item have an overlap duration, and wherein modifying the end portion of the first audio item and the beginning portion of the second audio item comprises: generating a first spectrogram corresponding to the end portion of the first audio item and a second spectrogram corresponding to the beginning portion of the second audio item; identifying, for each frequency band in a series of frequency bands, a window over which the first spectrogram within the end portion of the first audio item and the second spectrogram within the beginning portion of the second audio item have a particular cross-correlation; modifying, for each frequency band in the series of frequency bands, the end portion of the first spectrogram and the beginning portion of the second spectrogram such that amplitudes of frequencies within the frequency band decrease within the first spectrogram over the end portion of the first spectrogram and that amplitudes of frequencies within the frequency band increase within the second spectrogram over the beginning portion of the second spectrogram; and generating a modified version of the first audio item the includes the modified end portion of the first audio item based on the modified end portion of the first spectrogram and generating a modified version of the second audio item that includes the modified beginning portion of the second audio item based on the modified beginning portion of the second spectrogram.
-
公开(公告)号:US20200234126A1
公开(公告)日:2020-07-23
申请号:US16751175
申请日:2020-01-23
Applicant: Google LLC
Inventor: Michele Covell , David Marwood , Shumeet Baluja , Nicholas Johnston
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing a network input using a neural network to generate a network output for the network input. One of the methods includes maintaining, for each of the plurality of neural network layers, a respective look-up table that maps each possible combination of a quantized input index and a quantized weight index to a multiplication result; and generating a network output from a network input, comprising, for each of the neural network layers: receiving data specifying a quantized input to the neural network layer, the quantized input comprising a plurality of quantized input values; and generating a layer output for the neural network layer from the quantized input to the neural network layer using the respective look-up table for the neural network layer.
-
公开(公告)号:US10681374B2
公开(公告)日:2020-06-09
申请号:US16016857
申请日:2018-06-25
Applicant: GOOGLE LLC
Inventor: Debargha Mukherjee , Emil Keyder , Michele Covell , Chen Wang , Sarah Parker , Ramin Zabih
IPC: H04N19/527 , G06T7/246 , H04N19/172 , H04N19/44 , H04N19/159 , H04N19/124 , H04N19/176 , H04N19/119 , H04N19/192 , H04N19/109 , H04N19/137 , H04N19/17 , H04N19/167 , H04N19/543 , H04N19/573 , H04N19/147 , H04N19/14 , H04N19/557
Abstract: A method for encoding a current frame of a video includes jointly determining respective motion models for reference frames and encoding the current frame using the respective motion models. The reference frames are used for encoding the current frame. Jointly determining respective motion models for reference frames includes determining respective aggregated residuals for combinations of candidate motion models and selecting the combination of candidate motion models that corresponds to the smallest aggregated residual. The respective motion models correspond to the candidate motion models of the selected combination.
-
公开(公告)号:US20190356931A1
公开(公告)日:2019-11-21
申请号:US16413992
申请日:2019-05-16
Applicant: GOOGLE LLC
Inventor: David Marwood , Michele Covell , Shumeet Baluja , Nicholas Milo Johnston , Pascal Massimino
IPC: H04N19/90 , H04N19/186 , H04N19/176 , G06T9/00
Abstract: An encoder system can include a pixel grid generator to receive an image having a first dimension, generate a grid having a second dimension, add a plurality of points to positions on the grid, and map a plurality of pixels of the image to the plurality of points. The encoder system can include a color module to assign a color to each of the plurality of points using a color table, a triangulation module to generate a plurality of vertices based on the plurality of points and triangulate the grid using the vertices, and a compression module to compress the vertices as a set of compressed vertex positions and a set of vertex colors.
-
公开(公告)号:US20190238893A1
公开(公告)日:2019-08-01
申请号:US15883639
申请日:2018-01-30
Applicant: GOOGLE LLC
Inventor: Michele Covell , David Marwood , Shumeet Baluja , Rahul Sukthankar
CPC classification number: H04N19/91 , H04N19/13 , H04N19/14 , H04N19/176 , H04N19/44 , H04N19/463
Abstract: Encoding and decoding occupancy information is disclosed. A method includes determining row sums for the region, determining column sums for the region, encoding, in a compressed bitstream, at least one of the row sums and the column sums, and encoding, in the compressed bitstream and based on a coding order, at least one of the rows and the columns of the region. The coding order is based on the encoded at least one of the row sums and the column sums. The row sums include, for each row of the region, a respective count of a number of locations in the row having a specified value. The column sums include, for each column of the region, a respective count of a number of locations in the column having the specified value. A location having the specified value is indicative of the occupancy information at the location.
-
公开(公告)号:US11354822B2
公开(公告)日:2022-06-07
申请号:US16610063
申请日:2018-05-16
Applicant: GOOGLE LLC
Inventor: Michele Covell , Damien Vincent , David Charles Minnen , Saurabh Singh , Sung Jin Hwang , Nicholas Johnston , Joel Eric Shor , George Dan Toderici
IPC: G06T9/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image compression and reconstruction. A request to generate an encoded representation of an input image is received. The encoded representation of the input image is then generated. The encoded representation includes a respective set of binary codes at each iteration. Generating the set of binary codes for the iteration from an initial set of binary includes: for any tiles that have already been masked off during any previous iteration, masking off the tile. For any tiles that have not yet been masked off during any of the previous iterations, a determination is made as to whether a reconstruction error of the tile when reconstructed from binary codes at the previous iterations satisfies an error threshold. When the reconstruction quality satisfies the error threshold, the tile is masked off.
-
公开(公告)号:US20210335017A1
公开(公告)日:2021-10-28
申请号:US16610063
申请日:2018-05-16
Applicant: GOOGLE LLC
Inventor: Michele Covell , Damien Vincent , David Charles Minnen , Saurabh Singh , Sung Jin Hwang , Nicholas Johnston , Joel Eric Shor , George Dan Toderici
IPC: G06T9/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image compression and reconstruction. A request to generate an encoded representation of an input image is received. The encoded representation of the input image is then generated. The encoded representation includes a respective set of binary codes at each iteration. Generating the set of binary codes for the iteration from an initial set of binary includes: for any tiles that have already been masked off during any previous iteration, masking off the tile. For any tiles that have not yet been masked off during any of the previous iterations, a determination is made as to whether a reconstruction error of the tile when reconstructed from binary codes at the previous iterations satisfies an error threshold. When the reconstruction quality satisfies the error threshold, the tile is masked off.
-
公开(公告)号:US11115678B2
公开(公告)日:2021-09-07
申请号:US16861299
申请日:2020-04-29
Applicant: GOOGLE LLC
Inventor: Debargha Mukherjee , Emil Keyder , Michele Covell , Chen Wang , Sarah Parker , Ramin Zabih
IPC: H04N19/527 , H04N19/172 , H04N19/176 , H04N19/109 , G06T7/246 , H04N19/44 , H04N19/159 , H04N19/124 , H04N19/119 , H04N19/192 , H04N19/137 , H04N19/17 , H04N19/167 , H04N19/543 , H04N19/573 , H04N19/147 , H04N19/14 , H04N19/557
Abstract: An apparatus for encoding a current frame of a video. The apparatus includes a memory and a processor. The processor is configured to execute instructions stored in the memory to generate, for each reference frame of a subset of available reference frames, at least one respective candidate global motion model (GMM); partition the current frame into blocks; generate an aggregated residual frame for the current frame; and encode the respective residual blocks in a compressed bitstream. To generate the aggregated residual frame includes to select, for predicting each block of the blocks, a respective selected GMM, where the respective selected GMM corresponds to the one of the at least one respective candidate GMMs that minimizes a total error associated with the aggregated residual frame; and obtain respective residual blocks for the block.
-
公开(公告)号:US10713818B1
公开(公告)日:2020-07-14
申请号:US16259207
申请日:2019-01-28
Applicant: Google LLC
Inventor: George Dan Toderici , Sean O'Malley , Rahul Sukthankar , Sung Jin Hwang , Damien Vincent , Nicholas Johnston , David Charles Minnen , Joel Shor , Michele Covell
Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.
-
公开(公告)号:US20190149841A1
公开(公告)日:2019-05-16
申请号:US16016857
申请日:2018-06-25
Applicant: GOOGLE LLC
Inventor: Debargha Mukherjee , Emil Keyder , Michele Covell , Chen Wang , Sarah Parker , Ramin Zabih
IPC: H04N19/527 , G06T7/246 , H04N19/172 , H04N19/176 , H04N19/159 , H04N19/124 , H04N19/44
Abstract: A method for encoding a current frame of a video includes jointly determining respective motion models for reference frames and encoding the current frame using the respective motion models. The reference frames are used for encoding the current frame. Jointly determining respective motion models for reference frames includes determining respective aggregated residuals for combinations of candidate motion models and selecting the combination of candidate motion models that corresponds to the smallest aggregated residual. The respective motion models correspond to the candidate motion models of the selected combination.
-
-
-
-
-
-
-
-
-