-
公开(公告)号:US20230007284A1
公开(公告)日:2023-01-05
申请号:US17779380
申请日:2019-12-23
Applicant: Google LLC
Inventor: Shan Li , Claudionor Coelho , In Suk Chong , Aki Kuusela
IPC: H04N19/436 , H04N19/176 , H04N19/593 , H04N19/159 , H04N19/11 , H04N19/124 , H04N19/149
Abstract: Ultra light models and decision fusion for increasing the speed of intra-prediction are described. Using a machine-learning (ML) model, an ML intra-prediction mode is obtained. A most-probable intra-prediction mode is obtained from amongst available intra-prediction modes for encoding the current block. As an encoding intra-prediction mode, one of the ML intra-prediction mode or the most-probable intra-prediction mode is selected, and the encoding intra-prediction mode is encoded in a compressed bitstream. A current block is encoded using the encoding intra-prediction mode. Selection of the encoding intra-prediction mode is based on relative reliabilities of the ML intra-prediction mode and the most-probable intra-prediction mode.
-
公开(公告)号:US20200280717A1
公开(公告)日:2020-09-03
申请号:US16289149
申请日:2019-02-28
Applicant: GOOGLE LLC
Inventor: Shan Li , Claudionor Coelho , Aki Kuusela , Dake He
IPC: H04N19/107 , H04N19/119 , H04N19/176 , H04N19/96 , G06N3/04 , G06N3/08
Abstract: Convolutional neural networks (CNN) that determine a mode decision (e.g., block partitioning) for encoding a block include feature extraction layers and multiple classifiers. A non-overlapping convolution operation is performed at a feature extraction layer by setting a stride value equal to a kernel size. The block has a N×N size, and a smallest partition output for the block has a S×S size. Classification layers of each classifier receive feature maps having a feature dimension. An initial classification layer receives the feature maps as an output of a final feature extraction layer. Each classifier infers partition decisions for sub-blocks of size (αS)×(αS) of the block, wherein α is a power of 2 and α=2, . . . , N/S, by applying, at some successive classification layers, a 1×1 kernel to reduce respective feature dimensions; and outputting by a last layer of the classification layers an output corresponding to a N/(αS)×N/(αS)×1 output map.
-
公开(公告)号:US20200092556A1
公开(公告)日:2020-03-19
申请号:US16134134
申请日:2018-09-18
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Dake He , Aki Kuusela , Shan Li
IPC: H04N19/124 , H04N19/176 , H04N19/96 , H04N19/164
Abstract: A method for encoding an image block includes presenting, to a machine-learning model, the image block and a first value corresponding to a first quantization parameter; obtaining first mode decision parameters from the machine-learning model; and encoding the image block using the first mode decision parameters. The first value results from a non-linear function using the first quantization parameter as input. The machine-learning model is trained to output mode decision parameters by using training data. Each training datum includes a training block that is encoded by a second encoder, second mode decision parameters used by the second encoder for encoding the training block, and a second value corresponding to a second quantization parameter. The second encoder used the second quantization parameter for encoding the training block and the second value results from the non-linear function using the second quantization parameter as input.
-
公开(公告)号:US20200092552A1
公开(公告)日:2020-03-19
申请号:US16134165
申请日:2018-09-18
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Aki Kuusela , Shan Li , Dake He
IPC: H04N19/119 , H04N19/176 , H04N19/147 , H04N19/19
Abstract: A convolutional neural network (CNN) for determining a partitioning of a block is disclosed. The block is of size N×N and a smallest partition is of size S×S. The CNN includes feature extraction layers; a concatenation layer that receives, from the feature extraction layers, first feature maps of the block, where each first feature map is of size S×S; and classifiers. Each classifier includes classification layers, each classification layer receives second feature maps having a respective feature dimension. Each classifier is configured to infer partition decisions for sub-blocks of size (αS)×(αS) of the block, wherein α is a power of 2 and α=2, . . . , N/S, by: applying, at some of successive classification layers of the classification layers, a kernel of size 1×1 to reduce the respective feature dimension in half; and outputting by a last layer of the classification layers an output corresponding to a N/(αS)×N/(αS)×1 output map.
-
公开(公告)号:US20190191168A1
公开(公告)日:2019-06-20
申请号:US15847093
申请日:2017-12-19
Applicant: GOOGLE LLC
Inventor: Aki Kuusela , Daniel Stodolsky , Juha Pekka Maaninen
IPC: H04N19/14 , H04N19/567 , H04N19/82
CPC classification number: H04N19/14 , H04N19/127 , H04N19/149 , H04N19/156 , H04N19/194 , H04N19/567 , H04N19/82
Abstract: Systems and methods are disclosed for encoding video. For example, methods may include: receiving a throughput setting; adjusting, based on the throughput setting, an effort level selection for an encoder to utilize multiple effort levels from a set of effort levels, wherein each effort level of the set of effort levels specifies parameters of the encoder that control processing time for a coding unit of video data; and encoding video data, using the encoder configured using effort levels identified by the effort level selection, to generate data of an encoded bitstream.
-
公开(公告)号:US20180035129A1
公开(公告)日:2018-02-01
申请号:US15728661
申请日:2017-10-10
Applicant: GOOGLE LLC
Inventor: Aki Kuusela
IPC: H04N19/527 , H04N19/40 , H04N19/176 , H04N19/513 , H04N19/30 , H04N19/196 , H04N19/194 , H04N19/59 , H04N19/139 , H04N19/53 , H04N19/423
CPC classification number: H04N19/527 , H04N19/139 , H04N19/176 , H04N19/194 , H04N19/196 , H04N19/30 , H04N19/40 , H04N19/423 , H04N19/513 , H04N19/53 , H04N19/59
Abstract: An apparatus for use in low-latency two-pass video coding may include a memory and a processor configured to execute instructions stored in the memory to identify an input frame from an input video stream, determine a reduced frame from the input frame, the reduced frame having a size smaller than a size of the input frame, generate an encoded reduced frame by encoding the reduced frame, wherein encoding the reduced frame includes generating encoding metrics, generate encoding parameters based on the encoding metrics, generate an encoded frame by encoding the input frame using an encoding parameter from the encoding parameters include the encoded frame in an output bitstream, and store or transmit the output bitstream.
-
公开(公告)号:US11956447B2
公开(公告)日:2024-04-09
申请号:US17601639
申请日:2019-03-21
Applicant: Google LLC
Inventor: Claudionor Coelho , Aki Kuusela , Joseph Young , Shan Li , Dake He
IPC: H04N19/147 , G06T9/00 , H04N19/176 , H04N19/96
CPC classification number: H04N19/147 , G06T9/002 , H04N19/176 , H04N19/96
Abstract: An apparatus for encoding an image block includes a processor that presents, to a machine-learning model, the image block, obtains the partition decision for encoding the image block from the model, and encodes the image block using the partition decision. The model is trained to output a partition decision for encoding the image block by using training data for a plurality of training blocks as input, the training data including for a training block, partition decisions for encoding the training block, and, for each partition decision, a rate-distortion value resulting from encoding the training block using the partition decision. The model is trained using a loss function combining a partition loss function based upon a relationship between the partition decisions and respective predicted partitions, and a rate-distortion cost loss function based upon a relationship between the rate-distortion values and respective predicted rate-distortion values.
-
公开(公告)号:US11310498B2
公开(公告)日:2022-04-19
申请号:US17086591
申请日:2020-11-02
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Aki Kuusela , Shan Li , Dake He
IPC: H04N19/119 , H04N19/19 , H04N19/147 , H04N19/176
Abstract: An apparatus for encoding a block of a picture includes a convolutional neural network (CNN) for determining a block partitioning of the block, the block having an N×N size and a smallest partition determined by the CNN being of size S×S. The CNN includes feature extraction layers; a concatenation layer that receives, from the feature extraction layers, first feature maps of the block, where each first feature map of the first feature maps is of the smallest possible partition size S×S of the block; and at least one classifier that is configured to infer partition decisions for sub-blocks of size (αS)×(αS) of the block, where α is a power of 2.
-
公开(公告)号:US20210051322A1
公开(公告)日:2021-02-18
申请号:US17086591
申请日:2020-11-02
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Aki Kuusela , Shan Li , Dake He
IPC: H04N19/119 , H04N19/19 , H04N19/147 , H04N19/176
Abstract: An apparatus for encoding a block of a picture includes a convolutional neural network (CNN) for determining a block partitioning of the block, the block having an N×N size and a smallest partition determined by the CNN being of size S×S. The CNN includes feature extraction layers; a concatenation layer that receives, from the feature extraction layers, first feature maps of the block, where each first feature map of the first feature maps is of the smallest possible partition size S×S of the block; and at least one classifier that is configured to infer partition decisions for sub-blocks of size (αS)×(αS) of the block, where α is a power of 2.
-
公开(公告)号:US10547869B2
公开(公告)日:2020-01-28
申请号:US15835501
申请日:2017-12-08
Applicant: GOOGLE LLC
Inventor: Aki Kuusela , Dake He
IPC: H04N19/60 , H04N19/18 , H04N19/13 , H04N19/124
Abstract: A method of coding a transform block having transform coefficients includes selecting, based on a transform type used for the transform block, a spatial template for a coding context; defining shift registers to each hold one or more stored values regarding the coding context; initializing the shift registers by setting the stored values to default values; and coding values indicative of magnitudes of the transform coefficients from the transform block in a reverse scan order. Coding includes, for each of one or more values, obtaining a value to be coded at a scan position, determining the coding context using the stored values from the shift registers, entropy coding the value to be coded using the coding context, and subsequent to entropy coding the value to be coded, updating at least some of the stored values in the shift registers.
-
-
-
-
-
-
-
-
-