Abstract:
This application relates to the field of artificial intelligence technologies, and provides a feature map processing method and a related device. The method is implemented by invoking a neural network model that includes a plurality of input adaptation branches and a post-processing part. An output of each of the plurality of input adaptation branches is an input of the post-processing part, and each of the plurality of input adaptation branches can downsample a feature map at a different ratio.
Abstract:
A method comprising obtaining a bitstream, the bitstream comprises a transform unit syntax and a coding unit syntax, the transform unit syntax includes a value of a first flag and a value of a second flag related to, respectively, a first chroma transform block and a second chroma transform block of a current transform unit or a current sub-transform unit within the current transform unit, the first or second flag specifies whether the first or second chroma transform block contains at least one transform coefficient levels not equal to 0, the coding unit syntax includes a value of a third flag specifying whether a transform tree structure is present or not; and deriving a value of a fourth flag based on the values of the first, second, and third flags, the fourth flag specifies whether a luma transform block contains at least one transform coefficient levels not equal to 0.
Abstract:
This application provides a motion vector obtaining method and apparatus. The method includes: determining a target offset vector of a block and identifier information of a target picture, wherein the block comprises at least one sub-block; determining a location of the sub-block; determining, as a target location coordinate value of a collocated sub-block, a location coordinate value obtained by performing a clipping operation on an initial location coordinate value in a range, wherein the initial location coordinate value is based on the location of the sub-block and the target offset vector; and obtaining a motion vector of the sub-block based on a motion vector corresponding to the target location coordinate value. Thus, a range of the target offset vector is limited, so that a quantity of memory read times can be reduced in a process of obtaining the motion vector of the collocated sub-block.
Abstract:
An image prediction method, apparatus, and system, a device, and a storage medium are provided. The method includes: (401) obtaining a split mode of a current node, where the current node is an image block in a coding tree unit in a current image; (402) determining, based on the split mode of the current node and a size of the current node, whether the current node satisfies a first condition; and (403) when it is determined that the current node satisfies the first condition, performing intra prediction on all coding blocks belonging to the current node, to obtain predictors of all the coding blocks belonging to the current node.
Abstract:
A method includes parsing a transform coefficient of a transform block in a current coding unit to obtain a first transform coefficient matrix. The method also includes obtaining a quantity K of non-zero transform coefficients in a top-left preset region of the first transform coefficient matrix. The method also includes parsing an index value based on the quantity K being greater than a threshold. The method also includes determining a transform matrix based on the index value. The method also includes multiplying N transform coefficients of the first transform coefficient matrix by the transform matrix to obtain M transform coefficients. The method also includes updating the first transform coefficient matrix by using the M transform coefficients to obtain a second transform coefficient matrix. The method also includes performing an inverse transform on the second transform coefficient matrix to obtain residual samples of the current coding unit.
Abstract:
The present disclosure discloses a video decoding method and a video decoder. The method includes: parsing coding tree split information to obtain a current node; determining coordinates of an upper-left corner of a region covered by a current quantization group based on a depth N of the current node; obtaining a QP delta of a current CU in the region covered by the current quantization group; and obtaining a reconstructed picture of the current CU based on the QP delta of the current CU.
Abstract:
Embodiments of the present invention provide a method for presenting communication information in video communication, including: controlling collection of audio information and video information of a video communication site; determining a position of a speaker in the video communication site according to the audio information, where the speaker is a person of participants in the video communication site that speaks; acquiring speech video information from the video information according to the position of the speaker, where the speech video information is video information of the speaker within a speaking period; and controlling presentation of the speech video information.
Abstract:
Embodiments of this application disclose a decoding method includes: obtaining a bitstream including picture data; parsing the bitstream to obtain node split mode information of a first-level coding tree and node split mode information of a second-level coding tree, if the split mode corresponding to the first node is no further splitting, parsing the bitstream to obtain encoding information of the first node; and decoding and reconstructing, based on the encoding information of the first node, a coding unit corresponding to the first node, to obtain a picture corresponding to the picture data.
Abstract:
This application relates to the field of artificial intelligence technologies, and provides a feature map processing method and a related device. The method is implemented by invoking a neural network model that includes a plurality of input adaptation branches and a post-processing part. An output of each of the plurality of input adaptation branches is an input of the post-processing part, and each of the plurality of input adaptation branches can downsample a feature map at a different ratio.
Abstract:
Embodiments of this application disclose a picture reconstruction method and apparatus for a video picture. The picture reconstruction method includes: obtaining a prediction mode of a current coding unit, and/or obtaining a prediction partition mode of the current coding unit, where the current coding unit includes a luma coding block and a chroma coding block, and the prediction partition mode is a mode of splitting the current coding unit into prediction blocks or prediction units; obtaining a transform block of the current coding unit based on the prediction partition mode and/or the prediction mode; and generating a reconstructed picture block of the current coding unit based on the transform block. According to the method in the embodiments of this application, video coding efficiency can be improved.