摘要:
A graphics processing unit (GPU) may perform three-dimensional (3D) graphics processing in accordance with a 3D graphics pipeline using a first plurality of graphics processing hardware units of the GPU. The GPU may further perform a two-dimensional (2D) graphics operation using a second plurality of graphics processing hardware units of the GPU not used in performing the 3D graphics processing and one or more graphics processing hardware units of the first plurality of graphics processing hardware units of the GPU.
摘要:
A system and method is disclosed and includes an execution unit that can be used to count the leading zeros in a data word. During operation, the execution unit can receive a data word that has a width of 2 to the Nth power. Further, the execution unit can sign extend the data word to a temporary data word that has a width of 2 to the Mth power, wherein M is greater than N. The temporary data word can be input to a counter that has a width of 2 to the Mth power and the counter can count the leading zeros within the temporary data word to get a result.
摘要:
A graphics processing unit (GPU) may include a triangle setup engine (TSE) configured to determine coordinates of a triangle, rotate coordinates of the triangle based on an angle. To rotate the coordinates, the TSE generates coordinates of the triangle in a rotated domain, and determines coordinates of a bounding box in the rotated domain based on the coordinates of the triangle in the rotated domain. The TSE determines a first plurality of parallel scanlines in the rotated domain, and a second plurality of parallel scanlines in the rotated domain. The first and second pluralities of scanlines are perpendicular. The TSE determines whether the bounding box coordinates are located within two adjacent scanlines. If the bounding box coordinates are located within the two adjacent scanlines, the TSE removes the triangle from the scene.
摘要:
A graphics processing unit (GPU) may determine a workload of a fragment shader program that executes on the GPU. The GPU may compare the workload of the fragment shader program to a threshold. In response to determining that the workload of the fragment shader program is lower than a specified threshold, the fragment shader program may process one or more fragments without the GPU performing early depth testing of the one or more fragments before the processing by the fragment shader program. The GPU may perform, after processing by the fragment shader program, late depth testing of the one or more fragments to result in one or more non-occluded fragments. The GPU may write pixel values for the one or more non-occluded fragments into a frame buffer.
摘要:
This disclosure describes an adaptive memory address scanning technique that defines an address scanning pattern, to be used for a particular surface, based on one or more properties of the surface. In addition, a number, shape, and arrangement of sub-primitives of a surface to process in parallel may be determined. In one example of the disclosure, a memory accessing method for graphics processing comprises, determining, by a graphics processing unit (GPU), properties of a surface, determining, by the GPU, a memory address scanning technique based on the determined properties of the surface, and performing, by the GPU, at least one of a read or a write of data associated with the surface in a memory based on the determined memory address scanning technique.
摘要:
A method of transferring data between two caches comprises sending a first message from a first processor to a second processor indicating that data is available for transfer from a first cache associated with the first processor, requesting, from the second processor, a data transfer of the data from the first cache to a second cache associated with the second processor, transferring the data from the first cache to the second cache in response to the request, and sending a second message from the second processor to the first processor indicating that the data transfer is complete.