-
公开(公告)号:US20210287325A1
公开(公告)日:2021-09-16
申请号:US17182952
申请日:2021-02-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Lou Isabelle Kramer , Matthäus G. Chajdas
Abstract: Systems, apparatuses, and methods for implementing a downsampler in a single compute shader pass are disclosed. A central processing unit (CPU) issues a single-pass compute shader kernel to perform downsampling of a texture on a graphics processing unit (GPU). The GPU includes a plurality of compute units for executing thread groups of the kernel. Each thread group fetches a patch of the texture, and each individual thread downsamples four quads of texels to compute mip levels 1 and 2 independently of the other threads. For mip level 3, texel data is written back over one of the local data share (LDS) entries from which the texel data was loaded. This eliminates the need for a barrier between loads and stores for computing mip level 3. The remaining mip levels are computed in a similar fashion by the thread groups of the single-pass kernel.
-
公开(公告)号:US11915337B2
公开(公告)日:2024-02-27
申请号:US17182952
申请日:2021-02-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Lou Isabelle Kramer , Matthäus G. Chajdas
CPC classification number: G06T1/20 , G06F9/4881 , G06T3/40 , G06T15/04
Abstract: Systems, apparatuses, and methods for implementing a downsampler in a single compute shader pass are disclosed. A central processing unit (CPU) issues a single-pass compute shader kernel to perform downsampling of a texture on a graphics processing unit (GPU). The GPU includes a plurality of compute units for executing thread groups of the kernel. Each thread group fetches a patch of the texture, and each individual thread downsamples four quads of texels to compute mip levels 1 and 2 independently of the other threads. For mip level 3, texel data is written back over one of the local data share (LDS) entries from which the texel data was loaded. This eliminates the need for a barrier between loads and stores for computing mip level 3. The remaining mip levels are computed in a similar fashion by the thread groups of the single-pass kernel.