-
公开(公告)号:US10353591B2
公开(公告)日:2019-07-16
申请号:US15442499
申请日:2017-02-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Michael L. Schmitt , Radhakrishna Giduthuri
Abstract: Improvements in compute shader programs executed on parallel processing hardware are disclosed. An application or other entity defines a sequence of shader programs to execute. Each shader program defines inputs and outputs which would, if unmodified, execute as loads and stores to a general purpose memory, incurring high latency. A compiler combines the shader programs into groups that can operate in a lower-latency, but lower-capacity local data store memory. The boundaries of these combined shader programs are defined by several aspects including where memory barrier operations are to execute, whether combinations of shader programs can execute using only the local data store and not the global memory (except for initial reads and writes) and other aspects.