Abstract:
In an example, a method of arbitrating memory requests may include tagging a first batch of memory requests with first metadata identifying that the first batch of memory requests originates from a first group of threads. The method may include tagging a second batch of memory requests with second metadata identifying that the second batch of memory requests originates from the first group of threads. The method may include storing the first and second batches of memory requests in a conflict arbitration queue. The method may include performing, using the first metadata and the second metadata, conflict arbitration between only the first batch of memory of requests and the second batch of memory requests stored in the conflict arbitration queue, which may include at least one other batch of memory requests stored that originates from a group of threads different from the first group of threads stored therein.
Abstract:
Techniques for allowing for concurrent execution of multiple different tasks and preempted prioritized execution of tasks on a shader processor. In an example operation, a driver executed by a central processing unit (CPU) configures GPU resources based on needs of a first “host” shader to allow the first shader to execute “normally” on the GPU. The GPU may observe two sets of tasks, “guest” tasks. Based on, for example, detecting an availability of resources, the GPU may determine a “guest” task may be run while the “host” task is running. A second “guest” shader executes on a GPU by using resources that were configured for the first “host” shader if there are available resources and, in some examples, additional resources are obtained through software-programmable means.
Abstract:
In an example, a method of arbitrating memory requests may include tagging a first batch of memory requests with first metadata identifying that the first batch of memory requests originates from a first group of threads. The method may include tagging a second batch of memory requests with second metadata identifying that the second batch of memory requests originates from the first group of threads. The method may include storing the first and second batches of memory requests in a conflict arbitration queue. The method may include performing, using the first metadata and the second metadata, conflict arbitration between only the first batch of memory of requests and the second batch of memory requests stored in the conflict arbitration queue, which may include at least one other batch of memory requests stored that originates from a group of threads different from the first group of threads stored therein.
Abstract:
Techniques are described for copying data only from a subset of memory locations allocated to a set of instructions to free memory locations for higher priority instructions to execute. Data from a dynamic portion of one or more general purpose registers (GPRs) allocated to the set of instructions may be copied and stored to another memory unit while data from a static portion of the one or more GPRs allocated to the set of instructions may not be copied and stored to another memory unit.
Abstract:
In an example, a method of transferring data may include synchronizing work-items corresponding to a first subgroup and work-items corresponding to a second subgroup with a barrier. The method may include performing an inter-subgroup data transfer between the first subgroup and the second subgroup.
Abstract:
Techniques are described for copying data only from a subset of memory locations allocated to a set of instructions to free memory locations for higher priority instructions to execute. Data from a dynamic portion of one or more general purpose registers (GPRs) allocated to the set of instructions may be copied and stored to another memory unit while data from a static portion of the one or more GPRs allocated to the set of instructions may not be copied and stored to another memory unit.
Abstract:
In an example, a method of transferring data may include synchronizing work-items corresponding to a first subgroup and work-items corresponding to a second subgroup with a barrier. The method may include performing an inter-subgroup data transfer between the first subgroup and the second subgroup.
Abstract:
This disclosure describes examples of using two vertex shaders each one during different graphics processing passes in a binning architecture for graphics processing. A first vertex shader processes subset of attributes of a vertex in a binning pass, where the subset of attributes include those that contribute to visibility determination and attributes that may benefit from being processed with a vertex shader that provides functional flexibility. A second, different vertex shader processes another subset of attributes of the vertex in the rendering pass.
Abstract:
Techniques for allowing for concurrent execution of multiple different tasks and preempted prioritized execution of tasks on a shader processor. In an example operation, a driver executed by a central processing unit (CPU) configures GPU resources based on needs of a first “host” shader to allow the first shader to execute “normally” on the GPU. The GPU may observe two sets of tasks, “guest” tasks. Based on, for example, detecting an availability of resources, the GPU may determine a “guest” task may be run while the “host” task is running. A second “guest” shader executes on a GPU by using resources that were configured for the first “host” shader if there are available resources and, in some examples, additional resources are obtained through software-programmable means.
Abstract:
This disclosure describes examples of using two vertex shaders each one during different graphics processing passes in a binning architecture for graphics processing. A first vertex shader processes subset of attributes of a vertex in a binning pass, where the subset of attributes include those that contribute to visibility determination and attributes that may benefit from being processed with a vertex shader that provides functional flexibility. A second, different vertex shader processes another subset of attributes of the vertex in the rendering pass.