Abstract:
Position-based rendering apparatus and method for multi-die/GPU graphics processing. For example, one embodiment of a method comprises: distributing a plurality of graphics draws to a plurality of graphics processors; performing position-only shading using vertex data associated with tiles of a first draw on a first graphics processor, the first graphics processor responsively generating visibility data for each of the tiles; distributing subsets of the visibility data associated with different subsets of the tiles to different graphics processors; limiting geometry work to be performed on each tile by each graphics processor using the visibility data, each graphics processor to responsively generate rendered tiles; and wherein the rendered tiles are combined to generate a complete image frame.
Abstract:
Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
Abstract:
A mechanism for command stream processing is described. A method of embodiments, as described herein, includes fetching cache lines from a memory to fill command first in first out buffer (FIFO), wherein the fetched cachelines an overfetching of data necessary to process a command, a first parser to fetch and execute batch commands stored in the command FIFO and a second parser to fetch commands and execute the batch commands and non-batch commands stored in the command FIFO.
Abstract:
A mechanism is described for facilitating efficient scheduling of graphics workloads at computing devices. A method of embodiments, as described herein, includes receiving a work request for processing a work item at a graphics processor, where the work request is placed by an application. The method may further include allowing the application to directly call into a graphics driver associated with the graphics processor to generate a work queue for the work item, where direct calling allows the application to bypass an intermediary call to the graphics driver and directly submit the work item to the graphics processor, where direct calling further includes notifying the graphics processor of the work unit by writing into a memory location monitored by the graphics processor. The method may further include submitting the work item from the work queue to a submit queue of a plurality of submit queues, where one or more tasks associated with the work item are processed at the graphics processor.
Abstract:
In an embodiment, a system includes a graphics processing unit (GPU) that includes one or more GPU engines, and a microcontroller. The microcontroller is to assign a respective schedule slot for each of a plurality of virtual machines (VMs). When a particular VM is scheduled to access a first GPU engine, the particular VM has exclusive access to the first GPU engine. Other embodiments are described and claimed.