摘要:
In accordance with some embodiments, a system may detect whether or not a workload currently being worked on by two processors is serialized or concurrent. A workload is serialized or a producer consumer workload when the workload is such that one processor must receive the output from the other processor before it can begin. A workload is concurrent if both processors can work on the workload at the same time. In one embodiment, the nature of memory accesses can be used to determine the workload type. For example, when both processors use a shared virtual memory, the memory accesses can be tracked to detect whether serialized or concurrent workloads are involved.
摘要:
Methods and devices to augment volatile memory in a graphics subsystem with certain types of non-volatile memory are described. In one embodiment, includes storing one or more static or near-static graphics resources in a non-volatile random access memory (NVRAM). The NVRAM is directly accessible by a graphics processor using at least memory store and load commands. The method also includes a graphics processor executing a graphics application. The graphics processor sends a request using a memory load command for an address corresponding to at least one static or near-static graphics resources stored in the NVRAM. The method also includes directly loading the requested graphics resource from the NVRAM into a cache for the graphics processor in response to the memory load command.
摘要:
Embodiments of an invention for dynamic error correction using parity and redundant rows are disclosed. In one embodiment, an apparatus includes a storage structure, parity logic, an error storage space, and an error event generator. The storage structure is to store a plurality of data values. The parity logic is to detect a parity error in a data value stored in the storage structure. The error storage space is to store an indication of a detection of the parity error. The error event generator is to generate an event in response to the indication of the parity error being stored in the error storage space.
摘要:
In accordance with some embodiments, a graphics process frame generation frame rate may be monitored in combination with a utilization or work load metric for the graphics process in order to allocate performance resources to the graphics process and in some cases, between the graphics process and a central processing unit.
摘要:
Position-based rendering apparatus and method for multi-die/GPU graphics processing. For example, one embodiment of a method comprises: distributing a plurality of graphics draws to a plurality of graphics processors; performing position-only shading using vertex data associated with tiles of a first draw on a first graphics processor, the first graphics processor responsively generating visibility data for each of the tiles; distributing subsets of the visibility data associated with different subsets of the tiles to different graphics processors; limiting geometry work to be performed on each tile by each graphics processor using the visibility data, each graphics processor to responsively generate rendered tiles; and wherein the rendered tiles are combined to generate a complete image frame.
摘要:
Systems and methods related to memory paging and memory translation are disclosed. The systems may allow allocation of memory pages with increased diversity in the memory page sizes using page tables dimensioned in a manner that optimizes memory usage by the data structures of the page system.
摘要:
An embodiment of a conditional shader apparatus may include a conditional pixel shader to determine if one or more pixels meet a shader condition, and a pixel regrouper communicatively coupled to the conditional pixel shader to regroup pixels based on whether the one or more pixels are determined to meet the shader condition. Another embodiment of a conditional shader apparatus may include a thread analyzer to determine if a set of threads meet a thread condition, and a conditional kernel loader communicatively coupled to the thread analyzer to load an appropriate kernel from a set of two or more kernels based on whether the set of threads are determined to meet the thread condition. Other embodiments are disclosed and claimed.
摘要:
Transitions to ring 0, each time an application wants to use an adjunct processor, are avoided, saving central processor operating cycles and improving efficiency. Instead, initially each application is registered and setup to use adjunct processor resources in ring 3.
摘要:
By scheduling/managing workload submission to a POSH pipe one can exploit parallelism with minimum impact to the software scheduler in some embodiments.
摘要:
A method, device, and system to distribute code and data stores between volatile and non-volatile memory are described. In one embodiment, the method includes storing one or more static code segments of a software application in a phase change memory with switch (PCMS) device, storing one or more static data segments of the software application in the PCMS device, and storing one or more volatile data segments of the software application in a volatile memory device. The method then allocates an address mapping table with at least a first address pointer to point to each of the one or more static code segments, at least a second address pointer to point to each of the one or more static data segments, and at least a third address pointer to point to each of the one or more volatile data segments.