摘要:
Methods and devices to augment volatile memory in a graphics subsystem with certain types of non-volatile memory are described. In one embodiment, includes storing one or more static or near-static graphics resources in a non-volatile random access memory (NVRAM). The NVRAM is directly accessible by a graphics processor using at least memory store and load commands. The method also includes a graphics processor executing a graphics application. The graphics processor sends a request using a memory load command for an address corresponding to at least one static or near-static graphics resources stored in the NVRAM. The method also includes directly loading the requested graphics resource from the NVRAM into a cache for the graphics processor in response to the memory load command.
摘要:
Methods and devices to augment volatile memory in a graphics subsystem with certain types of non-volatile memory are described. In one embodiment, includes storing one or more static or near-static graphics resources in a non-volatile random access memory (NVRAM). The NVRAM is directly accessible by a graphics processor using at least memory store and load commands. The method also includes a graphics processor executing a graphics application. The graphics processor sends a request using a memory load command for an address corresponding to at least one static or near-static graphics resources stored in the NVRAM. The method also includes directly loading the requested graphics resource from the NVRAM into a cache for the graphics processor in response to the memory load command.
摘要:
An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type
摘要:
An apparatus to facilitate compute optimization is disclosed. The apparatus includes a memory device including a first integrated circuit (IC) including a plurality of memory channels and a second IC including a plurality of processing units, each coupled to a memory channel in the plurality of memory channels.
摘要:
Briefly, in accordance with one or more embodiments, an apparatus comprises a processor to monitor cache utilization of an application during execution of the application for a workload; and a memory to store cache utilization statistics responsive to the monitored cache utilization. The processor is to determine an optimal cache configuration for the application based at least in part on the cache utilization statistics for the workload such that a smallest amount of cache is turned on for subsequent executions of the workload by the application.
摘要:
In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
摘要:
Graphics processing systems and methods are described. A graphics processing apparatus may comprise one or more graphics processing cores, a shared buffer accessible to a user mode driver (UMD) associated with an application in an unprivileged domain, the UMD to write one or more commands to the shared buffer, and a controller parse a workload in the shared buffer to identify one or more commands in the workload, the workload added by the application executing in the unprivileged domain, associate a trigger with a command in the workload, transfer the workload to one or more components of the graphics processing apparatus for execution, and upon execution of the command associated with the trigger, sample the shared buffer to identify a new workload added to the shared buffer. The one or more components of the graphics processing apparatus automatically execute the new workload added to the shared buffer.
摘要:
A method, device, and system to distribute code and data stores between volatile and non-volatile memory are described. In one embodiment, the method includes storing one or more static code segments of a software application in a phase change memory with switch (PCMS) device, storing one or more static data segments of the software application in the PCMS device, and storing one or more volatile data segments of the software application in a volatile memory device. The method then allocates an address mapping table with at least a first address pointer to point to each of the one or more static code segments, at least a second address pointer to point to each of the one or more static data segments, and at least a third address pointer to point to each of the one or more volatile data segments.
摘要:
Described herein are technologies related to a ensuring that graphics commands and graphics context are offloading and scheduled for consumption as the commands and graphics context are sent from coherent to non-coherent memory/fabric in a “processor to processor” handoff or transaction.
摘要:
A computing device for performing scheduling operations for graphics hardware is described herein. The computing device includes a central processing unit (CPU) that is configured to execute an application. The computing device also includes a graphics scheduler configured to operate independently of the CPU. The graphics scheduler is configured to receive work queues relating to workloads from the application that are to execute on the CPU and perform scheduling operations for any of a number of graphics engines based on the work queues.