摘要:
An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type
摘要:
Briefly, in accordance with one or more embodiments, an apparatus comprises a processor to monitor cache utilization of an application during execution of the application for a workload; and a memory to store cache utilization statistics responsive to the monitored cache utilization. The processor is to determine an optimal cache configuration for the application based at least in part on the cache utilization statistics for the workload such that a smallest amount of cache is turned on for subsequent executions of the workload by the application.
摘要:
Embodiments of the present invention provide a memory arbiter for directing chipset and graphics traffic to system memory. Page consistency and priorities are used to optimize memory bandwidth utilization and guarantee latency to isochronous display requests. The arbiter also contains a mechanism to prevent CPU requests from starving lower priority requests. The memory arbiter thus provides a simple, easy to validate architecture that prevents the CPU from unfairly starving low priority agent and takes advantage of grace periods and memory page detection to optimize arbitration switches, thus increasing memory bandwidth utilization.
摘要:
An apparatus to facilitate compute optimization is disclosed. The apparatus includes a memory device including a first integrated circuit (IC) including a plurality of memory channels and a second IC including a plurality of processing units, each coupled to a memory channel in the plurality of memory channels.
摘要:
In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
摘要:
Methods and apparatus relating to a current change mitigation policy for limiting voltage droop in graphics logic are described. In an embodiment, logic inserts one or more bubbles in one or more Execution Unit (EU) logic pipelines or one or more sampler logic pipelines of a processor. The bubbles at least temporarily reduce execution of operations in one or more subsystems of the processor based at least partially on a comparison of a first value and one or more clamping threshold values. The first value is determined based at least partially on a summation of products of one or more event counts and dynamic capacitance weights for one or more subsystems of the processor. Other embodiments are also disclosed and claimed.
摘要:
Embodiments of the present invention provide a memory arbiter for directing chipset and graphics traffic to system memory. Page consistency and priorities are used to optimize memory bandwidth utilization and guarantee latency to isochronous display requests. The arbiter also contains a mechanism to prevent CPU requests from starving lower priority requests. The memory arbiter thus provides a simple, easy to validate architecture that prevents the CPU from unfairly starving low priority agent and takes advantage of grace periods and memory page detection to optimize arbitration switches, thus increasing memory bandwidth utilization.
摘要:
A buffer circuit coupling an input bus having a first portion and a second portion to an output bus. Each of the first portion, the second portion, and the output bus carry data of a predetermined width. The buffer circuit comprises a first plurality of registers, a second plurality of registers, an unload counter, and a multiplexer. The first plurality of registers is coupled to store data from the first portion of the input bus. The second plurality of registers is coupled to store data from the second portion of the input bus and from a data order signal. The unload counter provides an unload count that selects one of the first plurality of registers and a corresponding one of the second plurality of registers. The multiplexer provides either the selected one of the first plurality of registers or the corresponding one of the second plurality of registers to the output bus. The multiplexer is responsive to the data order signal stored in the corresponding one of the second plurality of registers.
摘要:
Embodiments of the present invention provide a memory arbiter for directing chipset and graphics traffic to system memory. Page consistency and priorities are used to optimize memory bandwidth utilization and guarantee latency to isochronous display requests. The arbiter also contains a mechanism to prevent CPU requests from starving lower priority requests. The memory arbiter thus provides a simple, easy to validate architecture that prevents the CPU from unfairly starving low priority agent and takes advantage of grace periods and memory page detection to optimize arbitration switches, thus increasing memory bandwidth utilization.
摘要:
An embodiment of a memory controller that improves main memory bandwidth utilization during graphics translational lookaside buffer fetch cycles is disclosed. The memory controller includes a first request path and a second request path. The memory controller further includes a graphics translational lookaside buffer that includes a cache. The graphics translational lookaside buffer issues an address fetch request to a memory interface when a graphics memory request received from the first request path or from the second request path misses the cache. The memory controller also includes a memory arbiter that includes a first request path cycle tracker and a second request path cycle tracker. The memory arbiter allows a request received from the second request path to be issued to the memory interface when a graphics memory request received from the first request path is stalled due to a graphics translational lookaside buffer cache miss.