摘要:
In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
摘要:
In an example, an apparatus comprises a plurality of execution units, and a first general register file (GRF) communicatively couple to the plurality of execution units, wherein the first GRF is shared by the plurality of execution units. Other embodiments are also disclosed and claimed.
摘要:
In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
摘要:
In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
摘要:
An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.
摘要:
An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.
摘要:
A processing device comprises an instruction execution unit, a memory agent and pinning logic to pin memory pages in a multi-level memory system upon request by the memory agent. The pinning logic includes an agent interface module to receive, from the memory agent, a pin request indicating a first memory page in the multi-level memory system, the multi-level memory system comprising a near memory and a far memory. The pinning logic further includes a memory interface module to retrieve the first memory page from the far memory and write the first memory page to the near memory. In addition, the pinning logic also includes a descriptor table management module to mark the first memory page as pinned in the near memory, wherein marking the first memory page as pinned comprises setting a pinning bit corresponding to the first memory page in a cache descriptor table and to prevent the first memory page from being evicted from the near memory when the first memory page is marked as pinned.
摘要:
In one embodiment, a method includes: receiving, in a root tile of an accelerator device having a plurality of tiles, a message from a processor, the message comprising a register write request to a register of a first remote tile of the plurality of remote tiles; decoding, in an endpoint controller of the root tile, a system address of the message to identify a destination tile for the message, based at least in part on a base address register decode of the system address; and in response to identifying the first remote tile as the destination tile, updating a first portion of an address offset field of the system address to a predetermined value and directing the message to the first remote tile coupled to the root tile via a sideband interconnect. Other embodiments are described and claimed.
摘要:
A processor may operate at a first frequency level for a first time interval. The processor automatically may transition to a sleep state from the first frequency level after the first time interval. Then the processor automatically transitions from the sleep state to the first frequency level after a second time interval. As a result the processor may operate at a reduced power consumption and higher performance.
摘要:
A processor may operate at a first frequency level for a first time interval. The processor automatically may transition to a sleep state from the first frequency level after the first time interval. Then the processor automatically transitions from the sleep state to the first frequency level after a second time interval. As a result the processor may operate at a reduced power consumption and higher performance.