Abstract:
Systems, apparatuses, and methods include technology that determines, with a neural network, that a first eviction node stored in a cache will be evicted from the cache based on a cache policy. The first eviction node is part of a plurality of nodes associated with a graph. Further, a subset of nodes of the plurality of nodes remains in the cache after the eviction of the first eviction node from the cache. The technology further tracks a number of cache hits on the cache during an aggregation operation associated with a hardware accelerator, where the aggregation operation is executed on the subset of nodes that remain in the cache after the eviction of the eviction node from the cache. The technology executes a training process on the neural network to adjust the cache policy based on the number of the cache hits.
Abstract:
Techniques to output a media stream, capture a media stream, or synchronize the output or capture of the media stream at a specified time are described. A media stream output or capture apparatus may include a media processor to receive a media stream to output or a request to capture a media stream and a start time. A buffer generator may be included to generate an input or an output buffer and a media mixer may be included to mix the media stream into the output buffer at the start time or capture the media stream from the input buffer at the start time.
Abstract:
Systems, apparatus and methods are described including operations for a dual mode GMM (Gaussian Mixture Model) scoring accelerator for both speech and video data.
Abstract:
A processor includes at least one core, a power control unit, and a first interconnect to couple with a peripheral controller. The first interconnect is to provide a first uni-directional communication path for communication of first power management data from the processor to the peripheral controller. Other embodiments are described and claimed.
Abstract:
A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
Abstract:
Embodiments of the invention are generally directed to systems, methods, and apparatuses for linear to physical address translation with support for page attributes. In some embodiments, a system receives an instruction to translate a memory pointer to a physical memory address for a memory location. The system may return the physical memory address and one or more page attributes. Other embodiments are described and claimed.
Abstract:
Embodiments of the invention are generally directed to systems, methods, and apparatuses for linear to physical address translation with support for page attributes. In some embodiments, a system receives an instruction to translate a memory pointer to a physical memory address for a memory location. The system may return the physical memory address and one or more page attributes. Other embodiments are described and claimed.
Abstract:
Embodiments of the invention are generally directed to systems, methods, and apparatuses for linear to physical address translation with support for page attributes. In some embodiments, a system receives an instruction to translate a memory pointer to a physical memory address for a memory location. The system may return the physical memory address and one or more page attributes. Other embodiments are described and claimed.
Abstract:
A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
Abstract:
Methods and apparatus to implement multiple inference compute engines are disclosed herein. A disclosed example apparatus includes a first inference compute engine, a second inference compute engine, and an accelerator on coherent fabric to couple the first inference compute engine and the second inference compute engine to a converged coherency fabric of a system-on-chip, the accelerator on coherent fabric to arbitrate requests from the first inference compute engine and the second inference compute engine to utilize a single in-die interconnect port.