Abstract:
Systems, apparatuses, and methods for implementing efficient queues and other data structures. A queue may be shared among multiple processors and/or threads without using explicit software atomic instructions to coordinate access to the queue. System software may allocate an atomic queue and corresponding queue metadata in system memory and return, to the requesting thread, a handle referencing the queue metadata. Any number of threads may utilize the handle for accessing the atomic queue. The logic for ensuring the atomicity of accesses to the atomic queue may reside in a management unit in the memory controller coupled to the memory where the atomic queue is allocated.
Abstract:
A memory accessing agent includes a memory access generating circuit and a memory controller. The memory access generating circuit is adapted to generate multiple memory accesses in a first ordered arrangement. The memory controller is coupled to the memory access generating circuit and has an output port, for providing the multiple memory accesses to the output port in a second ordered arrangement based on the memory accesses and characteristics of an external memory. The memory controller determines the second ordered arrangement by calculating an efficient row burst value and interrupting multiple row-hit requests to schedule a row-miss request based on the efficient row burst value.
Abstract:
The described embodiments include a computing device with one or more entities (processor cores, processors, etc.). In some embodiments, during operation, a thermal power management unit in the computing device uses a linear prediction to compute a predicted duration of a next idle period for an entity based on the duration of one or more previous idle periods for the entity. Based on the predicted duration of the next idle period, the thermal power management unit configures the entity to operate in a corresponding idle state.
Abstract:
A die-stacked memory device incorporates a reconfigurable logic device to provide implementation flexibility in performing various data manipulation operations and other memory operations that use data stored in the die-stacked memory device or that result in data that is to be stored in the die-stacked memory device. One or more configuration files representing corresponding logic configurations for the reconfigurable logic device can be stored in a configuration store at the die-stacked memory device, and a configuration controller can program a reconfigurable logic fabric of the reconfigurable logic device using a selected one of the configuration files. Due to the integration of the logic dies and the memory dies, the reconfigurable logic device can perform various data manipulation operations with higher bandwidth and lower latency and power consumption compared to devices external to the die-stacked memory device.
Abstract:
A system and method are disclosed for managing memory interleaving patterns in a system with multiple memory devices. The system includes a processor configured to access multiple memory devices. The method includes receiving a first plurality of data blocks, and then storing the first plurality of data blocks using an interleaving pattern in which successive blocks of the first plurality of data blocks are stored in each of the memory devices. The method also includes receiving a second plurality of data blocks, and then storing successive blocks of the second plurality of data blocks in a first memory device of the multiple memory devices.
Abstract:
A system includes an atomic processing engine (APE) coupled to an interconnect. The interconnect is to couple to one or more processor cores. The APE receives a plurality of commands from the one or more processor cores through the interconnect. In response to a first command, the APE performs a first plurality of operations associated with the first command. The first plurality of operations references multiple memory locations, at least one of which is shared between two or more threads executed by the one or more processor cores.
Abstract:
Power gating decisions can be made based on measures of cache dirtiness. Analyzer logic can selectively power gate a component of a processor system based on a cache dirtiness of one or more caches associated with the component. The analyzer logic may power gate the component when the cache dirtiness exceeds a threshold and may maintains the component in an idle state when the cache dirtiness does not exceed the threshold. Idle time prediction logic may be used to predict a duration of an idle time of the component. The analyzer logic may then selectively power gates the component based on the cache dirtiness and the predicted idle time.
Abstract:
The described embodiments include a computing device that comprises at least one memory die having memory circuits and memory die processing circuits, and a logic die coupled to the at least one memory die, the logic die having logic die processing circuits. In the described embodiments, the memory die processing circuits are configured to perform memory die processing operations on data retrieved from or destined for the memory circuits and the logic die processing circuits are configured to perform logic die processing operations on data retrieved from or destined for the memory circuits.
Abstract:
The described embodiments include a computing device with one or more entities (processor cores, processors, etc.). In some embodiments, during operation, a thermal power management unit in the computing device uses a linear prediction to compute a predicted duration of a next idle period for an entity based on the duration of one or more previous idle periods for the entity. Based on the predicted duration of the next idle period, the thermal power management unit configures the entity to operate in a corresponding idle state.
Abstract:
Modifying memory page attributes using a programmable page attribute register of a core executing a process is described. In accordance with the described techniques, a host includes a core that is configured to generate a modified page table attribute for a page in system memory. The modified page table attribute represents at least one demoted permission for a page as specified by a system page table. The core is configured to maintain the modified page table attribute locally in the programmable page attribute register and execute at least one operation allocated to the page according to the modified page table attribute.