Abstract:
Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.
Abstract:
Methods, devices, and systems for capturing an accuracy of an instruction executing on a processor. An instruction may be executed on the processor, and the accuracy of the instruction may be captured using a hardware counter circuit. The accuracy of the instruction may be captured by analyzing bits of at least one value of the instruction to determine a minimum or maximum precision datatype for representing the field, and determining whether to adjust a value of the hardware counter circuit accordingly. The representation may be output to a debugger or logfile for use by a developer, or may be output to a runtime or virtual machine to automatically adjust instruction precision or gating of portions of the processor datapath.
Abstract:
A plurality of first controllers operate according to a plurality of access protocols to control a plurality of memory modules. A second controller receives access requests that target the plurality of memory modules and selectively provides the access requests and control information to the plurality of first controllers based on physical addresses in the access requests. The second controller generates the control information for the first controllers based on statistical representations of the access requests to the plurality of memory modules.
Abstract:
A cache controller to configure a portion of a first memory as cache for a second memory responsive to an indicator of locality of memory access requests to the second memory. The indicator of locality determines a probability that a location of a memory access request to the second memory is predictable based upon at least one previous memory access request. The cache controller may determine a size of the cache based on a value of the indicator of locality or modify the size of the cache in response to changes in the value of the indicator of locality.
Abstract:
A computing system includes a set of computing resources and a datastore to store information representing a corresponding idle power consumption metric and a corresponding peak power consumption metric for each computing resource of the set. The computing system further includes a controller coupled to the set of computing resources and the datastore. The controller is to configure the set of computing resources to meet a power budget constraint for the set based on the corresponding idle power consumption metric and the corresponding peak power consumption metric for each computing resource of the set.
Abstract:
A processor employs multiple prefetchers at a processor to identify patterns in memory accesses to different memory modules. The memory accesses can include transfers between the memory modules, and the prefetchers can prefetch data directly from one memory module to another based on patterns in the transfers. This allows the processor to efficiently organize data at the memory modules without direct intervention by software or by a processor core, thereby improving processing efficiency.
Abstract:
A cooling system controller for a set of computing resources of a data center includes a first interface to couple to a first flow controller that controls a rate of thermal energy transfer to a PCM store from the set of computing resources, a second interface to couple to a second flow controller that controls a rate of thermal energy transfer from the PCM store to a cooling system, and a controller to determine a current set of operational parameters for the data center and to manipulate the first and second flow controllers and via the first and second interfaces to control a net thermal energy transfer to and from the PCM store based on the current set of parameters.
Abstract:
Embodiments are described for a communications interconnect scheme for 3D stacked memory devices. A ring network design is used for networks of memory chips organized as individual devices with multiple dies or wafers. The design comprises a three-tier ring network where each ring serves a different set of memory blocks. One ring or set of rings interconnects memory within a die (inter-bank), a second ring or set of rings interconnects memory across die in a stack (inter-die), and the third ring or set of rings interconnects memory across stacks or chip packages (inter-stack).
Abstract:
A processing device includes a plurality of components and a system management unit to selectively schedule an application phase to one of the plurality of components based on one or more comparisons of predictions of a plurality of thermal impacts of executing the application phase on each of the plurality of components. The predictions may be generated based on a thermal history associated with the application phase, thermal sensitivities of the plurality of components, or a layout of the plurality of components in the processing device.
Abstract:
A power management controller is used to control power management states of a processing device. A register stores a timer tick value accessible to the power management controller. The timer tick value indicates when an interrupt is to occur in the processing device. The power management controller may use the exposed timer tick value to decide whether to transition between power management states such as an active state, an idle state, and a power-gated state. The timer tick value stored in the register may be modified by an operating system, an application, or software implemented on the processing device.