Abstract:
A multi-die semiconductor package includes a first integrated circuit (IC) die having a first intrinsic performance level and a second IC die having a second intrinsic performance level different from the first intrinsic performance level. A power management controller distributes, based on a determined die performance differential between the first IC die and the second IC die, a level of power allocated to the semiconductor chip package between the first IC die and the second IC die. In this manner, the first IC die receives and operates at a first level of power resulting in performance exceeding its intrinsic performance level. The second IC die receives and operates at a second level of power resulting in performance below its intrinsic performance level, thereby reducing performance differentials between the IC dies.
Abstract:
Some die-stacked memories will contain a logic layer in addition to one or more layers of DRAM (or other memory technology). This logic layer may be a discrete logic die or logic on a silicon interposer associated with a stack of memory dies. Additional circuitry/functionality is placed on the logic layer to implement functionality to perform various computation operations. This functionality would be desired where performing the operations locally near the memory devices would allow increased performance and/or power efficiency by avoiding transmission of data across the interface to the host processor.
Abstract:
Durations of power management states are predicted on a per-process basis. Some embodiments include storing, in one or more data structures associated with one or more processes, information indicating previous durations of a power management state associated with the process(es). Some embodiments also include predicting a subsequent duration of the power management state for the process(es) using information stored in the data structure(s).
Abstract:
A die-stacked memory device incorporates a reconfigurable logic device to provide implementation flexibility in performing various data manipulation operations and other memory operations that use data stored in the die-stacked memory device or that result in data that is to be stored in the die-stacked memory device. One or more configuration files representing corresponding logic configurations for the reconfigurable logic device can be stored in a configuration store at the die-stacked memory device, and a configuration controller can program a reconfigurable logic fabric of the reconfigurable logic device using a selected one of the configuration files. Due to the integration of the logic dies and the memory dies, the reconfigurable logic device can perform various data manipulation operations with higher bandwidth and lower latency and power consumption compared to devices external to the die-stacked memory device.
Abstract:
The disclosed computer-implemented method for interpolating register-based lookup tables can include identifying, within a set of registers, a lookup table that has been encoded for storage within the set of registers. The method can also include receiving a request to look up a value in the lookup table and responding to the request by interpolating, from the encoded lookup table stored in the set of registers, a representation of the requested value. Various other methods, systems, and computer-readable media are also disclosed.
Abstract:
Durations of power management states are predicted on a per-process basis. Some embodiments include storing, in one or more data structures associated with one or more processes, information indicating previous durations of a power management state associated with the process(es). Some embodiments also include predicting a subsequent duration of the power management state for the process(es) using information stored in the data structure(s).
Abstract:
A multi-core data processor includes multiple data processor cores and a circuit. The multiple data processor cores each include a power state controller having a first input for receiving an idle signal, a second input for receiving a release signal, a third input for receiving a control signal, and an output for providing a current power state. In response to the idle signal, the power state controller causes a corresponding data processor core to enter an idle state. In response to the release signal, the power state controller changes the current power state from the idle state to an active state in dependence on the control signal. The circuit is coupled to each of the multiple data processor cores for providing the control signal in response to current power states in the multiple data processor cores.
Abstract:
A die-stacked memory device incorporates a data translation controller at one or more logic dies of the device to provide data translation services for data to be stored at, or retrieved from, the die-stacked memory device. The data translation operations implemented by the data translation controller can include compression/decompression operations, encryption/decryption operations, format translations, wear-leveling translations, data ordering operations, and the like. Due to the tight integration of the logic dies and the memory dies, the data translation controller can perform data translation operations with higher bandwidth and lower latency and power consumption compared to operations performed by devices external to the die-stacked memory device.
Abstract:
In some embodiments, a method for improving reliability in a processor is provided. The method can include replicating input data for first and second lanes of a processor, the first and second lanes being located in a same cluster of the processor and the first and second lanes each generating a respective value associated with an instruction to be executed in the respective lane, and responsive to a determination that the generated values do not match, providing an indication that the generated values do not match.
Abstract:
An accelerator device includes a first processing unit to access a structure of a graph dataset, and a second processing unit coupled with the first processing unit to perform computations based on data values in the graph dataset.