Abstract:
A three-dimensional (3-D) processor stack includes a plurality of processor cores implemented in a plurality of layers. A controller is to selectively throttle one or more of a plurality of processor cores in response to detecting a thermal event. The controller selectively throttles the one or more of the plurality of processor cores based on values of thermal couplings between the plurality of layers and based on measures of criticality of threads executing on the plurality of processor cores.
Abstract:
An integrated circuit (IC) package includes a stacked-die memory device. The stacked-die memory device includes a set of one or more stacked memory dies implementing memory cell circuitry. The stacked-die memory device further includes a set of one or more logic dies electrically coupled to the memory cell circuitry. The set of one or more logic dies includes a query controller and a memory controller. The memory controller is coupleable to at least one device external to the stacked-die memory device. The query controller is to perform a query operation on data stored in the memory cell circuitry responsive to a query command received from the external device.
Abstract:
A cache memory receives a request to perform a write operation. The request specifies an address. A first determination is made that the cache memory does not include a cache line corresponding to the address. A second determination is made that the address is between a previous value of a stack pointer and a current value of the stack pointer. A third determination is made that a write history indicator is set to a specified value. The write operation is performed in the cache memory without waiting for a cache fill corresponding to the address to be performed, in response to the first, second, and third determinations.
Abstract:
A system and method for efficiently limiting storage space for data with particular properties in a cache memory. A computing system includes a cache and one or more sources for memory requests. In response to receiving a request to allocate data of a first type, a cache controller allocates the data in the cache responsive to determining a limit of an amount of data of the first type permitted in the cache is not reached. The controller maintains an amount and location information of the data of the first type stored in the cache. Additionally, the cache may be partitioned with each partition designated for storing data of a given type. Allocation of data of the first type is dependent at least upon the availability of a first partition and a limit of an amount of data of the first type in a second partition.
Abstract:
A system and method for efficient predicting and processing of memory access dependencies. A computing system includes control logic that marks a detected load instruction as a first type responsive to predicting the load instruction has high locality and is a candidate for store-to-load (STL) data forwarding. The control logic marks the detected load instruction as a second type responsive to predicting the load instruction has low locality and is not a candidate for STL data forwarding. The control logic processes a load instruction marked as the first type as if the load instruction is dependent on an older store operation. The control logic processes a load instruction marked as the second type as if the load instruction is independent on any older store operation.
Abstract:
In some embodiments, a method of managing cache memory includes identifying a group of cache lines in a cache memory, based on a correlation between the cache lines. The method also includes tracking evictions of cache lines in the group from the cache memory and, in response to a determination that a criterion regarding eviction of cache lines in the group from the cache memory is satisfied, selecting one or more (e.g., all) remaining cache lines in the group for eviction.
Abstract:
An integrated circuit (IC) package includes a stacked-die memory device. The stacked-die memory device includes a set of one or more stacked memory dies implementing memory cell circuitry. The stacked-die memory device further includes a set of one or more logic dies electrically coupled to the memory cell circuitry. The set of one or more logic dies includes a query controller and a memory controller. The memory controller is coupleable to at least one device external to the stacked-die memory device. The query controller is to perform a query operation on data stored in the memory cell circuitry responsive to a query command received from the external device.
Abstract:
A data processing device is provided that includes an array of working memory banks and an associated processing engine. The working memory bank array is configured with at least one independently activatable memory bank. A dirty data counter (DDC) is associated with the independently activatable memory bank and is configured to reflect a count of dirty data migrated from the independently activatable memory bank upon selective deactivation of the independently activatable memory bank. The DDC is configured to selectively decrement the count of dirty data upon the reactivation of the independently activatable memory bank in connection with a transient state. In the transient state, each dirty data access by the processing engine to the reactivated memory bank is also conducted with respect to another memory bank of the array. Upon a condition that dirty data is found in the other memory bank, the count of dirty data is decremented.
Abstract:
An electronic device includes processing circuitry that executes a lookup table (LUT) vector instruction. Executing the lookup table vector instruction causes the processing circuitry to acquire a set of reference values by using each input value from an input vector as an index to acquire a reference value from a reference vector. The processing circuitry then provides the set of reference values for one or more subsequent operations. The processing circuitry can also use the set of reference values for controlling vector elements from among a set of vector elements for which a vector operation is performed.
Abstract:
An exemplary computing device includes a plurality of circuits and/or a plurality of in-situ monitors configured to generate outputs that indicate one or more operating conditions of the circuits. The computing device also includes a system management unit configured to detect a potentially faulty voltage-to-frequency ratio implemented by one of the circuits based at least in part on one or more of the outputs. The system management unit is also configured to modify the potentially faulty voltage-to-frequency ratio based at least in part on one or more of the outputs. Various other devices, systems, and methods are also disclosed.