Abstract:
A die-stacked hybrid memory device implements a first set of one or more memory dies implementing first memory cell circuitry of a first memory architecture type and a second set of one or more memory dies implementing second memory cell circuitry of a second memory architecture type different than the first memory architecture type. The die-stacked hybrid memory device further includes a set of one or more logic dies electrically coupled to the first and second sets of one or more memory dies, the set of one or more logic dies comprising a memory interface and a page migration manager, the memory interface coupleable to a device external to the die-stacked hybrid memory device, and the page migration manager to transfer memory pages between the first set of one or more memory dies and the second set of one or more memory dies.
Abstract:
The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism performs a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the resource is not available for performing the operation and until another resource is selected for performing the operation, the selection mechanism identifies a next resource in the table and selects the next resource for performing the operation when the next resource is available for performing the operation.
Abstract:
A system, method, and memory device embodying some aspects of the present invention for remapping external memory addresses and internal memory locations in stacked memory are provided. The stacked memory includes one or more memory layers configured to store data. The stacked memory also includes a logic layer connected to the memory layer. The logic layer has an Input/Output (I/O) port configured to receive read and write commands from external devices, a memory map configured to maintain an association between external memory addresses and internal memory locations, and a controller coupled to the I/O port, memory map, and memory layers, configured to store data received from external devices to internal memory locations.
Abstract:
A method of managing memory includes installing a first cacheline at a first location in a cache memory and receiving a write request. In response to the write request, the first cacheline is modified in accordance with the write request and marked as dirty. Also in response to the write request, a second cacheline is installed that duplicates the first cacheline, as modified in accordance with the write request, at a second location in the cache memory.
Abstract:
The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism is configured to perform a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the identified resource is not available for performing the operation and until a resource is selected for performing the operation, the selection mechanism is configured to identify a next resource in the table and select the next resource for performing the operation when the next resource is available for performing the operation.
Abstract:
A memory accessing agent includes a memory access generating circuit and a memory controller. The memory access generating circuit is adapted to generate multiple memory accesses in a first ordered arrangement. The memory controller is coupled to the memory access generating circuit and has an output port, for providing the multiple memory accesses to the output port in a second ordered arrangement based on the memory accesses and characteristics of an external memory. The memory controller determines the second ordered arrangement by calculating an efficient row burst value and interrupting multiple row-hit requests to schedule a row-miss request based on the efficient row burst value.
Abstract:
Methods, and media, and computer systems are provided. The method includes, the media includes control logic for, and the computer system includes a processor with control logic for overriding an execution mask of SIMD hardware to enable at least one of a plurality of lanes of the SIMD hardware. Overriding the execution mask is responsive to a data parallel computation and a diverged control flow of a workgroup.
Abstract:
A system and method for efficiently limiting storage space for data with particular properties in a cache memory. A computing system includes a cache array and a corresponding cache controller. The cache array includes multiple banks, wherein a first bank is powered down. In response a write request to a second bank for data indicated to be stored in the powered down first bank, the cache controller determines a respective bypass condition for the data. If the bypass condition exceeds a threshold, then the cache controller invalidates any copy of the data stored in the second bank. If the bypass condition does not exceed the threshold, then the cache controller stores the data with a clean state in the second bank. The cache controller writes the data in a lower-level memory for both cases.
Abstract:
A processor discards spill data from a memory hierarchy in response to the final access to the spill data has been performed by a compiled program executing at the processor. In some embodiments, the final access determined based on a special-purpose load instruction configured for this purpose. In some embodiments the determination is made based on the location of a stack pointer indicating that a method of the executing program has returned, so that data of the returned method that remains in the stack frame is no longer to be accessed. Because the spill data is discarded after the final access, it is not transferred through the memory hierarchy.
Abstract:
Methods, media, and computing systems are provided. The method includes, the media are configured for, and the computing system includes a processor with control logic for allocating memory for storing a plurality of local register states for work items to be executed in single instruction multiple data hardware and for repacking wavefronts that include work items associated with a program instruction responsive to a conditional statement. The repacking is configured to create repacked wavefronts that include at least one of a wavefront containing work items that all pass the conditional statement and a wavefront containing work items that all fail the conditional statement.