Abstract:
A system, method, and computer program product are provided for a memory device system. One or more memory dies and at least one logic die are disposed in a package and communicatively coupled. The logic die comprises a processing device configurable to manage virtual memory and operate in an operating mode. The operating mode is selected from a set of operating modes comprising a slave operating mode and a host operating mode.
Abstract:
A system and method are disclosed for managing memory interleaving patterns in a system with multiple memory devices. The system includes a processor configured to access multiple memory devices. The method includes receiving a first plurality of data blocks, and then storing the first plurality of data blocks using an interleaving pattern in which successive blocks of the first plurality of data blocks are stored in each of the memory devices. The method also includes receiving a second plurality of data blocks, and then storing successive blocks of the second plurality of data blocks in a first memory device of the multiple memory devices.
Abstract:
A system and method are disclosed for managing memory interleaving patterns in a system with multiple memory devices. The system includes a processor configured to access multiple memory devices. The method includes receiving a first plurality of data blocks, and then storing the first plurality of data blocks using an interleaving pattern in which successive blocks of the first plurality of data blocks are stored in each of the memory devices. The method also includes receiving a second plurality of data blocks, and then storing successive blocks of the second plurality of data blocks in a first memory device of the multiple memory devices.
Abstract:
A die-stacked memory device implements an integrated coherency manager to offload cache coherency protocol operations for the devices of a processing system. The die-stacked memory device includes a set of one or more stacked memory dies and a set of one or more logic dies. The one or more logic dies implement hardware logic providing a memory interface and the coherency manager. The memory interface operates to perform memory accesses in response to memory access requests from the coherency manager and the one or more external devices. The coherency manager comprises logic to perform coherency operations for shared data stored at the stacked memory dies. Due to the integration of the logic dies and the memory dies, the coherency manager can access shared data stored in the memory dies and perform related coherency operations with higher bandwidth and lower latency and power consumption compared to the external devices.
Abstract:
A system, method, and computer program product are provided for a memory device system. One or more memory dies and at least one logic die are disposed in a package and communicatively coupled. The logic die comprises a processing device configurable to manage virtual memory and operate in an operating mode. The operating mode is selected from a set of operating modes comprising a slave operating mode and a host operating mode.
Abstract:
The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism performs a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the resource is not available for performing the operation and until another resource is selected for performing the operation, the selection mechanism identifies a next resource in the table and selects the next resource for performing the operation when the next resource is available for performing the operation.
Abstract:
A system and method for efficiently powering down banks in a cache memory for reducing power consumption. A computing system includes a cache array and a corresponding cache controller. The cache array includes multiple banks, each comprising multiple cache sets. In response to a request to power down a first bank of the multiple banks in the cache array, the cache controller selects a cache line of a given type in the first bank and determines whether a respective locality of reference for the selected cache line exceeds a threshold. If the threshold is exceeded, then the selected cache line is migrated to a second bank in the cache array. If the threshold is not exceeded, then the selected cache line is written back to lower-level memory.
Abstract:
The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism is configured to perform a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the identified resource is not available for performing the operation and until a resource is selected for performing the operation, the selection mechanism is configured to identify a next resource in the table and select the next resource for performing the operation when the next resource is available for performing the operation.
Abstract:
A die-stacked memory device implements an integrated QoS manager to provide centralized QoS functionality in furtherance of one or more specified QoS objectives for the sharing of the memory resources by other components of the processing system. The die-stacked memory device includes a set of one or more stacked memory dies and one or more logic dies. The logic dies implement hardware logic for a memory controller and the QoS manager. The memory controller is coupleable to one or more devices external to the set of one or more stacked memory dies and operates to service memory access requests from the one or more external devices. The QoS manager comprises logic to perform operations in furtherance of one or more QoS objectives, which may be specified by a user, by an operating system, hypervisor, job management software, or other application being executed, or specified via hardcoded logic or firmware.
Abstract:
Methods and apparatuses are provided for avoiding cold translation lookaside buffer (TLB) misses in a computer system. A typical system is configured as a heterogeneous computing system having at least one central processing unit (CPU) and one or more graphic processing units (GPUs) that share a common memory address space. Each processing unit (CPU and GPU) has an independent TLB. When offloading a task from a particular CPU to a particular GPU, translation information is sent along with the task assignment. The translation information allows the GPU to load the address translation data into the TLB associated with the one or more GPUs prior to executing the task. Preloading the TLB of the GPUs reduces or avoids cold TLB misses that could otherwise occur without the benefits offered by the present disclosure.