Abstract:
Aspects presented herein relate to methods and devices for data or graphics processing including an apparatus, e.g., a graphics processing unit (GPU). The apparatus may obtain an indication of a data write for data associated with data processing. The apparatus may write, based on the indication, the data associated with the data processing to a memory address. The apparatus may receive a read request for the data including the memory address, where the read request is associated with a read with invalidate process. The apparatus may retrieve, based on the read request, the data from at least one of a first cache or at least one second memory, where the retrieval of the data is based on a timing of the indication of the data write. The apparatus may output the retrieved data from at least one of the first cache or the at least one second memory.
Abstract:
Various embodiments include methods for implementing flexible ranks in a memory system. Embodiments may include receiving, at a memory controller, a first memory access command and a first address at which to implement the first memory access command in a logical rank, generating, by the memory controller, a first signal configured to indicate to a first memory device of the logical rank to implement the first memory access command via a first partial channel, sending, from the memory controller, the first signal to the first memory device, generating, by the memory controller, a second signal configured to indicate to a second memory device of the logical rank that is different from the first memory device to implement the first memory access command via a second partial channel, and sending, from the memory controller, the second signal to the second memory device.
Abstract:
Various embodiments include systems and methods for improving the efficiency of a memory subsystem in a computing device. The memory subsystem may be configured to detect memory access events and determining their associated timings and determine an efficiency of the memory subsystem based on operational parameters of the memory subsystem, the detecting memory access events, and associated timings. The memory subsystem may adjust the operational parameters of the memory subsystem based on the determined efficiency of the memory subsystem. The memory subsystem may dynamically modify the operations of the memory subsystem based on the adjusted operational parameters.
Abstract:
Various embodiments include methods and devices for reducing latency in pseudo channel based memory systems. Embodiments may include a first pseudo channel selection device configured to selectively communicatively connect one of a plurality of pseudo channels to a first input/output (IO), and a second pseudo channel selection device configured to selectively communicatively connect one of the plurality of pseudo channels to a second IO, in which the first pseudo channel selection device and the second pseudo channel selection device may be operable to communicatively connect a first pseudo channel of the plurality of pseudo channels to the first IO and to the second IO concurrently. Embodiments may include the pseudo channel based memory system configured to receive a memory access command targeting the first pseudo channel, and use a first pseudo channel data bus and a second pseudo channel data bus to implement the memory access command.
Abstract:
Data caching may include storing data associated with DRAM transaction requests in data storage structures organized in a manner corresponding to the DRAM bank, bank group and rank organization. Data may be selected for transfer to the DRAM by selecting among the data storage structures.
Abstract:
Ephemeral data stored in a cache is read when needed but is not written to system memory so as to save power and bandwidth. In an embodiment, a no-writeback bit associated with the ephemeral data is set in response to a read-no-writeback instruction. Data in a cache line for which its no-writeback bit has been set is not written back into system memory. Accordingly, when evicting cache lines, if a cache line has a no-writeback bit set, then the data in that cache line is discarded without being written back to system memory.
Abstract:
Aspects presented herein relate to methods and devices for data or graphics processing including an apparatus, e.g., a graphics processing unit (GPU). The apparatus may configure an address range in a cache. The apparatus may also obtain a request to access data in the cache, where the request to access the data includes an address in the cache that maps to a set index in a plurality of set indexes, where an address value for the set index corresponds to a portion of the address. Further, the apparatus may select an updated address value for the set index, where the updated address value is associated with an updated address within the address range, and the updated address value corresponds to a portion of the updated address. The apparatus may also allocate the data in the request to the updated address value for the set index.
Abstract:
Various embodiments include methods and devices for reducing latency in pseudo channel based memory systems. Embodiments may include a first pseudo channel selection device configured to selectively communicatively connect one of a plurality of pseudo channels to a first input/output (IO), and a second pseudo channel selection device configured to selectively communicatively connect one of the plurality of pseudo channels to a second IO, in which the first pseudo channel selection device and the second pseudo channel selection device may be operable to communicatively connect a first pseudo channel of the plurality of pseudo channels to the first IO and to the second IO concurrently. Embodiments may include the pseudo channel based memory system configured to receive a memory access command targeting the first pseudo channel, and use a first pseudo channel data bus and a second pseudo channel data bus to implement the memory access command.
Abstract:
Various embodiments include methods and devices for implementing a criterion aware cache replacement policy by a computing device. Embodiments may include updating a staling counter, writing a value of a local counter to a system cache in association with a location in the system cache for with data, in which the value of the local counter includes a value of the staling counter when (i.e., at the time) the associated data is written to the system cache, and using the value of the local counter of the associated data to determine whether the associated data is stale.
Abstract:
Various embodiments include methods and devices for managing optional commands. Some embodiments may include receiving an optional command from an optional command request device, determining whether the optional command can be implemented, and transmitting, to the optional command request device, an optional command no data response in response to determining that the optional command cannot be implemented.