Abstract:
Methods and systems for pre-fetching address translations in a memory management unit (MMU) of a device are disclosed. In an embodiment, the MMU receives a pre-fetch command from an upstream component of the device, the pre-fetch command including an address of an instruction, pre-fetches a translation of the instruction from a translation table in a memory of the device, and stores the translation of the instruction in a translation cache associated with the MMU.
Abstract:
Methods and systems are disclosed for full-hardware management of power and clock domains related to a distributed virtual memory (DVM) network. An aspect includes transmitting, from a DVM initiator to a DVM network, a DVM operation, broadcasting, by the DVM network to a plurality of DVM targets, the DVM operation, and, based on the DVM operation being broadcasted to the plurality of DVM targets by the DVM network, performing one or more hardware optimizations comprising: turning on a clock domain coupled to the DVM network or a DVM target of the plurality of DVM targets that is a target of the DVM operation, increasing a frequency of the clock domain, turning on a power domain coupled to the DVM target based on the power domain being turned off, or terminating the DVM operation to the DVM target based on the DVM target being turned off.
Abstract:
Systems and methods relate to performing address translations in a multithreaded memory management unit (MMU). Two or more address translation requests can be received by the multithreaded MMU and processed in parallel to retrieve address translations to addresses of a system memory. If the address translations are present in a translation cache of the multithreaded MMU, the address translations can be received from the translation cache and scheduled for access of the system memory using the translated addresses. If there is a miss in the translation cache, two or more address translation requests can be scheduled in two or more translation table walks in parallel.
Abstract:
Methods and systems are disclosed for full-hardware management of power and clock domains related to a distributed virtual memory (DVM) network. An aspect includes transmitting, from a DVM initiator to a DVM network, a DVM operation, broadcasting, by the DVM network to a plurality of DVM targets, the DVM operation, and, based on the DVM operation being broadcasted to the plurality of DVM targets by the DVM network, performing one or more hardware optimizations comprising: turning on a clock domain coupled to the DVM network or a DVM target of the plurality of DVM targets that is a target of the DVM operation, increasing a frequency of the clock domain, turning on a power domain coupled to the DVM target based on the power domain being turned off, or terminating the DVM operation to the DVM target based on the DVM target being turned off.
Abstract:
A comparand that includes a virtual address is received. Upon determining a match of the comparand to a burst entry tag, a candidate matching translation data unit is selected. The selecting is from a plurality of translation data units associated with the burst entry tag, and is based at least in part on at least one bit of the virtual address. Content of the candidate matching translation data unit is compared to at least a portion of the comparand. Upon a match, a hit is generated.
Abstract:
Various embodiments include methods and devices for implementing decompression of compressed high dynamic ratio fields. Various embodiments may include receiving compressed first and second sets of data fields, decompressing the first and second compressed sets of data fields to generate first and second decompressed sets of data fields, receiving a mapping for mapping the first and second decompressed sets of data fields to a set of data units, aggregating the first and second decompressed sets of data fields using the mapping to generate a compression block comprising the set of data units.
Abstract:
Various embodiments include methods and devices for implementing compression of high dynamic ratio fields. Various embodiments may include receiving a compression block having data units, receiving a mapping for the compression block, wherein the mapping is configured to map bits of each data unit to two or more data fields to generate a first set of data fields and a second set of data fields, compressing the first set of data fields together to generate a compressed first set of data fields, and compressing the second set of data fields together to generate a compressed second set of data fields.
Abstract:
Systems and methods for pre-fetching address translations in a memory management unit (MMU) are disclosed. The MMU detects a triggering condition related to one or more translation caches associated with the MMU, the triggering condition associated with a trigger address, generates a sequence descriptor describing a sequence of address translations to pre-fetch into the one or more translation caches, the sequence of address translations comprising a plurality of address translations corresponding to a plurality of address ranges adjacent to an address range containing the trigger address, and issues an address translation request to the one or more translation caches for each of the plurality of address translations, wherein the one or more translation caches pre-fetch at least one address translation of the plurality of address translations into the one or more translation caches when the at least one address translation is not present in the one or more translation caches.