Abstract:
A method and computing system for handling a page fault while executing a cross-platform system call with a shared page cache. A first kernel running in a first computer system receives a request for a faulted page associated with raw data from a second kernel running in a second computer system. In response to the request for the faulted page: (i) a first validity flag is updated to denote that the faulted page is unavailable to the first computer system in a first copy of the shared page cache and (ii) the faulted page is transmitted to the second kernel for insertion of the faulted page in a second copy of the shared page cache and for updating a second validity flag to denote that the faulted page is available to the second computer system in the second copy of the shared page cache.
Abstract:
In one embodiment of the present invention, a method includes switching between a first address space and a second address space, determining if the second address space exists in a list of address spaces; and maintaining entries of the first address space in a translation buffer after the switching. In such manner, overhead associated with such a context switch may be reduced.
Abstract:
In one embodiment of the present invention, a method includes switching between a first address space and a second address space, determining if the second address space exists in a list of address spaces; and maintaining entries of the first address space in a translation buffer after the switching. In such manner, overhead associated with such a context switch may be reduced.
Abstract:
Cache utility curves are determined for different software entities depending on how frequently their storage access requests lead to cache hits or cache misses. Although possible, not all access requests need be tested, but rather only a subset, determined by whether a hash value of each current storage location identifier (such as an address or block number) meets one or more sampling criteria.
Abstract:
A computer system and a method are provided that reduce the amount of time and computing resources that are required to perform a hardware table walk (HWTW) in the event that a translation lookaside buffer (TLB) miss occurs. If a TLB miss occurs when performing a stage 2 (S2) HWTW to find the PA at which a stage 1 (S1) page table is stored, the MMU uses the IPA to predict the corresponding PA, thereby avoiding the need to perform any of the S2 table lookups. This greatly reduces the number of lookups that need to be performed when performing these types of HWTW read transactions, which greatly reduces processing overhead and performance penalties associated with performing these types of transactions.
Abstract:
In one embodiment of the present invention, a method includes switching between a first address space and a second address space, determining if the second address space exists in a list of address spaces; and maintaining entries of the first address space in a translation buffer after the switching. In such manner, overhead associated with such a context switch may be reduced.
Abstract:
A data processing device is provided that employs multiple translation look-aside buffers (TLBs) associated with respective processors that are configured to store selected address translations of a page table of a memory shared by the processors. The processing device is configured such that when an address translation is requested by a processor and is not found in the TLB associated with that processor, another TLB is probed for the requested address translation. The probe across to the other TLB may occur in advance of a walk of the page table for the requested address or alternatively a walk can be initiated concurrently with the probe. Where the probe successfully finds the requested address translation, the page table walk can be avoided or discontinued.
Abstract:
A method of storing stack data in a cache hierarchy is provided. The cache hierarchy comprises a data cache and a stack filter cache. Responsive to a request to access a stack data block, the method stores the stack data block in the stack filter cache, wherein the stack filter cache is configured to store any requested stack data block.
Abstract:
An apparatus and method are described for coupling a front end core to an accelerator component (e.g., such as a graphics accelerator). For example, an apparatus is described comprising: an accelerator comprising one or more execution units (EUs) to execute a specified set of instructions; and a front end core comprising a translation lookaside buffer (TLB) communicatively coupled to the accelerator and providing memory access services to the accelerator, the memory access services including performing TLB lookup operations to map virtual to physical addresses on behalf of the accelerator and in response to the accelerator requiring access to a system memory.
Abstract:
Embodiments of the invention relate to hybrid address translation. An aspect of the invention includes receiving a first address, the first address referencing a location in a first address space. The computer searches a segment lookaside buffer (SLB) for a SLB entry corresponding to the first address; the SLB entry comprising a type field and an address field and determines whether a value of the type field in the SLB entry indicates a hashed page table (HPT) search or a radix tree search. Based on determining that the value of the type field indicates the HPT search, a HPT is searched to determine a second address, the second address comprising a translation of the first address into a second address space; and based on determining that the value of the type field indicates the radix tree search, a radix tree is searched to determine the second address.