Abstract:
Systems, apparatuses, and methods for improving TM throughput using a TM region indicator (or color) are described. Through the use of TM region indicators younger TM regions can have their instructions retired while waiting for older TM regions to commit.
Abstract:
A processor includes a cache, a front end to decode an instruction, execution units to execute the instruction, and a retirement unit to retire the instruction. The instruction specifies that a vector of data will be prefetched. The instruction is to include a mask, the mask is to indicate whether corresponding values of the vector of data are included within the cache. The execution units include logic to issue prefetches for each value of the vector of data for which the mask indicates that is unavailable within the cache, and logic to suppress prefetches for each value of the vector of data for which the mask indicates that is available within the cache.
Abstract:
Systems, apparatuses, and methods for improving TM throughput using a TM region indicator (or color) are described. Through the use of TM region indicators younger TM regions can have their instructions retired while waiting for older TM regions to commit.
Abstract:
A method and apparatus of a device that reads and writes data using a shared memory hash table and a lookaside buffer is described. In an exemplary embodiment, a device locates a bucket for the data in a shared memory hash table, where a writer updates the shared memory hash table and a reader that is one of a plurality of readers reads from the shared memory hash table. The device further retrieves an initial value of a version of the bucket. If the initial value of the version is odd, the device copies the data from a lookaside buffer of the writer to a local buffer for the reader, wherein the lookaside buffer stores a copy of the data while the bucket is being modified.
Abstract:
Systems and methods enable a virtual machine, including any applications executing thereon, to quickly start executing and servicing users based on pre-staged data blocks supplied from a backup copy in secondary storage. An enhanced media agent may pre-stage certain backed up data blocks which may be needed to launch the virtual machine, based on predictive analysis pertaining to the virtual machine's operational profile. The enhanced media agent may also pre-stage backed up data blocks for a virtual-machine-file-relocation operation, based on the operation's relocation scheme. Servicing read requests to the virtual machine may take priority over ongoing pre-staging of backed up data. Read requests may be tracked so that the media agent may properly maintain the contents of an associated read cache. Some embodiments of the illustrative storage management system may lack, or may simply not require, the relocation operation, and may operate in a “live mount” configuration.
Abstract:
A cache lock validation apparatus for a cache having sets of cache lines and coupled to a cache controller. The apparatus includes a memory coupled to a processor. The memory includes test case data related to an architecture of the cache. The processor selects a first set of the sets of cache lines and generates a corresponding first group of addresses and an overflow status address. The processor instructs the cache controller to sequentially lock the first group of addresses and the overflow status address. The processor checks a status of an overflow bit in a status register of the cache controller upon locking the overflow status address, and generates a FAIL status signal when the overflow bit is reset.
Abstract:
In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing a speculative cache modification design. For example, in one embodiment, such means may include an integrated circuit having a data bus; a cache communicably interfaced with the data bus; a pipeline communicably interfaced with the data bus, in which the pipeline is to receive a store instruction corresponding to a cache line to be written to cache; caching logic to perform a speculative cache write of the cache line into the cache before the store instruction retires from the pipeline; and cache line validation logic to determine if the cache line written into the cache is valid or invalid, in which the cache line validation logic is to invalidate the cache line speculatively written into the cache when determined invalid and further in which the store instruction is allowed to retire from the pipeline when the cache line is determined to be valid.
Abstract:
A Fill Buffer (FB) based data forwarding scheme that stores a combination of Virtual Address (VA), TLB (Translation Look-aside Buffer) entry# or an indication of a location of a Page Table Entry (PTE) in the TLB, and a TLB page size information in the FB and uses these values to expedite FB forwarding. Load (Ld) operations send their non-translated VA for an early comparison against the VA entries in the FB, and are then further qualified with the TLB entry# to determine a “hit.” This hit determination is fast and enables FB forwarding at higher frequencies without waiting for a comparison of Physical Addresses (PA) to conclude in the FB. A safety mechanism may detect a false hit in the FB and generate a late load cancel indication to cancel the earlier-started FB forwarding by ignoring the data obtained as a result of the Ld execution. The Ld is then re-executed later and tries to complete successfully with the correct data.
Abstract:
In one embodiment, a method for detecting false sharing includes running code on a plurality of cores, where the code includes instrumentation and tracking cache invalidations in the code while running the code to produce tracked invalidations in accordance with the instrumentation, where tracking the cache invalidations includes tracking cache accesses to a plurality of cache lines by a plurality of tasks. The method also includes reporting false sharing in accordance with the tracked invalidations to produce a false sharing report.
Abstract:
In one embodiment, a method for predicting false sharing includes running code on a plurality of cores and tracking potential false sharing in the code while running the code to produce tracked potential false sharing, where tracking the potential false sharing includes determining whether there is potential false sharing between a first cache line and a second cache line, and where the first cache line is adjacent to the second cache line. The method also includes reporting potential false sharing in accordance with the tracked potential false sharing to produce a false sharing report.