摘要:
A method, system and computer program product are disclosed for maintaining data coherence, for use in a multi-node processing system where each of the nodes includes one or more components. In one embodiment, the method comprises establishing a data domain, assigning a group of the components to the data domain, sending a coherence message from a first component of the processing system to a second component of the processing system, and determining if that second component is assigned to the data domain. In this embodiment, if that second component is assigned to the data domain, the coherence message is transferred to all of the components assigned to the data domain to maintain data coherency among those components. In an embodiment, if that second component is assigned to the data domain, the first component is assigned to the data domain.
摘要:
A system, method, and computer program product for enhancing timeliness of cache memory prefetching in a processing system are provided. The system includes a stride pattern detector to detect a stride pattern for a stride size in an amount of bytes as a difference between successive cache accesses. The system also includes a confidence counter. The system further includes eager prefetching control logic for performing a method when the stride size is less than a cache line size. The method includes adjusting the confidence counter in response to the stride pattern detector detecting the stride pattern, comparing the confidence counter to a confidence threshold, and requesting a cache prefetch in response to the confidence counter reaching the confidence threshold. The system may also include selection logic to select between the eager prefetching control logic and standard stride prefetching control logic.
摘要:
A method, system and computer program product are disclosed for maintaining data coherence, for use in a multi-node processing system where each of the nodes includes one or more components. In one embodiment, the method comprises establishing a data domain, assigning a group of the components to the data domain, sending a coherence message from a first component of the processing system to a second component of the processing system, and determining if that second component is assigned to the data domain. In this embodiment, if that second component is assigned to the data domain, the coherence message is transferred to all of the components assigned to the data domain to maintain data coherency among those components. In an embodiment, if that second component is assigned to the data domain, the first component is assigned to the data domain.
摘要:
An event tracking hardware engine having N (≧2) caches is invoked when an event of interest occurs, using a corresponding key. The engine stores, for each of the different kinds of events, a corresponding cumulative number of occurrences, by carrying out additional steps. In some instances, the additional steps include searching in the N caches for an entry for the key; if an entry for the key is found, and no overflow of the corresponding cumulative number of occurrences for the entry for the key would occur by incrementing the corresponding cumulative number of occurrences, incrementing; if the entry for the key is found, and overflow would occur, promoting the entry to a next highest cache; and if the entry for the key is not found, entering the entry for the key in a zeroth one of the caches with the corresponding cumulative number of occurrences being initialized. In other instances, the additional steps include searching in a zeroth one of the caches for an entry for the key; if an entry for the key is found in the zeroth one of the caches, and no overflow of the corresponding cumulative number of occurrences for the entry for the key would occur by incrementing the corresponding cumulative number of occurrences, incrementing; if the entry for the key is found in the zeroth one of the caches, and overflow would occur, promoting the entry from the zeroth one of the caches in which the entry exists to a next highest cache; and if the entry for the key is not found, entering the entry for the key in the zeroth one of the caches with the corresponding cumulative number of occurrences being initialized. The engine includes a plurality of caches and a corresponding plurality of control circuits.
摘要:
A method and apparatus for maintaining membership in a set of items to be used in a predetermined manner in a computer system. A representation of each member of the set is mapped into a number of components of a primary and secondary vector when a member is added to the set. Periodically, the primary vector is changed to the secondary vector and the secondary vector to the primary vector. When members of the set are deleted, the components of the secondary vector are changed to indicate deletion of these members after the primary vector is changed to the secondary vector. Finally, membership in the set is determined by examining the components in the primary vector, and the members in the set of items are then used in a predetermined manner in the computer system. More specifically, in a sample embodiment of the present invention, membership in the set would determine if data is to be stored or removed from cache memory in a computer system. This invention, for example, provides a low cost and high performance mechanism to phase out aging membership information in a prefeteching mechanism for caching data or instructions in a computer system.
摘要:
A pointer is for pointing to a next-to-read location within a stack of information. For pushing information onto the stack: a value is saved of the pointer, which points to a first location within the stack as being the next-to-read location; the pointer is updated so that it points to a second location within the stack as being the next-to-read location; and the information is written for storage at the second location. For popping the information from the stack: in response to the pointer, the information is read from the second location as the next-to-read location; and the pointer is restored to equal the saved value so that it points to the first location as being the next-to-read location.
摘要:
A system, method, and computer program product for enhancing timeliness of cache memory prefetching in a processing system are provided. The system includes a stride pattern detector to detect a stride pattern for a stride size in an amount of bytes as a difference between successive cache accesses. The system also includes a confidence counter. The system further includes eager prefetching control logic for performing a method when the stride size is less than a cache line size. The method includes adjusting the confidence counter in response to the stride pattern detector detecting the stride pattern, comparing the confidence counter to a confidence threshold, and requesting a cache prefetch in response to the confidence counter reaching the confidence threshold. The system may also include selection logic to select between the eager prefetching control logic and standard stride prefetching control logic.
摘要:
An event tracking hardware engine having N (≧2) caches is invoked when an event of interest occurs, using a corresponding key. The event tracking engine stores a cumulative number of occurrences for each one of the different kinds of events, and searches in the N caches for an entry for the key. When an entry for the key is found, the engine increments the number of occurrences if no overflow of the cumulative number of occurrences would occur. However, if the incrementing would cause overflow, then instead of incrementing the cumulative number of occurrences, the engine promotes the entry for the event of interest to a next higher cache.
摘要:
A method for prefetching data from an array, A, the method including: detecting a stride, dB, of a stream of index addresses of an indirect array, B, contents of each index address having information for determining an address of an element of the array A; detecting an access pattern from the indirect array, B, to data in the array, A, wherein the detecting an access pattern includes: using a constant value of an element size, dA; using a domain size k; executing a load instruction to load bi at address, ia, and receiving index data, mbi; multiplying mbi by dA to produce the product mbi*dA; executing another load instruction to load for a column address, j, where 1≦j≦k, and receiving address aj; recording the difference, aj−mbi*dA; iterating the executing a load instruction, the multiplying, the executing another load instruction, and the recording to produce another difference; incrementing a counter by one if the difference and the another difference are the same; and confirming column address j when the counter reaches a pre-determined threshold; executing a load instruction to load bi+dB and receiving index data nextmbi; and executing a load instruction to load Aj+nextmbi*dA, where Aj=(aj−mbi*dA) when the column address j is confirmed to prefetch the data from the array, A.
摘要:
Methods, systems and computer program products for concomitant pair per-fetching. Exemplary embodiments include a method for concomitant pair prefetching, the method including detecting a stride pattern, detecting an indirect access pattern to define an access window, prefetching candidates within the defined access window, wherein the prefetching comprises obtaining prefetch addresses from a history table, updating a miss stream window, selecting a candidate of a concomitant pair from the miss stream window, producing an index from the candidate pair, accessing an aging filter, updating the history table and selecting another concomitant pair candidate from the miss stream window.