摘要:
A computer architecture that includes a hierarchical memory system and one or more processors. The processors execute memory access instructions whose semantics are defined in terms of the hierarchical structure of the memory system. That is, rather than attempting to maintain the illusion that the memory system is shared by all processors such that changes made by one processor are immediately visible to other processors, the memory access instructions explicitly address access to a processor-specific memory, and data transfer between the processor-specific memory and the shared memory system. Various alternative embodiments of the memory system are compatible with these instructions. These alternative embodiments do not change the semantic meaning of a computer program which uses the memory access instructions, but allow different approaches to how and when data is actually passed from one processor to another.
摘要:
A computer architecture that includes a hierarchical memory system and one or more processors. The processors execute memory access instructions whose semantics are defined in terms of the hierarchical structure of the memory system. That is, rather than attempting to maintain the illusion that the memory system is shared by all processors such that changes made by one processor are immediately visible to other processors, the memory access instructions explicitly address access to a processor-specific memory, and data transfer between the processor-specific memory and the shared memory system. Various alternative embodiments of the memory system are compatible with these instructions. These alternative embodiments do not change the semantic meaning of a computer program which uses the memory access instructions, but allow different approaches to how and when data is actually passed from one processor to another.
摘要:
A methodology for designing a distributed shared-memory system, which can incorporate adaptation or selection of cache protocols during operation, guarantees semantically correct processing of memory instructions by the multiple processors. A set of rules includes a first subset of “mandatory” rules and a second subset of “voluntary” rules such that correct operation of the memory system is provided by application of all of the mandatory rules and selective application of the voluntary rules. A policy for enabling voluntary rules specifies a particular coherent cache protocol. The policy can include various types of adaptation and selection of different operating modes for different addresses and at different caches. A particular coherent cache protocol can make use of a limited capacity directory in which some but not necessarily all caches that hold a particular address are identified in the directory. In another coherent cache protocol, various caches hold an address in different modes which, for example, affect communication between a cache and a shared memory in processing particular memory instructions.
摘要:
A computer architecture that includes a hierarchical memory system and one or more processors. The processors execute memory access instructions whose semantics are defined in terms of the hierarchical structure of the memory system. That is, rather than attempting to maintain the illusion that the memory system is shared by all processors such that changes made by one processor are immediately visible to other processors, the memory access instructions explicitly address access to a processor-specific memory, and data transfer between the processor-specific memory and the shared memory system. Various alternative embodiments of the memory system are compatible with these instructions. These alternative embodiments do not change the semantic meaning of a computer program which uses the memory access instructions, but allow different approaches to how and when data is actually passed from one processor to another.
摘要:
A methodology for designing a distributed shared-memory system, which can incorporate adaptation or selection of cache protocols during operation, guarantees semantically correct processing of memory instructions by the multiple processors. A set of rules includes a first subset of “mandatory” rules and a second subset of “voluntary” rules such that correct operation of the memory system is provided by application of all of the mandatory rules and selective application of the voluntary rules. A policy for enabling voluntary rules specifies a particular coherent cache protocol. The policy can include various types of adaptation and selection of different operating modes for different addresses and at different caches. A particular coherent cache protocol can make use of a limited capacity directory in which some but not necessarily all caches that hold a particular address are identified in the directory. In another coherent cache protocol, various caches hold an address in different modes which, for example, affect communication between a cache and a shared memory in processing particular memory instructions.
摘要:
In a computer system with a memory hierarchy, when a high-level cache supplies a data copy to a low-level cache, the shared copy can be either volatile or non-volatile. When the data copy is later replaced from the low-level cache, if the data copy is non-volatile, it needs to be written back to the high-level cache; otherwise it can be simply flushed from the low-level cache. The high-level cache can employ a volatile-prediction mechanism that adaptively determines whether a volatile copy or a non-volatile copy should be supplied when the high-level cache needs to send data to the low-level cache. An exemplary volatile-prediction mechanism suggests use of a non-volatile copy if the cache line has been accessed consecutively by the low-level cache. Further, the low-level cache can employ a volatile-promotion mechanism that adaptively changes a data copy from volatile to non-volatile according to some promotion policy, or changes a data copy from non-volatile to volatile according to some demotion policy.
摘要:
A method for reconfiguring a cache memory is provided. The method in one aspect may include analyzing one or more characteristics of an execution entity accessing a cache memory and reconfiguring the cache based on the one or more characteristics analyzed. Examples of analyzed characteristic may include but are not limited to data structure used by the execution entity, expected reference pattern of the execution entity, type of an execution entity, heat and power consumption of an execution entity, etc. Examples of cache attributes that may be reconfigured may include but are not limited to associativity of the cache memory, amount of the cache memory available to store data, coherence granularity of the cache memory, line size of the cache memory, etc.
摘要:
A system, method, and computer readable article of manufacture for sharing buffer management. The system includes: a predictor module to predict at runtime a transaction data size of a transaction according to history information of the transaction; and a resource management module to allocate sharing buffer resources for the transaction according to the predicted transaction data size in response to beginning of the transaction, to record an actual sharing buffer size occupied by the transaction in response to the successful commitment of the transaction, and to update the history information of the transaction.
摘要:
A system and method for latency-aware thread scheduling in non-uniform cache architecture are provided. Instructions may be provided to the hardware specifying in which banks to store data. Information as to which banks store which data may also be provided, for example, by the hardware. This information may be used to schedule threads on one or more cores. A selected bank in cache memory may be reserved strictly for selected data.
摘要:
In a computer system with a memory hierarchy, when a high-level cache supplies a data copy to a low-level cache, the shared copy can be either volatile or non-volatile. When the data copy is later replaced from the low-level cache, if the data copy is non-volatile, it needs to be written back to the high-level cache; otherwise it can be simply flushed from the low-level cache. The high-level cache can employ a volatile-prediction mechanism that adaptively determines whether a volatile copy or a non-volatile copy should be supplied when the high-level cache needs to send data to the low-level cache. An exemplary volatile-prediction mechanism suggests use of a non-volatile copy if the cache line has been accessed consecutively by the low-level cache. Further, the low-level cache can employ a volatile-promotion mechanism that adaptively changes a data copy from volatile to non-volatile according to some promotion policy, or changes a data copy from non-volatile to volatile according to some demotion policy.