摘要:
A method, system, and computer usable program product for improved register allocation in a simultaneous multithreaded processor. A determination is made that a thread of an application in the data processing environment needs more physical registers than are available to allocate to the thread. The thread is configured to utilize a logical register that is mapped to a memory register. The thread is executed utilizing the physical registers and the memory registers.
摘要:
A method, system, and computer usable program product for improved register allocation in a simultaneous multithreaded processor. A determination is made that a thread of an application in the data processing environment needs more physical registers than are available to allocate to the thread. The thread is configured to utilize a logical register that is mapped to a memory register. The thread is executed utilizing the physical registers and the memory registers.
摘要:
A system, and computer usable program product for fast remote communication and computation between processors are provided in the illustrative embodiments. A direct core to core communication unit (DCC) is configured to operate with a first processor, the first processor being a remote processor. A memory associated with the DCC receives a set of bytes, the set of bytes being sent from a second processor. An operation specified in the set of bytes is executed at the remote processor such that the operation is invoked without causing a software thread to execute.
摘要:
A method for fast remote communication and computation between processors is provided in the illustrative embodiments. A direct core to core communication unit (DCC) is configured to operate with a first processor, the first processor being a remote processor. A memory associated with the DCC receives a set of bytes, the set of bytes being sent from a second processor. An operation specified in the set of bytes is executed at the remote processor such that the operation is invoked without causing a software thread to execute.
摘要:
A system and method for cache management in a data processing system. The data processing system includes a processor and a memory hierarchy. The memory hierarchy includes at least an upper memory cache, at least a lower memory cache, and a write-back data structure. In response to replacing data from the upper memory cache, the upper memory cache examines the write-back data structure to determine whether or not the data is present in the lower memory cache. If the data is present in the lower memory cache, the data is replaced in the upper memory cache without casting out the data to the lower memory cache.
摘要:
A method for fast remote communication and computation between processors is provided in the illustrative embodiments. A direct core to core communication unit (DCC) is configured to operate with a first processor, the first processor being a remote processor. A memory associated with the DCC receives a set of bytes, the set of bytes being sent from a second processor. An operation specified in the set of bytes is executed at the remote processor such that the operation is invoked without causing a software thread to execute.
摘要:
A method of accessing data from a cache is disclosed. Tag bits of data among sets and ways of cache lines are divided into common subtags and remaining subtags. Similarly, an access address tag is divided into an address common subtag and address remaining tag. When the index of an access address selects a set, a match comparison of the address common subtag and the selected set common subtag is performed. Also, the address remaining tag and selected set remaining subtags are compared for matching before the selected set and associated data is supplied to the requester.
摘要:
For a flexible replication with skewed mapping in a multi-core chip, a request for a cache line is received, at a receiver core in the multi-core chip from a requester core in the multi-core chip. The receiver and requester cores comprise electronic circuits. The multi-core chip comprises a set of cores including the receiver and the requester cores. A target core is identified from the request to which the request is targeted. A determination is made whether the target core includes the requester core in a neighborhood of the target core, the neighborhood including a first subset of cores mapped to the target core according to a skewed mapping. The cache line is replicated, responsive to the determining being negative, from the target core to a replication core. The cache line is provided from the replication core to the requester core.
摘要:
For a flexible replication with skewed mapping in a multi-core chip, a request for a cache line is received, at a receiver core in the multi-core chip from a requester core in the multi-core chip. The receiver and requester cores comprise electronic circuits. The multi-core chip comprises a set of cores including the receiver and the requester cores. A target core is identified from the request to which the request is targeted. A determination is made whether the target core includes the requester core in a neighborhood of the target core, the neighborhood including a first subset of cores mapped to the target core according to a skewed mapping. The cache line is replicated, responsive to the determining being negative, from the target core to a replication core. The cache line is provided from the replication core to the requester core.
摘要:
For a flexible replication with skewed mapping in a multi-core chip, a request for a cache line is received, at a receiver core in the multi-core chip from a requester core in the multi-core chip. The receiver and requester cores comprise electronic circuits. The multi-core chip comprises a set of cores including the receiver and the requester cores. A target core is identified from the request to which the request is targeted. A determination is made whether the target core includes the requester core in a neighborhood of the target core, the neighborhood including a first subset of cores mapped to the target core according to a skewed mapping. The cache line is replicated, responsive to the determining being negative, from the target core to a replication core. The cache line is provided from the replication core to the requester core.