摘要:
A method, system and computer-usable medium are disclosed for managing transient instruction streams. Transient flags are defined in Branch-and-Link (BRL) instructions that are known to be infrequently executed. A bit is likewise set in a Special Purpose Register (SPR) of the hardware (e.g., a core) that is executing an instruction request thread. Subsequent fetches or prefetches in the request thread are treated as transient and are not written to lower-level caches. If an instruction is non-transient, and if a lower-level cache is non-inclusive of the L1 instruction cache, a fetch or prefetch miss that is obtained from memory may be written in both the L1 and the lower-level cache. If it is not inclusive, a cast-out from the L1 instruction cache may be written in the lower-level cache.
摘要:
A method, system and computer-usable medium are disclosed for managing transient instruction streams. Transient flags are defined in Branch-and-Link (BRL) instructions that are known to be infrequently executed. A bit is likewise set in a Special Purpose Register (SPR) of the hardware (e.g., a core) that is executing an instruction request thread. Subsequent fetches or prefetches in the request thread are treated as transient and are not written to lower-level caches. If an instruction is non-transient, and if a lower-level cache is non-inclusive of the L1 instruction cache, a fetch or prefetch miss that is obtained from memory may be written in both the L1 and the lower-level cache. If it is not inclusive, a cast-out from the L1 instruction cache may be written in the lower-level cache.
摘要:
A method, system and computer-usable medium are disclosed for managing transient instruction streams. Transient flags are defined in Branch-and-Link (BRL) instructions that are known to be infrequently executed. A bit is likewise set in a Special Purpose Register (SPR) of the hardware (e.g., a core) that is executing an instruction request thread. Subsequent fetches or prefetches in the request thread are treated as transient and are not written to lower-level caches. If an instruction is non-transient, and if a lower-level cache is non-inclusive of the L1 instruction cache, a fetch or prefetch miss that is obtained from memory may be written in both the L1 and the lower-level cache. If it is not inclusive, a cast-out from the L1 instruction cache may be written in the lower-level cache.
摘要:
A method, system and computer-usable medium are disclosed for managing prefetch streams in a virtual machine environment. Compiled application code in a first core, which comprises a Special Purpose Register (SPR) and a plurality of first prefetch engines, initiates a prefetch stream request. If the prefetch stream request cannot be initiated due to unavailability of a first prefetch engine, then an indicator bit indicating a Prefetch Stream Dispatch Fault is set in the SPR, causing a Hypervisor to interrupt the execution of the prefetch stream request. The Hypervisor then calls its associated operating system (OS), which determines prefetch engine availability for a second core comprising a plurality of second prefetch engines. If a second prefetch engine is available, then the OS migrates the prefetch stream request from the first core to the second core, where it is initiated on an available second prefetch engine.
摘要:
A method, system and computer-usable medium are disclosed for managing prefetch streams in a virtual machine environment. Compiled application code in a first core, which comprises a Special Purpose Register (SPR) and a plurality of first prefetch engines, initiates a prefetch stream request. If the prefetch stream request cannot be initiated due to unavailability of a first prefetch engine, then an indicator bit indicating a Prefetch Stream Dispatch Fault is set in the SPR, causing a Hypervisor to interrupt the execution of the prefetch stream request. The Hypervisor then calls its associated operating system (OS), which determines prefetch engine availability for a second core comprising a plurality of second prefetch engines. If a second prefetch engine is available, then the OS migrates the prefetch stream request from the first core to the second core, where it is initiated on an available second prefetch engine.
摘要:
A method, system and computer-usable medium are disclosed for managing transient instruction streams. Transient flags are defined in Branch-and-Link (BRL) instructions that are known to be infrequently executed. A bit is likewise set in a Special Purpose Register (SPR) of the hardware (e.g., a core) that is executing an instruction request thread. Subsequent fetches or prefetches in the request thread are treated as transient and are not written to lower-level caches. If an instruction is non-transient, and if a lower-level cache is non-inclusive of the L1 instruction cache, a fetch or prefetch miss that is obtained from memory may be written in both the L1 and the lower-level cache. If it is not inclusive, a cast-out from the L1 instruction cache may be written in the lower-level cache.
摘要:
Functionality is implemented to determine that a plurality of multi-core processing units of a system are configured in accordance with a plurality of operating performance modes. It is determined that a first of the plurality of operating performance modes satisfies a first performance criterion that corresponds to a first workload of a first logical partition of the system. Accordingly, the first logical partition is associated with a first set of the plurality of multi-core processing units that are configured in accordance with the first operating performance mode. It is determined that a second of the plurality of operating performance modes satisfies a second performance criterion that corresponds to a second workload of a second logical partition of the system. Accordingly, the second logical partition is associated with a second set of the plurality of multi-core processing units that are configured in accordance with the second operating performance mode.
摘要:
Functionality is implemented to determine that a plurality of multi-core processing units of a system are configured in accordance with a plurality of operating performance modes. It is determined that a first of the plurality of operating performance modes satisfies a first performance criterion that corresponds to a first workload of a first logical partition of the system. Accordingly, the first logical partition is associated with a first set of the plurality of multi-core processing units that are configured in accordance with the first operating performance mode. It is determined that a second of the plurality of operating performance modes satisfies a second performance criterion that corresponds to a second workload of a second logical partition of the system. Accordingly, the second logical partition is associated with a second set of the plurality of multi-core processing units that are configured in accordance with the second operating performance mode.
摘要:
Functionality is implemented to determine that a plurality of multi-core processing units of a system are configured in accordance with a plurality of operating performance modes. It is determined that a first of the plurality of operating performance modes satisfies a first performance criterion that corresponds to a first workload of a first logical partition of the system. Accordingly, the first logical partition is associated with a first set of the plurality of multi-core processing units that are configured in accordance with the first operating performance mode. It is determined that a second of the plurality of operating performance modes satisfies a second performance criterion that corresponds to a second workload of a second logical partition of the system. Accordingly, the second logical partition is associated with a second set of the plurality of multi-core processing units that are configured in accordance with the second operating performance mode.
摘要:
Functionality is implemented to determine that a plurality of multi-core processing units of a system are configured in accordance with a plurality of operating performance modes. It is determined that a first of the plurality of operating performance modes satisfies a first performance criterion that corresponds to a first workload of a first logical partition of the system. Accordingly, the first logical partition is associated with a first set of the plurality of multi-core processing units that are configured in accordance with the first operating performance mode. It is determined that a second of the plurality of operating performance modes satisfies a second performance criterion that corresponds to a second workload of a second logical partition of the system. Accordingly, the second logical partition is associated with a second set of the plurality of multi-core processing units that are configured in accordance with the second operating performance mode.