Abstract:
A system, processor, and method to record the interleavings of shared memory accesses in the presence of complex multi-operation instructions. An extension to instruction atomicity (IA) is disclosed that makes it possible for software to infer partial information about a multi-operation execution if the hardware has recorded a dependency due to an instruction atomicity violation (IAV). By monitoring the progress of a multi-operation instruction, the need for complex multi-operation emulation is unnecessary.
Abstract:
Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.
Abstract:
Example methods and apparatus to facilitate dynamic core selection are disclosed. An example apparatus includes a first processor core of a first type; a second processor core of a second type different from the first type; and software to: access a user-supplied hint indicative of a user preference to execute program code on the first processor core, the user-supplied hint including a user-defined attribute of the program code; monitor performance of the program code on the first processor core; determine, based on the user-defined attribute of the program code, a predicted performance of the program code on the second processor core is better than the performance of the program code on the first processor core; and ignore the user preference by migrating the program code from the first processor core for execution on the second processor core
Abstract:
Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.
Abstract:
A mechanism is described for facilitating dynamic and efficient management of instruction atomicity violations in software programs according to one embodiment. A method of embodiments, as described herein, includes receiving, at a replay logic from a recording system, a recording of a first software thread running a first macro instruction, and a second software thread running a second macro instruction. The first software thread and the second software thread are executed by a first core and a second core, respectively, of a processor at a computing device. The recording system may record interleavings between the first and second macro instructions. The method includes correctly replaying the recording of the interleavings of the first and second macro instructions precisely as they occurred. The correctly replaying may include replaying a local memory state of the first and second macro instructions and a global memory state of the first and second software threads.
Abstract:
Various embodiments are generally directed to detecting race conditions arising from uncoordinated data accesses by different portions of an application routine by detecting occurrences of a selected cache event associated with such accesses. An apparatus includes a processor component; a trigger component for execution by the processor component to configure a monitoring unit of the processor component to detect a cache event associated with a race condition between accesses to a piece of data and to capture an indication of a state of the processor component to generate monitoring data in response to an occurrence of the cache event; and a counter component for execution by the processor component to configure a counter of the monitoring unit to enable capture of the indication of the state of the processor component at a frequency less than every occurrence of the cache event. Other embodiments are described and claimed.
Abstract:
An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.