摘要:
A system may comprise an optimizer/scheduler to schedule on a set of instructions, compute a data dependence, a checking constraint and/or an anti-checking constraint for the set of scheduled instructions, and allocate alias registers for the set of scheduled instructions based on the data dependence, the checking constraint and/or the anti-checking constraint. In one embodiment, the optimizer is to release unused registers to reduce the alias registers used to protect the scheduled instructions. The optimizer is further to insert a dummy instruction after a fused instruction to break cycles in the checking and anti-checking constraints.
摘要:
Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.
摘要:
An apparatus and method for speculative vectorization. For example, one embodiment of a processor comprises: a queue comprising a set of locations for storing addresses associated with vectorized memory access instructions; and execution logic to execute a first vectorized memory access instruction to access the queue and to compare a new address associated with the first vectorized memory access instruction with existing addresses stored within a specified range of locations within the queue to detect whether a conflict exists, the existing addresses having been previously stored responsive to one or more prior vectorized memory access instructions.
摘要:
Methods and apparatus relating to conjugate code generation for efficient dynamic optimizations are described. In an embodiment, a binary code and an intermediate representation (IR) code are generated based at least partially on a source program. The binary code and the intermediate code are transmitted to a virtual machine logic. The binary code and the IR code each include a plurality of regions that are in one-to-one correspondence. Other embodiments are also claimed and described.
摘要:
Technologies for persistent memory programming include a computing device having a persistent memory including one or more nonvolatile regions. The computing device may assign a virtual memory address of a target location in persistent memory to a persistent memory pointer using persistent pointer strategy, and may dereference the pointer using the same strategy. Persistent pointer strategies include off-holder, ID-in-value, optimistic rectification, and pessimistic rectification. The computing device may log changes to persistent memory during the execution of a data consistency section, and commit changes to the persistent memory when the last data consistency section ends. Data consistency sections may be grouped by log group identifier. Using type metadata stored in the nonvolatile region, the computing device may identify the type of a root object within the nonvolatile region and then recursively identify the type of all objects referenced by the root object. Other embodiments are described and claimed.
摘要:
In one embodiment, the present invention includes a software-controlled method of forming instruction strands. The software may include instructions to obtain code of a superblock including a plurality of basic blocks, build a dependency directed acyclic graph (DAG) for the code, sort nodes coupled by edges of the dependency DAG into a topological order, form strands from the nodes based on hardware constraints, rule constraints, and scheduling constraints, and generate executable code for the strands and store the executable code in a storage. Other embodiments are described and claimed.
摘要:
Methods and systems to identify and reproduce concurrency bugs in multi-threaded programs are disclosed. An example method disclosed herein includes defining a data type. The data type includes a first predicate associated with a first thread of a multi-threaded program that is associated with a first condition, a second predicate that is associated with a second thread of the multi-threaded program, the second predicate being associated with a second condition, and an expression that defines a relationship between the first predicate and the second predicate. The relationship, when satisfied, causes the concurrency bug to be detected. A concurrency bug detector conforming to the data type is used to detect the concurrency bug in the multi-threaded program.
摘要:
Example methods and apparatus to manage partial commit-checkpoints are disclosed. A disclosed example method includes identifying a commit instruction associated with a region of instructions executed by a processor, identifying candidate instructions from the region of instructions, and generating a processor partial commit-checkpoint to save a current state of the processor, the checkpoint based on calculated register values associated with live instructions, and including instruction reference addresses to link the candidate instructions.
摘要:
An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.
摘要:
A system may comprises an optimizer/scheduler to schedule on a set of instructions, compute a data dependence, a checking constraint and/or an anti-checking constraint for the set of scheduled instructions, and allocate alias registers for the set of scheduled instructions based on the data dependence, the checking constraint and/or the anti-checking constraint. In one embodiment, the optimizer is to release unused registers to reduce the alias registers used to protect the scheduled instructions. The optimizer is further to insert a dummy instruction after a fused instruction to break cycles in the checking and anti-checking constraints.