Abstract:
Generally, this disclosure provides technologies for generating and executing partially vectorized code that may include backward dependencies within a loop body of the code to be vectorized. The method may include identifying backward dependencies within a loop body of the code; selecting one or more ranges of iterations within the loop body, wherein the selected ranges exclude the identified backward dependencies; and vectorizing the selected ranges. The system may include a vector processor configured to provide predicated vector instruction execution, loop iteration range enabling, and dynamic loop dependence checking.
Abstract:
A method to optimize speculative parallel thread execution comprises selecting a plurality of partition candidate pairs for speculative parallel thread execution, transforming each partition candidate pair of the plurality of partition candidate pairs to improve the expected performance gain of each pair, and selecting a set of one or more transformed partition candidate pairs that do not interfere with each other and produce a maximum expected performance gain.
Abstract:
A method for statement shifting to increase the parallelism of loops includes constructing a data dependence graph (DDG) to represent dependences between statements in a loop, constructing a basic equations group from the DDG, constructing a dependence equations group derived in part from the basic equations group, and determining a shifting vector for the loop from the dependence equations group, wherein the shifting vector to represent an offset to apply to each statement in the loop for statement shifting. Other embodiments are also disclosed.
Abstract:
A Markov chain model of a software system may be used to compute all-pairs reaching probabilities to provide guidance in performing speculative operations with respect to the software system.
Abstract:
A method to provide effective control and data flow information in an Intermediate Representation (IR) form. A Path Sensitive single Assignment (PSA) IR form with effective and explicit control and data path information supports control flow sensitive optimizations such as path sensitive symbolic substitution, array privatization and speculative multi threading. In the definition of PSA form, besides defining new versioned variables, the gamma functions keep control path information. The gamma function in PSA form keeps the basic attribute of SSA IR form and only one definition exists for each use. Therefore, all existing Single Static Assignment (SSA) IR form based analysis can be applied in PSA form. The gamma function in PSA form keeps all essential control flow information and eliminates unnecessary predicates at the same time.
Abstract:
A method to optimize speculative parallel thread execution comprises selecting a plurality of partition candidate pairs for speculative parallel thread execution, transforming each partition candidate pair of the plurality of partition candidate pairs to improve the expected performance gain of each pair, and selecting a set of one or more transformed partition candidate pairs that do not interfere with each other and produce a maximum expected performance gain.
Abstract:
A method and system to provide user-level multithreading are disclosed. The method according to the present techniques comprises receiving programming instructions to execute one or more shared resource threads (shreds) via an instruction set architecture (ISA). One or more instruction pointers are configured via the ISA; and the one or more shreds are executed simultaneously with a microprocessor, wherein the microprocessor includes multiple instruction sequencers.
Abstract:
Methods and apparatus to predict software values are disclosed. In one example, a method identifies a variable associated with one or more machine readable instructions, determines a predicted value of the variable based on a pattern, generates a value prediction instruction to predict a run-time value using the predicted value of the variable based on the pattern, and combines the value prediction instruction with the one or more machine readable instructions.
Abstract:
Methods and apparatus are disclosed to compile programs to use speculative parallel threads. An example method disclosed herein identifies a set of speculative parallel thread candidates; determines misspeculation cost values for at least some of the speculative parallel thread candidates; selects a set of speculative parallel threads from the set of speculative parallel thread candidates based on the cost values; and generates program code based on the set of speculative parallel threads.
Abstract:
A method and apparatus for enabling the speculative forking of a speculative thread is disclosed. In one embodiment, a speculative fork instruction is conditioned by the results of a fork predictor. The fork predictor may issue predictions as to whether or not a speculative thread would execute desirably. The fork predictor may be implemented as a modified branch predictor circuit, and may have execution history updates entered by a determination of whether or not the execution of a speculative thread was or would have been desirable.