摘要:
A processor includes an instruction pipeline and control circuitry. The instruction pipeline is configured to process instructions of program code. The control circuitry is configured to monitor the processed instructions at run-time, to construct an invocation data structure comprising multiple entries, wherein each entry (i) specifies an initial instruction that is a target of a branch instruction, (ii) specifies a portion of the program code that follows one or more possible flow-control traces beginning from the initial instruction, and (iii) specifies, for each possible flow-control trace specified in the entry, a next entry that is to be processed following processing of that possible flow-control trace, and to configure the instruction pipeline to process segments of the program code, by continually traversing the entries of the invocation data structure.
摘要:
A method includes, in a processor having a pipeline, fetching instructions of program code at run-time, in an order that is different from an order-of-appearance of the instructions in the program code. The instructions are divided into segments having segment identifiers (IDs). An event, which warrants flushing of instructions starting from an instruction belonging to a segment, is detected. In response to the event, at least some of the instructions in the segment that are subsequent to the instruction, and at least some of the instructions in one or more subsequent segments that are subsequent to the segment, are flushed from the pipeline based on the segment IDs.
摘要:
A method includes, in a processor (20) that executes instructions of program code, monitoring instructions of a repetitive sequence of the instructions that traverses a flow-control trace so as to construct a specification of register access by the monitored instructions. Based on the specification, multiple hardware threads are invoked to execute respective segments of the repetitive instruction sequence at least partially in parallel. Monitoring of the instructions continues in at least one of the segments during execution.
摘要:
A method includes, in a processor (20), processing program code that includes memory- access instructions, wherein at least some of the memory-access instructions include symbolic expressions that specify memory addresses in an external memory (41) in terms of one or more register names. A relationship between the memory addresses accessed by two or more of the memory-access instructions is identified, based on respective formats of the memory addresses specified in the symbolic expressions. An outcome of at least one of the memory-access instructions is assigned to be served from an internal memory (50) in the processor, based on the identified relationship.
摘要:
A method includes, in a processor (20) that processes instructions of program code, processing one or more of the instructions by a first hardware thread. Upon detecting that an instruction defined as a parallelization point has been fetched for the first thread, a second hardware thread is invoked to process at least one of the instructions at least partially in parallel with processing of the instructions by the first hardware thread.
摘要:
A method includes, in a processor that executes instructions of program code, identifying a region of the code containing one or more segments of the instructions that are at least partially repetitive. The instructions in the region are monitored, and an approximate specification of register access by the monitored instructions is constructed for the region. Execution of the segments in the region is parallelized using the specification.
摘要:
A method includes, in a processor that executes instructions of program code, monitoring the instructions in a segment of a repetitive sequence of the instructions so as to construct a specification of register access by the monitored instructions. In response to detecting a branch mis-prediction in the monitored instructions, the specification is corrected so as to compensate for the branch mis-prediction. Execution of the repetitive sequence is parallelized based on the corrected specification.
摘要:
A method includes processing a sequence of instructions of program code that are specified using one or more architectural registers, by a hardware-implemented pipeline that renames the architectural registers in the instructions so as to produce operations specified using one or more physical registers. At least first and second segments of the sequence of instructions are selected, wherein the second segment occurs later in the sequence than the first segment. One or more of the architectural registers in the instructions of the second segment are renamed, before completing renaming the architectural registers in the instructions of the first segment, by pre-allocating one or more of the physical registers to one or more of the architectural registers.
摘要:
A method includes, in a processor that processes instructions of program code, processing a first segment of the instructions. One or more destination registers are identified in the first segment using an approximate specification of register access by the instructions. Respective values of the destination registers are made available to a second segment of the instructions only upon verifying that the values are valid for readout by the second segment in accordance with the approximate specification. The second segment is processed at least partially in parallel with processing of the first segment, using the values made available from the first segment.