摘要:
A method and device for return instruction prediction in microprocessors and digital signal processors. The method and device uses a return target buffer, in which a return instruction address table serves to store addresses of return instructions, and a return target stack is used to store target pointers of return instructions, thereby correct prediction results can be provided in the fetch stage of a pipeline.
摘要:
An accelerator 120 is tightly coupled to the normal execution unit 110. The operand store, which could be a register file 130, a stack based operand store or other operand store is shared by the execution unit and the accelerator unit. Operands may also be accessed as immediate values within the instructions themselves. The sequences of individual program instructions corresponding to computational subgraphs remain within a program but can be recognized by the accelerator as suitable for acceleration and when encountered are executed by the accelerator instead of by the normal execution unit. Within such tightly coupled arrangement problems can arise due to a lack of register resources within the system. The present technique provides that at least some intermediate operand values which are generated within the accelerator, but are determined not to be referenced outside of the computational subgraph concerned, are not written to the operand store.
摘要:
There is provided an information processor for executing a program comprising a plurality of separate program instructions: processing logic operable to individually execute said separate program instructions of said program; an operand store operable to store operand values; and an accelerator having an array comprising a plurality of functional units, said accelerator being operable to execute a combined operation corresponding to a computational subgraph of said separate program instructions by configuring individual ones of said plurality of functional units to perform particular processing operations associated with one or more processing stages of said combined operation; wherein said accelerator executes said combined operation in dependence upon operand mapping data providing a mapping between operands of said combined operation and storage locations within said operand store and in dependence upon separately specified configuration data providing a mapping between said plurality of functional units and said particular processing operations such that said configuration data can be re-used for different operand mappings.
摘要:
A method and apparatus for decompressing relative addresses. A compressed relative address is retrieved from one or more micro-operation entries of a micro-operation storage and an uncompressed relative address is reconstructed from the compressed relative address and an instruction pointer (IP) address associated with the head of the micro-operation storage line in which the compressed relative address was stored. IP-relative addresses may be computed in a manner similar to relative branch targets, then compressed and stored in one or more micro-operation entries of a micro-operation storage line to be reconstructed later according to an IP address associated with the respective micro-operation storage line in which their compressed counterpart was stored.
摘要:
A return address in response to a return instruction corresponding to a call instruction is stored in a return address stack when a branch history detects presence of the call instruction. When the branch history detects the presence of the return instruction before a branch reservation station completes executing the call instruction, the return address in response to the return instruction is not stored in the return address stack. If so, an output selection circuit predicts a correct return target using information stored in the return address stack.
摘要:
A processor is configured to support a programmable flags masking during processing of a system call instruction such as Syscall. The processor includes a register storing a mask, where an indication within the mask corresponds to each of a plurality of flags used by the processor. Based on the state of the indication, the processor may clear a corresponding flag or may retain the value of the corresponding flag. By programming the register appropriately, the desired clearing and retaining of the plurality of flags may be performed as part of the system call instruction. Flexibility may be provided for different operating systems having different sets of flags to be preserved or cleared.
摘要:
Method and apparatus for instrumentation of an executable computer program that includes a predicated branch-call instruction followed by a call-shadow instruction. The predicated branch-call instruction and the call-shadow instruction is stored in a first bundle of instructions, which is followed by a second bundle. The predicated branch-call instruction is changed to a predicated branch instruction that targets a fifth bundle of instructions, and the predicate of the predicated branch instruction is the same as the predicate of the predicated branch-call instruction. Third, fourth, and fifth bundles are created to preserve program semantics. The third bundle is inserted following the first bundle and includes the call-shadow instruction. The fourth bundle is inserted following the third bundle and includes a branch instruction that targets the second bundle. The fifth bundle is inserted following the fourth bundle and includes a branch-call instruction that has a target address equal to the target address of the predicated branch-call instruction. Instrumentation instructions are then inserted.
摘要:
A high-performance information processing technique permitting updating of an instruction buffer ready for effective prefetching to branch instructions and returning to the subroutine with a small volume of hardware is to be provided at low cost. It is an information processing apparatus equipped with a CPU, a memory, prefetch means and the like, wherein a prefetch address generator unit in the prefetch means decodes a branching series of instructions including at least one branched address calculating instruction and branching instruction to a branched address out of a current instruction buffer storing the series of instructions currently accessed by the CPU, and thereby looks ahead to the branching destination address. The information processing apparatus further comprises a RTS instruction buffer for storing a series of instructions of the return destinations of RTS instructions, and series of instructions stored in the current instruction buffer are saved into the RTS instruction buffer.
摘要:
In a system where multiple instruction pipes share access to a common return stack buffer (RSB), coordination is provided to ensure that no instruction pipe gains unfair access to the RSB. Additionally, further coordination control may be provided to ensure that a pipe operates upon valid data notwithstanding communication delays that may be present in a communication path between the pipe and the RSB. In one embodiment, if a system must gain access to the RSB, it determines with reference to prior accesses to the RSB whether immediate access the RSB would exceed a predetermined access rate. If so, it delays its attempt to access the RSB until it re-synchronizes to the access rate. In another embodiment, it delays use of data from the RSB until communication delays are overcome.
摘要:
The present invention relates to a central processing unit comprising: (a) a number of functional units (A, B, . . . , N), (b) at least one module for processing of a function call received from one of the functional units, the module having a decoder to obtain an instruction address from the function call, a memory for storing a plurality of control instructions and for storing a plurality of branch instructions, each control instruction having an assigned instruction address for a next instruction and each branch instruction having assigned at least two alternative instruction addresses for a next instruction, first logic circuitry for processing of the branch instructions in order to select one of the at least two alternative instruction addresses of one of the branch instructions, second logic circuitry for processing of the control instructions in order to return a result in response to the function call.