摘要:
A microprocessor includes a fetch unit, an instruction cracking unit, and dispatch and completion control logic. The fetch unit retrieves a set of instructions from an instruction cache. The instruction cracking unit receives the set of fetched instructions and organizes the set of instructions into an instruction group. The dispatch and completion logic assigns a group tag to the instruction group and records the group tag in an entry of the completion table for tracking the completion status of the instructions comprising the instruction group. The dispatch and control logic may record a single instruction address in the completion table entry corresponding to the each instruction group. Preferably, the single instruction address is the instruction address of the first instruction in the instruction group. The processor may flush the instruction group in response to detecting an exception generated by an instruction in the instruction group.
摘要:
A microprocessor and related method and data processing system are disclosed. The microprocessor includes a dispatch unit suitable for issuing an instruction executable by the microprocessor, an execution pipeline configured to receive the issued instruction, and a pending instruction unit. The pending instruction unit includes a set of pending instruction entries. A copy of the issued instruction is maintained in one of the set of pending instruction entries. The execution pipeline is adapted to record, in response detecting to a condition preventing the instruction from successfully completing one of the stages in the pipeline during a current cycle, an exception status with the copy of the instruction in the pending instruction unit and to advance the instruction to a next stage in the pipeline in the next cycle thereby preventing the condition from stalling the pipeline. Preferably, the dispatch unit, in response to the instruction finishing pipeline execution with an exception status, is adapted to use the copy of the instruction to re-issue the instruction to the execution pipeline in a subsequent cycle. In one embodiment, the dispatch unit is adapted to deallocate the copy of the instruction in the pending instruction unit in response to the instruction successfully completing pipeline execution. The pending instruction unit may detect successful completion of the instruction by detecting when the instruction has been pending for a predetermined number of cycles without recording an exception status. In this embodiment, each entry in the pending instruction unit may include a timer field comprising a set of bits wherein the number of bits in the time field equals the predetermined number of cycles. The pending instruction unit may set, in successive cycles, successive bits in the timer field such that successful completion of an instruction is indicated when a last bit in the time field is set. In one embodiment, pending instruction unit includes a set of copies of instructions corresponding to each of a set of instructions pending in the execution pipeline at any given time. In various embodiments, the execution pipeline may comprise a load/store pipeline, a floating point pipeline, or a fixed point pipeline.
摘要:
A microprocessor and method of processing instructions therein are disclosed. Initially, a sequence of instructions is dispatched by a dispatch unit of the microprocessor. A code sequence recognition unit (CSR) is configured to detect a short branch sequence within the sequence of instruction, where the short branch sequence includes a condition setting instruction, a conditional branch, and at least one additional instruction that is executed if the conditional branch is not taken. The short branch sequence is then internally converted to a predicated instruction sequence that includes the condition setting instruction and a predicated instruction corresponding to each additional instruction in the short branch sequence. The predicated instruction sequence is then executed in at least one functional unit of the processor. Detecting the short branch sequence may include calculating the relative branch address associated with the conditional branch instruction and comparing the relative branch address to a specified maximum. In one embodiment, the received sequence of instructions may be converted into an instruction group by the processor. In this embodiment, the specified maximum number of instructions in a short branch sequence may be a function of the number of instructions in an instruction group. In an embodiment where the conditional branch statement is preferably allocated to the last slot of the instruction group, the additional instructions in the short branch sequence are located in the next subsequent instruction group. Converting the short branch sequence to the predicated instruction sequence may include converting each additional instruction in the short branch sequence to an analogous predicated instruction. In one embodiment, converting each additional instruction to its analogous predicated instruction includes determining a predicated instruction opcode for each additional instruction in the short branch sequence by adjusting the opcode of each additional instruction by a predetermined offset. In another embodiment, the opcode conversion may be accomplished with an opcode lookup table.
摘要:
A processor and data processing system suitable for dispatching an instruction to an issue unit. The issue unit includes a primary issue queue and a secondary issue queue. The instruction is stored in the primary issue queue if the instruction is currently eligible to issue for execution. The instruction is stored in the secondary issue queue if the instruction is currently ineligible to issue for execution. An instruction may be moved from the primary issue queue to the secondary issue queue if instruction is dependent upon results from another instruction. In one embodiment, the instruction may be moved from the primary issue queue to the secondary issue queue after issuing the instruction for execution. In this embodiment, the instruction may be maintained in the secondary issue queue for a specified duration. Thereafter, the secondary issue queue entry containing the instruction is deallocated if the instruction has not been rejected.
摘要:
A microprocessor includes a first processor core and a second processor core. The first core includes a first processing block. The first processing block includes an execution unit suitable for executing a first type of instruction. The second core includes a second processing block. The second processing block includes an execution unit suitable for executing an instruction if the instruction is of the first type. The processor further includes a shared execution unit. The first and second processor cores are adapted to forward an instruction to the shared execution unit for execution if the instruction is of a second type. In one embodiment, the first type of instruction includes fixed point instructions, load/store instructions, and branch instructions and the second type of instruction includes floating point instructions.
摘要:
A processor having improved branch prediction accuracy includes at least one execution unit that executes sequential instructions, a condition register, and a branch prediction circuit that predicts a condition register-dependent branch instruction by reference to a potentially stale condition register value to produce a speculative instruction fetch address. In a preferred embodiment, the processor includes branch execution circuitry that subsequently determines if the speculative instruction fetch address is correct by reference to a non-stale value of the condition register.
摘要:
A microprocessor and method of processing instructions for addressing timing assymetries are disclosed. A sequence of instructions including a first instruction and a second instruction are received. Dependency logic determines if any dependencies between the first and second instructions. The dependency logic then selects between first and second issue queue partitions for storing the first and second instructions pending issue based upon the dependency determination, wherein the first issue queue partition issues instructions to a first execution unit and the second issue queue partition issues instructions to a second execution unit. The first and second issue queue partitions may be asymmetric with respect to a first register file in which instruction results are stored. The first and second instructions are then stored in the selected partitions. Selecting between the first and second issue queue partitions may include selecting a common issue queue partition for the first and second instructions if there is a dependency between the first and second instructions and selecting between the first and second issue queue partition may be based upon a fairness algorithm if the first and second instructions lack dependencies.
摘要:
A processor having improved branch prediction accuracy includes at least one execution unit that executes sequential instructions and a plurality of branch prediction circuits including a lock acquisition branch prediction circuit that predicts a speculative execution path for a conditional branch instruction. The processor further includes a selector that selects the speculative execution path predicted by the lock acquisition branch prediction circuit in response to an indication that the conditional branch instruction is dependent upon lock acquisition. In a preferred embodiment, the indication that the conditional branch instruction is dependent upon lock acquisition is encoded within the conditional branch instruction.
摘要:
A processor having improved branch prediction accuracy includes at least one execution unit that executes sequential instructions and branch processing circuitry that processes branch instructions. The branch processing circuitry includes a number of branch prediction circuits that are each capable of providing a branch prediction for a conditional branch instruction and a selector that selects a branch prediction of a branch prediction circuit based upon the type of condition upon which the conditional branch instruction depends. The selector preferably includes hardware that determines the type of condition upon which the conditional branch instruction depends by reference to an instruction context defined by one or more instructions adjacent the conditional branch instruction in programmed sequence. The branch processing circuitry further includes path address logic that determines a path address of the selected branch prediction. Thus, branch prediction accuracy can be improved by considering the type of condition upon which a conditional branch instruction depends, rather than just branch history.
摘要:
A method and system in a data processing system for transferring data from a first device to a second device within the data processing system. The data processing system includes a data bus, an address bus, a first address space associated with a memory and a second address space associated with an input/output device. Initially, an operation request package is transmitted to a second device from a first device, which informs the second device of the total amount of data to be transferred. A transfer signal is then transmitted in the data processing system. The transfer signal identifies the transfer as a transfer concerning an address in the second address space associated with the input/output device. A first address package is then transmitted to the second device from the first device on the address bus. The first address package includes a transfer identifier, a first identifier associated with the first device and a second identifier associated with the second device. A second address package, comprising a byte count and an address, are transmitted to the second device from the first device on the address bus. If data is to be transferred, the data is then transferred on the data bus. Finally, a reply signal may be transmitted between the first and second devices, acknowledging the success or failure of the data transfer.