摘要:
A data processor (10) has an execution unit (18, 20) for generating the address of each requested data double-word. The data processor fetches the entire memory line, four double-words of data, containing the requested double-word when the requested double-word is not found in the data processor's memory cache. The data processor ultimately stores the requested data in the memory cache (40) when returned from an external memory system. The data processor also has forwarding circuitry (48, 50) for forwarding previously requested double-words directly to the execution unit under certain circumstances. The forwarding circuitry will forward a requested double-word if the data processor has not crossed a memory line boundary since the last memory cache miss and if the two least significant bits of the requested and received double-words logically match.
摘要:
A data processor (10) has a BTAC (48) storing a number of recently encountered fetch address-target address pairs. A branch unit (20) generates a fetch address that depends upon a condition precedent and a received branch instruction. After executing each branch instruction, the branch unit predicts whether the condition precedent will be met the next time it encounters the same branch instruction. If the predicted value of the condition precedent would cause the branch to be taken, then the branch unit adds the fetch address-target address pair corresponding to the branch instruction to the BTAC. If the predicted value of the condition precedent would cause the branch to be not taken, then the branch unit deletes the fetch address-target address pair corresponding to the branch instruction from the BTAC.
摘要:
A data processor (10) has a BTAC (48) storing a number of recently encountered fetch address-target address pairs. Each pair also includes an offset tag identifying which one of a plurality of instructions indexed by the fetch address generated the entry. A branch unit (20) generates an execution address that depends upon one of the plurality of instructions. After executing each instruction, the branch unit may delete an entry from the BTAC if the instruction's execution address differs from the target address and if the instruction is the same instruction which generated the BTAC entry initially.
摘要:
A mechanism, which supports predictive register cache allocation and entry, uses a counter look-up table to determine the potential significances of physical register references.
摘要:
Disclosed are an apparatus, system, and method for implementing predicated instructions using micro-operations. A micro-code engine receives an instruction, decomposes the instruction, and generates a plurality of micro-operations to implement the instruction. Each of the decomposed micro-operations indicates a single destination register. For predicated instructions, the decomposed micro-operations include “conditional move” micro-operations to select between two potential output values. Except in the case that one of the potential output values is a constant, the decomposed micro-operations for a predicated instruction also include an append instruction that saves the incoming value of a destination register in a temporary variable. For at least one embodiment, the qualifying predicate for a predicated instruction is appended to the incoming value stored in the temporary register.
摘要:
An apparatus including a first die including a plurality of conductive through substrate vias (TSVs); and a plurality of second dice each including a plurality of contact points coupled to the TSVs of the first die, the plurality of second dice arranged to collectively include a surface area approximating a surface area of the first die. A method including arranging a plurality of second dice on a first die such that collectively the plurality of second dice include a surface area approximating the surface area of the first die; and electrically coupling a plurality of second device to a plurality of the first die. A system including an electronic appliance including a printed circuit board and a module, the module including a first die including a plurality of TSVs; and the plurality of second dice arranged to collectively include a surface area approximating the surface area of the first die.
摘要:
An apparatus including a first die including a plurality of conductive through substrate vias (TSVs); and a plurality of second dice each including a plurality of contact points coupled to the TSVs of the first die, the plurality of second dice arranged to collectively include a surface area approximating a surface area of the first die. A method including arranging a plurality of second dice on a first die such that collectively the plurality of second dice include a surface area approximating the surface area of the first die; and electrically coupling a plurality of second device to a plurality of the first die. A system including an electronic appliance including a printed circuit board and a module, the module including a first die including a plurality of TSVs; and the plurality of second dice arranged to collectively include a surface area approximating the surface area of the first die.
摘要:
Disclosed are a multi-die processor apparatus and system. Processor logic to execute one or more instructions is allocated among two or more face-to-faces stacked dice. The processor includes a conductive interface between the stacked dice to facilitate die-to-die communication.
摘要:
A method and apparatus are provided for providing ready information to a scheduler. Dependence information is maintained in a relatively small map table, with potential loss of information when dependence information exceeds available space in the map table. Ready instructions are maintained, as space allows, in a select queue. Tags for scheduled instructions are maintained in a lookup queue, and dependency information for the scheduled instruction is maintained in an update queue, as space allows. Ready information for instructions in a scheduling window is updated based upon the information in the update queue. Loss of instruction information may occur, due to space limitations, at the map table, lookup queue, update queue, and/or select queue. Scheduling of lost instructions is handled by a lossy instruction handler.
摘要:
A data processor (10) has a branch target address cache (48) for storing the target addresses of a number of recently taken branch instructions. Normally, each fetch address is compared to the contents of the branch target address cache. If a hit occurs, then the data processor branches to the cached target address. The data processor also has a dispatch unit (60) that invalidates the data stored in the branch target address cache if and when it determines that the branch target address cache "hit" on an instruction that was not a branch instruction at all, a "phantom branch." The data processor thereby automatically invalidates its branch target address cache data after a context switch.