摘要:
A method and apparatus for storing and utilizing set prediction information regarding which set of a set-associative memory will be accessed for enhancing performance of the set-associative memory and reducing power consumption. The set prediction information is stored in various locations including a branch target buffer, instruction cache and operand history table to decrease latency for accesses to set-associative instruction and data caches.
摘要:
In an approach to reducing delays resulting from resolution of conditional branch instructions, such instructions are pre-executed in a coprocessor which precedes a pipeline processor and prepared an instruction stream for input to the pipeline processor. Because of this pre-execution, the input instruction stream has fewer conditional branches for the pipeline processor to resolve. Also, the coprocessor may handle address generation interlock situations which also cause execution delays in the pipeline processor.
摘要:
A method for branch prediction, the method comprising, receiving a load instruction including a first data location in a first memory area, retrieving data including a branch address and a target address from the first data location, and saving the data in a branch prediction memory, or receiving an unload instruction including the first data location in the first memory area, retrieving data including a branch address and a target address from the branch prediction memory, and saving the data in the first data location.
摘要:
A method comprising receiving a branch instruction, decoding a branch address and the branch instruction, executing a branch action associated with the branch address, determining whether a branch associated with the branch action was taken, and saving an identifier of the branch instruction and in indicator that the branch action was taken in a prefetch history table responsive to determining that the branch associated with the branch action was taken.
摘要:
A system, method and computer program product for executing a cache replacement algorithm. A system includes a computer processor having an instruction processor, a cache and one or more useful indicators. The instruction processor processes instructions in a running program. The cache includes two or more cache levels including a level one (L1) cache level and one or more higher cache levels. Each cache level includes one or more cache lines and has an associated directory having one or more directory entries. A useful indicator is located within one or more of the directory entries and is associated with a particular cache line. The useful indicator is set to provide an indication that the associated cache line contains one or more instructions that are required by the running program and cleared to provide lack of such an indication.
摘要:
A system, method and computer program product for executing a cache replacement algorithm. A system includes a computer processor having an instruction processor, a cache and one or more useful indicators. The instruction processor processes instructions in a running program. The cache includes two or more cache levels including a level one (L1) cache level and one or more higher cache levels. Each cache level includes one or more cache lines and has an associated directory having one or more directory entries. A useful indicator is located within one or more of the directory entries and is associated with a particular cache line. The useful indicator is set to provide an indication that the associated cache line contains one or more instructions that are required by the running program and cleared to provide lack of such an indication.
摘要:
Three-dimensional (3-D) processor structures are provided which are constructed by connecting processors in a stacked configuration. For example, a processor system includes a first processor chip comprising a first processor, and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively configure the first and second processors of the first and second processor chips to operate in one of a plurality of operating modes, wherein the processors can be selectively configured to operate independently, to aggregate resources, to share resources, and/or be combined to form a single processor image.
摘要:
Three-dimensional (3-D) processor devices are provided, which are constructed by connecting processors in a stacked configuration. For instance, a processor system includes a first processor chip comprising a first processor and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively operate the processor system in one of a plurality of operating modes. For example, in a one mode of operation, the first and second processors are configured to implement a run-ahead function, wherein the first processor operates a primary thread of execution and the second processor operates a run-ahead thread of execution.
摘要:
Three-dimensional (3-D) processor devices are provided, which are constructed by connecting processors in a stacked configuration. For instance, a processor system includes a first processor chip comprising a first processor and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively operate the processor system in one of a plurality of operating modes. For example, in a one mode of operation, the first and second processors are configured to implement a run-ahead function, wherein the first processor operates a primary thread of execution and the second processor operates a run-ahead thread of execution.
摘要:
A three-dimensional (3-D) processor system includes a first processor chip and a second processor chip in a stacked configuration. The first processor chip includes a first processor having a first set of state registers. The second processor chip includes a second processor having a second set of state registers that corresponds to the first set of state registers. The first and second processors are connected through vertical connections between the first and second processor chips. A mode control circuit operates the processor system in one of a plurality of operating modes. In one mode of operation, the first processor is active and the second processor is inactive, and the first processor operates at a speed greater than a maximum safe speed of the first processor, and the first processor uses the second set of state registers of the second processor to checkpoint a state of the first processor.