摘要:
A method for optimizing throughput in a microprocessor that is capable of processing multiple threads of instructions simultaneously. Instruction issue logic is provided between the input buffers and the pipeline of the microprocessor. The instruction issue logic speculatively issues instructions from a given thread based on the probability that the required operands will be available when the instruction reaches the stage in the pipeline where they are required. Issue of an instruction is blocked if the current pipeline conditions indicate that there is a significant probability that the instruction will need to stall in a shared resource to wait for operands. Once the probability that the instruction will stall is below a certain threshold, based on current pipeline conditions, the instruction is allowed to issue.
摘要:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
摘要:
A method of managing cache partitions provides a first pointer for higher priority writes and a second pointer for lower priority writes, and uses the first pointer to delimit the lower priority writes. For example, locked writes have greater priority than unlocked writes, and a first pointer may be used for locked writes, and a second pointer may be used for unlocked writes. The first pointer is advanced responsive to making locked writes, and its advancement thus defines a locked region and an unlocked region. The second pointer is advanced responsive to making unlocked writes. The second pointer also is advanced (or retreated) as needed to prevent it from pointing to locations already traversed by the first pointer. Thus, the pointer delimits the unlocked region and allows the locked region to grow at the expense of the unlocked region.
摘要:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
摘要:
A method and system for calculating a branch target address. Upon fetching a branch instruction from memory, the n−1 lower order bits of the branch target address may be pre-calculated and stored in the branch instruction prior to storing the branch instruction in the instruction cache. Upon retrieving the branch instruction from the instruction cache, the upper order bits of the branch target address may be recovered using the sign bit and the carry bit stored in the branch instruction. The sign bit and the carry bit may be used to select one of three possible upper-order bit value combinations of the branch target address. The selected upper-order bit value combination may then be appended to the n−1 lower order bits of the branch target address to form the complete branch target address.
摘要:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
摘要:
A method for optimizing throughput in a microprocessor that is capable of processing multiple threads of instructions simultaneously. Instruction issue logic is provided between the input buffers and the pipeline of the microprocessor. The instruction issue logic speculatively issues instructions from a given thread based on the probability that the required operands will be available when the instruction reaches the stage in the pipeline where they are required. Issue of an instruction is blocked if the current pipeline conditions indicate that there is a significant probability that the instruction will need to stall in a shared resource to wait for operands. Once the probability that the instruction will stall is below a certain threshold, based on current pipeline conditions, the instruction is allowed to issue.
摘要:
A method and system for utilizing bits in a collection of illegal op codes in order to enable pre-decoded instructions to be stored in an instruction cache without increasing the number of bits required to represent the pre-decoded instructions. Upon fetching an instruction from memory, the op code is examined for membership in a collection of illegal op codes. If the instruction op code is a member of this collection, the instruction may be re-encoded to use a different, common illegal op code. If the instruction op code is not a member of the collection of illegal op codes, but is instead an instruction to be stored in the instruction cache in a pre-decoded format, the additional pre-decoded information may be stored in the instruction encoding by utilizing the portion of the op code space which has been vacated by the re-encoding of the illegal op codes.
摘要:
A system and method for tracing program code within a processor having an embedded cache memory. The non-invasive tracing technique minimizes the need for trace information to be broadcast externally. The tracing technique monitors changes in instruction flow from the normal execution stream of the code. Various features, individually and in combination, provide a real-time trace-forward and trace-back capability with a minimal number of pins running at a minimal frequency relative to the processor.
摘要:
Data from a source domain operating at a first data rate is transferred to a FIFO in another domain operating at a different data rate. The FIFO buffers data before transfer to a sink for further processing or storage. A source side counter tracks space available in the FIFO. In disclosed examples, the initial counter value corresponds to FIFO depth. The counter decrements in response to a data ready signal from the source domain, without delay. The counter increments in response to signaling from the sink domain of a read of data off the FIFO. Hence, incrementing is subject to the signaling latency between domains. The source may send one more beat of data when the counter indicates the FIFO is full. The last beat of data is continuously sent from the source until it is indicated that a FIFO position became available; effectively providing one more FIFO position.