摘要:
In a cache-to-memory interface, a means and method for timesharing a single bus to allow the concurrent processing of multiple misses. The multiplicity of misses can arise from a single processor if that processor has a nonblocking cache and/or does speculative prefetching, or it can arise from a multiplicity of processors in a shared-bus configuration.
摘要:
A computer processing system includes a first memory that stores instructions belonging to a first instruction set architecture and a second memory that stores instructions belonging to a second instruction set architecture. An instruction buffer is coupled to the first and second memories, for storing instructions that are executed by a processor unit. The system operates in one of two modes. In a first mode, instructions are fetched from the first memory into the instruction buffer according to data stored in a first branch history table. In the second mode, instructions are fetched from the second memory into the instruction buffer according to data stored in a second branch history table.The first instruction set architecture may be system level instructions and the second instruction set architecture may be millicode instructions that, for example, define a complex system level instruction and/or emulate a third instruction set architecture.
摘要:
Monitoring apparatus is provided to allow out-of-sequence fetching of operands while preserving the appearance of in-sequence fetching to the processor of a computer. The key elements include a stack (119) of N entries holding the addresses of the last M, where M is less than or equal to N, out-of-sequence fetches. A comparator (103) is provided for comparing addresses in the stack with a test address. This test address is supplied via an OR gate (107) as either store addresses or cross-invalidate addresses, the latter being for a multiprocessor system. The addresses in the stack that compare with the test address are set as invalid. In addition, all addresses in the stack are set as invalid on the occurrence of a cache miss or serializing instruction. Finally, a select and check entry function (113) associates an address in the stack with the instruction it represents and deletes the address from the stack when the instruction is handled in its proper sequence.
摘要:
Apparatus for fetching instructions in a computing system. A broadband branch history table is organized by cache line. The broadband branch history table determines from the history of branches the next cache line to be referenced and uses that information for prefetching lines into the cache.
摘要:
A method comprising receiving a branch instruction, decoding a branch address and the branch instruction, executing a branch action associated with the branch address, determining whether a branch associated with the branch action was taken, and saving an identifier of the branch instruction and in indicator that the branch action was taken in a prefetch history table responsive to determining that the branch associated with the branch action was taken.
摘要:
A multi-prediction branch prediction mechanism predicts each conditional branch at least twice, first during the instruction-fetch phase of the pipeline and then again during the decode phase of the pipeline. The mechanism uses at least two different branch prediction mechanisms, each a separate and independent mechanism from the other. A set of rules are used to resolve those instances as to when the predictions differ.
摘要:
A method for branch prediction, the method comprising, receiving a load instruction including a first data location in a first memory area, retrieving data including a branch address and a target address from the first data location data location, and saving the data in a branch prediction memory.
摘要:
A two level branch history table (TLBHT) is substantially improved by providing a mechanism to prefetch entries from the very large second level branch history table (L2 BHT) into the active (very fast) first level branch history table (L1 BHT) before the processor uses them in the branch prediction process and at the same time prefetch cache misses into the instruction cache. The mechanism prefetches entries from the very large L2 BHT into the very fast L1 BHT before the processor uses them in the branch prediction process. A TLBHT is successful because it can prefetch branch entries into the L1 BHT sufficiently ahead of the time the entry is needed. This feature of the TLBHT is also used to prefetch instructions into the cache ahead of their use. In fact, the timeliness of the prefetches produced by the TLBHT can be used to remove most of the cycle time penalty incurred by cache misses.
摘要:
System and method for predicting a multiplicity of future branches simultaneously (parallel) from an executing program, to enable the simultaneous fetching of multiple disjoint program segments. Additionally, the present invention detects divergence of incorrect branch predictions and provides correction for such divergence without penalty. By predicting an entire sequence of branches in parallel, the present invention removes restrictions that decoding of multiple instructions in a superscalar environment must be limited to a single branch group. As a result, the speed of today's superscalar processors can be significantly increased. The present invention includes three main embodiments: (1) the first embodiment is directed to a simplex multibranch prediction device, that can predict a plurality of branch groups in one cycle and provide early detection of wrong predictions; (2) the second embodiments is directed to a duplex multibranch prediction device that can detect divergence in a predicted stream, and provide redirection (correction) within the stream; and (3) the third embodiment is directed to an n-plex multibranch prediction device, that can predict n multiplicity of branch predictions simultaneously and provide an early detection of wrong predictions as well as correction of wrong predictions.
摘要:
A method for branch prediction, the method comprising, receiving a branch wrong guess instruction having a branch wrong guess instruction address and data including an opcode and a branch target address, determining whether the branch wrong guess instruction was predicted by a branch prediction mechanism, sending the branch wrong guess instruction to an execution unit responsive to determining that the branch wrong guess instruction was predicted by the branch prediction mechanism, and receiving and decoding instructions at the branch target address.