摘要:
A loop branch prediction system is provided to predict a final iteration of a loop and resteer an associated fetch module to an appropriate target address. The loop prediction system includes a counter and an end of loop (EOL) module. In one mode, the counter tracks loop branches in process. When a termination condition is detected, the counter switches to a second mode to track the number of loop branches still to be issued. The EOL module compares the number of loop branches still to be issued with one or more threshold values and generates a resteer signal when a match is detected.
摘要:
A processor and method that reduces instruction fetch penalty in the execution of a program sequence of instructions comprises a branch predict instruction that is inserted into the program at a location which precedes the branch. The branch predict instruction has an opcode that specifies a branch as likely to be taken or not taken, and which also specifies a target address of the branch. A block of target instructions, starting at the target address, is prefetched into the instruction cache of the processor so that the instructions are available for execution prior to the point in the program where the branch is encountered. Also specified by the opcode is an indication of the size of the block of target instructions, and a trace vector of a path in the program sequence that leads to the target from the branch predict instruction for better utilization of limited memory bandwidth.
摘要:
A branch operation is processed using a branch predict instruction and an associated branch instruction. The branch predict instruction indicates a predicted direction, a target address, and an instruction address for the associated branch instruction. When the branch predict instruction is detected, the target address is stored at an entry indicated by the associated branch instruction address and a prefetch request is triggered to the target address. The branch predict instruction may also include hint information for managing the storage and use of the branch prediction information.
摘要:
A branch operation is processed using a branch predict instruction and an associated branch instruction. The branch predict instruction indicates a predicted direction, a target address, and an instruction address for the associated branch instruction. When the branch predict instruction is detected, the target address is stored at an entry indicated by the associated branch instruction address and a prefetch request is triggered to the target address. The branch predict instruction may also include hint information for managing the storage and use of the branch prediction information.
摘要:
A method for processing one or more branch instructions in an instruction bundle is provided. The instructions are ordered in an execution sequence within the bundle, with the branch instructions ordered last in the sequence. The bundled instructions are transferred to execution units indicated by a template field that is associated with the bundle. The first branch instruction in the bundle's execution sequence that is resolved taken is determined, and retirement of subsequent instructions in the execution sequence is suppressed.
摘要:
Embodiments of apparatuses and methods for improving performance in a virtualization architecture are disclosed. In one embodiment, an apparatus includes a processor and a processor abstraction layer. The processor abstraction layer includes instructions that, when executed by the processor, support techniques to improve the performance of the apparatus in a virtualization architecture.
摘要:
The present invention relates to carcinogenesis biomarkers produced by phenobarbitol-treated rat hepatocytes, nucleic acid molecules that encode carcinogenesis biomarkers or a fragment thereof and nucleic acid molecules that are useful as probes or primers for detecting or inducing carcinogenesis, respectively. The invention also relates to applications of the factor or fragment such as forming antibodies capable of binding the carcinogenesis biomarkers or fragments thereof.
摘要:
Embodiments of apparatuses and methods for improving performance in a virtualization architecture are disclosed. In one embodiment, an apparatus includes a processor and a processor abstraction layer. The processor abstraction layer includes instructions that, when executed by the processor, support techniques to improve the performance of the apparatus in a virtualization architecture.
摘要:
Parallel table lookups are implemented using variable Mux instructions to reorder data. Table data can be represented in a “table” register, while the desired ordering can be represented in an “Index” register. A direct variable Mux instruction can specify the table register and the index register as arguments, along with a result register. The instruction writes at least some of the data from the table register into the result register as specified in the index register. If the entire table cannot fit within a single register, entries can be divided between two or more table registers. An indirect variable Mux instruction can specify both a table-register-select register and a subword-location-select register. Both the direct and indirect Mux instructions can be used with entry data that is divided in accordance with significance between registers. In that case, plural Mux instructions are used with UnPack instructions that concatenate portions of the table entries.
摘要:
The present invention provides a multiprocessor system and method in which plural memory locations are used for storing TLB-shootdown data respectively for plural processors. In contrast to systems in which a single area of memory serves for all processors' TLB-shootdown data, different processors can describe the memory they want to free concurrently. Thus, concurrent TLB-shootdown request are less likely to result in performance-limiting TLB-shootdown contentions that have previously constrained the scaleability of multiprocessor systems.