摘要:
An instruction cache system having a virtually tagged instruction cache which, from a software program perspective, operates as if it were a physically tagged instruction cache is disclosed. The instruction cache system also includes a means for address translation which is responsive to an address translation invalidate instruction and a control logic circuit. The control logic circuit is configured to invalidate an entry in the virtually tagged instruction cache in response to the address translation invalidate instruction.
摘要:
A preload instruction in a first instruction set is executed at a processor. The preload instruction causes the processor to preload one or more instructions into an instruction cache. The pre-loaded instructions are pre-decoded according to a second instruction set that is different from the first instruction set. The preloaded instructions are pre-decoded according to the second instruction set in response to an instruction set preload indicator (ISPI).
摘要:
In a processor executing instructions from a variable-length instruction set, a preload instruction is operative to retrieve from memory a data block corresponding to an instruction cache line, pre-decode instructions from a variable-length instruction set in the data block, and load the instructions and pre-decode information into the instruction cache. An instruction execution unit indicates to a pre-decoder the position within the data block of a first valid instruction. The pre-decoder successively determines the length of each instruction and hence the instruction boundaries. An instruction cache line offset indicator that identifies the position of the first valid instruction may be generated and provided to the pre-decoder in a variety of ways.
摘要:
An instruction preload instruction executed in a first processor instruction set operating mode is operative to correctly preload instructions in a different, second instruction set. The instructions are pre-decoded according to the second instruction set encoding in response to an instruction set preload indicator (ISPI). In various embodiments, the ISPI may be set prior to executing the preload instruction, or may comprise part of the preload instruction or the preload target address.
摘要:
In a processor executing instructions from a variable-length instruction set, a preload instruction is operative to retrieve from memory a data block corresponding to an instruction cache line, pre-decode instructions from a variable-length instruction set in the data block, and load the instructions and pre-decode information into the instruction cache. An instruction execution unit indicates to a pre-decoder the position within the data block of a first valid instruction. The pre-decoder successively determines the length of each instruction and hence the instruction boundaries. An instruction cache line offset indicator that identifies the position of the first valid instruction may be generated and provided to the pre-decoder in a variety of ways.
摘要:
A processor is operative to execute two or more instruction sets, each in a different instruction set operating mode. As each instruction is executed, debug circuit comparison the current instruction set operating mode to a target instruction set operating mode sent by a programmer, and outputs an alert or indication in they match. The alert or indication may additionally be dependent upon the instruction address following within a predetermined target address range. The alert or indication may comprise a breakpoint signal that halts execution and/or it is output as an external signal of the processor. The instruction address at which the processor detects a match in the instruction set operating modes may additionally be output. Additionally or alternatively, the alert or indication may comprise starting or stopping a trace operation, causing an exception, or any other known debugger function.
摘要:
A processor performs a prefetch operation on non-sequential instruction addresses. If a first instruction address misses in an instruction cache and accesses a higher-order memory as part of a fetch operation, and a branch instruction associated with the first instruction address or an address following the first instruction address is detected and predicted taken, a prefetch operation is performed using a predicted branch target address, during the higher-order memory access. If the predicted branch target address hits in the instruction cache during the prefetch operation, associated instructions are not retrieved, to conserve power. If the predicted branch target address misses in the instruction cache during the prefetch operation, a higher-order memory access may be launched, using the predicted branch instruction address. In either case, the first instruction address is re-loaded into the fetch stage pipeline to await the return of instructions from its higher-order memory access.
摘要:
Systems and method for memory management units (MMUs) configured to automatically pre-fill a translation lookaside buffer (TLB) with address translation entries expected to be used in the future, thereby reducing TLB miss rate and improving performance. The TLB may be pre-filled with translation entries, wherein addresses corresponding to the pre-fill may be selected based on predictions. Predictions may be derived from external devices, or based on stride values, wherein the stride values may be a predetermined constant or dynamically altered based on access patterns. Pre-filling the TLB may effectively remove latency involved in determining address translations for TLB misses from the critical path.
摘要:
A first processing unit and a second processing unit can access a system memory that stores a common page table that is common to the first processing unit and the second processing unit. The common page table can store virtual memory addresses to physical memory addresses mapping for memory chunks accessed by a job of an application. A page entry, within the common page table, can include a first set of attribute bits that defines accessibility of the memory chunk by the first processing unit, a second set of attribute bits that defines accessibility of the same memory chunk by the second processing unit, and physical address bits that define a physical address of the memory chunk.
摘要:
Configuring a surrogate memory accessing agent using an instruction for translating and storing a data value is described. In one embodiment, the instruction is received that includes a first operand specifying a data value to be translated and a second operand specifying a virtual address associated with a location of a surrogate memory accessing agent register in which to store the data value. The data value can be translated to a first physical address. The virtual address can be translated to a second physical address. The first physical address is stored in the surrogate memory accessing agent register based on the second physical address.