Abstract:
Systems, apparatuses, methods, and computer-readable mediums for preventing return oriented programming (ROP) attacks. A compiler may insert landing pads adjacent to valid return targets in an instruction sequence. When a return instruction is executed, the processor may treat the return as suspicious if the target of the return instruction does not have an adjacent landing pad. Additionally, each landing pad may be encoded with a color, and a colored launch pad may be inserted into the instruction stream next to each return instruction. When a return instruction is executed, the processor may determine if the target of the return has a landing pad with the same color as the launch pad of the return instruction. Return-target pairs with color mismatches may be treated as suspicious and the offending process may be killed.
Abstract:
A processor includes a mechanism that checks for and flushes only speculative loads and any respective dependent instructions that are younger than an executed wait for event (WEV) instruction, and which also match an address of a store instruction that has been determined to have been executed by a different processor prior to execution of the paired SEV instruction by the different processor. The mechanism may allow speculative loads that do not match the address of any store instruction that has been determined to have been executed by a different processor prior to execution of the paired SEV instruction by the different processor.
Abstract:
A processor includes a mechanism that checks for and flushes only speculative loads and any respective dependent instructions that are younger than an executed wait for event (WEV) instruction, and which also match an address of a store instruction that has been determined to have been executed by a different processor prior to execution of the paired SEV instruction by the different processor. The mechanism may allow speculative loads that do not match the address of any store instruction that has been determined to have been executed by a different processor prior to execution of the paired SEV instruction by the different processor.
Abstract:
A system and method for efficiently protecting branch prediction information. In various embodiments, a computing system includes at least one processor with a branch predictor storing branch target addresses and security tags in a table. The security tag includes one or more components of machine context. When the branch predictor receives a portion of a first program counter of a first branch instruction, and hits on a first table entry during an access, the branch predictor reads out a first security tag. The branch predictor compares one or more components of machine context of the first security tag to one or more components of machine context of the first branch instruction. When there is at least one mismatch, the branch prediction information of the first table entry is not used. Additionally, there is no updating of any branch prediction training information of the first table entry.
Abstract:
Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.
Abstract:
A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.
Abstract:
Techniques are disclosed relating to capturing information related to instructions executing on in a processor. In one embodiment, an integrated circuit is disclosed that includes an execution pipeline configured to execute a sequence of instructions. The integrated circuit includes monitoring circuitry configured to monitor the execution pipeline for occurrences of an event associated with the sequence of instructions, and in response to detecting a particular number of occurrences of the event, capture a value of a program counter corresponding to an instruction of the sequence of instructions that is associated with an occurrence of the event. The monitoring circuitry stores the captured value of the program counter in a distinct capture register and signals an interrupt indicating that the captured value of the program counter is retrievable from the capture register. In some embodiments, a debugging application may retrieve the value and present it to a developer attempting perform code profiling.
Abstract:
In an embodiment, an indirect branch predictor generates indirect branch predictions based on one or more register values. The register values may be the contents of registers on which the indirect branch instruction is directly or indirectly dependent for generating the branch target address, for example. In an embodiment, at least one of the registers may be a source for a load instruction, and the indirect branch may be dependent (directly or indirectly) on the target of the load. In an embodiment, the indirect branch predictor may be one of at least two indirect branch predictors in a processor. The other indirect branch predictor may be based on a fetch address, or PC, associated with the indirect branch instruction. The other indirect branch predictor may generate a first predicted target address, and the indirect branch predictor may generate a second predicted target address for the same indirect branch instruction.
Abstract:
Systems, apparatuses, and methods for implementing a physical register last reference scheme are described. A system includes a processor with a mapper, history file, and freelist. When an entry in the mapper is updated with a new architectural register-to-physical register mapping, the processor creates a new history file entry for the given instruction that caused the update. The processor also searches the mapper to determine if the old physical register that was previously stored in the mapper entry is referenced by any other mapper entries. If there are no other mapper entries that reference this old physical register, then a last reference indicator is stored in the new history file entry. When the given instruction retires, the processor checks the last reference indicator in the history file entry to determine whether the old physical register can be returned to the freelist of available physical registers.
Abstract:
In some embodiments, a branch prediction unit includes a plurality of branch prediction circuits and selection logic. At least two of the branch prediction circuits are configured, based on an address of a branch instruction and different sets of history information, to provide a corresponding branch prediction for the branch instruction. At least one storage element of the at least two branch prediction circuits is set associative. The selection logic is configured to select a particular branch prediction output by one of the branch prediction circuits as a current branch prediction output of the branch prediction unit. In some instances, the branch prediction unit may be less likely to replace branch prediction information, as compared to a different branch prediction unit that does not include a set associative storage element. In some embodiments, this arrangement may lead to increased performance of the branch prediction unit.