Abstract:
Systems and methods for dependency-prediction include executing instructions in an instruction pipeline of a processor and detecting a conditionality-imposing control instruction, such as an If-Then (IT) instruction, which imposes dependent behavior on a conditionality block size of one or more dependent instructions. Prior to executing a first instruction, a dependency-prediction is made to determine if the first instruction is a dependent instruction of the conditionality-imposing control instruction, based on the conditionality block size and one or more parameters of the instruction pipeline. The first instruction is executed based on the dependency-prediction. When the first instruction is dependency-mispredicted, an associated dependency-misprediction penalty is mitigated. If the first instruction is a branch instruction, the mitigation involves training a branch prediction tracking mechanism to correctly dependency-predict future occurrences of the first instruction.
Abstract:
Systems and methods are directed to prefetch mechanisms involving non-equal magnitude stride values. A non-equal magnitude functional relationship between successive stride values, may be detected, wherein the stride values are based on distances between target addresses of successive load instructions. At least a next stride value for prefetching data, may be determined, wherein the next stride value is based on the non-equal magnitude functional relationship and a previous stride value. Data prefetch may be from at least one prefetch address calculated based on the next stride value and a previous target address. The non-equal magnitude functional relationship may include a logarithmic relationship corresponding to a binary search algorithm.
Abstract:
Systems and methods for identifying candidate load instructions for prefetch operations based on at least instruction encoding of the load instructions, include an identifier based on a function of at least one or more fields of a load instruction and optionally, a subset of bits of the PC value of the load instruction, wherein the one or more fields exclude a full address or program counter (PC) value of the load instruction. Prefetch mechanisms, including a prefetch table indexed by the identifier, can determine whether the load instruction is a candidate load instruction for prefetching load data, based on the identifier. The function may be a hash, a concatenation, or a combination thereof, of one or more bits of the one or more fields. The fields include one or more of a base register, a destination register, an immediate offset, an offset register, or other bits of instruction encoding of the load instruction.
Abstract:
Various aspects disclosed herein relate to combining instructions to load data from or store data in memory while processing instructions in a computer processor. More particularly, at least one pattern of multiple memory access instructions that reference a common base register and do not fully utilize an available bus width may be identified in a processor pipeline. In response to determining that the multiple memory access instructions target adjacent memory or non-contiguous memory that can fit on a single cache line, the multiple memory access instructions may be replaced within the processor pipeline with one equivalent memory access instruction that utilizes more of the available bus width than either of the replaced memory access instructions.
Abstract:
Aspects disclosed herein relate to combining instructions to load data from or store data in memory while processing instructions in processors. An exemplary method includes detecting a pattern of pipelined instructions to access memory using a first portion of available bus width and, in response to detecting the pattern, combining the pipelined instructions into a single instruction to access the memory using a second portion of the available bus width that is wider than the first portion. Devices including processors using disclosed aspects may execute currently available software in a more efficient manner without the software being modified.
Abstract:
Systems and methods for managing access to a cache relate to determining one or more execute permissions associated with a write-address of a write-request to the cache. The cache may be a unified cache for storing data as well as instructions. If there is a write-miss in the cache for the write-request, a cache controller may determine whether to implement a write-allocate policy or a write-no-allocate policy for servicing the write-miss, based on the one or more execute permissions. The one or more execute permissions can relate to a privilege level associated with the write-address. Execute permissions of a producing agent which generated the write-request and an execute permission of a consuming agent which can execute from the write-address may be based on the privilege levels of the producing agent and the consuming agent, respectively.
Abstract:
Systems and methods pertain to a branch target instruction cache (BTIC) of a processor. The BTIC is configured to store one or more branch target instructions at branch target addresses of branch instructions executable by the processor. At least one of the branch target instructions stored in the BTIC is a conditional branch instruction. Branch prediction techniques for predicting the direction of the conditional branch instruction allow one or more instructions following the conditional branch instruction, as well as a branch target address of the conditional branch instruction to also be stored in the BTIC.
Abstract:
Systems and methods for mitigating influence of wrong-path branch instructions in branch prediction include a branch prediction write queue. A first entry of the branch prediction write queue is associated with a first branch instruction based on an order in which the first branch instruction is fetched. Upon speculatively executing the first branch instruction, a correct direction of the first branch instruction is written in the first entry. Prior to committing the first branch instruction, the branch prediction write queue is configured to update one or more branch prediction mechanisms based on the first entry if the first branch instruction was speculatively executed in a correct-path. Updates to the one or more branch prediction mechanisms based on the first entry are prevented if the first branch instruction was speculatively executed in a wrong-path.
Abstract:
A memory structure compresses a portion of a memory tag using an indexed tag compression structure. A set of higher order bits of the memory tag may be stored in the indexed tag compression structure, where the set of higher order bits are identified by an index value. A tag array stores a set of lower order bits of the memory tag and the index value identifying the entry in the tag compression structure storing the set of higher order bits of the memory tag. The memory tag may comprise at least a portion of a memory address of a data element stored in a data array.