Efficient branch target address cache entry replacement
    1.
    发明授权
    Efficient branch target address cache entry replacement 有权
    高效的分支目标地址缓存条目替换

    公开(公告)号:US08832418B2

    公开(公告)日:2014-09-09

    申请号:US12575951

    申请日:2009-10-08

    IPC分类号: G06F9/38 G06F9/30

    CPC分类号: G06F9/3806 G06F9/30054

    摘要: A microprocessor includes a branch target address cache (BTAC), each entry thereof configured to store branch prediction information for at most N branch instructions. An execution unit executes a branch instruction previously fetched in a fetch quantum. Update logic determines whether the BTAC is already storing information for N branch instructions within the fetch quantum (N is at least two), updates the BTAC for the branch instruction if the BTAC is not already storing information for N branch instructions, determines whether a type of the branch instruction has a higher replacement priority than a type of the N branch instructions if the BTAC is already storing information for N branch instructions, and updates the BTAC for the branch instruction if the type of the branch instruction has a higher replacement priority than the type of the N branch instructions already stored in the BTAC.

    摘要翻译: 微处理器包括分支目标地址高速缓存(BTAC),其每个条目被配置为存储最多N个分支指令的分支预测信息。 执行单元执行预取取取量子中的分支指令。 更新逻辑确定BTAC是否已在存储量子中存储N个分支指令的信息(N为至少两个),如果BTAC尚未存储N个分支指令的信息,则更新用于分支指令的BTAC,确定是否使用类型 如果BTAC已经存储了N个分支指令的信息,则分支指令的替换优先级比N分支指令的类型更高,并且如果分支指令的类型具有更高的替换优先级,则更新分支指令的BTAC 已经存储在BTAC中的N个分支指令的类型。

    Early release of cache data with start/end marks when instructions are only partially present
    2.
    发明授权
    Early release of cache data with start/end marks when instructions are only partially present 有权
    当指令仅部分存在时,可以及时发布具有开始/结束标记的缓存数据

    公开(公告)号:US08335910B2

    公开(公告)日:2012-12-18

    申请号:US12572024

    申请日:2009-10-01

    IPC分类号: G06F9/30

    摘要: An apparatus extracts instructions from a stream of undifferentiated instruction bytes in a microprocessor having an instruction set architecture in which the instructions are variable length. Decoders generate an associated start/end mark for each instruction byte of a line from a first queue of entries each storing a line of instruction bytes. A second queue has entries each storing a line received from the first queue along with the associated start/end marks. Control logic detects a condition where the length of an instruction whose initial portion within a first line in the first queue is yet undeterminable because the instruction's remainder resides in a second line yet to be loaded into the first queue from the instruction cache; loads the first line and corresponding start/end marks into the second queue and refrains from shifting the first line out of the first queue, in response to detecting the condition; and extracts instructions from the first line in the second queue based on the corresponding start/end marks. The instructions exclude the yet undeterminable length instruction.

    摘要翻译: 一种装置在具有其中指令是可变长度的指令集架构的微处理器中从未分化指令字节流中提取指令。 解码器从存储指令字节行的第一个条目队列生成一行的每个指令字节的相关起始/终止标记。 第二队列具有各自存储从第一队列接收的线以及相关联的开始/结束标记的条目。 控制逻辑检测其中在第一队列中的第一行内的初始部分尚未确定的指令的长度的条件,因为指令的余数驻留在尚未从指令高速缓存加载到第一队列中的第二行; 响应于检测到所述条件,将所述第一行和对应的开始/结束标记加载到所述第二队列中并且避免将所述第一行移出所述第一队列; 并基于相应的开始/结束标记从第二队列中的第一行提取指令。 该指令排除了尚未确定的长度指令。

    Apparatus and method for selectively accessing disparate instruction buffer stages based on branch target address cache hit and instruction stage wrap
    3.
    发明授权
    Apparatus and method for selectively accessing disparate instruction buffer stages based on branch target address cache hit and instruction stage wrap 有权
    基于分支目标地址缓存命中和指令级封装选择性地访问不同指令缓冲区的装置和方法

    公开(公告)号:US06823444B1

    公开(公告)日:2004-11-23

    申请号:US09898832

    申请日:2001-07-03

    IPC分类号: G06F938

    摘要: A branch control apparatus in a microprocessor. The branch control apparatus includes an instruction buffer having a plurality of stages that buffer cache lines of instruction bytes received from an instruction cache. A multiplexer selects one of the bottom three stages in the instruction buffer to provide to instruction format logic. The multiplexer selects a stage based on a branch indicator, an instruction wrap indicator, and a carry indicator. The branch indicator indicates whether the processor previously branched to a target address provided by a branch target address cache. The branch indicator and target address are previously stored in association with the stage containing the branch instruction for which the target address is cached. The wrap indicator indicates whether the currently formatted instruction wraps across two cache lines. The carry indicator indicates whether the current instruction being formatted occupies the last byte of the currently formatted instruction buffer stage.

    摘要翻译: 微处理器中的分支控制装置。 分支控制装置包括具有缓冲从指令高速缓存接收的指令字节的高速缓存行的多个级的指令缓冲器。 多路复用器选择指令缓冲器中的三个底层之一来提供指令格式逻辑。 复用器基于分支指示符,指令换行指示符和进位指示器选择一个级。 分支指示符指示处理器是否先前分支到由分支目标地址高速缓存提供的目标地址。 分支指示符和目标地址预先与包含缓存目标地址的转移指令的级相关联地存储。 包装指示符指示当前格式化的指令是否包含在两条缓存行之间。 进位指示器指示当前正在格式化的指令是否占用当前格式化指令缓冲区的最后一个字节。

    Apparatus and method for marking start and end bytes of instructions in a stream of instruction bytes in a microprocessor having an instruction set architecture in which instructions may include a length-modifying prefix
    4.
    发明授权
    Apparatus and method for marking start and end bytes of instructions in a stream of instruction bytes in a microprocessor having an instruction set architecture in which instructions may include a length-modifying prefix 有权
    在具有指令集结构的微处理器中用于标记指令字节流中的指令字节的开始和结束字节的装置和方法,其中指令可以包括长度修改前缀

    公开(公告)号:US08612727B2

    公开(公告)日:2013-12-17

    申请号:US12571997

    申请日:2009-10-01

    IPC分类号: G06F9/30

    摘要: An apparatus in a microprocessor that has an instruction set architecture in which instructions may include a length-modifying prefix used to select an address/operand size other than a default address/operand size, wherein the apparatus marks the start byte and the end byte of each instruction in a stream of instruction bytes. Decode logic decodes each instruction byte of a predetermined number of instruction bytes to determine whether the instruction byte specifies a length-modifying prefix and generates a start mark and an end mark for each of the instruction bytes based on an address/operand size. Operand/address size logic provides the default operand/address size to the decode logic to use to generate the start and end marks during a first clock cycle during which the decode logic decodes the predetermined number of instruction bytes. If during the first clock cycle and any of N subsequent clock cycles the decode logic indicates that one of the predetermined number of instruction bytes specifies a length-modifying prefix, the operand/address size logic provides to the decode logic on the next clock cycle the address/operand size specified by the length-modifying prefix to use to generate the start and end marks.

    摘要翻译: 一种微处理器中的装置,其具有指令集结构,其中指令可以包括用于选择除默认地址/操作数大小之外的地址/操作数大小的长度修改前缀,其中该装置标记起始字节和结束字节 指令字节中的每条指令。 解码逻辑解码预定数量的指令字节的每个指令字节,以确定指令字节是否指定长度修改前缀,并且基于地址/操作数大小为每个指令字节生成起始标记和结束标记。 操作数/地址大小逻辑为解码逻辑提供了默认的操作数/地址大小,用于在第一时钟周期期间产生起始和终止标志,在此时间周期内解码逻辑解码预定数量的指令字节。 如果在第一时钟周期和N个随后的时钟周期中的任何一个周期中,解码逻辑指示预定数量的指令字节中的一个指定长度修改前缀,则操作数/地址大小逻辑在下一个时钟周期提供给解码逻辑 由用于生成开始和结束标记的长度修改前缀指定的地址/操作数大小。

    Out-of-order microprocessor with separate branch information circular queue table tagged by branch instructions in reorder buffer to reduce unnecessary space in buffer
    5.
    发明授权
    Out-of-order microprocessor with separate branch information circular queue table tagged by branch instructions in reorder buffer to reduce unnecessary space in buffer 有权
    无序微处理器具有单独的分支信息循环队列表,由重排序缓冲区中的分支指令标记,以减少缓冲区中的不必要空间

    公开(公告)号:US08281110B2

    公开(公告)日:2012-10-02

    申请号:US12581000

    申请日:2009-10-16

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3844 G06F9/3857

    摘要: An out-of-order execution in-order retire microprocessor includes a branch information table comprising N entries. Each of the N entries stores information associated with a branch instruction. The microprocessor also includes a reorder buffer comprising M entries. Each of the M entries stores information associated with an unretired instruction within the microprocessor. Each of the M entries includes a field that indicates whether the unretired instruction is a branch instruction and, if so, a tag identifying one of the N entries in the branch information table storing information associated with the branch instruction. N is significantly less than M such that the overall die space and power consumption is reduced over a processor in which each reorder buffer entry stores the branch information.

    摘要翻译: 无序执行在线退出微处理器包括包括N个条目的分支信息表。 N个条目中的每一个存储与分支指令相关联的信息。 微处理器还包括一个包括M个条目的重排序缓冲器。 M条目中的每一个存储与微处理器内的未命令指令相关联的信息。 每个M个条目包括一个字段,该字段指示该未命令指令是否是分支指令,如果是,则标识标识分支信息表中存在与该分支指令相关联的信息的分支信息表中的一个N条目的标签。 N明显小于M,使得整个管芯空间和功耗在每个重排序缓冲器入口存储分支信息的处理器上减少。

    OUT-OF-ORDER EXECUTION IN-ORDER RETIRE MICROPROCESSOR WITH BRANCH INFORMATION TABLE TO ENJOY REDUCED REORDER BUFFER SIZE
    6.
    发明申请
    OUT-OF-ORDER EXECUTION IN-ORDER RETIRE MICROPROCESSOR WITH BRANCH INFORMATION TABLE TO ENJOY REDUCED REORDER BUFFER SIZE 有权
    不具备分支信息表的订单后续微处理器获得减少的REORDER BUFFER SIZE

    公开(公告)号:US20110016292A1

    公开(公告)日:2011-01-20

    申请号:US12581000

    申请日:2009-10-16

    IPC分类号: G06F9/38 G06F9/30

    CPC分类号: G06F9/3844 G06F9/3857

    摘要: An out-of-order execution in-order retire microprocessor includes a branch information table comprising N entries. Each of the N entries stores information associated with a branch instruction. The microprocessor also includes a reorder buffer comprising M entries. Each of the M entries stores information associated with an unretired instruction within the microprocessor. Each of the M entries includes a field that indicates whether the unretired instruction is a branch instruction and, if so, a tag identifying one of the N entries in the branch information table storing information associated with the branch instruction. N is significantly less than M such that the overall die space and power consumption is reduced over a processor in which each reorder buffer entry stores the branch information.

    摘要翻译: 无序执行在线退出微处理器包括包括N个条目的分支信息表。 N条目中的每一个存储与分支指令相关联的信息。 微处理器还包括一个包括M个条目的重排序缓冲器。 M条目中的每一个存储与微处理器内的未命令指令相关联的信息。 每个M个条目包括一个字段,该字段指示该未命令指令是否是分支指令,如果是,则标识标识分支信息表中存在与该分支指令相关联的信息的分支信息表中的一个N条目的标签。 N明显小于M,使得整个管芯空间和功耗在每个重排序缓冲器入口存储分支信息的处理器上减小。

    PREFIX ACCUMULATION FOR EFFICIENT PROCESSING OF INSTRUCTIONS WITH MULTIPLE PREFIX BYTES
    7.
    发明申请
    PREFIX ACCUMULATION FOR EFFICIENT PROCESSING OF INSTRUCTIONS WITH MULTIPLE PREFIX BYTES 有权
    有效处理多个前缀字节的前缀累加

    公开(公告)号:US20100299500A1

    公开(公告)日:2010-11-25

    申请号:US12572002

    申请日:2009-10-01

    IPC分类号: G06F9/30

    摘要: In a microprocessor that has an instruction set architecture in which the instructions may include a variable number of prefix bytes, an apparatus for efficiently extracting instructions from a stream of undifferentiated instruction bytes. Decode logic determines which byte is an opcode byte for each instruction of a plurality of instructions within the stream of undifferentiated instruction bytes. The opcode byte is the first non-prefix byte of the instruction. The decode logic accumulates prefix information onto the opcode byte of the instruction for each instruction of the plurality of instructions. A queue holds the stream of undifferentiated instruction bytes and the accumulated prefix information. Extraction logic extracts the plurality of instructions from the queue in one clock cycle independent of the number of prefix bytes included in each of the plurality of instructions.

    摘要翻译: 在具有指令集架构的微处理器中,指令可以包括可变数量的前缀字节,用于从未分化指令字节流高效地提取指令的装置。 解码逻辑确定哪个字节是未分化指令字节流内的多条指令的每个指令的操作码字节。 操作码字节是指令的第一个非前缀字节。 解码逻辑将前缀信息累积到多个指令中的每个指令的指令的操作码字节上。 队列保存未分化指令字节流和累加的前缀信息。 提取逻辑在一个时钟周期内从队列中提取多个指令,而与包含在多个指令中的每个指令中的前缀字节的数量无关。

    APPARATUS FOR EFFICIENTLY DETERMINING INSTRUCTION LENGTH WITHIN A STREAM OF X86 INSTRUCTION BYTES
    8.
    发明申请
    APPARATUS FOR EFFICIENTLY DETERMINING INSTRUCTION LENGTH WITHIN A STREAM OF X86 INSTRUCTION BYTES 有权
    在X86指令字节流中有效确定指令长度的设备

    公开(公告)号:US20100299497A1

    公开(公告)日:2010-11-25

    申请号:US12572045

    申请日:2009-10-01

    IPC分类号: G06F9/30

    摘要: An apparatus efficiently determines the length of an instruction within a stream of instruction bytes processed by a microprocessor having a variable instruction length instruction set architecture. The apparatus includes combinatorial logic associated with each instruction byte of the stream, each configured to receive the associated instruction byte and the next instruction byte of the stream and to generate in response thereto a first length, a second length, and a select control. A multiplexor associated with each of the combinatorial logic selects and outputs one of the following inputs based on the select control received from the combinatorial logic: a zero input and the second length received from the combinatorial logic associated with each of the next three instruction bytes of the stream. An adder associated with each of the combinatorial logic and multiplexor adds the first length and the output of the multiplexor to generate the length of the instruction.

    摘要翻译: 一种装置有效地确定由具有可变指令长度指令集架构的微处理器处理的指令字节流内的指令的长度。 该装置包括与流的每个指令字节相关联的组合逻辑,每个指令字节被配置为接收流的相关联的指令字节和下一指令字节,并响应于此生成第一长度,第二长度和选择控制。 与组合逻辑中的每一个相关联的多路复用器基于从组合逻辑接收的选择控制来选择并输出以下输入之一:从与下一个三个指令字节中的每一个相关联的组合逻辑接收到的零输入和第二长度 流。 与组合逻辑和多路复用器中的每一个相关联的加法器将多路复用器的第一长度和输出相加以生成指令的长度。

    Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence
    9.
    发明授权
    Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence 有权
    响应于非标准返回序列的检测,选择性地覆盖返回堆栈预测的装置和方法

    公开(公告)号:US07631172B2

    公开(公告)日:2009-12-08

    申请号:US11609261

    申请日:2006-12-11

    摘要: A microprocessor for predicting return instruction target addresses is disclosed. A branch target address cache stores a plurality of target address predictions and a corresponding plurality of override indicators for a corresponding plurality of return instructions, and provides a prediction of the target address of the return instruction from the target address predictions and provides a corresponding override indicator from the override indicators. Each has a true value when the return stack has mispredicted the target address of the corresponding return instruction for a most recent execution of the return instruction. A return stack also provides a prediction of the target address of the return instruction. Branch control logic causes the microprocessor to branch to the prediction of the target address provided by the BTAC, and not to the prediction of the target address provided by the return stack, when the override indicator is a true value.

    摘要翻译: 公开了一种用于预测返回指令目标地址的微处理器。 分支目标地址缓存存储多个目标地址预测和对应的多个返回指令的对应的多个覆盖指示符,并且从目标地址预测提供对返回指令的目标地址的预测,并提供相应的覆盖指示符 从覆盖指标。 当返回堆栈错误地预测了最近执行返回指令的相应返回指令的目标地址时,每个值都具有真实值。 返回栈还提供了返回指令的目标地址的预测。 分支控制逻辑使得微处理器转移到由BTAC提供的目标地址的预测,而不是当覆盖指示符是真值时对由返回栈提供的目标地址的预测。

    Apparatus and method for speculatively performing a return instruction in a microprocessor
    10.
    发明授权
    Apparatus and method for speculatively performing a return instruction in a microprocessor 有权
    在微处理器中推测执行返回指令的装置和方法

    公开(公告)号:US07200740B2

    公开(公告)日:2007-04-03

    申请号:US09849822

    申请日:2001-05-04

    IPC分类号: G06F9/30

    摘要: A branch prediction apparatus that employs dual call/return stacks to predict return addresses in a microprocessor. The apparatus includes a first call/return stack that provides a speculative return address based upon a return instruction hit in a speculative branch target address cache (BTAC) of an instruction cache fetch address prior to decoding of the instruction to know whether it is actually a return instruction. The speculative return address is one of multiple return addresses simultaneously stored in the first call/return stack each pushed thereupon in response to the BTAC indicating a call instruction was fetched and prior to decoding the call instruction. The speculative return address is provided early in the pipeline and the microprocessor speculatively branches to the speculative return address. Later in the pipeline, a second call/return stack provides a non-speculative return address after the instruction is decoded and verified to be a return instruction. A comparator compares the speculative and non-speculative return addresses, and if the two addresses mismatch, the microprocessor branches to the non-speculative return address.

    摘要翻译: 一种分支预测装置,其采用双重呼叫/返回栈来预测微处理器中的返回地址。 该装置包括第一呼叫/返回栈,其基于指令高速缓存提取地址的推测性分支目标地址高速缓存(BTAC)中的返回指令命令提供推测返回地址,在解码指令之前是否实际上是 返回指令。 推测返回地址是同时存储在第一个调用/返回栈中的多个返回地址之一,每个返回栈中的每一个被响应于BTAC指示已经取出了一个调用指令并在解码该调用指令之前被推送。 投机回报地址在管道早期提供,微处理器推测性地分支到投机回报地址。 在管道中,第二个调用/返回栈在指令被解码并被验证为返回指令之后提供非推测返回地址。 比较器比较推测和非推测返回地址,如果两个地址不匹配,则微处理器分支到非推测返回地址。