Reorder buffer employing last in line indication
    91.
    发明授权
    Reorder buffer employing last in line indication 失效
    重新排序采用最后一个在线指示的缓冲区

    公开(公告)号:US06292884B1

    公开(公告)日:2001-09-18

    申请号:US09476388

    申请日:1999-12-30

    Abstract: A reorder buffer is provided which stores a last in buffer (LIB) indication corresponding to each instruction. The last in buffer indication indicates whether or not the corresponding instruction is last, in program order, of the instructions within the buffer to update the storage location defined as the destination of that instruction. The LIB indication is included in the dependency checking comparisons. A dependency is indicated for a given source operand and a destination operand within the reorder buffer if the operand specifiers match and the corresponding LIB indication indicates that the instruction corresponding to the destination operand is last to update the corresponding storage location. At most one of the dependency comparisons for a given source operand can indicate dependency. According to one embodiment, the reorder buffer employs a line-oriented configuration. Concurrently decoded instructions are stored into a line of storage, and the concurrently decoded instructions are retired as a unit. A last in line (LIL) indication is stored for each instruction in the line. The LIL indication indicates whether or not the instruction is last within the line storing that instruction to update the storage location defined as the destination of that instruction. The LIL indications for a line can be used as write enables for the register file.

    Abstract translation: 提供重排序缓冲器,其存储对应于每个指令的最后一个缓冲器(LIB)指示。 缓冲器指示中的最后一个指示是否以缓冲器中的指令的程序顺序最后的相应指令是否更新被定义为该指令的目的地的存储位置。 LIB指示包含在依赖关系检查比较中。 如果操作数指定符匹配,并且对应的LIB指示指示对应于目的地操作数的指令最后更新相应的存储位置,则对重定序缓冲器内的给定源操作数和目的地操作数指示依赖关系。 对于给定的源操作数,最多的一个依赖比较可以表示依赖。 根据一个实施例,重排序缓冲器采用线路定向配置。 同时解码的指令被存储到一行存储器中,同时解码的指令作为一个单元退休。 对于行中的每条指令,存储最后一行(LIL)指示。 LIL指示指示在存储该指令的行的最后一条指令是否更新被定义为该指令的目的地的存储位置。 一行的LIL指示可用作寄存器文件的写使能。

    Superscalar microprocessor configured to predict return addresses from a return stack storage
    92.
    发明授权
    Superscalar microprocessor configured to predict return addresses from a return stack storage 有权
    超标量微处理器配置为从返回堆栈存储器预测返回地址

    公开(公告)号:US06269436B1

    公开(公告)日:2001-07-31

    申请号:US09392300

    申请日:1999-09-08

    Abstract: A microprocessor is provided which is configured to predict return addresses for return instructions according to a return stack storage included therein. The return stack storage is a stack structure configured to store return addresses associated with previously detected call instructions. Return addresses may be predicted for return instructions early in the instruction processing pipeline of the microprocessor. In one embodiment, the return stack storage additionally stores a call tag and a return tag with each return address. The call tag and return tag respectively identify call and return instructions associated with the return address. These tags may be compared to a branch tag conveyed to the return prediction unit upon detection of a branch misprediction. The results of the comparisons may be used to adjust the contents of the return stack storage with respect to the misprediction. The microprocessor may continue to predict return addresses correctly following a mispredicted branch instruction.

    Abstract translation: 提供微处理器,其被配置为根据其中包括的返回堆栈存储来预测返回指令的返回地址。 返回栈存储器是被配置为存储与先前检测到的调用指令相关联的返回地址的堆栈结构。 可以在微处理器的指令处理流水线的早期为返回指令预测返回地址。 在一个实施例中,返回堆栈存储器另外存储具有每个返回地址的呼叫标签和返回标签。 呼叫标签和返回标签分别标识与返回地址相关联的呼叫和返回指令。 这些标签可以与检测到分支错误预测时传送到返回预测单元的分支标签进行比较。 可以使用比较结果来调整返回堆栈存储器相对于错误预测的内容。 微处理器可以在错误预测的分支指令之后继续正确地预测返回地址。

    Three state branch history using one bit in a branch prediction mechanism
    93.
    发明授权
    Three state branch history using one bit in a branch prediction mechanism 有权
    三州分支历史在分支预测机制中使用一位

    公开(公告)号:US06253316B1

    公开(公告)日:2001-06-26

    申请号:US09438963

    申请日:1999-11-12

    CPC classification number: G06F9/3806 G06F9/30054 G06F9/3844

    Abstract: A branch prediction unit stores a set of branch prediction history bits and branch selectors corresponding to each of a group of contiguous instruction bytes stored in an instruction cache. While only one bit is used to represent branch prediction history, three distinct states are represented in conjunction with the absence of a branch prediction. This provides for the storage of fewer bits, while maintaining a high degree of branch prediction accuracy. Each branch selector identifies the branch prediction to be selected if a fetch address corresponding to that branch selector is presented. In order to minimize the number of branch selectors stored for a group of contiguous instruction bytes, the group is divided into multiple byte ranges. The largest byte range may include a number of bytes comprising the shortest branch instruction in the instruction set (exclusive of the return instruction). For example, the shortest branch instruction may be two bytes in one embodiment. Therefore, the largest byte range is two bytes in the example. Since the branch selectors as a group change value (i.e. indicate a different branch instruction) only at the end byte of a predicted-taken branch instruction, fewer branch selectors may be stored than the number of bytes within the group.

    Abstract translation: 分支预测单元存储对应于存储在指令高速缓存中的一组连续指令字节中的每一个分支预测历史位和分支选择器的集合。 虽然仅使用一个比特来表示分支预测历史,但是与不存在分支预测一起表示三个不同的状态。 这提供了存储较少位,同时保持高度的分支预测精度。 如果呈现与该分支选择器相对应的获取地址,则每个分支选择器识别要选择的分支预测。 为了最小化一组连续指令字节存储的分支选择器的数量,该组被划分为多个字节范围。 最大字节范围可以包括包括指令集中的最短分支指令(不包括返回指令)的字节数。 例如,在一个实施例中,最短分支指令可以是两个字节。 因此,在该示例中,最大字节范围是两个字节。 由于分支选择器作为组改变值(即指示不同的分支指令)仅在预测的分支指令的结束字节处,所以可以存储比组内的字节数少的分支选择器。

    Dependency table for reducing dependency checking hardware
    94.
    发明授权
    Dependency table for reducing dependency checking hardware 有权
    用于减少依赖关系检查硬件的依赖关系表

    公开(公告)号:US06249862B1

    公开(公告)日:2001-06-19

    申请号:US09715467

    申请日:2000-11-15

    Abstract: A dependency table stores a reorder buffer tag for each register. The stored reorder buffer tag corresponds to the last of the instructions within the reorder buffer (in program order) to update the register. Otherwise, the dependency table indicates that the value stored in the register is valid. When operand fetch is performed for a set of concurrently decoded instructions, dependency checking is performed including checking for dependencies between the set of concurrently decoded instructions as well as accessing the dependency table to select the reorder buffer tag stored therein. Either the reorder buffer tag of one of the concurrently decoded instructions, the reorder buffer tag stored in the dependency table, the instruction result corresponding to the stored reorder buffer tag, or the value from the register itself is forwarded as the source operand for the instruction. Information from the comparators and the information stored in the dependency table is sufficient to select which value is forwarded. Additionally, the dependency table stores the width of the register being updated. Prior to forwarding the reorder buffer tag stored within the dependency table, the width stored therein is compared to the width of the source operand being requested. If a narrow-to-wide dependency is detected the instruction is stalled until the instruction indicated in the dependency table retires. Still further, the dependency table recovers from branch mispredictions and exceptions by redispatching the instructions into the dependency table.

    Abstract translation: 依赖关系表存储每个寄存器的重排序缓冲区标签。 存储的重排序缓冲器标签对应于重新排序缓冲器中的最后一个指令(以程序顺序)来更新寄存器。 否则,依赖关系表表示存储在寄存器中的值有效。 当对一组并行解码的指令执行操作数提取时,执行依赖性检查,包括检查所述一组并行解码指令之间的依赖性以及访问依赖关系表以选择其中存储的重排序缓冲器标签。 同时解码的指令之一的重排序缓冲器标签,存储在依赖关系表中的重排序缓冲器标签,与存储的重排序缓冲器标签相对应的指令结果或来自寄存器本身的值被转发作为指令的源操作数 。 来自比较器的信息和存储在依赖关系表中的信息足以选择转发哪个值。 另外,依赖关系表存储正被更新的寄存器的宽度。 在转发存储在依赖关系表内的重新排序缓冲器标签之前,将其中存储的宽度与所请求的源操作数的宽度进行比较。 如果检测到窄到宽的依赖关系,则指令停止,直到依赖关系表中指示的指令退出。 此外,依赖关系表通过将指令重新分配到依赖关系表中,从分支错误预测和异常中恢复。

    Branch selectors associated with byte ranges within an instruction cache
for rapidly identifying branch predictions
    95.
    发明授权
    Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions 有权
    与指令高速缓存中的字节范围相关联的分支选择器,用于快速识别分支预测

    公开(公告)号:US6141748A

    公开(公告)日:2000-10-31

    申请号:US366809

    申请日:1999-08-04

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F9/30054 G06F9/3806 G06F9/3844

    Abstract: A branch prediction unit stores a set of branch selectors corresponding to each of a group of contiguous instruction bytes stored in an instruction cache. Each branch selector identifies the branch prediction to be selected if a fetch address corresponding to that branch selector is presented. In order to minimize the number of branch selectors stored for a group of contiguous instruction bytes, the group is divided into multiple byte ranges. The largest byte range may include a number of bytes comprising the shortest branch instruction in the instruction set (exclusive of the return instruction). For example, the shortest branch instruction may be two bytes in one embodiment. Therefore, the largest byte range is two bytes in the example. Since the branch selectors as a group change value (i.e. indicate a different branch instruction) only at the end byte of a predicted-taken branch instruction, fewer branch selectors may be stored than the number of bytes within the group.

    Abstract translation: 分支预测单元存储对应于存储在指令高速缓存中的一组连续指令字节中的每一个分支选择器。 如果呈现与该分支选择器相对应的获取地址,则每个分支选择器识别要选择的分支预测。 为了最小化一组连续指令字节存储的分支选择器的数量,该组被划分为多个字节范围。 最大字节范围可以包括包括指令集中的最短分支指令(不包括返回指令)的字节数。 例如,在一个实施例中,最短分支指令可以是两个字节。 因此,在该示例中,最大字节范围是两个字节。 由于分支选择器作为组改变值(即指示不同的分支指令)仅在预测的分支指令的结束字节处,所以可以存储比组内的字节数少的分支选择器。

    Reorder buffer employed in a microprocessor to store instruction results
having a plurality of entries predetermined to correspond to a
plurality of functional units
    96.
    发明授权
    Reorder buffer employed in a microprocessor to store instruction results having a plurality of entries predetermined to correspond to a plurality of functional units 有权
    在微处理器中使用的重排序缓冲器来存储具有预定为对应于多个功能单元的多个条目的指令结果

    公开(公告)号:US6134651A

    公开(公告)日:2000-10-17

    申请号:US458816

    申请日:1999-12-10

    Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor. The reorder buffer tag (or instruction result, if the instruction has executed) of the last instruction in program order to update the register is stored in the future file. The reorder buffer provides the value (either reorder buffer tag or instruction result) stored in the storage location corresponding to a register when the register is used as a source operand for another instruction. Another advantage of the future file for microprocessors which allow access and update to portions of registers is that narrow-to-wide dependencies are resolved upon completion of the instruction which updates the narrower register.

    Abstract translation: 重排序缓冲器被配置成多个存储线,其中存储线包括关于预定的最大数量的可同时分发的指令的指令结果的足够的存储。 只要调度一个或多个指令,就分配一行存储空间。 采用重排序缓冲器的微处理器也配置有固定的对称发布位置。 问题位置的对称性质可能会增加由微处理器同时调度和执行的指令的平均数量。 随着并发调度指令的平均数量的增加,行中未使用位置的平均数量减少。 重排序缓冲器的一个特定实现包括将来的文件。 未来文件包括与微处理器内的每个寄存器对应的存储位置。 程序顺序中的最后一条指令的重新排序缓冲区标签(或指令结果已执行)更新寄存器存储在将来的文件中。 重新排序缓冲器提供当寄存器用作另一个指令的源操作数时,存储在与寄存器相对应的存储位置中的值(重新排序缓冲器标签或指令结果)。 允许访问和更新寄存器部分的微处理器的未来文件的另一个优点是,在更新较窄寄存器的指令完成后,解决了窄到宽的依赖关系。

    Way prediction logic for cache array

    公开(公告)号:US6115792A

    公开(公告)日:2000-09-05

    申请号:US436906

    申请日:1999-11-09

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: A set-associative cache memory configured to use multiple portions of a requested address in parallel to quickly access data from a data array based upon stored way predictions. The cache memory comprises a plurality of memory locations, a plurality of storage locations configured to store way predictions, a decoder, a plurality of pass transistors, and a sense amp unit. A subset of the storage locations are selected according to a first portion of a requested address. The decoder is configured to receive and decode a second portion of the requested address. The decoded portion of the address is used to select a particular subset of the data array based upon the way predictions stored within the selected subset of storage locations. The pass transistors are configured select a second subset of the data array according to a third portion of the requested address. The sense amp unit then reads a cache line from the intersection of the first subset and second subset within the data array.

    Pipelined instruction cache and branch prediction mechanism therefor
    98.
    发明授权
    Pipelined instruction cache and branch prediction mechanism therefor 失效
    流水线指令缓存及其分支预测机制

    公开(公告)号:US6101577A

    公开(公告)日:2000-08-08

    申请号:US929767

    申请日:1997-09-15

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    Abstract: A microprocessor includes an instruction cache having a cache access time greater than the clock cycle time employed by the microprocessor. The instruction cache is banked, and access to alternate banks is pipelined. The microprocessor also includes a branch prediction unit. The branch prediction unit provides a branch prediction in response to each fetch address. The branch prediction predicts a non-consecutive instruction block within the instruction stream being executed by the microprocessor. Access to the consecutive instruction block is initiated prior to completing access to a current instruction block. Therefore, a branch prediction for the consecutive instruction block is produced as a result of fetching a prior instruction block. A branch prediction produced as a result of fetching the current instruction block predicts the non-consecutive instruction block, and the fetch address of the non-consecutive instruction block is provided to the instruction cache access pipeline.

    Abstract translation: 微处理器包括具有大于由微处理器使用的时钟周期时间的高速缓存访​​问时间的指令高速缓存。 指令缓存被存储,并且流水线访问备用bank。 微处理器还包括分支预测单元。 分支预测单元响应于每个取出地址提供分支预测。 分支预测预测由微处理器执行的指令流内的非连续指令块。 在完成对当前指令块的访问之前,启动对连续指令块的访问。 因此,作为获取先前指令块的结果,产生用于连续指令块的分支预测。 作为获取当前指令块的结果产生的分支预测预测非连续指令块,并且将非连续指令块的取出地址提供给指令高速缓存存取管道。

    Microprocessor including virtual address branch prediction and current
page register to provide page portion of virtual and physical fetch
address
    99.
    发明授权
    Microprocessor including virtual address branch prediction and current page register to provide page portion of virtual and physical fetch address 失效
    微处理器包括虚拟地址分支预测和当前页面寄存器,以提供虚拟和物理提取地址的页面部分

    公开(公告)号:US6079005A

    公开(公告)日:2000-06-20

    申请号:US975224

    申请日:1997-11-20

    CPC classification number: G06F9/3806 G06F12/1054 G06F9/30058

    Abstract: A microprocessor employs a branch prediction unit including a branch prediction storage which stores the index portion of branch target addresses and an instruction cache which is virtually indexed and physically tagged. The branch target index (if predicted-taken), or the sequential index (if predicted not-taken) is provided as the index to the instruction cache. The selected physical tag is provided to a reverse translation lookaside buffer (TLB) which translates the physical tag to a virtual page number. Concatenating the virtual page number to the virtual index from the instruction cache (and the offset portion, generated from the branch prediction) results in the branch target address being generated. In one embodiment, a current page register stores the most recently translated virtual page number and the corresponding real page number. The branch prediction unit predicts that each fetch address will continue to reside in the current page and uses the virtual page number from the current page to form the branch Target address. The physical tag from the fetched cache line is compared to the corresponding real page number to verify that the fetch address is actually still within the current page. When a mismatch is detected between the corresponding real page number and the physical tag from the fetched cache line, the branch target address is corrected with the linear page number provided by the reverse TLB and the current page register is updated.

    Abstract translation: 微处理器采用分支预测单元,该分支预测单元包括分支预测存储器,该分支预测存储器存储分支目标地址的索引部分和虚拟索引并被物理标记的指令高速缓存。 提供分支目标索引(如果预测取得)或顺序索引(如果预测未被采用)作为指令高速缓存的索引。 所选择的物理标签被提供给反向翻译后备缓冲器(TLB),其将物理标签转换成虚拟页码。 将虚拟页号连接到来自指令高速缓存(以及从分支预测生成的偏移部分)的虚拟索引导致生成分支目标地址。 在一个实施例中,当前页面寄存器存储最近翻译的虚拟页面号码和相应的真实页面号码。 分支预测单元预测每个获取地址将继续驻留在当前页面中,并且使用当前页面中的虚拟页面号来形成分支目标地址。 将获取的高速缓存行中的物理标记与相应的实际页码进行比较,以验证提取地址实际上仍在当前页面中。 当在相应的实际页码与来自取出的高速缓存行的物理标记之间检测到不匹配时,用反向TLB提供的线性页码修正分支目标地址,并更新当前页寄存器。

    Superscalar microprocessor configured to predict return addresses from a
return stack storage
    100.
    发明授权
    Superscalar microprocessor configured to predict return addresses from a return stack storage 有权
    超标量微处理器配置为从返回堆栈存储器预测返回地址

    公开(公告)号:US6014734A

    公开(公告)日:2000-01-11

    申请号:US153770

    申请日:1998-09-15

    Abstract: A microprocessor is provided which is configured to predict return addresses for return instructions according to a return stack storage included therein. The return stack storage is a stack structure configured to store return addresses associated with previously detected call instructions. Return addresses may be predicted for return instructions early in the instruction processing pipeline of the microprocessor. In one embodiment, the return stack storage additionally stores a call tag and a return tag with each return address. The call tag and return tag respectively identify call and return instructions associated with the return address. These tags may be compared to a branch tag conveyed to the return prediction unit upon detection of a branch misprediction. The results of the comparisons may be used to adjust the contents of the return stack storage with respect to the misprediction. The microprocessor may continue to predict return addresses correctly following a mispredicted branch instruction.

    Abstract translation: 提供微处理器,其被配置为根据其中包括的返回堆栈存储来预测返回指令的返回地址。 返回栈存储器是被配置为存储与先前检测到的调用指令相关联的返回地址的堆栈结构。 可以在微处理器的指令处理流水线的早期为返回指令预测返回地址。 在一个实施例中,返回堆栈存储器另外存储具有每个返回地址的呼叫标签和返回标签。 呼叫标签和返回标签分别标识与返回地址相关联的呼叫和返回指令。 这些标签可以与检测到分支错误预测时传送到返回预测单元的分支标签进行比较。 可以使用比较结果来调整返回堆栈存储器相对于错误预测的内容。 微处理器可以在错误预测的分支指令之后继续正确地预测返回地址。

Patent Agency Ranking