Microprocessor configured to swap operands in order to minimize
dependency checking logic
    1.
    发明授权
    Microprocessor configured to swap operands in order to minimize dependency checking logic 失效
    微处理器配置为交换操作数,以便最小化依赖关系检查逻辑

    公开(公告)号:US5835744A

    公开(公告)日:1998-11-10

    申请号:US561030

    申请日:1995-11-20

    IPC分类号: G06F9/30 G06F9/38

    摘要: A microprocessor is provided which is configured to locate memory and register operands regardless their use as an A operand or B operand in an instruction. Memory operands are conveyed upon a memory operand bus, and register operands are conveyed upon a register operand bus. Decoding of the source and destination status of the operands may be performed in parallel with the operand fetch. Restricting memory operands to a memory operand bus enables reduced bussing between decode units and the operand fetch unit. After fetching operand values from an operand storage, the operand fetch unit reorders the operand values according to the instruction determined by the associated decode unit. The operand values are thereby properly aligned for conveyance to the associated reservation station.

    摘要翻译: 提供微处理器,其被配置为定位存储器和寄存器操作数,而不管它们如何用作指令中的操作数或B操作数。 存储器操作数在存储器操作数总线上传送,并且寄存器操作数在寄存器操作数总线上传送。 操作数的源和目的地状态的解码可以与操作数提取并行执行。 将存储器操作数限制到存储器操作数总线使得能够在解码单元和操作数获取单元之间减少总线。 在从操作数存储器取出操作数值之后,操作数提取单元根据由相关联的解码单元确定的指令对操作数值进行重新排序。 因此,操作数值被适当地对齐以便传送到相关联的保留站。

    Method for transferring data between a pair of caches configured to be
accessed from different stages of an instruction processing pipeline
    2.
    发明授权
    Method for transferring data between a pair of caches configured to be accessed from different stages of an instruction processing pipeline 失效
    用于在配置成从指令处理流水线的不同阶段被访问的一对缓存之间传送数据的方法

    公开(公告)号:US5903910A

    公开(公告)日:1999-05-11

    申请号:US561073

    申请日:1995-11-20

    摘要: A microprocessor including a pair of caches is provided. One of the pair of caches is accessed by stack-relative memory accesses from the decode stage of the instruction processing pipeline. The second of the pair of caches is accessed by memory accesses from the execute stage of the instruction processing pipeline. When a miss is detected in the first of the pair of caches, the stack-relative memory access which misses is conveyed to the execute stage of the instruction processing pipeline. When the stack-relative memory access accesses the second of the pair of caches, the cache line containing the access is transmitted to the first of the pair of caches for storage. The first of the pair of caches selects a victim line for replacement when the data is transferred from the second of the pair of caches. If the victim line has been modified while stored in the first cache, then the victim line is stored in a copyback buffer. A signal is asserted by the first cache to inform the second cache of the need to perform a victim line copyback. Requests from the execute stage of the instruction processing pipeline are stalled to allow the copyback to occur.

    摘要翻译: 提供了包括一对高速缓存的微处理器。 一对缓存中的一个通过来自指令处理流水线的解码级的堆栈相对存储器访问进行访问。 该对高速缓存中的第二个由指令处理流水线的执行阶段的存储器访问访问。 当在一对高速缓存中的第一个中检测到未命中时,丢失的堆栈相对存储器访问被传送到指令处理流水线的执行阶段。 当堆栈相对存储器访问访问该对高速缓存中的第二个时,包含访问的高速缓存行被发送到该对高速缓存中的第一个用于存储。 一对缓存中的第一个在从一对缓存中的第二个数据传输数据时选择一个受害者行进行替换。 如果受害者行已被存储在第一个缓存中被修改,那么受害者行将被存储在一个副本缓冲区中。 一个信号由第一个缓存断言,通知第二个缓存是否需要执行受害线回拷。 来自指令处理流水线的执行阶段的请求被停止以允许发生回拷。

    Superscalar microprocessor including a load/store unit, decode units and a reorder buffer to detect dependencies between access to a stack cache and a data cache
    3.
    发明授权
    Superscalar microprocessor including a load/store unit, decode units and a reorder buffer to detect dependencies between access to a stack cache and a data cache 失效
    超标量微处理器包括加载/存储单元,解码单元和重排序缓冲器,用于检测访问堆栈高速缓存和数据高速缓存之间的依赖关系

    公开(公告)号:US06192462B1

    公开(公告)日:2001-02-20

    申请号:US09162419

    申请日:1998-09-28

    IPC分类号: G06F938

    摘要: A superscalar microprocessor is provided which maintains coherency between a pair of caches accessed from different stages of an instruction processing pipeline. A dependency checking structure is provided within the microprocessor. The dependency checking structure compares memory accesses performed from the execution stage of the instruction processing pipeline to memory accesses performed from the decode stage. The decode stage performs memory accesses to a stack cache, while the execution stage performs its accesses (address for which are formed via indirect addressing) to the stack cache and to a data cache. If a read memory access performed by the execution stage is dependent upon a write memory access performed by the decode stage, the read memory access is stalled until the write memory access completes. If a read memory access performed by the decode stage is dependent upon a write memory access performed by the execution stage, then the instruction associated with the read memory access and subsequent instructions are flushed. Data coherency is maintained between the pair of caches while allowing stack-relative accesses to be performed from the decode stage. The comparator circuits used to perform the comparison are configured to compare a field of address bits instead of the entire address, reducing the size while still maintaining accurate dependency checking by qualifying the resulting comparison signals with an indication that both addresses hit in the same storage location within the stack cache.

    摘要翻译: 提供了一种超标量微处理器,其保持从指令处理流水线的不同阶段访问的一对缓存之间的一致性。 在微处理器内提供依赖检查结构。 依赖性检查结构将从指令处理流水线的执行阶段执行的存储器访问与从解码级执行的存储器访问进行比较。 解码级对堆栈高速缓存执行存储器访问,而执行级通过间接寻址将其访问(通过间接寻址形成的地址)执行到堆栈高速缓存和数据高速缓存。 如果由执行级执行的读取存储器访问取决于由解码级执行的写存储器访问,则读存储器访问被停止,直到写存储器访问完成。 如果由解码级执行的读取存储器访问取决于由执行级执行的写入存储器访问,则刷新与读取的存储器访问和后续指令相关联的指令。 在一对缓存之间保持数据一致性,同时允许从解码级执行堆栈相对访问。 用于执行比较的比较器电路被配置为比较地址位的字段而不是整个地址,减小大小,同时仍然通过将所得到的比较信号限定在相同存储位置中的两个地址的指示来保持精确的依赖性检查 在堆栈缓存内。

    Superscalar microprocessor including a reorder buffer which detects
dependencies between accesses to a pair of caches
    4.
    发明授权
    Superscalar microprocessor including a reorder buffer which detects dependencies between accesses to a pair of caches 失效
    超标量微处理器包括重新排序缓冲器,其检测对一对高速缓存的访问之间的依赖性

    公开(公告)号:US5848287A

    公开(公告)日:1998-12-08

    申请号:US603804

    申请日:1996-02-20

    IPC分类号: G06F9/38

    摘要: A superscalar microprocessor is provided which maintains coherency between a pair of caches accessed from different stages of an instruction processing pipeline. A dependency checking structure is provided within the microprocessor. The dependency checking structure compares memory accesses performed from the execution stage of the instruction processing pipeline to memory accesses performed from the decode stage. The decode stage performs memory accesses to a stack cache, while the execution stage performs its accesses (address for which are formed via indirect addressing) to the stack cache and to a data cache. If a read memory access performed by the execution stage is dependent upon a write memory access performed by the decode stage, the read memory access is stalled until the write memory access completes. If a read memory access performed by the decode stage is dependent upon a write memory access performed by the execution stage, then the instruction associated with the read memory access and subsequent instructions are flushed. Data coherency is maintained between the pair of caches while allowing stack-relative accesses to be performed from the decode stage. The comparator circuits used to perform the comparison are configured to compare a field of address bits instead of the entire address, reducing the size while still maintaining accurate dependency checking by qualifying the resulting comparison signals with an indication that both addresses hit in the same storage location within the stack cache.

    摘要翻译: 提供了一种超标量微处理器,其保持从指令处理流水线的不同阶段访问的一对缓存之间的一致性。 在微处理器内提供依赖检查结构。 依赖性检查结构将从指令处理流水线的执行阶段执行的存储器访问与从解码级执行的存储器访问进行比较。 解码级对堆栈高速缓存执行存储器访问,而执行级通过间接寻址将其访问(通过间接寻址形成的地址)执行到堆栈高速缓存和数据高速缓存。 如果由执行级执行的读取存储器访问取决于由解码级执行的写存储器访问,则读存储器访问被停止,直到写存储器访问完成。 如果由解码级执行的读取存储器访问取决于由执行级执行的写入存储器访问,则刷新与读取的存储器访问和后续指令相关联的指令。 在一对缓存之间保持数据一致性,同时允许从解码级执行堆栈相对访问。 用于执行比较的比较器电路被配置为比较地址位的字段而不是整个地址,减小大小,同时仍然通过将所得到的比较信号限定在相同存储位置中的两个地址的指示来保持精确的依赖性检查 在堆栈缓存内。

    Reverse TLB for providing branch target address in a microprocessor
having a physically-tagged cache
    5.
    发明授权
    Reverse TLB for providing branch target address in a microprocessor having a physically-tagged cache 失效
    用于在具有物理标记的高速缓存的微处理器中提供分支目标地址的反向TLB

    公开(公告)号:US6079003A

    公开(公告)日:2000-06-20

    申请号:US974972

    申请日:1997-11-20

    摘要: A microprocessor employs a branch prediction unit including a branch prediction storage which stores the index portion of branch target addresses and an instruction cache which is virtually indexed and physically tagged. The branch target index (if predicted-taken, or the sequential index if predicted not-taken) is provided as the index to the instruction cache. The selected physical tag is provided to a reverse translation lookaside buffer (TLB) which translates the physical tag to a virtual page number. Concatenating the virtual page number to the virtual index from the instruction cache (and the offset portion, generated from the branch prediction) results in the branch target address being generated. In one embodiment, the process of reading an index from the branch prediction storage, accessing the instruction cache, selecting the physical tag, and reverse translating the physical tag to achieve a virtual page number may require more than a clock cycle to complete. Such an embodiment may employ a current page register which stores the most recently translated virtual page number and the corresponding real page number. The branch prediction unit predicts that each fetch address will continue to reside in the current page and uses the virtual page number from the current page to form the branch target address. The physical tag from the fetched cache line is compared to the corresponding real page number to verify that the fetch address is actually still within the current page. When a mismatch is detected between the corresponding real page number and the physical tag from the fetched cache line, the branch target address is corrected with the linear page number provided by the reverse TLB and the current page register is updated.

    摘要翻译: 微处理器采用分支预测单元,该分支预测单元包括分支预测存储器,该分支预测存储器存储分支目标地址的索引部分,以及虚拟索引和物理标记的指令高速缓 提供分支目标索引(如果预测取得的,或者如果预测未被采用的顺序索引)作为指令高速缓存的索引。 所选择的物理标签被提供给反向翻译后备缓冲器(TLB),其将物理标签转换成虚拟页码。 将虚拟页号连接到来自指令高速缓存(以及从分支预测生成的偏移部分)的虚拟索引导致生成分支目标地址。 在一个实施例中,从分支预测存储器读取索引,访问指令高速缓存,选择物理标签以及反转翻译物理标签以实现虚拟页面号的过程可能需要多于一个时钟周期来完成。 这样的实施例可以使用存储最近翻译的虚拟页面号码和对应的真实页面号码的当前页面寄存器。 分支预测单元预测每个获取地址将继续驻留在当前页面中,并使用当前页面中的虚拟页面号来形成分支目标地址。 将获取的高速缓存行中的物理标记与相应的实际页码进行比较,以验证提取地址实际上仍在当前页面中。 当在相应的实际页码与来自取出的高速缓存行的物理标记之间检测到不匹配时,用反向TLB提供的线性页码修正分支目标地址,并更新当前页寄存器。

    Reorder buffer having an improved future file for storing speculative
instruction execution results
    6.
    发明授权
    Reorder buffer having an improved future file for storing speculative instruction execution results 失效
    重新排序缓冲器具有改进的未来文件,用于存储推测性指令执行结果

    公开(公告)号:US5946468A

    公开(公告)日:1999-08-31

    申请号:US974967

    申请日:1997-11-20

    IPC分类号: G06F9/38

    摘要: A reorder buffer for a microprocessor comprising a control unit, an instruction storage, and future file. The future file has storage locations associated with each register implemented in the microprocessor. The future file is configured to store a reorder buffer tag that corresponds to the last instruction, in program order, stored within the instruction storage that has a destination operand corresponding to the register associated with said storage location. The future file is further configured to store instruction results. The control unit is configured to read a particular reorder buffer tag from the future file that corresponds to a completed instruction and to compare the particular reorder buffer tag with the completed instruction's result tag. If the two tags compare equal, the control unit is configured to write any result data corresponding to the completed instruction into the future file. This advantageously reduces the number of comparators needed to maintain the future file. The future file is also configured to improve branch misprediction recovery speed by examining each entry in said instruction storage for a valid destination starting with the mispredicted branch instruction. This configuration advantageously allows older instructions in the instruction storage to be retired while the future file is being recovered, thereby reducing the number of instructions the control unit must process to recover the future file.

    摘要翻译: 一种用于微处理器的重新排序缓冲器,包括控制单元,指令存储器和未来文件。 未来文件具有与微处理器中实现的每个寄存器相关联的存储位置。 未来文件被配置为存储对应于最后指令的重新排序缓冲器标签,以程序顺序存储在具有对应于与所述存储位置相关联的寄存器的目的地操作数的指令存储器内。 将来的文件进一步配置为存储指令结果。 控制单元被配置为从对应于完成的指令的未来文件中读取特定的重排序缓冲器标签,并且将特定重排序缓冲器标签与完成的指令的结果标签进行比较。 如果两个标签的比较相等,则控制单元被配置为将与完成的指令对应的任何结果数据写入未来文件。 这有利地减少了保持未来文件所需的比较器的数量。 未来文件还被配置为通过检查所述指令存储器中针对由错误预测的分支指令开始的有效目的地的每个条目来提高分支错误预测恢复速度。 该配置有利地允许指令存储器中的较旧指令在将来的文件被恢复的同时退出,从而减少控制单元必须处理以恢复未来文件的指令数量。

    Branch misprediction recovery in a reorder buffer having a future file
    7.
    发明授权
    Branch misprediction recovery in a reorder buffer having a future file 失效
    在具有未来文件的重新排序缓冲器中的分支错误预测恢复

    公开(公告)号:US5915110A

    公开(公告)日:1999-06-22

    申请号:US975011

    申请日:1997-11-20

    IPC分类号: G06F9/38

    摘要: A reorder buffer for a microprocessor comprising a control unit, an instruction storage, and future file. The future file has storage locations associated with each register implemented in the microprocessor. The future file is configured to store a reorder buffer tag that corresponds to the last instruction, in program order, stored within the instruction storage that has a destination operand corresponding to the register associated with said storage location. The future file is further configured to store instruction results. The control unit is configured to read a particular reorder buffer tag from the future file that corresponds to a completed instruction and to compare the particular reorder buffer tag with the completed instruction's result tag. If the two tags compare equal, the control unit is configured to write any result data corresponding to the completed instruction into the future file. This advantageously reduces the number of comparators needed to maintain the future file. The future file is also configured to improve branch misprediction recovery speed by examining each entry in said instruction storage for a valid destination starting with the mispredicted branch instruction. This configuration advantageously allows older instructions in the instruction storage to be retired while the future file is being recovered, thereby reducing the number of instructions the control unit must process to recover the future file.

    摘要翻译: 一种用于微处理器的重新排序缓冲器,包括控制单元,指令存储器和未来文件。 未来文件具有与微处理器中实现的每个寄存器相关联的存储位置。 未来文件被配置为存储对应于最后指令的重新排序缓冲器标签,以程序顺序存储在具有对应于与所述存储位置相关联的寄存器的目的地操作数的指令存储器内。 将来的文件进一步配置为存储指令结果。 控制单元被配置为从对应于完成的指令的未来文件中读取特定的重排序缓冲器标签,并且将特定重排序缓冲器标签与完成的指令的结果标签进行比较。 如果两个标签的比较相等,则控制单元被配置为将与完成的指令对应的任何结果数据写入未来文件。 这有利地减少了保持未来文件所需的比较器的数量。 未来文件还被配置为通过检查所述指令存储器中针对由错误预测的分支指令开始的有效目的地的每个条目来提高分支错误预测恢复速度。 该配置有利地允许指令存储器中的较旧指令在将来的文件被恢复的同时退出,从而减少控制单元必须处理以恢复未来文件的指令数量。

    Computer system including a microprocessor having a reorder buffer
employing last in buffer and last in line indications
    8.
    发明授权
    Computer system including a microprocessor having a reorder buffer employing last in buffer and last in line indications 失效
    计算机系统包括具有在缓冲器中最后使用的重排序缓冲器的微处理器和最后一行的指示

    公开(公告)号:US6032251A

    公开(公告)日:2000-02-29

    申请号:US78213

    申请日:1998-05-13

    IPC分类号: G06F9/38 G06F9/30

    摘要: A computer system including a microprocessor employing a reorder buffer is provided which stores a last in buffer (LIB) indication corresponding to each instruction. The last in buffer indication indicates whether or not the corresponding instruction is last, in program order, of the instructions within the buffer to update the storage location defined as the destination of that instruction. The LIB indication is included in the dependency checking comparisons. A dependency is indicated for a given source operand and a destination operand within the reorder buffer if the operand specifiers match and the corresponding LIB indication indicates that the instruction corresponding to the destination operand is last to update the corresponding storage location. At most one of the dependency comparisons for a given source operand can indicate dependency. According to one embodiment, the reorder buffer employs a line-oriented configuration. Concurrently decoded instructions are stored into a line of storage, and the concurrently decoded instructions are retired as a unit. A last in line (LIL) indication is stored for each instruction in the line. The LIL indication indicates whether or not the instruction is last within the line storing that instruction to update the storage location defined as the destination of that instruction. The LIL indications for a line can be used as write enables for the register file.

    摘要翻译: 提供一种包括使用重排序缓冲器的微处理器的计算机系统,其存储对应于每个指令的最后一个缓冲器(LIB)指示。 缓冲器指示中的最后一个指示是否以缓冲器中的指令的程序顺序最后的相应指令是否更新被定义为该指令的目的地的存储位置。 LIB指示包含在依赖关系检查比较中。 如果操作数指定符匹配,并且对应的LIB指示指示对应于目的地操作数的指令最后更新相应的存储位置,则对重定序缓冲器内的给定源操作数和目的地操作数指示依赖关系。 对于给定的源操作数,最多的一个依赖比较可以表示依赖。 根据一个实施例,重排序缓冲器采用线路定向配置。 同时解码的指令被存储到一行存储器中,同时解码的指令作为一个单元退休。 对于行中的每条指令,存储最后一行(LIL)指示。 LIL指示指示在存储该指令的行的最后一条指令是否更新被定义为该指令的目的地的存储位置。 一行的LIL指示可用作寄存器文件的写使能。

    Update unit for providing a delayed update to a branch prediction array
    9.
    发明授权
    Update unit for providing a delayed update to a branch prediction array 失效
    更新单元,用于向分支预测阵列提供延迟更新

    公开(公告)号:US5878255A

    公开(公告)日:1999-03-02

    申请号:US969039

    申请日:1997-11-12

    IPC分类号: G06F9/38 G06F9/40

    CPC分类号: G06F9/3844 G06F9/3848

    摘要: An update unit for an array in an integrated circuit is provided. The update unit delays the update of the array until a clock cycle in which the functional input to the array is idle. The input port normally used by the functional input is then used to perform the update. During clock cycles between receiving the update and storing the update into the array, the update unit compares the current functional input address to the update address. If the current functional input address matches the update address, then the update value is provided as the output of the array. Otherwise, the information stored in the indexed storage location is provided. In this manner, the update appears to have been performed in the clock cycle that the update value was received, as in a dual-ported array. A particular embodiment of the update unit is a branch prediction array update unit.

    摘要翻译: 提供集成电路中的阵列的更新单元。 更新单元延迟阵列的更新,直到阵列的功能输入空闲的时钟周期为止。 通常由功能输入端使用的输入端口用于执行更新。 在接收到更新并将更新存储到阵列中的时钟周期期间,更新单元将当前功能输入地址与更新地址进行比较。 如果当前功能输入地址与更新地址匹配,则更新值作为数组的输出提供。 否则,提供存储在索引存储位置中的信息。 以这种方式,更新似乎是在接收到更新值的时钟周期中执行的,如双端口阵列中那样。 更新单元的特定实施例是分支预测阵列更新单元。

    Superscalar microprocessor which delays update of branch prediction
information in response to branch misprediction until a subsequent idle
clock
    10.
    发明授权
    Superscalar microprocessor which delays update of branch prediction information in response to branch misprediction until a subsequent idle clock 失效
    超标量微处理器,其响应于分支错误预测延迟分支预测信息的更新,直到后续的空闲时钟

    公开(公告)号:US5875324A

    公开(公告)日:1999-02-23

    申请号:US947225

    申请日:1997-10-08

    IPC分类号: G06F9/38

    摘要: A superscalar microprocessor employing a branch prediction array update unit is provided. The branch prediction array update unit collects the update prediction information for each branch misprediction or external fetch. When a fetch address is presented for branch prediction, the fetch address is compared to the update address stored in the update unit. If the addresses match, then the update prediction information is forwarded as the output of the array. If the addresses do not match, then the information stored in the indexed storage location is forwarded as the output of the array. When the next external fetch begins or misprediction is detected, the update is written into the branch prediction array. The update unit allows for a single-ported array implementation of the branch prediction array while still maintaining the operational aspects of the dual-ported array implementation, as well as allowing for speculative branch prediction update.

    摘要翻译: 提供了采用分支预测阵列更新单元的超标量微处理器。 分支预测阵列更新单元收集每个分支错误预测或外部提取的更新预测信息。 当提取用于分支预测的获取地址时,将获取地址与更新单元中存储的更新地址进行比较。 如果地址匹配,则更新预测信息作为阵列的输出被转发。 如果地址不匹配,则存储在索引存储位置的信息将作为阵列的输出转发。 当检测到下一个外部提取开始或错误预测时,将更新写入分支预测数组。 更新单元允许分支预测阵列的单端口阵列实现,同时仍保持双端口阵列实现的操作方面,以及允许推测分支预测更新。