Method and apparatus for synchronizing parallel pipelines in a superscalar microprocessor
    12.
    发明授权
    Method and apparatus for synchronizing parallel pipelines in a superscalar microprocessor 失效
    用于在超标量微处理器中同步并行管线的方法和装置

    公开(公告)号:US06385719B1

    公开(公告)日:2002-05-07

    申请号:US09345719

    申请日:1999-06-30

    IPC分类号: G06F938

    摘要: A transfer tag is generated by the Instruction Fetch Unit and passed to the decode unit in the instruction pipeline with each group of instructions fetched during a branch prediction by a fetcher. Individual instructions within the fetched group for the branch pipeline are assigned a concatenated version (group tag concatenated with instruction lane) of the transfer tag which is used to match on requests to flush any newer instructions. All potential instruction or Internal Operation latches in the decode pipeline must perform a match and if a match is encountered, all valid bits associated with newer instructions or internal operations upstream from the match are cleared. The transfer tag representing the next instruction to be processed in the branch pipeline is passed to the Instruction Dispatch Unit. The Instruction Dispatch Unit queries the branch pipeline to compare its transfer tag with transfer tags of instructions in the branch pipeline. If the transfer tag matches a branch instruction tag the Instruction Decode Unit is stalled until the branch instruction is processed thus, providing a synchronizing method for the parallel pipelines.

    摘要翻译: 传送标签由指令提取单元生成,并在指令流水线中传送给解码单元,每个指令组由读取器在分支预测期间取出。 为分支流水线提取的组中的单独指令被分配用于匹配在刷新任何较新指令的请求上的传送标签的级联版本(组标签与指令通道连接)。 解码流水线中的所有潜在指令或内部操作锁存器必须执行匹配,并且如果遇到匹配,将清除与较新指令相关联的所有有效位或匹配上游的内部操作。 表示在分支管线中要处理的下一条指令的传送标签被传递到指令调度单元。 指令调度单元查询分支流水线以将其传输标签与分支流水线中的指令的传输标签进行比较。 如果转移标签与分支指令标签匹配,则指令解码单元停止,直到处理分支指令为止,为并行管线提供同步方法。

    Method and system for optimizing the fetching of dispatch groups in a superscalar processor
    13.
    发明授权
    Method and system for optimizing the fetching of dispatch groups in a superscalar processor 有权
    用于优化超标量处理器中调度组的获取的方法和系统

    公开(公告)号:US06286094B1

    公开(公告)日:2001-09-04

    申请号:US09263663

    申请日:1999-03-05

    IPC分类号: G06F930

    摘要: A method and system for determining if a dispatch slot is required in a processing system is disclosed. The method and system comprises a plurality of predecode bits to provide routing information and utilizing the predecode bits to allow instructions to be directed to specific decode slots and to obey dispatch constraints without examining the instructions. The purpose of this precode encoding system scheme is to provide the most information possible about the grouping of the instructions without increasing the complexity of the logic which uses this information for decode and group formation. In a preferred embodiment, pre-decode bits for each instruction that may be issued in parallel are analyzed and the multiplexer controls are retained for each of the possible starting positions within the stream of instructions.

    摘要翻译: 公开了一种用于确定处理系统中是否需要调度槽的方法和系统。 所述方法和系统包括多个预解码比特,以提供路由信息并利用所述预解码比特来允许指令被引导到特定解码时隙,并且在不检查指令的情况下服从调度约束。 该预编码系统方案的目的是为了提供关于指令分组的可能性最大的信息,而不增加使用该信息进行解码和组形成的逻辑的复杂性。 在优选实施例中,分析可以并行发出的每个指令的预解码位,并且为指令流内的每个可能的起始位置保留多路复用器控制。

    Support for out-of-order execution of loads and stores in a processor
    14.
    发明授权
    Support for out-of-order execution of loads and stores in a processor 失效
    支持处理器中负载和存储的无序执行

    公开(公告)号:US5931957A

    公开(公告)日:1999-08-03

    申请号:US829669

    申请日:1997-03-31

    摘要: To support load instructions which execute out-of-order with respect to store instructions, a mechanism is implemented to detect (and correct) the occurrences where a load instruction executed prior to a logically prior store instruction, and where the load instruction received data for the location prior to being modified by the store instruction, and the correct data for the load instruction included bytes from the store instruction. Additionally, to execute store instructions out-of-order with respect to load instructions, a mechanism is implemented to keep a store instruction from destroying data that will be used by a logically earlier load instruction. Further, to support load instructions that are executed out-of-order with respect to each other, a mechanism is implemented to insure that any pair of load instructions (which access at least one byte in common) return data consistent with executing the load instructions in order.

    摘要翻译: 为了支持关于存储指令执行无序的加载指令,实现了一种机制来检测(和校正)在逻辑上先前的存储指令之前执行的加载指令的发生,并且其中加载指令接收数据为 由存储指令修改之前的位置,以及加载指令的正确数据,包括来自存储指令的字节。 另外,为了执行与加载指令无序的存储指令,实现了一种机制来保持存储指令不会破坏由逻辑上较早的加载指令使用的数据。 此外,为了支持相对于彼此执行的无序执行的加载指令,实现一种机制以确保任何一对加载指令(其访问至少一个共同的字节)返回数据与执行加载指令一致 为了。

    Systems and methods for synchronization of processing elements

    公开(公告)号:US11838397B2

    公开(公告)日:2023-12-05

    申请号:US17350758

    申请日:2021-06-17

    IPC分类号: H04L7/00

    摘要: In an example, a synchronization signal can be transmitted to a plurality of synchronizers. The plurality of synchronizers can include a plurality of upstream synchronizers and a downstream synchronizer. Each synchronizer of the plurality of upstream synchronizers can be caused to count from a respective count value until a predetermined end count sequence value in response to receiving the synchronization signal. The respective count value stored at each synchronizer can be representative of a difference in time between a respective upstream synchronizer of the plurality of upstream synchronizers receiving the synchronization signal and the downstream synchronizer receiving the synchronization signal. A respective processing element of a plurality of processing elements can be caused to start a respective function or operation in response to a respective upstream synchronizer reaching the predetermined end count sequence value.

    Reducing store-hit-loads in an out-of-order processor
    16.
    发明授权
    Reducing store-hit-loads in an out-of-order processor 有权
    减少无序处理器中的存储命中负载

    公开(公告)号:US09069563B2

    公开(公告)日:2015-06-30

    申请号:US13235174

    申请日:2011-09-16

    IPC分类号: G06F9/312 G06F9/38

    CPC分类号: G06F9/3834 G06F9/3838

    摘要: A technique for reducing store-hit-loads in an out-of-order processor includes storing a store address of a store instruction associated with a store-hit-load (SHL) pipeline flush in an SHL entry. In response to detecting another SHL pipeline flush for the store address, a current count associated with the SHL entry is updated. In response to the current count associated with the SHL entry reaching a first terminal count, a dependency for the store instruction is created such that execution of a younger load instruction with a load address that overlaps the store address stalls until the store instruction executes.

    摘要翻译: 用于减少无序处理器中的存储命中负载的技术包括存储与存储命中加载(SHL)管线冲洗相关联的存储指令的存储地址在SHL条目中。 响应于检测存储地址的另一个SHL管道flush,更新与SHL条目相关联的当前计数。 响应于与SHL条目相关联的当前计数到达第一终端计数,创建存储指令的依赖关系,使得具有与存储地址重叠的加载地址的较年轻加载指令的执行停止,直到存储指令执行。

    Operating a stack of information in an information handling system
    17.
    发明授权
    Operating a stack of information in an information handling system 有权
    在信息处理系统中操作一堆信息

    公开(公告)号:US08943299B2

    公开(公告)日:2015-01-27

    申请号:US12817609

    申请日:2010-06-17

    IPC分类号: G06F9/30

    摘要: A pointer is for pointing to a next-to-read location within a stack of information. For pushing information onto the stack: a value is saved of the pointer, which points to a first location within the stack as being the next-to-read location; the pointer is updated so that it points to a second location within the stack as being the next-to-read location; and the information is written for storage at the second location. For popping the information from the stack: in response to the pointer, the information is read from the second location as the next-to-read location; and the pointer is restored to equal the saved value so that it points to the first location as being the next-to-read location.

    摘要翻译: 一个指针用于指向一堆信息中的下一个读取位置。 将信息推送到堆栈中:保存指针的值,该指针指向堆栈内的第一个位置作为下一个读取位置; 指针被更新,使得它指向堆栈内的第二位置作为下一个读取位置; 并且将信息写入第二位置处的存储。 为了从堆栈弹出信息:响应于指针,从第二位置读取信息作为下一个读取位置; 并且指针被恢复为等于保存的值,使得其指向作为下一个读取位置的第一位置。

    REDUCING STORE-HIT-LOADS IN AN OUT-OF-ORDER PROCESSOR
    18.
    发明申请
    REDUCING STORE-HIT-LOADS IN AN OUT-OF-ORDER PROCESSOR 有权
    减少订单处理器中的存储负载

    公开(公告)号:US20130073833A1

    公开(公告)日:2013-03-21

    申请号:US13235174

    申请日:2011-09-16

    IPC分类号: G06F9/38 G06F9/312

    CPC分类号: G06F9/3834 G06F9/3838

    摘要: A technique for reducing store-hit-loads in an out-of-order processor includes storing a store address of a store instruction associated with a store-hit-load (SHL) pipeline flush in an SHL entry. In response to detecting another SHL pipeline flush for the store address, a current count associated with the SHL entry is updated. In response to the current count associated with the SHL entry reaching a first terminal count, a dependency for the store instruction is created such that execution of a younger load instruction with a load address that overlaps the store address stalls until the store instruction executes.

    摘要翻译: 用于减少无序处理器中的存储命中负载的技术包括存储与存储命中加载(SHL)管线冲洗相关联的存储指令的存储地址在SHL条目中。 响应于检测存储地址的另一个SHL管道flush,更新与SHL条目相关联的当前计数。 响应于与SHL条目相关联的当前计数到达第一终端计数,创建存储指令的依赖关系,使得具有与存储地址重叠的加载地址的较年轻加载指令的执行停止,直到存储指令执行。

    DETERMINING EACH STALL REASON FOR EACH STALLED INSTRUCTION WITHIN A GROUP OF INSTRUCTIONS DURING A PIPELINE STALL
    19.
    发明申请
    DETERMINING EACH STALL REASON FOR EACH STALLED INSTRUCTION WITHIN A GROUP OF INSTRUCTIONS DURING A PIPELINE STALL 失效
    确定管道中的一组指令中的每个停留指令的每一个原因

    公开(公告)号:US20120278595A1

    公开(公告)日:2012-11-01

    申请号:US13097284

    申请日:2011-04-29

    IPC分类号: G06F9/38

    摘要: During a pipeline stall in an out of order processor, until a next to complete instruction group completes, a monitoring unit receives, from a completion unit of a processor, a next to finish indicator indicating the finish of an oldest previously unfinished instruction from among a plurality of instructions of a next to complete instruction group. The monitoring unit receives, from a plurality of functional units of the processor, a plurality of finish reports including completion reasons for a plurality of separate instructions. The monitoring unit determines at least one stall reason from among multiple stall reasons for the oldest instruction from a selection of completion reasons from a selection of finish reports aligned with the next to finish indicator from among the plurality of finish reports. Once the monitoring unit receives a complete indicator from the completion unit, indicating the completion of the next to complete instruction group, the monitoring unit stores each determined stall reason aligned with each next to finish indicator in memory.

    摘要翻译: 在处理器处于不规则处理器的流水线停止期间,直到完成指令组的下一个完成为止,监视单元从处理器的完成单元接收到指示完成以前未完成的指令的完成的下一个完成指示, 下一个完成指令组的多个指令。 监视单元从处理器的多个功能单元接收多个完成报告,包括多个单独指令的完成原因。 从多个完成报告中的与下一个完成指示符对齐的完成报告的选择完成原因的选择中,监视单元从最多的指令的多个失败原因中确定至少一个失败原因。 一旦监视单元从完成单元接收到完整的指示符,指示完成下一个完成指令组,则监视单元将每个确定的停顿原因与每个下一个完成指示符对准在存储器中。

    Storing branch information in an address table of a processor
    20.
    发明授权
    Storing branch information in an address table of a processor 有权
    将分支信息存储在处理器的地址表中

    公开(公告)号:US07984280B2

    公开(公告)日:2011-07-19

    申请号:US12171370

    申请日:2008-07-11

    IPC分类号: G06F9/00

    CPC分类号: G06F9/3806

    摘要: Methods for storing branch information in an address table of a processor are disclosed. A processor of the disclosed embodiments may generally include an instruction fetch unit connected to an instruction cache, a branch execution unit, and an address table being connected to the instruction fetch unit and the branch execution unit. The address table may generally be adapted to store a plurality of entries with each entry of the address table being adapted to store a base address and a base instruction tag. In a further embodiment, the branch execution unit may be adapted to determine the address of a branch instruction having an instruction tag based on the base address and the base instruction tag of an entry of the address table associated with the instruction tag. In some embodiments, the address table may further be adapted to store branch information.

    摘要翻译: 公开了将分支信息存储在处理器的地址表中的方法。 所公开的实施例的处理器通常可以包括连接到指令高速缓存的指令获取单元,分支执行单元和连接到指令获取单元和分支执行单元的地址表。 地址表通常适于存储多个条目,其中地址表的每个条目适于存储基地址和基本指令标签。 在另一实施例中,分支执行单元可以适于基于与指令标签相关联的地址表的条目的基地址和基本指令标签来确定具有指令标签的分支指令的地址。 在一些实施例中,地址表还可以适于存储分支信息。