Efficient firm consistency support mechanisms in an out-of-order
execution superscaler multiprocessor
    51.
    发明授权
    Efficient firm consistency support mechanisms in an out-of-order execution superscaler multiprocessor 失效
    无序执行超标量多处理器中有效的企业一致性支持机制

    公开(公告)号:US5699538A

    公开(公告)日:1997-12-16

    申请号:US352467

    申请日:1994-12-09

    IPC分类号: G06F9/38 G06F12/08 G06F9/312

    CPC分类号: G06F9/383 G06F9/3834

    摘要: Two processor controls for supporting efficient Firm Consistency while allowing out-of-order execution of Load instructions is provided. The Touch control operates when the processor stores a subsequent Store in a pending Store buffer while awaiting any outstanding Loads or Stores. The efficiency of the pending Store is improved by issuing a Touch of the data which pre-loads the line of data in the cache that is the subject of the store. The processor can complete out-of-order execution of a subsequently issued Load relative to a prior Load, but only to its finished state. The subsequently issued Load is not allowed to complete until the prior Load is completed. The Finished Load Cancellation control ensures that Firm Consistency is maintained by canceling any finished Loads, and subsequent instructions, when the subject of the Load is the same as an invalidation request from a multiprocessor.

    摘要翻译: 提供了两个处理器控制,用于支持高效的一致性,同时允许负载指令的无序执行。 当等待任何未完成的负载或商店时,处理器将后续存储器存储在待处理的存储缓冲区中时,Touch控件将会起作用。 通过发布预先加载作为商店主题的缓存中的数据行的数据,可以改善待处理存储的效率。 处理器可以完成相对于先前负载的随后发出的负载的无序执行,但只能完成其完成状态。 随后发出的装载将不允许完成,直到先前的装载完成。 当负载的主体与来自多处理器的无效请求相同时,完成负载消除控制确保通过取消任何完成的负载和后续指令来维持公司一致性。

    Transactional memory system which employs thread assists using address history tables
    52.
    发明授权
    Transactional memory system which employs thread assists using address history tables 失效
    事务性存储系统采用线程协助使用地址历史表

    公开(公告)号:US08117403B2

    公开(公告)日:2012-02-14

    申请号:US11928758

    申请日:2007-10-30

    IPC分类号: G06F12/00 G06F13/00

    CPC分类号: G06F12/0842 G06F12/0817

    摘要: A computing system uses specialized “Set Associative Transaction Tables” and additional “Summary Transaction Tables” to speed the processing of common transactional memory conflict cases and those which employ assist threads using an Address History Table and processes memory transactions with a Transaction Table in memory for parallel processing of multiple threads of execution by support of which an application need not be aware. Special instructions may mark the boundaries of a transaction and identify memory locations applicable to a transaction. A ‘private to transaction’ (PTRAN) tag, directly addressable as part of the main data storage memory location, enables a quick detection of potential conflicts with other transactions that are concurrently executing on another thread of said computing system. The tag indicates whether (or not) a data entry in memory is part of a speculative memory state of an uncommitted transaction that is currently active in the system.

    摘要翻译: 计算系统使用专门的“集合关联事务表”和附加的“摘要事务表”来加速常见的事务性内存冲突情况的处理,以及使用地址历史表使用辅助线程的处理,并使用内存中的事务表处理内存事务 通过支持应用程序不需要知道的并行处理多个执行线程。 特殊说明可能标记交易的边界,并确定适用于交易的记忆位置。 作为主数据存储存储器位置的一部分可直接寻址的“私有交易”(PTRAN)标签使得能够快速检测与所述计算系统的另一个线程上并发执行的其他事务的潜在冲突。 标记表示(或不)内存中的数据条目是系统中当前处于活动状态的未提交事务的推测性存储器状态的一部分。

    Method, apparatus and program product for enhancing performance of an in-order processor with long stalls
    53.
    发明授权
    Method, apparatus and program product for enhancing performance of an in-order processor with long stalls 失效
    用于提高具有长档位的处理器性能的方法,装置和程序产品

    公开(公告)号:US07603543B2

    公开(公告)日:2009-10-13

    申请号:US11055862

    申请日:2005-02-11

    IPC分类号: G06F9/30

    摘要: A method, system, and computer program product for enhancing performance of an in-order microprocessor with long stalls. In particular, the mechanism of the present invention provides a data structure for storing data within the processor. The mechanism of the present invention comprises a data structure including information used by the processor. The data structure includes a group of bits to keep track of which instructions preceded a rejected instruction and therefore will be allowed to complete and which instructions follow the rejected instruction. The group of bits comprises a bit indicating whether a reject was a fast or slow reject; and a bit for each cycle that represents a state of an instruction passing through a pipeline. The processor speculatively continues to execute a set bit's corresponding instruction during stalled periods in order to generate addresses that will be needed when the stall period ends and normal dispatch resumes.

    摘要翻译: 一种方法,系统和计算机程序产品,用于增强具有长档位的按顺序微处理器的性能。 特别地,本发明的机构提供了一种用于在处理器内存储数据的数据结构。 本发明的机构包括包括由处理器使用的信息的数据结构。 数据结构包括一组比特,用于跟踪被拒绝指令之前的哪些指令,因此将被允许完成,以及哪些指令遵循被拒绝的指令。 该比特组包括指示拒绝是否是快速或慢速拒绝的位; 以及表示通过管道的指令的状态的每个周期的一点。 处理器推测地在停滞时段期间继续执行设置位的相应指令,以便产生在停滞期结束并且恢复正常调度时将需要的地址。

    Data stream prefetching in a microprocessor
    54.
    发明授权
    Data stream prefetching in a microprocessor 失效
    数据流在微处理器中预取

    公开(公告)号:US07350029B2

    公开(公告)日:2008-03-25

    申请号:US11054889

    申请日:2005-02-10

    IPC分类号: G06F12/00 G06F13/00

    CPC分类号: G06F12/0862 G06F2212/6028

    摘要: A method of prefetching data in a microprocessor includes identifying a data stream associated with a process and determining a depth associated with the data stream based upon prefetch factors including the number of currently concurrent data streams and data consumption rates associated with the concurrent data streams. Data prefetch requests are allocated with the data stream to reflect the determined depth of the data stream. Allocating data prefetch requests may include allocating prefetch requests for a number of cache lines away from the cache line currently being referenced, wherein the number of cache lines is equal to the determined depth. The method may include, responsive to determining the depth associated with a data stream, configuring prefetch hardware to reflect the determined depth for the identified data stream. Prefetch control bits in an instruction executed by the processor control the prefetch hardware configuration.

    摘要翻译: 在微处理器中预取数据的方法包括基于包括当前并发数据流的数量和与并发数据流相关联的数据消耗速率的预取因子来识别与进程相关联的数据流并确定与数据流相关联的深度。 数据预取请求被分配与数据流以反映确定的数据流的深度。 分配数据预取请求可以包括为当前被引用的高速缓存行分配多个高速缓存行的预取请求,其中高速缓存行的数量等于所确定的深度。 该方法可以响应于确定与数据流相关联的深度,配置预取硬件以反映所识别的数据流的确定的深度。 由处理器执行的指令中的预取控制位控制预取硬件配置。

    Mechanism for self-initiated instruction issuing and method therefor
    55.
    发明授权
    Mechanism for self-initiated instruction issuing and method therefor 失效
    自发指令发布机制及其方法

    公开(公告)号:US07080241B2

    公开(公告)日:2006-07-18

    申请号:US09903828

    申请日:2001-07-11

    IPC分类号: G06F9/00

    摘要: An apparatus and method for self-initiated instruction issuing are implemented. In a central processing unit (CPU) having a pipelined architecture, instructions are queued for issuing to the execution unit which will execute them. Instructions are issued each cycle, and an instruction should be selectable for issuing as soon as its source operands are available. An instruction in the issue queue having source operands depending on other, target, instructions to determine their value are signaled to the target instruction by a link mask in the queue entry corresponding to the target instruction. A bit in the link mask identifies the queue entry corresponding to the dependent instruction. When the target instruction issues to the execution unit, a bit is set in a predetermined portion of the queue entry containing the dependent instruction. The portion of the queue entry is associated with the source operand depending on the issuing instruction. This bit informs selection logic circuitry that the dependency is resolved by the issuing instruction, and the dependent instruction may be selected for issuing.

    摘要翻译: 实现了自发指令发布的装置和方法。 在具有流水线架构的中央处理单元(CPU)中,排队等待执行单元执行它们的指令。 每个周期都会发出指令,一旦指令的源操作数可用,就可以选择发出指令。 根据与目标指令相对应的队列条目中的链接掩码,具有取决于其他目标指令的源操作数的发出队列中的指令被发送到目标指令。 链接掩码中的一位标识与依赖指令对应的队列条目。 当目标指令发送到执行单元时,在包含依赖指令的队列条目的预定部分中设置一个位。 根据发出指令,队列条目的部分与源操作数相关联。 该位通知选择逻辑电路,依赖性由发布指令解决,并且依赖指令可被选择用于发布。

    Method and system for dynamically shared completion table supporting multiple threads in a processing system
    56.
    发明授权
    Method and system for dynamically shared completion table supporting multiple threads in a processing system 有权
    用于在处理系统中支持多个线程的动态共享完成表的方法和系统

    公开(公告)号:US06721874B1

    公开(公告)日:2004-04-13

    申请号:US09687078

    申请日:2000-10-12

    IPC分类号: G06F938

    摘要: A method and system for utilizing a completion table in a superscalar processor is disclosed. The method and system comprises providing a plurality of threads to the processor and associating a link list with each of the threads, wherein each entry associated with a thread is linked to a next entry. A method and system in accordance with the present invention implements the completion table as link lists. Each entry in the completion table in a thread is linked to the next entry via a pointer that is stored in a link list. In a second aspect a method of determining the relative order between instructions is provided. A method and system in accordance with the present invention implements a flush mask array which is accessed to determine the relative order of entries in the said completion table. A method and system in accordance with the present invention implements a restore head pointer table to save and restore the state of the pointer of said completion table.

    摘要翻译: 公开了一种在超标量处理器中利用完成表的方法和系统。 该方法和系统包括向处理器提供多个线程并将链接列表与每个线程相关联,其中与线程相关联的每个条目被链接到下一条目。 根据本发明的方法和系统将完成表实现为链接列表。 线程中完成表中的每个条目通过存储在链接列表中的指针链接到下一个条目。 在第二方面,提供了一种确定指令之间的相对顺序的方法。 根据本发明的方法和系统实现了一个刷新掩模阵列,其被访问以确定所述完成表中的条目的相对顺序。 根据本发明的方法和系统实现恢复头指针表以保存和恢复所述完成表的指针的状态。

    Method and apparatus for patching problematic instructions in a microprocessor using software interrupts
    57.
    发明授权
    Method and apparatus for patching problematic instructions in a microprocessor using software interrupts 有权
    使用软件中断在微处理器中修补有问题的指令的方法和装置

    公开(公告)号:US06631463B1

    公开(公告)日:2003-10-07

    申请号:US09436103

    申请日:1999-11-08

    IPC分类号: G06F900

    摘要: A method and apparatus for patching a problematic instruction within a pipelined processor in a data processing system is presented. A plurality of instructions are fetched, and the plurality of instructions are matched against at least one match condition to generate a matched instruction. The match conditions may include matching the opcode of an instruction, the pre-decode bits of an instruction, a type of instruction, or other conditions. A matched instruction may be marked using a match bit that accompanies the instruction through the instruction pipeline. The matched instruction is then replaced with an internal opcode or internal instruction that causes the instruction scheduling unit to take a special software interrupt. The problematic instruction is then patched through the execution of a set of instructions that cause the desired logical operation of the problematic instruction.

    摘要翻译: 提出了一种用于在数据处理系统中的流水线处理器内修补有问题的指令的方法和装置。 获取多个指令,并且将多个指令与至少一个匹配条件进行匹配以生成匹配的指令。 匹配条件可以包括匹配指令的操作码,指令的预解码位,指令的类型或其他条件。 可以使用伴随指令的匹配位通过指令流水线来标记匹配指令。 匹配的指令被替换为内部操作码或内部指令,使指令调度单元进行特殊的软件中断。 然后通过执行导致有问题的指令的期望的逻辑操作的一组指令来修补有问题的指令。

    System and method for handling instructions occurring after an ISYNC instruction
    58.
    发明授权
    System and method for handling instructions occurring after an ISYNC instruction 失效
    用于以程序顺序有选择地刷新遵循ISYNC屏障指令的指令的系统

    公开(公告)号:US06473850B1

    公开(公告)日:2002-10-29

    申请号:US09389197

    申请日:1999-09-02

    IPC分类号: G06F938

    摘要: An ISYNC instruction does not cause a flush of speculatively dispatched or fetched instructions (instructions that are dispatched or fetched after the ISYNC instruction) unconditionally. The present invention detects the occurrence of any instruction that changes the state of the machine and requires a context synchronizing complete; these instructions are called context-synchronizing-required instructions. When a context-synchronizing-required instruction completes, the present invention sets a flag to note the occurrence of that condition. When an ISYNC instruction completes, the present invention causes a flush and refetches the instruction after the ISYNC if the context-synchronizing-required flag is active. The present invention then resets the context-synchronizing-required flag. If the context-synchronizing-required flag is not active, then the present invention does not generate a flush operation.

    摘要翻译: ISYNC指令不会导致无条件地抛出推测分派或获取的指令(在ISYNC指令之后调度或取出的指令)。 本发明检测改变机器状态并需要上下文同步完成的任何指令的发生; 这些指令称为上下文同步所需指令。 当上下文同步所需指令完成时,本发明设置一个标志以注意该条件的发生。 当ISYNC指令完成时,如果上下文同步所需的标志是活动的,本发明引起冲洗并在ISYNC之后重新指定该指令。 然后,本发明重置上下文同步所需标志。 如果上下文同步所需的标志不是活动的,则本发明不产生刷新操作。

    Scoreboard mechanism for serialized string operations utilizing the XER
    59.
    发明授权
    Scoreboard mechanism for serialized string operations utilizing the XER 失效
    使用XER的串行字符串操作的记分板机制

    公开(公告)号:US06430678B1

    公开(公告)日:2002-08-06

    申请号:US09363463

    申请日:1999-07-29

    IPC分类号: G06F930

    摘要: An XER scoreboard function is provided by utilizing the instruction sequencer unit scoreboard. A scoreboard bit is set if the XER is being used by a previous instruction. If a new instruction is fetched that uses the XER, a dummy read to the XER is generated to test the scoreboard bit to determine if the scoreboard bit is set. If the scoreboard bit is not set when the dummy read is executed, the X-form string proceeds to execution. If the scoreboard bit is set when the dummy is executed, the pipeline is stalled until the scoreboard bit is cleared, and then the X-form string padded with generated padding IOPs (Dummy or NOPs) is executed. After an accessing instruction is executed, the scoreboard bit is cleared.

    摘要翻译: 通过使用指令排序器单元记分板提供XER记分板功能。 如果XER由前一条指令使用,记分板位将被置位。 如果获取使用XER的新指令,则生成对XER的虚拟读取以测试记分板位以确定记分板位是否设置。 如果执行虚拟读取时记分板位未设置,则X形式的字符串将继续执行。 如果在执行虚拟机时设置了记分板位,则流水线停止,直到记分板位被清除,然后执行填充有生成的填充IOP(虚拟或NOP)的X形式字符串。 执行访问指令后,记分板位被清除。

    Method and system for managing registers in a data processing system supports out-of-order and speculative instruction execution
    60.
    发明授权
    Method and system for managing registers in a data processing system supports out-of-order and speculative instruction execution 失效
    用于管理数据处理系统中的寄存器的方法和系统支持无序和推测性指令执行

    公开(公告)号:US06356918B1

    公开(公告)日:2002-03-12

    申请号:US08507542

    申请日:1995-07-26

    IPC分类号: G06F1200

    摘要: A method and a system in a data processing system for managing registers in a register array wherein the data processing system has M architected registers and the register array has greater than M registers. A first physical register address is selected from a group of available physical register addresses in a renamed table in response to dispatching a register-modifying instruction that specifies an architected target register address. The architected target register address is then associated with the first physical register address, and a result of executing the register-modifying instruction is stored in a physical register pointed to by the first physical register address. In response to completing the register-modifying instruction, the first physical address in the rename table is exchanged with a second physical address in a completion renamed table, wherein the second physical address is located in the completion rename table at a location pointed to by the architected target register address. Therefore, upon instruction completion, the completion rename table contains pointers that map architected register addresses to physical register addresses. The rename table maps architected register addresses to physical register addresses for instructions currently being executed, or for instructions that have “finished” and have not yet been “completed.” Bits indicating the validity of an association between an architected register address and a physical register address are stored before instructions are speculatively executed following an unresolved conditional branch.

    摘要翻译: 一种用于管理寄存器阵列中的寄存器的数据处理系统中的方法和系统,其中所述数据处理系统具有M个架构的寄存器,并且所述寄存器阵列具有大于M个寄存器。 响应于调度指定架构化目标寄存器地址的寄存器修改指令,从重命名的表中的一组可用物理寄存器地址中选择第一物理寄存器地址。 然后将架构化的目标寄存器地址与第一物理寄存器地址相关联,并且执行寄存器修改指令的结果存储在由第一物理寄存器地址指向的物理寄存器中。 响应于完成寄存器修改指令,重命名表中的第一物理地址与完成重命名表中的第二物理地址交换,其中第二物理地址位于完成重命名表中,位于由 架构目标寄存器地址。 因此,在指令完成时,完成重命名表包含将架构化的寄存器地址映射到物理寄存器地址的指针。 重命名表将架构化的寄存器地址映射到当前正在执行的指令的物理寄存器地址,或者对于“已完成”但尚未“完成”的指令。 指示建立的寄存器地址和物理寄存器地址之间的关联的有效位的位置是在未解决的条件分支之后推测执行指令之前存储的。