Pipelined processor with multi-cycle grouping for instruction dispatch with inter-group and intra-group dependency checking
    51.
    发明授权
    Pipelined processor with multi-cycle grouping for instruction dispatch with inter-group and intra-group dependency checking 有权
    具有多周期分组的流水线处理器,用于通过组间和组内依赖性检查进行指令分派

    公开(公告)号:US07430653B1

    公开(公告)日:2008-09-30

    申请号:US09583097

    申请日:1999-08-02

    申请人: Marc Tremblay

    发明人: Marc Tremblay

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3885 G06F9/3853

    摘要: A pipelined instruction dispatch or grouping circuit allows instruction dispatch decisions to be made over multiple processor cycles. In one embodiment, the grouping circuit performs resource allocation and data dependency checks on an instruction group, based on a state vector which includes representation of source and destination registers of instructions within said instruction group and corresponding state vectors for instruction groups of a number of preceding processor cycles.

    摘要翻译: 流水线指令调度或分组电路允许在多个处理器周期内进行指令调度决策。 在一个实施例中,分组电路基于包括所述指令组中的指令的源和目的寄存器的表示的状态向量以及前面的多个指令组的指令组的对应状态向量来对指令组执行资源分配和数据依赖性检查 处理器周期。

    Fail instruction to support transactional program execution
    52.
    发明授权
    Fail instruction to support transactional program execution 有权
    支持事务性程序执行的失败指令

    公开(公告)号:US07418577B2

    公开(公告)日:2008-08-26

    申请号:US10637169

    申请日:2003-08-08

    IPC分类号: G06F9/00

    摘要: One embodiment of the present invention provides a system that supports executing a fail instruction, which terminates transactional execution of a block of instructions. During operation, the system facilitates transactional execution of a block of instructions within a program, wherein changes made during the transactional execution are not committed to the architectural state of the processor until the transactional execution successfully completes. If a fail instruction is encountered during this transactional execution, the system terminates the transactional execution without committing results of the transactional execution to the architectural state of the processor.

    摘要翻译: 本发明的一个实施例提供一种支持执行失败指令的系统,其终止指令块的事务执行。 在操作期间,系统促进程序内的指令块的事务执行,其中在事务执行期间所做的更改不会被提交到处理器的体系结构状态,直到事务执行成功完成。 如果在此事务执行期间遇到失败指令,则系统终止事务执行,而不将事务执行的结果提交给处理器的体系结构状态。

    Method and apparatus for synchronizing threads on a processor that supports transactional memory
    53.
    发明申请
    Method and apparatus for synchronizing threads on a processor that supports transactional memory 有权
    用于在支持事务性存储器的处理器上同步线程的方法和装置

    公开(公告)号:US20070240158A1

    公开(公告)日:2007-10-11

    申请号:US11418652

    申请日:2006-05-05

    IPC分类号: G06F9/46

    摘要: One embodiment of the present invention provides a system that synchronizes threads on a multi-threaded processor. The system starts by executing instructions from a multi-threaded program using a first thread and a second thread. When the first thread reaches a predetermined location in the multi-threaded program, the first thread executes a Start-Transactional-Execution (STE) instruction to commence transactional execution, wherein the STE instruction specifies a location to branch to if transactional execution fails. During the subsequent transactional execution, the first thread accesses a mailbox location in memory (which is also accessible by the second thread) and then executes instructions that cause the first thread to wait. When the second thread reaches a second predetermined location in the multi-threaded program, the second thread signals the first thread by accessing the mailbox location, which causes the transactional execution of the first thread to fail, thereby causing the first thread to resume non-transactional execution from the location specified in the STE instruction. In this way, the second thread can signal to the first thread without the first thread having to poll a shared variable.

    摘要翻译: 本发明的一个实施例提供了一种在多线程处理器上同步线程的系统。 系统通过使用第一个线程和第二个线程执行来自多线程程序的指令来启动。 当第一线程到达多线程程序中的预定位置时,第一线程执行开始 - 事务执行(STE)指令以开始事务执行,其中STE指令指定分支到事务执行失败的位置。 在随后的事务执行期间,第一个线程访问存储器中的邮箱位置(也可由第二个线程访问),然后执行使第一个线程等待的指令。 当第二线程到达多线程程序中的第二预定位置时,第二线程通过访问邮箱位置来发信号通知第一线程,这导致第一线程的事务执行失败,从而使第一线程恢复为非线程, 从STE指令中指定的位置进行事务执行。 以这种方式,第二线程可以向第一线程发信号,而第一线程不必轮询共享变量。

    WORKING REGISTER FILE ENTRIES WITH INSTRUCTION BASED LIFETIME
    54.
    发明申请
    WORKING REGISTER FILE ENTRIES WITH INSTRUCTION BASED LIFETIME 有权
    使用基于生命周期的工作注册文件

    公开(公告)号:US20070226467A1

    公开(公告)日:2007-09-27

    申请号:US11425869

    申请日:2006-06-22

    IPC分类号: G06F9/30

    摘要: A technique for operating a computing apparatus includes allocating a working register file entry corresponding to a register in a working register file when an instruction referencing the register proceeds through a particular stage of the computing apparatus. The technique maintains the working register file entry until at least a predetermined number of subsequent instructions have similarly proceeded through the particular stage.

    摘要翻译: 一种用于操作计算设备的技术包括:当参考寄存器的指令通过计算设备的特定阶段进行时,分配与工作寄存器文件中的寄存器相对应的工作寄存器文件条目。 该技术维持工作寄存器文件条目,直到至少预定数量的后续指令已经类似地进行到特定阶段。

    Enforcing memory-reference ordering requirements at the L2 cache level
    55.
    发明申请
    Enforcing memory-reference ordering requirements at the L2 cache level 有权
    在L2缓存级别执行内存引用排序要求

    公开(公告)号:US20070198778A1

    公开(公告)日:2007-08-23

    申请号:US11592835

    申请日:2006-11-03

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0897

    摘要: One embodiment of the present invention provides a system that enforces memory-reference ordering requirements at an L2 cache. During operation, the system receives a load at the L2 cache, wherein the load previously caused a miss at an L1 cache. Upon receiving the load, the system performs a lookup for the load in reflections of store buffers associated with other L1 caches. These reflections are located at the L2 cache, and each reflection contains addresses for stores in a corresponding store buffer associated with an L1 cache, and possibly contains data that was overwritten by the stores. If the lookup generates a hit, which indicates that the load may potentially interfere with a store, the system causes the load to wait to execute until the store commits.

    摘要翻译: 本发明的一个实施例提供一种在L2高速缓存上实施存储器参考排序要求的系统。 在操作期间,系统在L2高速缓存中接收负载,其中负载先前在L1高速缓存引起了错过。 在接收到负载后,系统以与其他L1高速缓存相关联的存储缓冲器的反射来执行对负载的查找。 这些反射位于L2高速缓存中,每个反射都包含与L1缓存相关联的存储缓冲器中的存储地址,并且可能包含由存储器覆盖的数据。 如果查找生成一个命中,这表明该负载可能潜在地干扰一个存储,系统会导致负载等待执行,直到存储提交。

    Supporting out-of-order issue in an execute-ahead processor
    56.
    发明申请
    Supporting out-of-order issue in an execute-ahead processor 审中-公开
    支持执行处理器中的乱序问题

    公开(公告)号:US20070186081A1

    公开(公告)日:2007-08-09

    申请号:US11367814

    申请日:2006-03-03

    IPC分类号: G06F9/30

    摘要: One embodiment of the present invention provides a system which supports out-of-order issue in a processor that normally executes instructions in-order. The system starts by issuing instructions from an issue queue in program order during a normal-execution mode. While issuing the instructions, the system determines if any instruction in the issue queue has an unresolved short-latency data dependency which depends on a short-latency operation. If so, the system generates a checkpoint and enters an out-of-order-issue mode, wherein instructions in the issue queue with unresolved short-latency data dependencies are held and not issued, and wherein other instructions in the issue queue without unresolved data dependencies are allowed to issue out-of-order.

    摘要翻译: 本发明的一个实施例提供了一种支持处理器中的乱序问题的系统,其通常按顺序执行指令。 在正常执行模式期间,系统以程序顺序从发出队列发出指令开始。 在发出指令时,系统确定发出队列中的任何指令是否具有取决于短暂延迟操作的未解决的短延迟数据依赖性。 如果是这样,系统将生成一个检查点并进入无序发布模式,其中具有未解决的短延迟数据依赖性的发布队列中的指令被保留并且不发出,并且其中发出队列中的其他指令没有未解析的数据 依赖关系被允许发布无序。

    Multiple-thread processor with in-pipeline, thread selectable storage
    57.
    发明授权
    Multiple-thread processor with in-pipeline, thread selectable storage 有权
    多线程处理器具有管线,线程可选存储

    公开(公告)号:US07185185B2

    公开(公告)日:2007-02-27

    申请号:US10403406

    申请日:2003-03-31

    IPC分类号: G06F12/12

    摘要: A processor reduces wasted cycle time resulting from stalling and idling, and increases the proportion of execution time, by supporting and implementing both vertical multithreading and horizontal multithreading. Vertical multithreading permits overlapping or “hiding” of cache miss wait times. In vertical multithreading, multiple hardware threads share the same processor pipeline. A hardware thread is typically a process, a lightweight process, a native thread, or the like in an operating system that supports multithreading. Horizontal multithreading increases parallelism within the processor circuit structure, for example within a single integrated circuit die that makes up a single-chip processor. To further increase system parallelism in some processor embodiments, multiple processor cores are formed in a single die. Advances in on-chip multiprocessor horizontal threading are gained as processor core sizes are reduced through technological advancements.

    摘要翻译: 处理器通过支持和实现垂直多线程和水平多线程来减少由于停滞和空闲而导致的浪费周期时间,并增加执行时间的比例。 垂直多线程允许重叠或“隐藏”高速缓存未命中等待时间。 在垂直多线程中,多个硬件线程共享相同的处理器管道。 在支持多线程的操作系统中,硬件线程通常是进程,轻量级进程,本机线程等。 水平多线程增加了处理器电路结构内的并行性,例如在构成单片处理器的单个集成电路管芯内。 为了在一些处理器实施例中进一步增加系统并行性,在单个管芯中形成多个处理器核。 通过技术进步降低了处理器核心尺寸,从而获得片上多处理器水平线程的进步。

    Method and structure for explicit software control of data speculation
    58.
    发明申请
    Method and structure for explicit software control of data speculation 审中-公开
    显式软件控制数据推测的方法和结构

    公开(公告)号:US20070006195A1

    公开(公告)日:2007-01-04

    申请号:US11082281

    申请日:2005-03-16

    IPC分类号: G06F9/45

    摘要: Explicit software control is used for data speculations. The explicit software control is applied at selected locations in a computer program to provide the benefit of data speculation while eliminating the need for hardware to perform data speculation. A computer-based method first determines, via explicit software control, whether data speculation for an item, a variable, a pointer, an address, etc., is needed. Upon determining that data speculation for the item is needed, the data speculation is performed under explicit software control. Conversely, if the explicit software control determines that data speculation is not needed, e.g., the value of the item typically obtained by execution of a long latency instruction, is available, an original code segment is executed using an actual value of the item.

    摘要翻译: 显式软件控制用于数据推测。 显式软件控制应用于计算机程序中的选定位置,以提供数据推测的优点,同时不需要硬件来执行数据推测。 基于计算机的方法首先通过显式软件控制来确定是否需要对项目,变量,指针,地址等的数据推测。 在确定需要该项目的数据推测时,数据推测是在明确的软件控制下执行的。 相反,如果显式软件控制确定不需要数据推测,例如,通常通过执行长延迟指令获得的项目的值是可用的,则使用该项目的实际值来执行原始代码段。

    Implicitly derived register specifiers in a processor
    59.
    发明授权
    Implicitly derived register specifiers in a processor 有权
    在处理器中隐式导出寄存器说明符

    公开(公告)号:US07117342B2

    公开(公告)日:2006-10-03

    申请号:US09204479

    申请日:1998-12-03

    IPC分类号: G06F9/30

    摘要: A processor executes an instruction set including instructions in which a register specifier is implicitly derived, based on another register specifier. One technique for implicitly deriving a register specifier is to add or subtract one from a specifically-defined register specifier. Implicit derivation of a register specifier is selectively implemented for some opcodes. A decoder decodes instructions that use implicitly-derived register specifiers and reads the explicitly-defined register. The decoder generates pointers both to the explicitly-defined register and to the implicitly-derived register. In other embodiments, a pointer to registers within a register file includes an additional bit indicating that a register read is accompanied by a read of an implicitly-derived register.

    摘要翻译: 处理器基于另一个寄存器说明符执行指令集,该指令集包括其中隐含地导出寄存器说明符的指令。 用于隐式导出寄存器说明符的一种技术是从特定定义的寄存器说明符添加或减去寄存器说明符。 一些操作码有选择地实现了寄存器说明符的隐式推导。 解码器解码使用隐式导出的寄存器说明符并读取明确定义的寄存器的指令。 解码器生成指向明确定义的寄存器和隐式导出寄存器的指针。 在其他实施例中,寄存器文件中的寄存器指针包括指示寄存器读取伴随着隐式导出寄存器的读取的附加位。

    Selectively deferring instructions issued in program order utilizing a checkpoint and multiple deferral scheme
    60.
    发明授权
    Selectively deferring instructions issued in program order utilizing a checkpoint and multiple deferral scheme 有权
    使用检查点和多个延期方案选择性地推迟以程序顺序发布的指令

    公开(公告)号:US07114060B2

    公开(公告)日:2006-09-26

    申请号:US10686061

    申请日:2003-10-14

    IPC分类号: G06F9/38

    摘要: One embodiment of the present invention provides a system that facilitates deferring execution of instructions with unresolved data dependencies as they are issued for execution in program order. During a normal execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system generates a checkpoint that can subsequently be used to return execution of the program to the point of the instruction. Next, the system executes subsequent instructions in an execute-ahead mode, wherein instructions that cannot be executed because of an unresolved data dependency are deferred, and wherein other non-deferred instructions are executed in program order.

    摘要翻译: 本发明的一个实施例提供了一种系统,其有助于在按照程序顺序执行时,推迟执行具有未解决的数据依赖性的指令。 在正常执行模式下,系统以程序顺序发出执行指令。 在执行指令期间遇到未解决的数据依赖性时,系统产生一个检查点,随后可以使用该检查点将程序的执行返回到指令点。 接下来,系统以执行模式执行后续指令,其中由于未解决的数据依赖性而不能执行的指令被延迟,并且其中其他非延迟指令以程序顺序执行。