Elimination of traps and atomics in thread synchronization
    1.
    发明授权
    Elimination of traps and atomics in thread synchronization 有权
    在线程同步中消除陷阱和原子

    公开(公告)号:US06230230B1

    公开(公告)日:2001-05-08

    申请号:US09204794

    申请日:1998-12-03

    IPC分类号: G06F1200

    摘要: Elimination of traps and atomics in thread synchronization is provided. In one embodiment, a processor includes a lock cache. The lock cache holds a value that corresponds to or identifies a computer resource only if a current thread executing on the processor owns the computer resource. A lock cache operation (e.g., a lockcachecheck instruction) determines whether a value identifying a computer resource is cached in the lock cache and returns a first predetermined value if the value identifying the computer resource is cached in the lock cache. Otherwise, a second predetermined value is returned.

    摘要翻译: 提供线程同步中的陷阱和原子消除。 在一个实施例中,处理器包括锁高速缓存。 只有当处理器上执行的当前线程拥有计算机资源时,锁缓存才能保存对应于或识别计算机资源的值。 锁定高速缓存操作(例如,锁定检查指令)确定标识计算机资源的值是否被缓存在锁定缓存中,并且如果标识计算机资源的值被缓存在锁定缓存中,则返回第一预定值。 否则,返回第二预定值。

    Locking of computer resources
    2.
    发明授权
    Locking of computer resources 有权
    锁定电脑资源

    公开(公告)号:US06725308B2

    公开(公告)日:2004-04-20

    申请号:US10288393

    申请日:2002-11-05

    IPC分类号: G06F946

    摘要: A computer processor includes a number of register pairs LOCKADD/LOCKCOUNT to hold values identifying when a computer resource is locked. The LOCKCOUNT register is incremented or decremented in response to lock or unlock instructions, respectively. The lock is freed when a count associated with the LOCKCOUNT register is decremented to zero. In embodiments without LOCKOUT registers, the lock may be freed on any unlock instruction corresponding to the lock. In some embodiments, a computer object includes a header in which two header LSBs store: (1) a LOCK bit indicating whether the object is locked, and (2) a WANT bit indicating whether a thread is waiting to acquire a lock for the object.

    摘要翻译: 计算机处理器包括多个寄存器对LOCKADD / LOCKCOUNT以保存识别计算机资源何时被锁定的值。 响应锁定或解锁指令,LOCKCOUNT寄存器分别递增或递减。 当与LOCKCOUNT寄存器关联的计数递减为零时,该锁定被释放。 在没有LOCKOUT寄存器的实施例中,锁可以在对应于锁的任何解锁指令上被释放。 在一些实施例中,计算机对象包括头部,其中两个标题LSB存储:(1)指示对象是否被锁定的LOCK位,以及(2)指示线程是否正在等待获取对象的锁的WANT位 。

    Switching method in a multi-threaded processor

    公开(公告)号:US06694347B2

    公开(公告)日:2004-02-17

    申请号:US10074419

    申请日:2002-02-12

    IPC分类号: G06F900

    摘要: A processor includes logic for attaining a very fast exception handling functionality while executing non-threaded programs by invoking a multithreaded-type functionality in response to an exception condition. The processor, while operating in multithreaded conditions or while executing non-threaded programs, progresses through multiple machine states during execution. The very fast exception handling logic includes connection of an exception signal line to thread select logic, causing an exception signal to evoke a switch in thread and machine state. The switch in thread and machine state causes the processor to enter and to exit the exception handler immediately, without waiting to drain the pipeline or queues and without the inherent timing penalty of the operating system's software saving and restoring of registers.

    Switching method in a multi-threaded processor

    公开(公告)号:US06507862B1

    公开(公告)日:2003-01-14

    申请号:US09309735

    申请日:1999-05-11

    IPC分类号: G06F946

    摘要: A processor includes logic for attaining a very fast exception handling functionality while executing non-threaded programs by invoking a multithreaded-type functionality in response to an exception condition. The processor, while operating in multithreaded conditions or while executing non-threaded programs, progresses through multiple machine states during execution. The very fast exception handling logic includes connection of an exception signal line to thread select logic, causing an exception signal to evoke a switch in thread and machine state. The switch in thread and machine state causes the processor to enter and to exit the exception handler immediately, without waiting to drain the pipeline or queues and without the inherent timing penalty of the operating system's software saving and restoring of registers.

    Switching method in a multi-threaded processor
    5.
    发明授权
    Switching method in a multi-threaded processor 有权
    多线程处理器中的切换方法

    公开(公告)号:US07316021B2

    公开(公告)日:2008-01-01

    申请号:US10779944

    申请日:2004-02-17

    IPC分类号: G06F9/46 G06F9/30

    摘要: A processor includes logic for attaining a very fast exception handling functionality while executing non-threaded programs by invoking a multithreaded-type functionality in response to an exception condition. The processor, while operating in multithreaded conditions or while executing non-threaded programs, progresses through multiple machine states during execution. The very fast exception handling logic includes connection of an exception signal line to thread select logic, causing an exception signal to evoke a switch in thread and machine state. The switch in thread and machine state causes the processor to enter and to exit the exception handler immediately, without waiting to drain the pipeline or queues and without the inherent timing penalty of the operating system's software saving and restoring of registers.

    摘要翻译: 处理器包括用于通过响应于异常情况调用多线程类型功能来执行非线程程序来获得非常快速的异常处理功能的逻辑。 处理器在多线程状态下运行或执行非线程程序时,在执行过程中会经历多个机器状态。 非常快的异常处理逻辑包括将异常信号线连接到线程选择逻辑,导致异常信号引起线程和机器状态的开关。 线程和机器状态的切换使得处理器立即进入并退出异常处理程序,而不用等待排除流水线或队列,并且没有操作系统的软件保存和恢复寄存器的固有时间损失。

    Processor with multiple-thread, vertically-threaded pipeline
    6.
    发明授权
    Processor with multiple-thread, vertically-threaded pipeline 有权
    处理器采用多线程,垂直螺纹管线

    公开(公告)号:US06938147B1

    公开(公告)日:2005-08-30

    申请号:US09309732

    申请日:1999-05-11

    IPC分类号: G06F9/38 G06F9/48 G06F9/00

    CPC分类号: G06F9/4843 G06F9/3851

    摘要: A processor reduces wasted cycle time resulting from stalling and idling, and increases the proportion of execution time, by supporting and implementing both vertical multithreading and horizontal multithreading. Vertical multithreading permits overlapping or “hiding” of cache miss wait times. In vertical multithreading, multiple hardware threads share the same processor pipeline. A hardware thread is typically a process, a lightweight process, a native thread, or the like in an operating system that supports multithreading. Horizontal multithreading increases parallelism within the processor circuit structure, for example within a single integrated circuit die that makes up a single-chip processor. To further increase system parallelism in some processor embodiments, multiple processor cores are formed in a single die. Advances in on-chip multiprocessor horizontal threading are gained as processor core sizes are reduced through technological advancements.

    摘要翻译: 处理器通过支持和实现垂直多线程和水平多线程来减少由于停滞和空闲而导致的浪费周期时间,并增加执行时间的比例。 垂直多线程允许重叠或“隐藏”高速缓存未命中等待时间。 在垂直多线程中,多个硬件线程共享相同的处理器管道。 在支持多线程的操作系统中,硬件线程通常是进程,轻量级进程,本机线程等。 水平多线程增加了处理器电路结构内的并行性,例如在构成单片处理器的单个集成电路管芯内。 为了在一些处理器实施例中进一步增加系统并行性,在单个管芯中形成多个处理器核。 通过技术进步降低了处理器核心尺寸,从而获得片上多处理器水平线程的进步。

    Combining results of selectively executed remaining sub-instructions with that of emulated sub-instruction causing exception in VLIW processor
    7.
    发明授权
    Combining results of selectively executed remaining sub-instructions with that of emulated sub-instruction causing exception in VLIW processor 有权
    将选择执行的剩余子指令的结果与在VLIW处理器中引起异常的仿真子指令的结果相结合

    公开(公告)号:US06405300B1

    公开(公告)日:2002-06-11

    申请号:US09273602

    申请日:1999-03-22

    IPC分类号: G06F944

    摘要: One embodiment of the present invention provides a system that efficiently emulates sub-instructions in a very long instruction word (VLIW) processor. The system operates by receiving an exception condition during execution of a VLIW instruction within a VLIW program. This exception condition indicates that at least one sub-instruction within the VLIW instruction requires emulation in software or software assistance. In processing this exception condition, the system emulates the sub-instructions that require emulation in software and stores the results. The system also selectively executes in hardware any remaining sub-instructions in the VLIW instruction that do not require emulation in software. The system finally combines the results from the sub-instructions emulated in software with the results from the remaining sub-instructions executed in hardware, and resumes execution of the VLIW program.

    摘要翻译: 本发明的一个实施例提供了一种在非常长的指令字(VLIW)处理器中有效地模拟子指令的系统。 该系统通过在VLIW程序中执行VLIW指令期间接收到异常情况来进行操作。 该异常条件表示VLIW指令中的至少一个子指令需要软件或软件协助进行仿真。 在处理此异常情况时,系统会模拟需要软件仿真并存储结果的子指令。 该系统还在硬件中选择性地执行VLIW指令中的任何剩余子指令,这些指令不需要软件仿真。 系统最终将从软件中仿真的子指令的结果与硬件中执行的剩余子指令的结果相结合,并恢复VLIW程序的执行。

    Thread switch logic in a multiple-thread processor
    9.
    发明授权
    Thread switch logic in a multiple-thread processor 有权
    线程切换逻辑在多线程处理器中

    公开(公告)号:US06341347B1

    公开(公告)日:2002-01-22

    申请号:US09309733

    申请日:1999-05-11

    IPC分类号: G06F930

    摘要: A processor includes a thread switching control logic that performs a fast thread-switching operation in response to an L1 cache miss stall. The fast thread-switching operation implements one or more of several thread-switching methods. A first thread-switching operation is “oblivious” thread-switching for every N cycle in which the individual flip-flops locally determine a thread-switch without notification of stalling. The oblivious technique avoids usage of an extra global interconnection between threads for thread selection. A second thread-switching operation is “semi-oblivious” thread-switching for use with an existing “pipeline stall” signal (if any). The pipeline stall signal operates in two capacities, first as a notification of a pipeline stall, and second as a thread select signal between threads so that, again, usage of an extra global interconnection between threads for thread selection is avoided. A third thread-switching operation is an “intelligent global scheduler” thread-switching in which a thread switch decision is based on a plurality of signals including: (1) an L1 data cache miss stall signal, (2) an instruction buffer empty signal, (3) an L2 cache miss signal, (4) a thread priority signal, (5) a thread timer signal, (6) an interrupt signal, or other sources of triggering. In some embodiments, the thread select signal is broadcast as fast as possible, similar to a clock tree distribution. In some systems, a processor derives a thread select signal that is applied to the flip-flops by overloading a scan enable (SE) signal of a scannable flip-flop.

    摘要翻译: 处理器包括线程切换控制逻辑,其响应于L1高速缓存未命中而执行快速线程切换操作。 快速线程切换操作实现了几种线程切换方法中的一种或多种。 第一个线程切换操作对于每个N个周期是“忽视的”线程切换,其中各个触发器本地确定线程切换而不通知失速。 遗忘的技术避免了在线程选择的线程之间使用额外的全局互连。 第二个线程切换操作是与现有的“流水线失速”信号(如果有的话)一起使用的“半隐匿”线程切换。 流水线失速信号以两个容量运行,首先作为流水线停顿的通知,第二个作为线程之间的线程选择信号,这样就可以避免线程选择线程之间额外的全局互连使用。 第三线程切换操作是“智能全局调度器”线程切换,其中线程切换决定基于多个信号,包括:(1)L1数据高速缓存未命中停止信号,(2)指令缓冲器空信号 ,(3)L2高速缓存未命中信号,(4)线程优先信号,(5)线程定时器信号,(6)中断信号或其他触发源。 在一些实施例中,类似于时钟树分布,尽可能快地广播线程选择信号。 在一些系统中,处理器通过过载可扫描触发器的扫描使能(SE)信号来导出施加到触发器的线程选择信号。

    Temporary pipeline register file for a superpipelined superscalar
processor
    10.
    发明授权
    Temporary pipeline register file for a superpipelined superscalar processor 失效
    用于超级管道超标量处理器的临时管道寄存器文件

    公开(公告)号:US6128721A

    公开(公告)日:2000-10-03

    申请号:US153814

    申请日:1993-11-17

    IPC分类号: G06F9/30 G06F9/38 G06F15/00

    摘要: A processor method and apparatus. The processor has an execution pipeline, a register file and a controller. The execution pipeline is for executing an instruction and has a first stage for generating a first result and a last stage for generating a final result. The register file is for storing the first result and the final result. The controller makes the first result stored in the register file available in the event that the first result is needed for the execution of a subsequent instruction. By storing the result of the first stage in the register file, the length of the execution pipeline is reduced from that of the prior art. Furthermore, logic required for providing inputs to the execution pipeline is greatly simplified over that required by the prior art.

    摘要翻译: 一种处理器方法和装置。 处理器具有执行流水线,寄存器文件和控制器。 执行流程用于执行指令,并且具有用于产生第一结果的第一阶段和用于生成最终结果的最后阶段。 寄存器文件用于存储第一个结果和最终结果。 在执行后续指令需要第一个结果的情况下,控制器使存储在寄存器文件中的第一个结果可用。 通过将第一级的结果存储在寄存器文件中,执行流水线的长度比现有技术的长度减少。 此外,与现有技术要求相比,提供输入到执行流水线所需的逻辑大大简化。