Method and logical apparatus for switching between single-threaded and multi-threaded execution states in a simultaneous multi-threaded (SMT) processor
    41.
    发明授权
    Method and logical apparatus for switching between single-threaded and multi-threaded execution states in a simultaneous multi-threaded (SMT) processor 失效
    用于在同时多线程(SMT)处理器中在单线程和多线程执行状态之间切换的方法和逻辑设备

    公开(公告)号:US07155600B2

    公开(公告)日:2006-12-26

    申请号:US10422648

    申请日:2003-04-24

    CPC分类号: G06F9/485

    摘要: A method and logical apparatus for switching between single-threaded and multi-threaded execution states within a simultaneous multi-threaded (SMT) processor provides a mechanism for switching between single-threaded and multi-threaded execution. The processor receives an instruction specifying a transition from a single-threaded to a multi-threaded mode or vice-versa and halts execution of all threads executing on the processor. Internal control logic controls a sequence of events that ends instruction prefetching, dispatch of new instructions, interrupt processing and maintenance operations and waits for operation of the processor to complete for instructions that are in process. Then, the logic determines one or more threads to start in conformity with a thread enable state specifying the enable state of multiple threads and reallocates various resources, dividing them between threads if multiple threads are specified for further execution (multi-threaded mode) or allocating substantially all of the resources to a single thread if further execution is specified as single-threaded mode. The processor then starts execution of the remaining enabled threads.

    摘要翻译: 用于在同时多线程(SMT)处理器中的单线程和多线程执行状态之间切换的方法和逻辑设备提供了在单线程和多线程执行之间进行切换的机制。 处理器接收指定从单线程转换到多线程模式或反之亦然的指令,并停止在处理器上执行的所有线程的执行。 内部控制逻辑控制结束指令预取,调度新指令,中断处理和维护操作的事件序列,并等待处理器的操作完成以处理正在进行的指令。 然后,逻辑根据指定多个线程的使能状态的线程使能状态确定一个或多个线程,以重新分配各种资源,如果多个线程被指定用于进一步执行(多线程模式)或分配 如果进一步执行被指定为单线程模式,则基本上所有的资源到单个线程。 然后,处理器开始执行剩余的已启用线程。

    Enhanced STCX design to improve subsequent load efficiency
    42.
    发明申请
    Enhanced STCX design to improve subsequent load efficiency 有权
    增强的STCX设计,以提高后续的负载效率

    公开(公告)号:US20060212653A1

    公开(公告)日:2006-09-21

    申请号:US11082761

    申请日:2005-03-17

    IPC分类号: G06F12/00

    摘要: A method, system and computer program product for processing in a multiprocessor data processing system are disclosed. The method includes, in response to executing a load-and-reserve instruction in a processor core, the processing core sending a load-and-reserve operation for an address to a lower level cache of a memory hierarchy, invalidating data for the address in a store-through upper level cache, and placing data returned from the lower level cache into the store-through upper level cache.

    摘要翻译: 公开了一种用于在多处理器数据处理系统中处理的方法,系统和计算机程序产品。 该方法包括响应于在处理器核心中执行加载和保留指令,处理核心向地址的低级缓存发送地址的加载和预留操作,使存储器层次结构中的地址的数据无效 通过存储的上级缓存,并将从较低级别高速缓存返回的数据放置到通过存储的上级缓存中。

    Thread priority method, apparatus, and computer program product for ensuring processing fairness in simultaneous multi-threading microprocessors
    43.
    发明申请
    Thread priority method, apparatus, and computer program product for ensuring processing fairness in simultaneous multi-threading microprocessors 失效
    线程优先方法,装置和计算机程序产品,用于确保同时多线程微处理器的处理公平性

    公开(公告)号:US20060184946A1

    公开(公告)日:2006-08-17

    申请号:US11055850

    申请日:2005-02-11

    IPC分类号: G06F9/46

    摘要: A method, apparatus, and computer program product are disclosed in a data processing system for ensuring processing fairness in simultaneous multi-threading (SMT) microprocessors that concurrently execute multiple threads during each clock cycle. A clock cycle priority is assigned to a first thread and to a second thread during a standard selection state that lasts for an expected number of clock cycles. The clock cycle priority is assigned according to a standard selection definition during the standard selection state by selecting the first thread to be a primary thread and the second thread to be a secondary thread during the standard selection state. If a condition exists that requires overriding the standard selection definition, an override state is executed during which the standard selection definition is overridden by selecting the second thread to be the primary thread and the first thread to be the secondary thread. The override state is forced to be executed for an override period of time which equals the expected number of clock cycles plus a forced number of clock cycles. The forced number of clock cycles is granted to the first thread in response to the first thread again becoming the primary thread.

    摘要翻译: 在数据处理系统中公开了一种方法,装置和计算机程序产品,用于确保在每个时钟周期期间同时执行多个线程的同时多线程(SMT)微处理器中的处理公平性。 在持续预期数量的时钟周期的标准选择状态期间,将时钟周期优先级分配给第一线程和第二线程。 在标准选择状态期间,根据标准选择定义分配时钟周期优先级,通过在标准选择状态期间选择作为主线程的第一线程和第二线程作为次线程。 如果存在需要覆盖标准选择定义的条件,则执行超越状态,在该状态期间,通过选择第二个线程作为主线程,并将第一个线程作为次要线程来覆盖标准选择定义。 超时状态被强制执行超时时间等于预期的时钟周期数加上强制的时钟周期数。 响应于第一个线程再次成为主线程,强制的时钟周期数被授予第一个线程。

    Enabling and disabling cache bypass using predicted cache line usage
    44.
    发明申请
    Enabling and disabling cache bypass using predicted cache line usage 失效
    使用预测的缓存线路使用启用和禁用缓存旁路

    公开(公告)号:US20060112233A1

    公开(公告)日:2006-05-25

    申请号:US10993531

    申请日:2004-11-19

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0888 G06F12/0897

    摘要: Arrangements and method for enabling and disabling cache bypass in a computer system with a cache hierarchy. Cache bypass status is identified with respect to at least one cache line. A cache line identified as cache bypass enabled is transferred to one or more higher level caches of the cache hierarchy, whereby a next higher level cache in the cache hierarchy is bypassed, while a cache line identified as cache bypass disabled is transferred to one or more higher level caches of the cache hierarchy, whereby a next higher level cache in the cache hierarchy is not bypassed. Included is an arrangement for selectively enabling or disabling cache bypass with respect to at least one cache line based on historical cache access information.

    摘要翻译: 在具有缓存层次结构的计算机系统中启用和禁用缓存旁路的安排和方法。 相对于至少一个高速缓存行来标识缓存旁路状态。 标识为启用高速缓存旁路的高速缓存线路被传送到高速缓存层级的一个或多个较高级别的高速缓存,由此旁路高速缓存层级中的下一个较高级别的高速缓存,而被识别为高速缓存旁路禁用的高速缓存线路被转移到一个或多个 高速缓存层级的更高级别的高速缓存,从而不绕过高速缓存层级中的下一级高速缓存。 包括一种用于基于历史缓存访问信息选择性地启用或禁用相对于至少一个高速缓存行的高速缓存旁路的装置。

    Performance throttling for temperature reduction in a microprocessor
    45.
    发明授权
    Performance throttling for temperature reduction in a microprocessor 失效
    微处理器降温性能节流

    公开(公告)号:US07051221B2

    公开(公告)日:2006-05-23

    申请号:US10425399

    申请日:2003-04-28

    IPC分类号: G06F1/32

    摘要: A microprocessor includes a functional block having dynamic power savings circuitry, a functional block control circuit, and a thermal control unit. The functional block control circuits are capable of altering performance characteristics of their associated functional blocks automatically upon detecting an over temperature condition. The thermal control unit receives an over-temperature signal indicating a processor temperature exceeding a threshold and invokes the one or more of the functional block control units in response to the signal. The functional block control units respond to signals from the thermal control unit by reducing processor activity, slowing processor performance, or both. The reduced activity that results causes the dynamic power saving circuitry to engage. The functional block control units can throttle performance by numerous means including reducing the exploitable parallelism within the processor, suspending out-of-order execution, reducing effective resource size, and the like.

    摘要翻译: 微处理器包括具有动态功率节省电路的功能块,功能块控制电路和热控制单元。 功能块控制电路能够在检测到过温度条件时自动改变其相关功能块的性能特性。 热控制单元接收指示处理器温度超过阈值的过温度信号,并响应于该信号调用一个或多个功能块控制单元。 功能块控制单元通过减少处理器活动,降低处理器性能或两者来响应来自热控制单元的信号。 导致动态省电电路参与的活动减少。 功能块控制单元可以通过多种方式来抑制性能,包括减少处理器内可利用的并行性,暂停无序执行,减少有效的资源大小等。

    Branch prediction circuits and methods and systems using the same
    46.
    发明授权
    Branch prediction circuits and methods and systems using the same 失效
    分支预测电路及其使用方法和系统

    公开(公告)号:US07000096B1

    公开(公告)日:2006-02-14

    申请号:US09631726

    申请日:2000-08-03

    申请人: Balaram Sinharoy

    发明人: Balaram Sinharoy

    IPC分类号: G06F9/00

    CPC分类号: G06F9/3861 G06F9/3848

    摘要: A method of generating a Global History Vector includes the steps of determining if a selected group of instructions contains a branch instruction. A current Global History Vector is maintained in a shift register when the selected group does not contain a branch instruction. A first value is shifted into the shift register to generate a second vector if the selected group contains a branch instruction which is predicted to be a branch taken. A second value is shifted into the shift register to generate a second vector when the selected group contains a branch instruction and the selected group does not include a branch instruction predicted to be a branch taken.

    摘要翻译: 一种产生全局历史向量的方法包括以下步骤:确定选定的指令组是否包含分支指令。 当所选择的组不包含分支指令时,当前的全局历史矢量保持在移位寄存器中。 如果所选择的组包含被预测为分支的分支指令,则将第一值移入移位寄存器以产生第二向量。 当所选择的组包含分支指令并且所选择的组不包括预测为分支的分支指令时,将第二值移位到移位寄存器中以产生第二向量。

    Apparatus and method for performing branch predictions using dual branch history tables and for updating such branch history tables
    47.
    发明授权
    Apparatus and method for performing branch predictions using dual branch history tables and for updating such branch history tables 失效
    用于使用双分支历史表执行分支预测并用于更新这种分支历史表的装置和方法

    公开(公告)号:US06823446B1

    公开(公告)日:2004-11-23

    申请号:US09549154

    申请日:2000-04-13

    申请人: Balaram Sinharoy

    发明人: Balaram Sinharoy

    IPC分类号: G06F900

    CPC分类号: G06F9/3848

    摘要: A branch prediction method includes the step of retrieving prediction values from a local branch history table and a global branch history table. A branch prediction operation is selectively performed using the value retrieved from the local branch history table when the value from the local branch history table falls within first predicted limits. A branch prediction operation is selectively performed using the value retrieved from the global branch history table when the value from the global branch history falls within a second predetermined limit.

    摘要翻译: 分支预测方法包括从本地分支历史表和全局分支历史表检索预测值的步骤。 当来自本地分支历史表的值落在第一预定限度内时,使用从本地分支历史表检索到的值来选择性地执行分支预测操作。 当来自全球分支历史的值落在第二预定极限内时,使用从全局分支历史表检索到的值来选择性地执行分支预测操作。

    Prefetching instructions in mis-predicted path for low confidence branches
    48.
    发明授权
    Prefetching instructions in mis-predicted path for low confidence branches 有权
    预读路径中的低信号分支预取指令

    公开(公告)号:US06766441B2

    公开(公告)日:2004-07-20

    申请号:US09765163

    申请日:2001-01-19

    申请人: Balaram Sinharoy

    发明人: Balaram Sinharoy

    IPC分类号: G06F938

    摘要: In a first aspect of the present invention, a method for prefetching instructions in a superscalar processor is disclosed. The method comprises the steps of fetching a set of instructions along a predicted path and prefetching a predetermined number of instructions if a low confidence branch is fetched and storing the predetermined number of instructions in a prefetch buffer. In a second aspect of the present invention, a system for prefetching instructions in a superscalar processor is disclosed. The system comprises a cache for fetching a set of instructions along a predicted path, a prefetching mechanism coupled to the cache for prefetching a predetermined number of instructions if a low confidence branch is fetched and a prefetch buffer coupled to the prefetching mechanism for storing the predetermined number of instructions. Through the use of the method and system in accordance with the present invention, existing prefetching algorithms are improved with minimal additional hardware cost.

    摘要翻译: 在本发明的第一方面,公开了一种用于在超标量处理器中预取指令的方法。 该方法包括以下步骤:如果获取低置信度分支并将预定数量的指令存储在预取缓冲器中,则沿着预测路径获取一组指令并预取预定数量的指令。 在本发明的第二方面中,公开了一种用于在超标量处理器中预取指令的系统。 该系统包括用于沿着预测路径获取一组指令的高速缓存器,耦合到高速缓冲存储器的预取机构,用于如果获取低置信度分支则预取数量的指令,以及耦合到预取机构的预取缓冲器,用于存储预定的 指令数量 通过使用根据本发明的方法和系统,现有的预取算法被改进,同时最小的附加硬件成本。

    Processor and method for partially flushing a dispatched instruction group including a mispredicted branch
    49.
    发明授权
    Processor and method for partially flushing a dispatched instruction group including a mispredicted branch 有权
    用于部分刷新分派指令组的处理器和方法,包括错误预测的分支

    公开(公告)号:US09489207B2

    公开(公告)日:2016-11-08

    申请号:US12423495

    申请日:2009-04-14

    IPC分类号: G06F9/30 G06F9/38

    摘要: Mechanisms are provided for partial flush handling with multiple branches per instruction group. The instruction fetch unit sorts instructions into groups. A group may include a floating branch instruction and a boundary branch instruction. For each group of instructions, the instruction sequencing unit creates an entry in a global completion table (GCT), which may also be referred to herein as a group completion table. The instruction sequencing unit uses the GCT to manage completion of instructions within each outstanding group. Because each group may include up to two branches, the instruction sequencing unit may dispatch instructions beyond the first branch, i.e. the floating branch. Therefore, if the floating branch results in a misprediction, the processor performs a partial flush of that group, as well as a flush of every group younger than that group.

    摘要翻译: 提供了用于部分刷新处理的机制,每个指令组具有多个分支。 指令提取单元将指令分组分组。 组可以包括浮动分支指令和边界分支指令。 对于每组指令,指令排序单元在全局完成表(GCT)中创建条目,其也可以在此被称为组完成表。 指令排序单元使用GCT来管理每个优秀组内的指令完成。 因为每个组可以包括多达两个分支,所以指令排序单元可以分派指令超出第一分支,即浮动分支。 因此,如果浮动分支导致错误预测,则处理器将对该组进行部分刷新,以及每个小于该组的组的刷新。