Apparatus and method for fine-grained multithreading in a multipipelined processor core
    1.
    发明授权
    Apparatus and method for fine-grained multithreading in a multipipelined processor core 有权
    多重处理器核心中的细粒度多线程的装置和方法

    公开(公告)号:US07401206B2

    公开(公告)日:2008-07-15

    申请号:US10880488

    申请日:2004-06-30

    IPC分类号: G06F9/34

    摘要: An apparatus and method for fine-grained multithreading in a multipipelined processor core. According to one embodiment, a processor may include instruction fetch logic configured to assign a given one of a plurality of threads to a corresponding one of a plurality of thread groups, where each of the plurality of thread groups may comprise a subset of the plurality of threads, to issue a first instruction from one of the plurality of threads during one execution cycle, and to issue a second instruction from another one of the plurality of threads during a successive execution cycle. The processor may further include a plurality of execution units, each configured to execute instructions issued from a respective thread group.

    摘要翻译: 一种用于多行处理器核心中的细粒度多线程的装置和方法。 根据一个实施例,处理器可以包括指令提取逻辑,其被配置为将多个线程中的给定一个线程分配给多个线程组中的相应一个线程组,其中多个线程组中的每一个可以包括多个线程组的子集 线程,以在一个执行周期期间从多个线程之一发出第一指令,并且在连续执行周期期间从多个线程中的另一个发出第二指令。 处理器还可以包括多个执行单元,每个执行单元被配置为执行从相应的线程组发出的指令。

    Multi-threaded instruction buffer design

    公开(公告)号:US10346173B2

    公开(公告)日:2019-07-09

    申请号:US13041881

    申请日:2011-03-07

    IPC分类号: G06F9/30 G06F9/38

    摘要: An instruction buffer for a processor configured to execute multiple threads is disclosed. The instruction buffer is configured to receive instructions from a fetch unit and provide instructions to a selection unit. The instruction buffer includes one or more memory arrays comprising a plurality of entries configured to store instructions and/or other information (e.g., program counter addresses). One or more indicators are maintained by the processor and correspond to the plurality of threads. The one or more indicators are usable such that for instructions received by the instruction buffer, one or more of the plurality entries of a memory array can be determined as a write destination for the received instructions, and for instructions to be read from the instruction buffer (and sent to a selection unit), one or more entries can be determined as the correct source location from which to read.

    System and method for balancing instruction loads between multiple execution units using assignment history
    4.
    发明授权
    System and method for balancing instruction loads between multiple execution units using assignment history 有权
    用于使用分配历史平衡多个执行单元之间的指令加载的系统和方法

    公开(公告)号:US09122487B2

    公开(公告)日:2015-09-01

    申请号:US12490005

    申请日:2009-06-23

    IPC分类号: G06F9/38

    摘要: A system and method for balancing instruction loads between multiple execution units are disclosed. One or more execution units may be represented by a slot configured to accept instructions on behalf of the execution unit(s). A decode unit may assign instructions to a particular slot for subsequent scheduling for execution. Slot assignments may be made based on an instruction's type and/or on a history of previous slot assignments. A cumulative slot assignment history may be maintained in a bias counter, the value of which reflects the bias of previous slot assignments. Slot assignments may be determined based on the value of the bias counter, in order to balance the instruction load across all slots, and all execution units. The bias counter may reflect slot assignments made only within a desired historical window. A separate data structure may store data reflecting the actual slot assignments made during the desired historical window.

    摘要翻译: 公开了一种用于平衡多个执行单元之间的指令负载的系统和方法。 一个或多个执行单元可以由被配置为接受代表执行单元的指令的时隙来表示。 解码单元可以向特定时隙分配指令用于后续调度以执行。 插槽分配可以基于指令的类型和/或先前的时隙分配的历史来进行。 可以在偏置计数器中保持累积时隙分配历史,其偏差反映了先前时隙分配的偏差。 可以基于偏置计数器的值来确定插槽分配,以便平衡所有时隙上的指令负载以及所有执行单元。 偏置计数器可以反映仅在期望的历史窗口内进行的时隙分配。 单独的数据结构可以存储反映在所需历史窗口期间进行的实际时隙分配的数据。

    Accessing a multibank register file using a thread identifier
    5.
    发明授权
    Accessing a multibank register file using a thread identifier 有权
    使用线程标识符访问多银行寄存器文件

    公开(公告)号:US08458446B2

    公开(公告)日:2013-06-04

    申请号:US12570682

    申请日:2009-09-30

    IPC分类号: G06F9/30

    摘要: A processor includes an instruction fetch unit configured to issue instructions for execution, where the instructions are selected from a number of threads, where each given instruction has a corresponding thread identifier, and where at least some of the instructions specify operand(s) via register identifiers. A register file stores operands usable by the instructions, and may include several banks, each corresponding to a register identifiers and including several entries corresponding to the several threads, wherein the entries are configured to store data values. In response to receiving a request to read a particular register identifier for a given thread identifier, the register file may be configured to decode the given thread identifier to retrieve entries from the banks that correspond to the given thread identifier. The register file may further select, from among the retrieved entries, a data value corresponding to the particular register identifier to be output.

    摘要翻译: 处理器包括:指令获取单元,被配置为发出用于执行的指令,其中从多个线程中选择指令,其中每个给定指令具有对应的线程标识符,并且其中至少一些指令经由寄存器指定操作数 身份标识。 寄存器文件存储指令可用的操作数,并且可以包括几个存储体,每个存储体对应于寄存器标识符,并且包括与多个线程对应的多个条目,其中条目被配置为存储数据值。 响应于接收到针对给定线程标识符读取特定寄存器标识符的请求,寄存器文件可以被配置为对给定的线程标识符进行解码以从对应于给定线程标识符的存储体检索条目。 寄存器文件还可以从检索到的条目中选择与要输出的特定寄存器标识符对应的数据值。

    Method and system for sharing functional units of a multithreaded processor
    6.
    发明授权
    Method and system for sharing functional units of a multithreaded processor 有权
    用于共享多线程处理器功能单元的方法和系统

    公开(公告)号:US08095778B1

    公开(公告)日:2012-01-10

    申请号:US10880712

    申请日:2004-06-30

    申请人: Robert T. Golla

    发明人: Robert T. Golla

    IPC分类号: G06F9/40

    摘要: Sharing functional units within a multithreaded processor. In one embodiment, the multithreaded processor may include a multithreaded instruction source that may provide an instruction from each of a plurality of thread groups in a given cycle. A given thread group may include one or more instructions from one or more threads. The arbitration functionality may arbitrate between the plurality of thread groups for access to a functional unit such as a load store unit, for example, that may be shared between the thread groups.

    摘要翻译: 在多线程处理器中共享功能单元。 在一个实施例中,多线程处理器可以包括可以在给定周期中从多个线程组中的每一个提供指令的多线程指令源。 给定的线程组可以包括来自一个或多个线程的一个或多个指令。 仲裁功能可以在多个线程组之间仲裁以访问功能单元,例如可以在线程组之间共享的加载存储单元。

    THREAD FAIRNESS ON A MULTI-THREADED PROCESSOR WITH MULTI-CYCLE CRYPTOGRAPHIC OPERATIONS
    7.
    发明申请
    THREAD FAIRNESS ON A MULTI-THREADED PROCESSOR WITH MULTI-CYCLE CRYPTOGRAPHIC OPERATIONS 有权
    具有多周期运行的多线程处理器的螺纹公差

    公开(公告)号:US20110276783A1

    公开(公告)日:2011-11-10

    申请号:US12773278

    申请日:2010-05-04

    IPC分类号: G06F9/38

    摘要: Systems and methods for efficient execution of operations in a multi-threaded processor. Each thread may include a blocking instruction. A blocking instruction blocks other threads from utilizing hardware resources for an appreciable amount of time. One example of a blocking type instruction is a Montgomery multiplication cryptographic instruction. Each thread can operate in a thread-based mode that allows the insertion of stall cycles during the execution of blocking instructions, during which other threads may utilize the previously blocked hardware resources. At times when multiple threads are scheduled to execute blocking instructions, the thread-based mode may be changed to increase throughput for these multiple threads. For example, the mode may be changed to disallow the insertion of stall cycles. Therefore, the time for sequential operation of the blocking instructions corresponding to the multiple threads may be reduced.

    摘要翻译: 在多线程处理器中有效执行操作的系统和方法。 每个线程可以包括阻塞指令。 阻塞指令阻止其他线程在相当长的时间内利用硬件资源。 阻塞型指令的一个例子是蒙哥马利乘法加密指令。 每个线程都可以以线程为基础的模式运行,允许在执行阻塞指令期间插入停滞周期,在此期间其他线程可能利用先前阻止的硬件资源。 在多个线程被调度执行阻塞指令的时候,可以改变基于线程的模式,以增加这些多线程的吞吐量。 例如,可以改变该模式以不允许插入失速循环。 因此,可以减少对应于多个线程的阻塞指令的顺序操作的时间。

    Register access protocol in a multihreaded multi-core processor
    8.
    发明授权
    Register access protocol in a multihreaded multi-core processor 有权
    在多线程多核处理器中注册访问协议

    公开(公告)号:US07747771B1

    公开(公告)日:2010-06-29

    申请号:US10881178

    申请日:2004-06-30

    IPC分类号: G06F15/16 G06F15/76 G06F13/00

    CPC分类号: G06F15/16

    摘要: A method and mechanism for managing access to a plurality of registers in a processing device are contemplated. A processing device includes multiple nodes coupled to a ring bus, each of which include one or more registers which may be accessed by processes executing within the device. Also coupled to the ring bus is a ring control unit which is configured to initiate transactions targeted to nodes on the ring bus. Each of the nodes are configured receive and process bus transaction with a fixed latency whether or not the first transaction is targeted to the receiving node. The ring control unit is configured to periodically convey idle transactions on the ring bus in order to allow nodes responding to indeterminate transactions to gain access to the bus.

    摘要翻译: 考虑了用于管理对处理设备中的多个寄存器的访问的方法和机制。 处理设备包括耦合到环形总线的多个节点,每个节点包括一个或多个可由设备内执行的进程访问的寄存器。 还耦合到环形总线的环控制单元被配置为发起针对环形总线上的节点的事务。 每个节点被配置为具有固定延迟的接收和处理总线事务,无论第一个事务是否针对接收节点。 环控制单元被配置为周期性地传送环总线上的空闲事务,以便允许节点响应不确定的事务来访问总线。

    Arbitration of window swap operations
    9.
    发明授权
    Arbitration of window swap operations 有权
    窗口交换操作的仲裁

    公开(公告)号:US07426630B1

    公开(公告)日:2008-09-16

    申请号:US10881151

    申请日:2004-06-30

    摘要: In one embodiment, a processor comprises a register file, register management logic coupled to the register file, and at least two sources of window swap operations coupled to the register management logic. The register management logic is configured to control an interface to the register file to switch register windows in the register file in response to one or more window swap operations. The sources of window swap operations and the register management logic are configured to cooperate according to an arbitration scheme to arbitrate between conflicting window swap operations to be performed using the interface. In one particular implementation, for example, block signals may be used from higher priority sources to lower priority sources to block issuance of window swap operations by the lower priority sources.

    摘要翻译: 在一个实施例中,处理器包括寄存器文件,耦合到寄存器文件的寄存器管理逻辑以及耦合到寄存器管理逻辑的至少两个窗口交换源。 寄存器管理逻辑被配置为响应于一个或多个窗口交换操作来控制寄存器文件的接口来切换寄存器文件中的寄存器窗口。 窗口交换操作的来源和寄存器管理逻辑被配置为根据仲裁方案进行协作以在使用该接口执行的冲突的窗口交换操作之间进行仲裁。 在一个特定实现中,例如,可以使用块信号从较高优先级源降低优先级源,以阻止较低优先级源发出窗口交换操作。

    Software accessible fast VA to PA translation
    10.
    发明授权
    Software accessible fast VA to PA translation 有权
    软件可访问快速VA到PA翻译

    公开(公告)号:US07350053B1

    公开(公告)日:2008-03-25

    申请号:US11034345

    申请日:2005-01-11

    IPC分类号: G06F9/26 G06F9/34 G06F12/00

    CPC分类号: G06F12/1081 G06F12/1027

    摘要: A method to communicate data is disclosed which includes communicating a virtual address to a translation lookaside buffer (TLB) and translating the virtual address to a physical address of a computer memory. The method also includes loading the physical address translated by the TLB into a register within a processor and transmitting the data from the physical address to a destination computing device.

    摘要翻译: 公开了一种用于传送数据的方法,其包括将虚拟地址传送到翻译后备缓冲器(TLB)并将虚拟地址转换为计算机存储器的物理地址。 该方法还包括将由TLB转换的物理地址加载到处理器内的寄存器中,并将数据从物理地址传输到目标计算设备。