Method and apparatus for token triggered multithreading
    11.
    发明授权
    Method and apparatus for token triggered multithreading 有权
    令牌触发多线程的方法和装置

    公开(公告)号:US06842848B2

    公开(公告)日:2005-01-11

    申请号:US10269245

    申请日:2002-10-11

    CPC分类号: G06F9/3867 G06F9/3851

    摘要: Techniques for token triggered multithreading in a multithreaded processor are disclosed. An instruction issuance sequence for a plurality of threads of the multithreaded processor is controlled by associating with each of the threads at least one register which stores a value identifying a next thread to be permitted to issue one or more instructions, and utilizing the stored value to control the instruction issuance sequence. For example, each of a plurality of hardware thread units of the multithreaded processor may include a corresponding local register updatable by that hardware thread unit, with the local register for a given one of the hardware thread units storing a value identifying the next thread to be permitted to issue one or more instructions after the given hardware thread unit has issued one or more instructions. A global register arrangement may also or alternatively be used. The processor may be configured so as to permit the instruction issuance sequence to correspond to an arbitrary alternating even-odd sequence of threads, without introducing blocking conditions leading to thread stalls.

    摘要翻译: 公开了一种用于多线程处理器中令牌触发多线程的技术。 多线程处理器的多个线程的指令发布序列通过与每个线程相关联来控制,该至少一个寄存器存储标识下一个线程的值以允许发出一个或多个指令,并且利用存储的值 控制指令发布顺序。 例如,多线程处理器的多个硬件线程单元中的每一个可以包括可由该硬件线程单元更新的对应的本地寄存器,其中给定的一个硬件线程单元的本地寄存器存储标识下一个线程的值 允许在给定的硬件线程单元发出一个或多个指令之后发出一个或多个指令。 还可以或替代地使用全局寄存器布置。 处理器可以被配置为允许指令发布序列对应于任意交替偶数奇数序列的线程,而不引入导致线程停顿的阻塞条件。

    Multithreaded processor with multiple concurrent pipelines per thread
    12.
    发明授权
    Multithreaded processor with multiple concurrent pipelines per thread 有权
    多线程处理器,每个线程具有多个并发管道

    公开(公告)号:US08918627B2

    公开(公告)日:2014-12-23

    申请号:US12579912

    申请日:2009-10-15

    IPC分类号: G06F9/38 G06F9/30

    摘要: A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

    摘要翻译: 多线程处理器包括多个硬件线程单元,耦合到用于解码从其接收的指令的线程单元的指令解码器,以及用于执行解码指令的多个执行单元。 多线程处理器被配置为用于控制与相应硬件线程单元相关联的线程的指令发布序列。 在给定的处理器时钟周期中,仅允许指定的一个线程发出一个或多个指令,但是允许发出指令的指定线程根据指令发布顺序在多个时钟周期内变化。 这些指令以允许至少给定的一个线程支持多个并行指令流水线的方式流水线化。

    Power saving circuit using a clock buffer and multiple flip-flops
    14.
    发明授权
    Power saving circuit using a clock buffer and multiple flip-flops 有权
    使用时钟缓冲器和多个触发器的省电电路

    公开(公告)号:US08471597B2

    公开(公告)日:2013-06-25

    申请号:US12994115

    申请日:2009-05-07

    摘要: A circuit is described including a clock input for at least one clock signal. Only one clock buffer is connected to the clock input to generate, based on the at least one clock signal, at least a first modified clock signal and a second modified clock signal. A plurality of flip-flops are connected to the clock buffer. Each of the flip-flops receive the first and second modified clock signals. A plurality of data inputs are each connected to at least one of the plurality of flip-flops to provide input data to the plurality of flip-flops. A plurality of data outputs each are connected to at least one of the plurality of flip-flops to provide output data from the plurality of flip-flops. Each of the plurality of flip-flops transform the input data to the output data utilizing the first modified clock signal and the second modified clock signal.

    摘要翻译: 描述了包括用于至少一个时钟信号的时钟输入的电路。 只有一个时钟缓冲器被连接到时钟输入端,以便基于至少一个时钟信号产生至少第一修改时钟信号和第二修改时钟信号。 多个触发器连接到时钟缓冲器。 每个触发器接收第一和第二修改的时钟信号。 多个数据输入各自连接到多个触发器中的至少一个,以向多个触发器提供输入数据。 多个数据输出各自连接到多个触发器中的至少一个,以提供来自多个触发器的输出数据。 多个触发器中的每一个利用第一修改时钟信号和第二修改时钟信号将输入数据变换为输出数据。

    POWER SAVING CIRCUIT USING A CLOCK BUFFER AND MULTIPLE FLIP-FLOPS
    15.
    发明申请
    POWER SAVING CIRCUIT USING A CLOCK BUFFER AND MULTIPLE FLIP-FLOPS 有权
    节电电路使用时钟缓冲器和多个FLIP-FLOPS

    公开(公告)号:US20110254588A1

    公开(公告)日:2011-10-20

    申请号:US12994115

    申请日:2009-05-07

    IPC分类号: G06F7/38 H01R43/00 H03K3/00

    摘要: A circuit is described including a clock input for at least one clock signal. Only one clock buffer is connected to the clock input to generate, based on the at least one clock signal, at least a first modified clock signal and a second modified clock signal. A plurality of flip-flops are connected to the clock buffer. Each of the flip-flops receive the first and second modified clock signals. A plurality of data inputs are each connected to at least one of the plurality of flip-flops to provide input data to the plurality of flip-flops. A plurality of data outputs each are connected to at least one of the plurality of flip-flops to provide output data from the plurality of flip-flops. Each of the plurality of flip-flops transform the input data to the output data utilizing the first modified clock signal and the second modified clock signal.

    摘要翻译: 描述了包括用于至少一个时钟信号的时钟输入的电路。 只有一个时钟缓冲器被连接到时钟输入端,以便基于至少一个时钟信号产生至少第一修改时钟信号和第二修改时钟信号。 多个触发器连接到时钟缓冲器。 每个触发器接收第一和第二修改的时钟信号。 多个数据输入各自连接到多个触发器中的至少一个,以向多个触发器提供输入数据。 多个数据输出各自连接到多个触发器中的至少一个,以提供来自多个触发器的输出数据。 多个触发器中的每一个利用第一修改时钟信号和第二修改时钟信号将输入数据变换为输出数据。

    METHOD FOR ENABLING MULTI-PROCESSOR SYNCHRONIZATION
    16.
    发明申请
    METHOD FOR ENABLING MULTI-PROCESSOR SYNCHRONIZATION 有权
    用于实现多处理器同步的方法

    公开(公告)号:US20090193279A1

    公开(公告)日:2009-07-30

    申请号:US12362329

    申请日:2009-01-29

    IPC分类号: G06F1/08

    CPC分类号: G06F9/526 G06F9/52

    摘要: A method for providing at least one sequence of values to a plurality of processors is described. In the method, a sequence generator from one or more sequence generators is associated with a memory location. The sequence generator is configured to generate the at least one sequence of values. One or more read accesses of the memory location are enabled by a processor from the plurality of processors. In response to enabling the read access, the sequence generator is executed so that it returns a first value from the sequence of values to the processor. After executing the sequence generator, the sequence generator is advanced so that the next access generates a second value from the sequence of values. The second value is sequentially subsequent to the first value.

    摘要翻译: 描述了一种用于向多个处理器提供至少一个值序列的方法。 在该方法中,来自一个或多个序列生成器的序列生成器与存储器位置相关联。 序列生成器被配置为生成至少一个值序列。 存储器位置的一个或多个读取访问由来自多个处理器的处理器启用。 响应于启用读取访问,序列生成器被执行,使得其从值序列返回到处理器的第一值。 在执行序列生成器之后,序列发生器被提前使得下一次访问从值序列生成第二个值。 第二个值依次在第一个值之后。

    Method of renaming registers in register file and microprocessor thereof

    公开(公告)号:US20060294342A1

    公开(公告)日:2006-12-28

    申请号:US11511677

    申请日:2006-08-29

    申请人: Mayan Moudgill

    发明人: Mayan Moudgill

    IPC分类号: G06F15/00

    摘要: A microprocessor for processing instructions comprises multiple clusters for receiving the instructions, each of the clusters having a plurality of functional units for executing the instructions, multiple register sub-files each having multiple registers for storing data for executing the instructions, wherein each of the clusters is associated with corresponding one of the register sub-files so that an instruction dispatched to a cluster is executed by accessing registers in a register sub-file associated with the cluster to which the instruction is dispatched, a register-renaming unit for renaming target registers in an instruction with registers in a register sub-file associated with a cluster to which the instruction is dispatched, and issue-queue units each of which is associated with a corresponding one of the clusters, wherein an issue-queue unit holds instruction renamed by the register-renaming unit until the renamed instruction is issued to be executed in a cluster associated with the issue-queue unit.

    Method and apparatus for multithreaded cache with cache eviction based on thread identifier

    公开(公告)号:US06990557B2

    公开(公告)日:2006-01-24

    申请号:US10161774

    申请日:2002-06-04

    IPC分类号: G06F12/00

    摘要: A cache memory for use in a multithreaded processor includes a number of set-associative thread caches, with one or more of the thread caches each implementing a thread-based eviction process that reduces the amount of replacement policy storage required in the cache memory. At least a given one of the thread caches in an illustrative embodiment includes a memory array having multiple sets of memory locations, and a directory for storing tags each corresponding to at least a portion of a particular address of one of the memory locations. The directory has multiple entries each storing multiple ones of the tags, such that if there are n sets of memory locations in the memory array, there are n tags associated with each directory entry. The directory is utilized in implementing a set-associative address mapping between access requests and memory locations of the memory array. An entry in a particular one of the memory locations is selected for eviction from the given thread cache in conjunction with a cache miss event, based at least in part on at least a portion of a thread identifier of the given thread cache.

    Method and apparatus for thread-based memory access in a multithreaded processor
    19.
    发明授权
    Method and apparatus for thread-based memory access in a multithreaded processor 有权
    用于多线程处理器中基于线程的内存访问的方法和装置

    公开(公告)号:US06925643B2

    公开(公告)日:2005-08-02

    申请号:US10269247

    申请日:2002-10-11

    摘要: Techniques for thread-based memory access by a multithreaded processor are disclosed. The multithreaded processor determines a thread identifier associated with a particular processor thread, and utilizes at least a portion of the thread identifier to select a particular portion of an associated memory to be accessed by the corresponding processor thread. In an illustrative embodiment, a first portion of the thread identifier is utilized to select one of a plurality of multiple-bank memory elements within the memory, and a second portion of the thread identifier is utilized to select one of a plurality of memory banks within the selected one of the multiple-bank memory elements. The first portion may comprise one or more most significant bits of the thread identifier, while the second portion comprises one or more least significant bits of the thread identifier. Advantageously, the invention reduces memory access times and power consumption, while preventing the stalling of any processor threads.

    摘要翻译: 公开了一种由多线程处理器进行基于线程的存储器访问的技术。 多线程处理器确定与特定处理器线程相关联的线程标识符,并且利用线程标识符的至少一部分来选择要由对应的处理器线程访问的相关联的存储器的特定部分。 在说明性实施例中,线程标识符的第一部分用于选择存储器内的多个多存储体存储器元件中的一个,并且线程标识符的第二部分用于选择多个存储体内的多个存储体, 所选择的多组存储器元件之一。 第一部分可以包括线程标识符的一个或多个最高有效位,而第二部分包括线程标识符的一个或多个最低有效位。 有利地,本发明减少了存储器访问时间和功耗,同时防止任何处理器线程的停止。

    Method and apparatus for reducing encoding needs and ports to shared resources in a processor
    20.
    发明授权
    Method and apparatus for reducing encoding needs and ports to shared resources in a processor 失效
    用于将编码需求和端口减少到处理器中的共享资源的方法和装置

    公开(公告)号:US06704855B1

    公开(公告)日:2004-03-09

    申请号:US09585766

    申请日:2000-06-02

    IPC分类号: G06F930

    摘要: The present invention relates to a method for accessing elements from a shared resource to be used by consumers that perform actions according to corresponding operations. The method creates a packet of operations to be processed simultaneously, wherein the elements from the shared resource used by the operations are specified by source and destination identifier fields that are shared among the operations in such a way that the sum of all the elements from the shared resource used by the operations does not exceed a total number of identifiers available in the packet. The method also reads the elements from the shared resource according to the shared identifier fields specified in the packet. The method decodes a number of elements from the shared resource needed by each operation, by passing the operations to an operation decoder having a defined routing scheme based on the needs of the operations. The method also routes the elements to the consumers performing operations and resulting values to the shared resource, according to a routing signal of the operation decoder.

    摘要翻译: 本发明涉及一种从共享资源访问要由使用者根据相应操作执行动作的消费者的方法。 该方法创建要同时处理的一组操作,其中由操作使用的来自共享资源的元素由在操作之间共享的源和目标标识符字段指定,使得来自 操作使用的共享资源不会超过数据包中可用的标识符的总数。 该方法还根据分组中指定的共享标识符字段从共享资源中读取元素。 该方法根据操作的需要,通过将操作传递给具有定义的路由方案的操作解码器,从每个操作所需的共享资源中解码多个元素。 该方法还根据操作解码器的路由信号将元素路由到消费者对共享资源执行操作和结果值。