Multithreaded processor with multiple concurrent pipelines per thread
    1.
    发明授权
    Multithreaded processor with multiple concurrent pipelines per thread 有权
    多线程处理器,每个线程具有多个并发管道

    公开(公告)号:US08918627B2

    公开(公告)日:2014-12-23

    申请号:US12579912

    申请日:2009-10-15

    IPC分类号: G06F9/38 G06F9/30

    摘要: A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

    摘要翻译: 多线程处理器包括多个硬件线程单元,耦合到用于解码从其接收的指令的线程单元的指令解码器,以及用于执行解码指令的多个执行单元。 多线程处理器被配置为用于控制与相应硬件线程单元相关联的线程的指令发布序列。 在给定的处理器时钟周期中,仅允许指定的一个线程发出一个或多个指令,但是允许发出指令的指定线程根据指令发布顺序在多个时钟周期内变化。 这些指令以允许至少给定的一个线程支持多个并行指令流水线的方式流水线化。

    Method and apparatus for multithreaded cache with cache eviction based on thread identifier

    公开(公告)号:US06990557B2

    公开(公告)日:2006-01-24

    申请号:US10161774

    申请日:2002-06-04

    IPC分类号: G06F12/00

    摘要: A cache memory for use in a multithreaded processor includes a number of set-associative thread caches, with one or more of the thread caches each implementing a thread-based eviction process that reduces the amount of replacement policy storage required in the cache memory. At least a given one of the thread caches in an illustrative embodiment includes a memory array having multiple sets of memory locations, and a directory for storing tags each corresponding to at least a portion of a particular address of one of the memory locations. The directory has multiple entries each storing multiple ones of the tags, such that if there are n sets of memory locations in the memory array, there are n tags associated with each directory entry. The directory is utilized in implementing a set-associative address mapping between access requests and memory locations of the memory array. An entry in a particular one of the memory locations is selected for eviction from the given thread cache in conjunction with a cache miss event, based at least in part on at least a portion of a thread identifier of the given thread cache.

    Method and apparatus for thread-based memory access in a multithreaded processor
    4.
    发明授权
    Method and apparatus for thread-based memory access in a multithreaded processor 有权
    用于多线程处理器中基于线程的内存访问的方法和装置

    公开(公告)号:US06925643B2

    公开(公告)日:2005-08-02

    申请号:US10269247

    申请日:2002-10-11

    摘要: Techniques for thread-based memory access by a multithreaded processor are disclosed. The multithreaded processor determines a thread identifier associated with a particular processor thread, and utilizes at least a portion of the thread identifier to select a particular portion of an associated memory to be accessed by the corresponding processor thread. In an illustrative embodiment, a first portion of the thread identifier is utilized to select one of a plurality of multiple-bank memory elements within the memory, and a second portion of the thread identifier is utilized to select one of a plurality of memory banks within the selected one of the multiple-bank memory elements. The first portion may comprise one or more most significant bits of the thread identifier, while the second portion comprises one or more least significant bits of the thread identifier. Advantageously, the invention reduces memory access times and power consumption, while preventing the stalling of any processor threads.

    摘要翻译: 公开了一种由多线程处理器进行基于线程的存储器访问的技术。 多线程处理器确定与特定处理器线程相关联的线程标识符,并且利用线程标识符的至少一部分来选择要由对应的处理器线程访问的相关联的存储器的特定部分。 在说明性实施例中,线程标识符的第一部分用于选择存储器内的多个多存储体存储器元件中的一个,并且线程标识符的第二部分用于选择多个存储体内的多个存储体, 所选择的多组存储器元件之一。 第一部分可以包括线程标识符的一个或多个最高有效位,而第二部分包括线程标识符的一个或多个最低有效位。 有利地,本发明减少了存储器访问时间和功耗,同时防止任何处理器线程的停止。

    MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD
    5.
    发明申请
    MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD 有权
    多通道加工器,每个螺纹多个并流管道

    公开(公告)号:US20120096243A1

    公开(公告)日:2012-04-19

    申请号:US13282800

    申请日:2011-10-27

    摘要: A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

    摘要翻译: 多线程处理器包括多个硬件线程单元,耦合到用于解码从其接收的指令的线程单元的指令解码器,以及用于执行解码指令的多个执行单元。 多线程处理器被配置为用于控制与相应硬件线程单元相关联的线程的指令发布序列。 在给定的处理器时钟周期中,仅允许指定的一个线程发出一个或多个指令,但是允许发出指令的指定线程根据指令发布顺序在多个时钟周期内变化。 这些指令以允许至少给定的一个线程支持多个并行指令流水线的方式流水线化。

    Processor having parallel vector multiply and reduce operations with sequential semantics
    6.
    发明授权
    Processor having parallel vector multiply and reduce operations with sequential semantics 有权
    具有并行向量乘法的处理器,并且使用顺序语义来减少操作

    公开(公告)号:US07797363B2

    公开(公告)日:2010-09-14

    申请号:US11096921

    申请日:2005-04-01

    IPC分类号: G06F15/00

    摘要: A processor comprises a plurality of arithmetic units, an accumulator unit, and a reduction unit coupled between the plurality of arithmetic units and the accumulator unit. The reduction unit receives products of vector elements from the arithmetic units and a first accumulator value from the accumulator unit, and processes the products and the first accumulator value to generate a second accumulator value for delivery to the accumulator unit. The processor implements a plurality of vector multiply and reduce operations having guaranteed sequential semantics, that is, operations which guarantee that the computational result will be the same as that which would be produced using a corresponding sequence of individual instructions.

    摘要翻译: 处理器包括多个运算单元,累加器单元和耦合在所述多个运算单元和所述累加器单元之间的缩减单元。 还原单元从算术单元接收向量元素的乘积和来自累加器单元的第一累加器值,并且处理乘积和第一累加器值以产生用于递送到累加器单元的第二累加器值。 处理器实现多个向量乘法和减少具有保证的顺序语义的操作,即,确保计算结果将与使用单独指令的相应序列产生的计算结果相同的操作。

    Multithreaded processor with multiple concurrent pipelines per thread
    7.
    发明授权
    Multithreaded processor with multiple concurrent pipelines per thread 有权
    多线程处理器,每个线程具有多个并发管道

    公开(公告)号:US08959315B2

    公开(公告)日:2015-02-17

    申请号:US12579867

    申请日:2009-10-15

    IPC分类号: G06F9/38 G06F9/30

    摘要: A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

    摘要翻译: 多线程处理器包括多个硬件线程单元,耦合到用于解码从其接收的指令的线程单元的指令解码器,以及用于执行解码指令的多个执行单元。 多线程处理器被配置为用于控制与相应硬件线程单元相关联的线程的指令发布序列。 在给定的处理器时钟周期中,仅允许指定的一个线程发出一个或多个指令,但是允许发出指令的指定线程根据指令发布顺序在多个时钟周期内变化。 这些指令以允许至少给定的一个线程支持多个并行指令流水线的方式流水线化。

    Method and apparatus for multithreaded cache with simplified implementation of cache replacement policy
    10.
    发明授权
    Method and apparatus for multithreaded cache with simplified implementation of cache replacement policy 有权
    用于多线程缓存的方法和装置,具有缓存替换策略的简化实现

    公开(公告)号:US06912623B2

    公开(公告)日:2005-06-28

    申请号:US10161874

    申请日:2002-06-04

    IPC分类号: G06F12/08 G06F12/12 G06F12/00

    摘要: A cache memory for use in a multithreaded processor includes a number of set-associative thread caches, with one or more of the thread caches each implementing an eviction process based on access request address that reduces the amount of replacement policy storage required in the cache memory. At least a given one of the thread caches in an illustrative embodiment includes a memory array having multiple sets of memory locations, and a directory for storing tags each corresponding to at least a portion of a particular address of one of the memory locations. The directory has multiple entries each storing multiple ones of the tags, such that if there are n sets of memory locations in the memory array, there are n tags associated with each directory entry. The directory is utilized in implementing a set-associative address mapping between access requests and memory locations of the memory array. An entry in a particular one of the memory locations is selected for eviction from the given thread cache in conjunction with a cache miss event, based at least in part on at least a portion of an address in an access request associated with the cache miss event.

    摘要翻译: 用于多线程处理器的高速缓存存储器包括多个设置关联线程高速缓冲存储器,其中一个或多个线程高速缓冲存储器基于访问请求地址来实现逐出过程,所述访问请求地址减少高速缓冲存储器中所需的替换策略存储器的量 。 在说明性实施例中的至少一个线程高速缓存中的给定的一个包括具有多组存储器位置的存储器阵列和用于存储标签的目录,每个对应于存储器位置之一的特定地址的至少一部分。 目录具有多个条目,每个条目存储多个标签,使得如果存储器阵列中存在n组存储器位置,则存在与每个目录条目相关联的n个标签。 该目录用于实现访问请求和存储器阵列的存储器位置之间的集合关联地址映射。 至少部分地基于与高速缓存未命中事件相关联的访问请求中的地址的至少一部分,选择存储器位置中的特定一个存储器位置中的条目以从结合高速缓存未命中事件的给定线程高速缓存中的逐出 。