HIERARCHICAL SHARED SEMAPHORE REGISTERS
    1.
    发明申请
    HIERARCHICAL SHARED SEMAPHORE REGISTERS 审中-公开
    分层分析仪

    公开(公告)号:US20100115236A1

    公开(公告)日:2010-05-06

    申请号:US12263305

    申请日:2008-10-31

    IPC分类号: G06F15/76 G06F9/06

    CPC分类号: G06F9/30101 G06F9/52

    摘要: A multiprocessor computer system having a plurality of processing elements comprises one or more core-level hierarchical shared semaphore registers, wherein each core-level hierarchical shared semaphore register is coupled to a different processor core. Each hierarchical shared semaphore register is writable to each of a plurality of streams executing on the coupled processor core. One or more chip-level hierarchical shared semaphore registers are also coupled to plurality of processor cores, each chip-level hierarchical shared semaphore register writable to each of the plurality of processor cores.

    摘要翻译: 具有多个处理元件的多处理器计算机系统包括一个或多个核心级别分层共享信号量寄存器,其中每个核心级别分层共享信号量寄存器耦合到不同的处理器核心。 每个分层共享信号量寄存器对于在耦合的处理器核上执行的多个流中的每一个是可写的。 一个或多个芯片级别分层共享信号量寄存器也耦合到多个处理器核,每个芯片级分级共享信号量寄存器可写入多个处理器核心中的每一个。

    Decoupled scalar/vector computer architecture system and method
    5.
    发明授权
    Decoupled scalar/vector computer architecture system and method 有权
    去耦标量/矢量计算机架构系统及方法

    公开(公告)号:US07334110B1

    公开(公告)日:2008-02-19

    申请号:US10643586

    申请日:2003-08-18

    IPC分类号: G06F9/38

    摘要: In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a system and method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands, reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit, predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed, dispatching the predispatched vector instruction if all required operands are ready, and executing the dispatched vector instruction as a function of the scalar operand.

    摘要翻译: 在具有标量处理单元和向量处理单元的计算机系统中,其中矢量处理单元包括矢量调度单元,将标量处理单元与矢量处理单元的操作分离的系统和方法,所述方法包括发送 从标量处理单元到矢量调度单元的向量指令,其中如果向量指令不是向量存储器指令,并且如果该向量指令不需要标量操作数,读取标量操作数,则发送包括将该向量指令标记为完成,其中, 读取包括将标量操作数从标量处理单元传送到向量调度单元,如果向量指令是标量提交的,则在向量调度单元内预分配向量指令,如果所有需要的操作数都准备就调度预分配向量指令,并执行调度 向量指令作为标量操作数的函数 。

    MULTIPROCESSOR COMPUTER CACHE COHERENCE PROTOCOL
    6.
    发明申请
    MULTIPROCESSOR COMPUTER CACHE COHERENCE PROTOCOL 审中-公开
    多媒体计算机缓存协议协议

    公开(公告)号:US20100318741A1

    公开(公告)日:2010-12-16

    申请号:US12483915

    申请日:2009-06-12

    IPC分类号: G06F12/08 G06F12/00

    摘要: A multiprocessor computer system comprises a processing node having a plurality of processors and a local memory shared among processors in the node. An L1 data cache is local to each of the plurality of processors, and an L2 cache is local to each of the plurality of processors. An L3 cache is local the node but shared among the plurality of processors, and the L3 cache is a subset of data stored in the local memory. The L2 caches are subsets of the L3 cache, and the L1 caches are a subset of the L2 caches in the respective processors.

    摘要翻译: 多处理器计算机系统包括具有多个处理器的处理节点和在节点中的处理器之间共享的本地存储器。 L1数据高速缓冲存储器是多个处理器中的每一个的本地,并且L2高速缓存对于多个处理器中的每个处理器是本地的。 L3高速缓存是本地节点,但是在多个处理器之间共享,并且L3高速缓存是存储在本地存储器中的数据的子集。 L2高速缓存是L3高速缓存的子集,L1高速缓存是相应处理器中的L2高速缓存的子集。

    System and method for processing memory instructions using a forced order queue
    7.
    发明授权
    System and method for processing memory instructions using a forced order queue 有权
    使用强制排序队列处理存储器指令的系统和方法

    公开(公告)号:US07519771B1

    公开(公告)日:2009-04-14

    申请号:US10643577

    申请日:2003-08-18

    IPC分类号: G06F12/00

    摘要: A novel system and method for processing memory instructions. One embodiment of the invention provides a method for processing a memory instruction. In this embodiment, the method includes obtaining a memory request; storing the memory request in an Initial Request Queue (IRQ); and processing the memory request from the IRQ by a cache controller, wherein processing includes: identifying a type of the memory request, and processing the memory request in both a local cache and an Force Order Queue (FOQ), wherein processing includes determining if a portion of an address associated with the memory request matches one or more partial addresses in the FOQ and, if the memory request misses in the cache and the address does not match one or more partial addresses in the FOQ, adding the memory request to the FOQ and allocating a cache line in the local cache corresponding to the local cache miss.

    摘要翻译: 一种用于处理存储器指令的新型系统和方法。 本发明的一个实施例提供了一种用于处理存储器指令的方法。 在本实施例中,该方法包括获取存储器请求; 将所述存储器请求存储在初始请求队列(IRQ)中; 以及由缓存控制器处理来自IRQ的存储器请求,其中处理包括:识别存储器请求的类型,以及在本地高速缓存和强制排队队列(FOQ)中处理存储器请求,其中处理包括确定是否 与存储器请求相关联的地址的一部分匹配FOQ中的一个或多个部分地址,并且如果存储器请求在高速缓存中丢失并且该地址与FOQ中的一个或多个部分地址不匹配,则将存储器请求添加到FOQ 以及在对应于本地高速缓存未命中的本地高速缓存中分配高速缓存行。

    LARGE INTEGER SUPPORT IN VECTOR OPERATIONS
    8.
    发明申请
    LARGE INTEGER SUPPORT IN VECTOR OPERATIONS 审中-公开
    大量整数支持向量运算

    公开(公告)号:US20100115232A1

    公开(公告)日:2010-05-06

    申请号:US12263313

    申请日:2008-10-31

    IPC分类号: G06F9/302 G06F15/76

    摘要: A vector processor or vector processing computer has a first vector register operable to store two or more vector elements that together comprise a single first large integer and a second vector register operable to store two or more vector elements that together comprise a single second large integer. An adder having a carry-in bit is operable to add the large integer in the first vector register to the large integer in the second vector register by using the carry-in bit to add sequential elements of the vector registers.

    摘要翻译: 向量处理器或向量处理计算机具有可操作以存储两个或更多个向量元素的第一向量寄存器,所述向量元素一起包括单个第一大整数和第二向量寄存器,该第一向量寄存器可操作以存储两个或多个向量元素,所述两个或更多个向量元 具有进位位的加法器可操作以通过使用进位位将第一向量寄存器中的大整数加到第二向量寄存器中的大整数,以添加向量寄存器的顺序元件。

    Vector and scalar data cache for a vector multiprocessor

    公开(公告)号:US06665774B2

    公开(公告)日:2003-12-16

    申请号:US09981380

    申请日:2001-10-16

    IPC分类号: G06F1200

    摘要: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.) a destination code that specifies which of the scalar registers and vector registers into which the read-data value is to be loaded when the read-data value is returned from the memory.

    Vector and scalar data cache for a vector multiprocessor

    公开(公告)号:US06496902B1

    公开(公告)日:2002-12-17

    申请号:US09223853

    申请日:1998-12-31

    IPC分类号: G06F12100

    摘要: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.) a destination code that specifies which of the scalar registers and vector registers into which the read-data value is to be loaded when the read-data value is returned from the memory.