OPERAND DATA STRUCTURE
    1.
    发明申请
    OPERAND DATA STRUCTURE 审中-公开
    操作数据结构

    公开(公告)号:WO2010069640A1

    公开(公告)日:2010-06-24

    申请号:PCT/EP2009/063720

    申请日:2009-10-20

    CPC classification number: G06F8/4441 G06F8/447

    Abstract: In response to receiving pre-processed code, a compiler identifies a code section that is not a candidate for acceleration and a code block that is a candidate for acceleration. The code block specifies an iterated operation having a first operand and a second operand, where each of multiple first operands and each of multiple second operands for the iterated operation has a defined addressing relationship. In response to the identifying, the compiler generates post-processed code containing lower level instruction(s) corresponding to the identified code section and creates and outputs an operand data structure separate from the post-processed code. The operand data structure specifies the defined addressing relationship for the multiple first operands and for the multiple second operands. The compiler places a block computation command in the post-processed code that invokes processing of the operand data structure to compute operand addresses.

    Abstract translation: 响应于接收预处理的代码,编译器识别不是加速候选的代码段和作为加速候选的代码块。 代码块指定具有第一操作数和第二操作数的迭代操作,其中迭代操作的多个第一操作数和多个第二操作数中的每一个具有定义的寻址关系。 响应于识别,编译器生成包含对应于所识别的代码段的较低级别指令的后处理代码,并创建并输出与后处理代码分离的操作数数据结构。 操作数数据结构指定多个第一个操作数和多个第二个操作数的定义的寻址关系。 编译器在后处理代码中放置块计算命令,该代码调用操作数数据结构的处理以计算操作数地址。

    OPERAND ADDRESS GENERATION
    2.
    发明申请
    OPERAND ADDRESS GENERATION 审中-公开
    操作地址生成

    公开(公告)号:WO2010069638A1

    公开(公告)日:2010-06-24

    申请号:PCT/EP2009/063718

    申请日:2009-10-20

    CPC classification number: G06F9/345

    Abstract: A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, and an instruction sequencing unit that fetches instructions for execution by the execution unit. The processor further includes an operand data structure and an address generation accelerator. The operand data structure specifies a first relationship between addresses of sequential accesses within a first address region and a second relationship between addresses of sequential accesses within a second address region. The address generation accelerator computes a first address of a first memory access in the first address region by reference to the first relationship and a second address of a second memory access in the second address region by reference to the second relationship.

    Abstract translation: 处理器包括执行指令的至少一个执行单元,耦合到所述至少一个执行单元的至少一个寄存器文件,其缓冲由所述至少一个执行单元访问的操作数,以及指令排序单元,其提取用于执行的指令 由执行单位。 处理器还包括操作数数据结构和地址生成加速器。 操作数数据结构指定第一地址区域内的顺序访问的地址与第二地址区域内的顺序存取的地址之间的第一关系。 参考第二关系,地址生成加速器通过参考第一关系和第二地址区中的第二存储器访问的第二地址来计算第一地址区中的第一存储器访问的第一地址。

    ACCOUNTING METHOD AND LOGIC FOR DETERMINING PER-THREAD PROCESSOR RESOURCE UTILIZATION IN A SIMULTANEOUS MULTI-THREADED (SMT) PROCESSOR
    3.
    发明申请
    ACCOUNTING METHOD AND LOGIC FOR DETERMINING PER-THREAD PROCESSOR RESOURCE UTILIZATION IN A SIMULTANEOUS MULTI-THREADED (SMT) PROCESSOR 审中-公开
    用于确定同时多线程(SMT)处理器中的每个线程处理器资源利用的会计方法和逻辑

    公开(公告)号:WO2004095282A1

    公开(公告)日:2004-11-04

    申请号:PCT/GB2004/001586

    申请日:2004-04-14

    Abstract: An accounting method and logic for determining per-thread processor resource utilization in a simultaneous multi-threaded (SMT) processor provides a mechanism for accounting for processor resource usage by programs and threads within programs. Relative resource use is determined by detecting instruction dispatches for multiple threads active within the processor, which may include idle threads that are still occupying processor resources. If instructions are dispatched for all threads or no threads, the processor cycle is accounted equally to all threads. Alternatively if no threads are in a dispatch state, the accounting may be made using a prior state, or in conformity with ratios of the threads' priority levels. If only one thread is dispatching, that thread is accounted the entire processor cycle. If multiple threads are dispatching, but less than all threads are dispatching (in processors supporting more than two threads), the processor cycle is billed evenly across the dispatching threads. Multiple dispatches may be detected for the threads and a fractional resource usage determined for each thread and the counters may be updated in accordance with their fractional usage.

    Abstract translation: 用于确定同时多线程(SMT)处理器中的每线程处理器资源利用的计费方法和逻辑提供了一种用于计算程序内的程序和线程的处理器资源使用的机制。 通过检测处理器内活动的多个线程的指令分派来确定相对资源使用,这可能包括仍占用处理器资源的空闲线程。 如果为所有线程或没有线程调度指令,则处理器周期与所有线程相等。 或者,如果没有线程处于调度状态,则可以使用先前状态进行计费,或者根据线程的优先级的比率来进行计费。 如果只调度一个线程,则该线程将占整个处理器周期。 如果多个线程正在调度,但是少于所有线程的调度(在支持多于两个线程的处理器中),处理器周期将在调度线程之间平均计费。 可以为线程检测多个调度,并且为每个线程确定的分数资源使用,并且可以根据其分数使用来更新计数器。

    HARDWARE ASSIST THREAD
    4.
    发明申请
    HARDWARE ASSIST THREAD 审中-公开
    硬件辅助螺纹

    公开(公告)号:WO2011141337A1

    公开(公告)日:2011-11-17

    申请号:PCT/EP2011/057106

    申请日:2011-05-04

    Abstract: Mechanisms are provided for offloading a workload from a main thread to an assist thread. The mechanisms receive, in a fetch unit of a processor of the data processing system, a branch-to-assist-thread instruction of a main thread. The branch-to-assist-thread instruction informs hardware of the processor to look for an already spawned idle thread to be used as an assist thread. Hardware implemented pervasive thread control logic determines if one or more already spawned idle threads are available for use as an assist thread. The hardware implemented pervasive thread control logic selects an idle thread from the one or more already spawned idle threads if it is determined that one or more already spawned idle threads are available for use as an assist thread, to thereby provide the assist thread. In addition, the hardware implemented pervasive thread control logic offloads a portion of a workload of the main thread to the assist thread.

    Abstract translation: 提供了将工作负载从主线程卸载到辅助线程的机制。 机构在数据处理系统的处理器的提取单元中接收主线程的分支到辅助线程指令。 分支到辅助线程指令通知处理器的硬件,以查找已经产生的空闲线程以用作辅助线程。 硬件实现的普遍线程控制逻辑确定一个或多个已经产生的空闲线程是否可用作辅助线程。 如果确定一个或多个已经产生的空闲线程可用作辅助线程,则实现的普遍线程控制逻辑的硬件从一个或多个已经产生的空闲线程中选择空闲线程,从而提供辅助线程。 此外,实现的普遍线程控制逻辑的硬件将主线程的一部分工作量卸载到辅助线程。

    OPERAND CACHING POLICY
    5.
    发明申请
    OPERAND CACHING POLICY 审中-公开
    操作缓存策略

    公开(公告)号:WO2010069639A1

    公开(公告)日:2010-06-24

    申请号:PCT/EP2009/063719

    申请日:2009-10-20

    CPC classification number: G06F9/383 G06F2212/6028

    Abstract: A processor has an associated memory hierarchy including a cache memory. The processor includes an instruction sequencing unit that fetches instructions for processing, an operand data structure including a plurality of entries corresponding to operands of operations to be performed by the processor, and a computation engine. A first entry among the plurality of entries in the operand data structure specifies a first caching policy for a first operand, and a second entry specifies a second caching policy for a second operand. The computation engine computes and stores operands in the memory hierarchy in accordance with the cache policies indicated within the operand data structure.

    Abstract translation: 处理器具有包括高速缓冲存储器的相关联的存储器层级。 所述处理器包括:指令排序单元,其提取用于处理的指令;操作数数据结构,包括与由所述处理器执行的操作操作数对应的多个条目;以及计算引擎。 操作数数据结构中的多个条目中的第一条目指定第一操作数的第一高速缓存策略,第二条目指定用于第二操作数的第二高速缓存策略。 计算引擎根据操作数数据结构中指示的缓存策略计算并存储存储器层次结构中的操作数。

    OPERATION DATA STRUCTURE
    6.
    发明申请
    OPERATION DATA STRUCTURE 审中-公开
    操作数据结构

    公开(公告)号:WO2010069637A1

    公开(公告)日:2010-06-24

    申请号:PCT/EP2009/063717

    申请日:2009-10-20

    CPC classification number: G06F8/4441

    Abstract: A compiler is designed, in response to receiving pre-processed code, to identify a code section that is not a candidate for acceleration and identifying a code block specifying an iterated operation that is a candidate for acceleration. In response to identifying the code section, the compiler generates post-processed code containing one or more lower level instructions corresponding to the identified code section, and in response to identifying the code block, the compiler creates and outputs an operation data structure separate from the post-processed code that identifies the iterated operation. The compiler places a block computation command in the post-processed code that invokes processing of the operation data structure to perform the iterated operation and outputs the post-processed code.

    Abstract translation: 编译器被设计为响应于接收预处理的代码来识别不是加速候选的代码段,并且识别指定作为加速候选的迭代操作的代码块。 响应于识别代码部分,编译器生成包含与识别的代码部分相对应的一个或多个较低级别指令的后处理代码,并且响应于识别代码块,编译器创建并输出与 标识迭代操作的后处理代码。 编译器在后处理代码中放置块计算命令,该代码调用操作数据结构的处理以执行迭代操作,并输出后处理代码。

Patent Agency Ranking