Floating point collect and operate
    2.
    发明授权
    Floating point collect and operate 有权
    浮点收集和运行

    公开(公告)号:US08595467B2

    公开(公告)日:2013-11-26

    申请号:US12648527

    申请日:2009-12-29

    IPC分类号: G06F9/00

    摘要: Mechanisms are provided for performing a floating point collect and operate for a summation across a vector for a dot product operation. A routing network placed before the single instruction multiple data (SIMD) unit allows the SIMD unit to perform a summation across a vector with a singe stage of adders. The routing network routes the vector elements to the adders in a first cycle. The SIMD unit stores the results of the adders into a results vector register. The routing network routes the summation results from the results vector register to the adders in a second cycle. The SIMD unit then stores the results from the second cycle in the results vector register.

    摘要翻译: 提供了用于执行浮点收集和操作用于点产品操作的矢量的求和的机制。 放置在单指令多数据(SIMD)单元之前的路由网络允许SIMD单元在具有单个加法器的一个阶段的矢量上执行求和。 路由网络在第一个周期中将向量元素路由到加法器。 SIMD单元将加法器的结果存储到结果向量寄存器中。 路由网络将求和结果从结果向量寄存器路由到第二个周期中的加法器。 然后,SIMD单元将来自第二周期的结果存储在结果向量寄存器中。

    Reducing power requirements of a multiple core processor
    3.
    发明授权
    Reducing power requirements of a multiple core processor 失效
    降低多核处理器的功耗要求

    公开(公告)号:US08381006B2

    公开(公告)日:2013-02-19

    申请号:US12756570

    申请日:2010-04-08

    IPC分类号: G06F1/00

    摘要: A mechanism is provided for reducing power consumed by a multi-core processor. Responsive to a number of properly functioning processor cores being more than a required number of processor cores in a multi-core processor, the power consumption measurement module determines a number of the properly functioning processor cores to disable. The power consumption measurement module initiates an equal amount of workload to be processed by each of the properly functioning processor cores. The power consumption measurement module determines power consumed by each of the properly functioning processor cores. The power consumption measurement module deactivates one or more of the properly functioning processor cores that have maximum power in order that the number of properly functioning processor cores deactivated is equal to the number of properly functioning processor cores to disable.

    摘要翻译: 提供了用于减少多核处理器消耗的功率的机制。 响应于多个正常运行的处理器内核超过多核处理器中所需数量的处理器内核,功耗测量模块确定要禁用的正常运行的处理器内核的数量。 功耗测量模块启动要由每个正常运行的处理器内核处理的相同数量的工作负载。 功耗测量模块确定每个正常运行的处理器内核消耗的功耗。 功耗测量模块取消激活具有最大功率的一个或多个正常运行的处理器内核,以使已正常运行的处理器内核的数量等于要禁用的正常运行的处理器内核的数量。

    Runtime Extraction of Data Parallelism
    4.
    发明申请
    Runtime Extraction of Data Parallelism 有权
    数据并行性的运行时提取

    公开(公告)号:US20110161643A1

    公开(公告)日:2011-06-30

    申请号:US12649860

    申请日:2009-12-30

    IPC分类号: G06F9/32

    摘要: Mechanisms for extracting data dependencies during runtime are provided. The mechanisms execute a portion of code having a loop and generate, for the loop, a first parallel execution group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The mechanisms further execute the first parallel execution group and determining, for each iteration in the subset of iterations, whether the iteration has a data dependence. Moreover, the mechanisms commit store data to system memory only for stores performed by iterations in the subset of iterations for which no data dependence is determined. Store data of stores performed by iterations in the subset of iterations for which a data dependence is determined is not committed to the system memory.

    摘要翻译: 提供了在运行时提取数据依赖关系的机制。 所述机制执行具有循环的一部分代码,并为所述循环生成包括小于所述循环的总迭代次数的循环迭代子集的第一并行执行组。 机制进一步执行第一个并行执行组,并确定迭代子集中的每个迭代,迭代是否具有数据依赖性。 此外,机制仅将数据存储到系统存储器中,用于仅在确定了数据依赖性的迭代子集中通过迭代执行的存储。 在确定数据相关性的迭代子集中存储由迭代执行的存储数据不会提交给系统存储器。

    Multithreaded Programmable Direct Memory Access Engine
    5.
    发明申请
    Multithreaded Programmable Direct Memory Access Engine 有权
    多线程可编程直接存储器访问引擎

    公开(公告)号:US20100161846A1

    公开(公告)日:2010-06-24

    申请号:US12342501

    申请日:2008-12-23

    IPC分类号: G06F3/00

    摘要: A mechanism programming a direct memory access engine operating as a multithreaded processor is provided. A plurality of programs is received from a host processor in a local memory associated with the direct memory access engine. A request is received in the direct memory access engine from the host processor indicating that the plurality of programs located in the local memory is to be executed. The direct memory access engine executes two or more of the plurality of programs without intervention by a host processor. As each of the two or more of the plurality of programs completes execution, the direct memory access engine sends a completion notification to the host processor that indicates that the program has completed execution.

    摘要翻译: 提供了编程作为多线程处理器操作的直接存储器访问引擎的机制。 从与直接存储器访问引擎相关联的本地存储器中的主机处理器接收多个程序。 在来自主机处理器的直接存储器访问引擎中接收到指示将要执行位于本地存储器中的多个程序的请求。 直接存储器访问引擎在主机处理器的干预下执行多个程序中的两个或多个。 当多个程序中的两个或更多个程序中的每一个完成执行时,直接存储器访问引擎向主处理器发送指示程序已经完成执行的完成通知。

    Parallel Execution Unit that Extracts Data Parallelism at Runtime
    8.
    发明申请
    Parallel Execution Unit that Extracts Data Parallelism at Runtime 审中-公开
    并行执行单元在运行时提取数据并行

    公开(公告)号:US20120191953A1

    公开(公告)日:2012-07-26

    申请号:US13434903

    申请日:2012-03-30

    IPC分类号: G06F9/38 G06F9/312

    摘要: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor, Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.

    摘要翻译: 提供了在运行时提取数据依赖关系的机制。 利用这些机制,执行具有循环的一部分代码。 为循环生成第一个并行执行组,该组包括循环的迭代次数小于循环迭代次数的总数。 通过并行执行每个迭代来执行第一个并行执行组。 用于迭代的存储数据存储在处理器的对应存储高速缓存中,处理器的依赖性检查逻辑针对每个迭代确定迭代是否具有数据依赖性。 只有确定了没有数据依赖关系的商店的商店数据被提交到内存。

    Data Parallel Function Call for Determining if Called Routine is Data Parallel
    9.
    发明申请
    Data Parallel Function Call for Determining if Called Routine is Data Parallel 失效
    数据并行函数调用确定调用例程是否是数据并行的

    公开(公告)号:US20120180031A1

    公开(公告)日:2012-07-12

    申请号:US13430168

    申请日:2012-03-26

    IPC分类号: G06F9/45

    摘要: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.

    摘要翻译: 提供了在运行期间执行代码中数据并行函数调用的机制。 这些机制可以操作以在处理器中执行具有对目标代码部分的数据并行函数调用的代码的一部分。 这些机制可以进一步操作以在运行时由处理器确定目标代码部分是代码的数据并行部分还是代码的标量部分,并确定调用代码是数据并行代码还是标量代码。 此外,这些机制可以基于代码的目标部分是代码的数据并行部分还是代码的标量部分的确定来执行代码的目标部分,以及确定调用代码是否是数据并行代码 或标量代码。

    Programmable Direct Memory Access Engine
    10.
    发明申请
    Programmable Direct Memory Access Engine 有权
    可编程直接存储器访问引擎

    公开(公告)号:US20100161848A1

    公开(公告)日:2010-06-24

    申请号:US12342280

    申请日:2008-12-23

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28

    摘要: A mechanism for programming a direct memory access engine operating as a single thread processor is provided. A program is received from a host processor in a local memory associated with the direct memory access engine. A request is received in the direct memory access engine from the host processor indicating that the program located in the local memory is to be executed. The direct memory access engine executes the program without intervention by a host processor. Responsive to the program completing execution, the direct memory access engine sends a completion notification to the host processor that indicates that the program has completed execution.

    摘要翻译: 提供了一种用于对作为单线程处理器操作的直接存储器访问引擎进行编程的机制。 从与直接存储器访问引擎相关联的本地存储器中的主处理器接收程序。 在来自主处理器的直接存储器访问引擎中接收到指示将要执行位于本地存储器中的程序的请求。 直接存储器访问引擎在没有主机处理器干预的情况下执行程序。 响应程序完成执行,直接存储器访问引擎向主处理器发送指示程序已经完成执行的完成通知。