EFFICIENT TRANSFER OF MATRICES FOR MATRIX BASED OPERATIONS
    21.
    发明申请
    EFFICIENT TRANSFER OF MATRICES FOR MATRIX BASED OPERATIONS 有权
    基于矩阵的操作的矩阵的有效转移

    公开(公告)号:US20100318758A1

    公开(公告)日:2010-12-16

    申请号:US12485365

    申请日:2009-06-16

    IPC分类号: G06F12/02 G06F7/32

    CPC分类号: G06F17/16 G06F7/768

    摘要: Techniques for transferring a matrix for performing one or more operations are provided. The techniques include applying a permutation on at least one of one or more columns and one or more rows of a matrix to group each of at least one of one or more columns and one or more rows of the matrix with a same alignment, blocking at least one of the grouped columns and grouped rows, and performing one or more operations on each matrix block.

    摘要翻译: 提供了用于传送用于执行一个或多个操作的矩阵的技术。 这些技术包括对一个或多个列中的至少一个以及矩阵的一行或多行应用置换以将一个或多个列中的至少一个列和一行或多行的矩阵中的每一个以相同的对齐方式分组, 至少一个分组列和分组行,并对每个矩阵块执行一个或多个操作。

    Performance evaluation of algorithmic tasks and dynamic parameterization on multi-core processing systems
    22.
    发明授权
    Performance evaluation of algorithmic tasks and dynamic parameterization on multi-core processing systems 失效
    算法任务的性能评估和多核处理系统的动态参数化

    公开(公告)号:US07793011B2

    公开(公告)日:2010-09-07

    申请号:US12129245

    申请日:2008-05-29

    摘要: A method for evaluating performance of DMA-based algorithmic tasks on a target multi-core processing system includes the steps of: inputting a template for a specified task, the template including DMA-related parameters specifying DMA operations and computational operations to be performed; evaluating performance for the specified task by running a benchmark on the target multi-core processing system, the benchmark being operative to generate data access patterns using DMA operations and invoking prescribed computation routines as specified by the input template; and providing results of the benchmark indicative of a measure of performance of the specified task corresponding to the target multi-core processing system.

    摘要翻译: 一种用于评估目标多核处理系统上基于DMA的算法任务的性能的方法包括以下步骤:输入用于指定任务的模板,该模板包括指定要执行的DMA操作和计算操作的DMA相关参数; 通过在目标多核处理系统上运行基准来评估指定任务的性能,该基准用于使用DMA操作生成数据访问模式,并调用由输入模板指定的规定的计算例程; 并提供表示与目标多核处理系统对应的指定任务的性能度量的基准测试结果。