Methods and apparatus for automatic communication optimizations in a compiler based on a polyhedral representation

    公开(公告)号:US09830133B1

    公开(公告)日:2017-11-28

    申请号:US13712659

    申请日:2012-12-12

    IPC分类号: G06F9/45

    CPC分类号: G06F8/41 G06F8/453 G06F8/457

    摘要: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least one local memory unit that allows for data reuse opportunities. The first custom computing apparatus optimizes the code for reduced communication execution on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.

    Methods and apparatus for joint scheduling and layout optimization to enable multi-level vectorization
    2.
    发明授权
    Methods and apparatus for joint scheduling and layout optimization to enable multi-level vectorization 有权
    联合调度和布局优化的方法和装置,以实现多级向量化

    公开(公告)号:US09489180B1

    公开(公告)日:2016-11-08

    申请号:US13679861

    申请日:2012-11-16

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/443 G06F8/447

    摘要: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least one vector execution unit that allow for parallel execution of tasks on constant-strided memory locations. The first custom computing apparatus optimizes the code for parallelism, locality of operations, constant-strided memory accesses and vectorized execution on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.

    摘要翻译: 提供了用于源代码优化的方法,设备和计算机软件产品。 在示例性实施例中,使用第一定制计算装置来优化第二计算装置上的源代码的执行。 在该实施例中,第一定制计算装置包含存储器,存储介质和具有至少一个多级执行单元的至少一个处理器。 第二计算装置包含至少一个向量执行单元,其允许并行执行恒定跨度存储器位置上的任务。 第一定制计算装置优化用于并行性的代码,操作的局部性,在第二计算装置上的恒定帧存储器访问和向量化执行。 本摘要仅用于遵守抽象要求规则。 本摘要以明确的理解提交,不会用于解释或限制权利要求的范围或含义。