Cache aware self-referential structure peeling

    公开(公告)号:US10275230B2

    公开(公告)日:2019-04-30

    申请号:US15650368

    申请日:2017-07-14

    Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.

    Strided loading of non-sequential memory locations by skipping memory locations between consecutive loads

    公开(公告)号:US10353708B2

    公开(公告)日:2019-07-16

    申请号:US15273916

    申请日:2016-09-23

    Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.

    Optimizing runtime alias checks
    4.
    发明授权

    公开(公告)号:US11435987B2

    公开(公告)日:2022-09-06

    申请号:US16774756

    申请日:2020-01-28

    Abstract: Optimizing runtime alias checks includes identifying, by a compiler, a base pointer and a plurality of different memory accesses based on the base pointer in a code loop; generating, by the compiler, a first portion of runtime code to determine a minimum access and a maximum access of the plurality of different memory accesses; and generating, by the compiler, a second portion of runtime code including one or more runtime alias checks for the minimum access and one or more runtime alias checks for the maximum access.

    CACHE AWARE SELF-REFERENTIAL STRUCTURE PEELING

    公开(公告)号:US20190018664A1

    公开(公告)日:2019-01-17

    申请号:US15650368

    申请日:2017-07-14

    Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.

Patent Agency Ranking