-
公开(公告)号:US10275230B2
公开(公告)日:2019-04-30
申请号:US15650368
申请日:2017-07-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Suresh Mani , Dibyendu Das , Shivarama Rao , Ashutosh Nema
IPC: G06F9/44 , G06F8/41 , G06F12/0802
Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.
-
公开(公告)号:US10353708B2
公开(公告)日:2019-07-16
申请号:US15273916
申请日:2016-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Anupama Rajesh Rasale , Dibyendu Das , Ashutosh Nema , Md Asghar Ahmad Shahid , Prathiba Kumar
Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.
-
公开(公告)号:US20180088948A1
公开(公告)日:2018-03-29
申请号:US15273916
申请日:2016-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Anupama Rajesh Rasale , Dibyendu Das , Ashutosh Nema , Md Asghar Ahmad Shahid , Prathiba Kumar
CPC classification number: G06F9/30036 , G06F9/30032 , G06F9/30043 , G06F9/3455 , G06F15/8007 , G06F15/8053
Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.
-
公开(公告)号:US11435987B2
公开(公告)日:2022-09-06
申请号:US16774756
申请日:2020-01-28
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Ganesh Gopalasubramanian , Ashutosh Nema , Venugopal Raghavan
IPC: G06F8/41
Abstract: Optimizing runtime alias checks includes identifying, by a compiler, a base pointer and a plurality of different memory accesses based on the base pointer in a code loop; generating, by the compiler, a first portion of runtime code to determine a minimum access and a maximum access of the plurality of different memory accesses; and generating, by the compiler, a second portion of runtime code including one or more runtime alias checks for the minimum access and one or more runtime alias checks for the maximum access.
-
公开(公告)号:US20190018664A1
公开(公告)日:2019-01-17
申请号:US15650368
申请日:2017-07-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Suresh Mani , Dibyendu Das , Shivarama Rao , Ashutosh Nema
IPC: G06F9/45 , G06F12/0802
Abstract: Methods of compiling source code are provided. A method includes identifying a first array of structures (AOS), having a plurality of array elements, each array element being a structure with a plurality of fields, and performing structure peeling on the first AOS to convert a data layout of the first AOS to an array of structure of arrays (AOSOA) including a plurality of memory blocks of uniform block size. At least one of the plurality of memory blocks is allocated for each field of the plurality of fields. The method further includes allocating a number of complete memory blocks to accommodate all of the plurality of array elements of the AOS.
-
-
-
-