-
1.
公开(公告)号:US10387128B2
公开(公告)日:2019-08-20
申请号:US15499018
申请日:2017-04-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Md Asghar Ahmad Shahid , Shivarama Kokrady Rao , Dibyendu Das
Abstract: A method and apparatus provides for compiling a computer-readable computer program having a plurality of computer-readable statements into a plurality of computer-executable instructions. In one example, the method and apparatus determines when at least one pair of the computer-readable statements is partial-isomorphic having an equivalent operation and same order of operation but causing non-consecutive memory accesses, and generates the computer-executable instructions causing the at least one pair of the partial-isomorphic statements to perform sequential physical memory accesses.
-
2.
公开(公告)号:US20180314506A1
公开(公告)日:2018-11-01
申请号:US15499018
申请日:2017-04-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Md Asghar Ahmad Shahid , Shivarama Kokrady Rao , Dibyendu Das
IPC: G06F9/45
CPC classification number: G06F8/4452 , G06F8/443
Abstract: A method and apparatus provides for compiling a computer-readable computer program having a plurality of computer-readable statements into a plurality of computer-executable instructions. In one example, the method and apparatus determines when at least one pair of the computer-readable statements is partial-isomorphic having an equivalent operation and same order of operation but causing non-consecutive memory accesses, and generates the computer-executable instructions causing the at least one pair of the partial-isomorphic statements to perform sequential physical memory accesses.
-
公开(公告)号:US10353708B2
公开(公告)日:2019-07-16
申请号:US15273916
申请日:2016-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Anupama Rajesh Rasale , Dibyendu Das , Ashutosh Nema , Md Asghar Ahmad Shahid , Prathiba Kumar
Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.
-
公开(公告)号:US20180088948A1
公开(公告)日:2018-03-29
申请号:US15273916
申请日:2016-09-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Anupama Rajesh Rasale , Dibyendu Das , Ashutosh Nema , Md Asghar Ahmad Shahid , Prathiba Kumar
CPC classification number: G06F9/30036 , G06F9/30032 , G06F9/30043 , G06F9/3455 , G06F15/8007 , G06F15/8053
Abstract: Systems, apparatuses, and methods for utilizing efficient vectorization techniques for operands in non-sequential memory locations are disclosed. A system includes a vector processing unit (VPU) and one or more memory devices. In response to determining that a plurality of vector operands are stored in non-sequential memory locations, the VPU performs a plurality of vector load operations to load the plurality of vector operands into a plurality of vector registers. Next, the VPU performs a shuffle operation to consolidate the plurality of vector operands from the plurality of vector registers into a single vector register. Then, the VPU performs a vector operation on the vector operands stored in the single vector register. The VPU can also perform a vector store operation by permuting and storing a plurality of vector operands in appropriate locations within multiple vector registers and then storing the vector registers to locations in memory using a mask.
-
-
-