-
公开(公告)号:US20190050212A1
公开(公告)日:2019-02-14
申请号:US16076735
申请日:2016-03-11
Applicant: INTEL CORPORATION
Inventor: Hideki Saito IDO , Serge V. PREIS , Sergey S. KOZHUKHOV , Xinmin TIAN , Sergey V. MASLOV , Clark NELSON , Jianfei YU
IPC: G06F8/41
Abstract: Technologies for indirectly calling vector functions include a compute device that includes a memory device to store source code and a compiler module. The compiler module is to identify a set of declarations of vector variants for scalar functions in the source code, generate a vector variant address map for each set of vector variants, generate an offset map for each scalar function, and identify, in the source code, an indirect call to the scalar functions, wherein the indirect call is to be vectorized. The compiler module is also to determine, based on a context of the indirect call, a vector variant to be called and store, in object code and in association with the indirect call, an offset into one of the vector variant address maps based on (i) the determined vector variant to be called and (ii) the offset map that corresponds to each scalar function.
-
公开(公告)号:US20230244456A1
公开(公告)日:2023-08-03
申请号:US18161105
申请日:2023-01-30
Applicant: Intel Corporation
Inventor: Wei XIAO , Xinmin TIAN
CPC classification number: G06F8/441 , G06F9/30112
Abstract: Examples provide an apparatus, device, method, computer program and non-transitory machine-readable storage medium including program code for processing memory spill code during compilation of a computer program. The non-transitory machine-readable storage medium includes program code for processing memory spill code during compilation of a computer program, when executed, to cause a machine to perform identifying a plurality of instructions related to scalar memory spill code during compilation of a computer program, and transforming at least a subset of the plurality of instructions into vectorized code.
-
公开(公告)号:US20210382717A1
公开(公告)日:2021-12-09
申请号:US16892202
申请日:2020-06-03
Applicant: Intel Corporation
Inventor: Hong JIANG , Sabareesh GANAPATHY , Xinmin TIAN , Fangwen FU , James VALERIO
Abstract: Examples described herein relate to a graphics processing apparatus that includes a memory device and a graphics processing unit (GPU) coupled to the memory device, the GPU can be configured to: execute an instruction thread; determine if a signal barrier is associated with the instruction thread; for a signal barrier associated with the instruction thread, determine if the signal barrier is cleared; and based on the signal barrier being cleared, permit any waiting instruction thread associated with the signal barrier identifier to commence with execution but not permit any waiting thread that is not associated with the signal barrier identifier to commence with execution. In some examples, the signal barrier includes a signal barrier identifier. In some examples, the signal barrier identifier is one of a plurality of values. In some examples, a gateway is used to receive indications of a signal barrier identifier and to selectively clear a signal barrier for a waiting instruction thread associated with the signal barrier identifier based on clearance conditions associated with the signal barrier being met.
-
公开(公告)号:US20190278577A1
公开(公告)日:2019-09-12
申请号:US16304644
申请日:2016-07-01
Applicant: Intel Corporation
Inventor: Mikhail PLOTNIKOV , Hideki IDO , Xinmin TIAN , Sergey PREIS , Milind B. GIRKAR , Maxim SHUTOV
IPC: G06F8/41
Abstract: Methods, apparatus, and system to optimize compilation of source code into vectorized compiled code, notwithstanding the presence of output dependencies which might otherwise preclude vectorization.
-
-
-