TECHNOLOGIES FOR INDIRECTLY CALLING VECTOR FUNCTIONS

    公开(公告)号:US20190050212A1

    公开(公告)日:2019-02-14

    申请号:US16076735

    申请日:2016-03-11

    Abstract: Technologies for indirectly calling vector functions include a compute device that includes a memory device to store source code and a compiler module. The compiler module is to identify a set of declarations of vector variants for scalar functions in the source code, generate a vector variant address map for each set of vector variants, generate an offset map for each scalar function, and identify, in the source code, an indirect call to the scalar functions, wherein the indirect call is to be vectorized. The compiler module is also to determine, based on a context of the indirect call, a vector variant to be called and store, in object code and in association with the indirect call, an offset into one of the vector variant address maps based on (i) the determined vector variant to be called and (ii) the offset map that corresponds to each scalar function.

    Concept for Handling Memory Spills
    2.
    发明公开

    公开(公告)号:US20230244456A1

    公开(公告)日:2023-08-03

    申请号:US18161105

    申请日:2023-01-30

    CPC classification number: G06F8/441 G06F9/30112

    Abstract: Examples provide an apparatus, device, method, computer program and non-transitory machine-readable storage medium including program code for processing memory spill code during compilation of a computer program. The non-transitory machine-readable storage medium includes program code for processing memory spill code during compilation of a computer program, when executed, to cause a machine to perform identifying a plurality of instructions related to scalar memory spill code during compilation of a computer program, and transforming at least a subset of the plurality of instructions into vectorized code.

    HIERARCHICAL THREAD SCHEDULING
    3.
    发明申请

    公开(公告)号:US20210382717A1

    公开(公告)日:2021-12-09

    申请号:US16892202

    申请日:2020-06-03

    Abstract: Examples described herein relate to a graphics processing apparatus that includes a memory device and a graphics processing unit (GPU) coupled to the memory device, the GPU can be configured to: execute an instruction thread; determine if a signal barrier is associated with the instruction thread; for a signal barrier associated with the instruction thread, determine if the signal barrier is cleared; and based on the signal barrier being cleared, permit any waiting instruction thread associated with the signal barrier identifier to commence with execution but not permit any waiting thread that is not associated with the signal barrier identifier to commence with execution. In some examples, the signal barrier includes a signal barrier identifier. In some examples, the signal barrier identifier is one of a plurality of values. In some examples, a gateway is used to receive indications of a signal barrier identifier and to selectively clear a signal barrier for a waiting instruction thread associated with the signal barrier identifier based on clearance conditions associated with the signal barrier being met.

Patent Agency Ranking