Scalarization of vector processing
Abstract:
A Single-Instruction-Multiple-Treads (SIMT) computing system includes multiple processors and a scheduler to schedule multiple threads to each of the processors. Each processor includes a scalar unit to provide a scalar lane for scalar execution and vector units to provide N parallel lanes for vector execution. During execution time, a processor detects that an instruction of N threads has been predicted by a compiler to have (N−M) inactive threads and the same source operands for M active threads, where N>M≥1. Upon the detection, the instruction is sent to the scalar unit for scalar execution.
Public/Granted literature
Information query
Patent Agency Ranking
0/0