Abstract:
Systems, apparatuses and methods include technology that identifies that a computation will be executed based on a plurality of values. The technology determines an order-of-operations associated with the computation and loads the plurality of values in an order determined based on the order-of-operations.
Abstract:
A method and apparatus for keeping statistical inference accuracy with 8-bit winograd convolution. A calibration dataset and a pretrained CNN comprising 32-bit floating point weight values may be sampled to generate an input activation tensor and a weight tensor. A transformed input activation tensor may be generated by multiplying the input activation tensor and an input matrix to generate a transformed input activation tensor. A transformed weight tensor may be generated by multiplying the weight tensor and a weight matrix. A scale factor may be computed for each transformed tensor. An 8-bit CNN model including the scale factors may be generated.
Abstract:
Methods, apparatuses and storage medium associated with execution of application code having multiple ISAs, are disclosed. In various embodiments, a runtime environment may execute application code having multiple instruction set architectures. The runtime environment may be configured to execute first code of the application code according to a first instruction set architecture, while also configured to execute second code of the application code according to a second instruction set architecture that extends the first instruction set architecture. Using gates, the runtime environment may be adapted to adapt an interaction from the first code to the second instruction set architecture and/or adapt an interaction from the second code to the first instruction set architecture and, subsequently, return to executing the application code according to the first instruction set architecture or the second instruction set architecture, respectively. Other embodiments may be disclosed or claimed.