Abstract:
Various embodiments include nested emulation for a source application and source emulator. Duplicate source ISA libraries redirect the source emulator library calls to a target library, thereby forcing the native emulator through proper emulation channels between first and second ISAs. Other embodiments concern accelerating dynamic linking by determining certain function calls that, rather than being processed through emulation of PLT code, are instead directly called without the need for PLT code translation. Some embodiments address both nested emulation and accelerated dynamic linking but other embodiments include one of nested emulation and accelerated dynamic linking. Other embodiments are described herein.
Abstract:
Apparatuses, methods and storage media associated with multiple processor modes execution are described herein. In embodiments, an apparatus may include a processor with a plurality of processor modes, including a first processor mode to address a first address space, and a second processor mode to address a second address space, the second address space including the first address space. The apparatus may further include a signal handler to handle a signal from a kernel, in the first processor mode; and a signal handler wrapper to switch the processor to the second processor mode on delivery of the signal from the kernel, save a current extra context of the second processor mode from the second register file to a user stack, switch the processor back to the first processor mode, then invoke the signal handler to handle the signal. Other embodiments may be described or claimed.
Abstract:
Systems and methods may provide translation cache closure and consistent data recovery in dynamic code generating system. An apparatus may group translation cache together and restore a translation cache snapshot as a whole. Chaining between translations may be maintained during saving and restoration.
Abstract:
Systems, apparatuses and methods may provide technology for optimizing an inference neural network model that performs asymmetric quantization by generating a quantized neural network, wherein model weights of the neural network are quantized as signed integer values, and wherein an input layer of the neural network is configured to quantize input values as unsigned integer values, generating a weights accumulation table based on the quantized model weights and a kernel size for the neural network, and generating an output restoration function for an output layer of the neural network based on the weights accumulation table and the kernel size. The technology may also perform per-input channel quantization. The technology may also perform mixed-precision auto-tuning.
Abstract:
Method, systems and apparatuses may provide for technology that divides an application into a plurality of portions that are each associated with one or more functions of the application and determine a plurality of transition probabilities between the plurality of portions. Some technology may also receive at least a first portion of the plurality of portions, and receive a relation file indicating the plurality of transition probabilities between the plurality of portions.
Abstract:
Systems and methods may provide translation cache closure and consistent data recovery in dynamic code generating system. An apparatus may group translation cache together and restore a translation cache snapshot as a whole. Chaining between translations may be maintained during saving and restoration.
Abstract:
Apparatuses, methods and storage media associated with multiple processor modes execution are described herein. In embodiments, an apparatus may include a processor with a plurality of processor modes, including a first processor mode to address a first address space, and a second processor mode to address a second address space, the second address space including the first address space. The apparatus may further include a signal handler to handle a signal from a kernel, in the first processor mode; and a signal handler wrapper to switch the processor to the second processor mode on delivery of the signal from the kernel, save a current extra context of the second processor mode from the second register file to a user stack, switch the processor back to the first processor mode, then invoke the signal handler to handle the signal. Other embodiments may be described or claimed.
Abstract:
Method, systems and apparatuses may provide for technology that divides an application into a plurality of portions that are each associated with one or more functions of the application and determine a plurality of transition probabilities between the plurality of portions. Some technology may also receive at least a first portion of the plurality of portions, and receive a relation file indicating the plurality of transition probabilities between the plurality of portions.
Abstract:
Various embodiments include nested emulation for a source application and source emulator. Duplicate source ISA libraries redirect the source emulator library calls to a target library, thereby forcing the native emulator through proper emulation channels between first and second ISAs. Other embodiments concern accelerating dynamic linking by determining certain function calls that, rather than being processed through emulation of PLT code, are instead directly called without the need for PLT code translation. Some embodiments address both nested emulation and accelerated dynamic linking but other embodiments include one of nested emulation and accelerated dynamic linking. Other embodiments are described herein.
Abstract:
Methods, apparatuses and storage medium associated with execution of application code having multiple ISAs, are disclosed. In various embodiments, a runtime environment may execute application code having multiple instruction set architectures. The runtime environment may be configured to execute first code of the application code according to a first instruction set architecture, while also configured to execute second code of the application code according to a second instruction set architecture that extends the first instruction set architecture. Using gates, the runtime environment may be adapted to adapt an interaction from the first code to the second instruction set architecture and/or adapt an interaction from the second code to the first instruction set architecture and, subsequently, return to executing the application code according to the first instruction set architecture or the second instruction set architecture, respectively. Other embodiments may be disclosed or claimed.