摘要:
In the various aspects, a virtual machine operating at the machine layer may use power consumption models to partition object code into portions, identify the relative power efficiencies of the mobile device processors for the various code portions, and route the code portions to the mobile device processors that can perform the operations using the least amount of energy. A dynamic binary translator process may translate the object code portions into an instruction set language supported by the hardware component identified as being preferred. The code portions may be executed and the amount of power consumed may be measured, with the measurements used to generate and/or update performance and power consumption models.
摘要:
One embodiment of the present invention provides a system that facilitates precise exception semantics for a virtual machine. During operation, the system executes a program in the virtual machine using a processor that includes a gated store buffer that stores values to be written to a memory. This gated store buffer is configured to delay a store to the memory until after a speculatively-optimized region of the program commits. The processor signals an exception when it detects that a load following the store is attempting to access the same memory region being written by the store prior to the commitment of the speculatively-optimized region.
摘要:
The aspects enable a computing device to allocate memory space to variables during runtime compilation of a software application. A compiler may be modified to identify operations that can be performed on either a main pipe or an alternative pipe, identify chains of related operations that can be performed on either the main pipe or the alternative pipe, identify points in the execution of code at which the number of live values will exceed the number of registers, and choosing a chain of operations as a candidate to be moved to the alternative pipe in order to reduce the number of live values at identified points in the execution of code. The entire chosen chain of operations may be moved to the alternative pipe. The alternative pipe may perform the computations and return the results to the main pipe for execution.
摘要:
One embodiment of the present invention provides a system that improves program performance by enregistering memory locations. During operation, the system receives program object code which has been generated to use a specified number of registers that are available for a given target hardware implementation. Next, the system translates this object code to execute on a second hardware implementation which includes more registers than the first hardware implementation. The system uses these additional registers to improve the performance of the translated object code for the second hardware implementation. More specifically, the system identifies a memory access in the object code, and then rewrites an instruction associated with this memory access to access an available register instead of the original target memory location. To preserve program semantics, the system subsequently moderates accesses to the memory location to ensure that no threads access a stale value in the enregistered memory location.
摘要:
The aspects enable a computing device to allocate memory space to variables during runtime compilation of a software application. A compiler may be modified to identify operations that can be performed on either a main pipe or an alternative pipe, identify chains of related operations that can be performed on either the main pipe or the alternative pipe, identify points in the execution of code at which the number of live values will exceed the number of registers, and choosing a chain of operations as a candidate to be moved to the alternative pipe in order to reduce the number of live values at identified points in the execution of code. The entire chosen chain of operations may be moved to the alternative pipe. The alternative pipe may perform the computations and return the results to the main pipe for execution.
摘要:
A method for implementing a just-in-time compiler involves obtaining high-level code templates in a high-level programming language, where the high-level programming language is designed for compilation to an intermediate language capable of execution by a virtual machine, and where each high-level code template represents an instruction in the intermediate language. The method further involves compiling the high-level code templates to native code to obtain optimized native code templates, where compiling the high-level code templates is performed, prior to runtime, using an optimizing static compiler designed for runtime use with the virtual machine. The method further involves implementing the just-in-time compiler using the optimized native code templates, where the just-in-time compiler is configured to substitute an optimized native code template when a corresponding instruction in the intermediate language is encountered at runtime.
摘要:
In the various aspects, a virtual machine operating at the machine layer may use power consumption models to partition object code into portions, identify the relative power efficiencies of the mobile device processors for the various code portions, and route the code portions to the mobile device processors that can perform the operations using the least amount of energy. A dynamic binary translator process may translate the object code portions into an instruction set language supported by the hardware component identified as being preferred. The code portions may be executed and the amount of power consumed may be measured, with the measurements used to generate and/or update performance and power consumption models.
摘要:
One embodiment provides a system that protects translated guest program code in a virtual machine that supports self-modifying program code. While executing a guest program in the virtual machine, the system uses a guest shadow page table associated with the guest program and the virtual machine to map a virtual memory page for the guest program to a physical memory page on the host computing device. The system then uses a dynamic compiler to translate guest program code in the virtual memory page into translated guest program code (e.g., native program instructions for the computing device). During compilation, the dynamic compiler stores in a compiler shadow page table and the guest shadow page table information that tracks whether the guest program code in the virtual memory page has been translated. The compiler subsequently uses the information stored in the guest shadow page table to detect attempts to modify the contents of the virtual memory page. Upon detecting such an attempt, the system invalidates the translated guest program code associated with the virtual memory page.
摘要:
One embodiment of the present invention provides a system that facilitates precise exception semantics for a virtual machine. During operation, the system executes a program in the virtual machine using a processor that includes a gated store buffer that stores values to be written to a memory. This gated store buffer is configured to delay a store to the memory until after a speculatively-optimized region of the program commits. The processor signals an exception when it detects that a load following the store is attempting to access the same memory region being written by the store prior to the commitment of the speculatively-optimized region.
摘要:
A system and method are provided for inlining a program call between processes executing under separate ISAs (Instruction Set Architectures) within a system virtual machine. The system virtual machine hosts any number of virtual operating system instances, each of which may execute any number of applications. The system virtual machine interprets or dynamically compiles not only application code executing under virtual operating systems, but also the virtual operating systems. For a program call that crosses ISA boundaries, the virtual machine assembles an intermediate representation (IR) graph that spans the boundary. Region nodes corresponding to code on both sides of the call are enhanced with information identifying the virtual ISA of the code. The IR is optimized and used to generate instructions in a native ISA (Instruction Set Architecture) of the virtual machine. Individual instructions are configured and executed (or emulated) to perform as they would within the virtual ISA.