摘要:
Hardware-assisted program tracing is facilitated by a processor that includes a root instruction address register, a program trace signature computation unit and a call signature register. When a program instruction having an address matching the root instruction address register is executed, a program trace signature is captured in the call signature register and capture of branch history is commenced. By accumulating different values of the call signature register, for example in response to an interrupt generated when the root instruction is executed, software that performs program tracing can obtain signatures of all of the multiple execution paths that lead to the root instruction, which is also specified by software in order to set different root instructions for program tracing. In an alternative implementation, a storage for multiple call signatures is provided in the processor and read at once by the software.
摘要:
The disclosure provides a method, data processing system, and computer program product for managing a branch trace environment. In response to a branch being taken for a first branch instruction that is conditional and direct in the branch instructions, a performance monitoring unit stores an effective address of the first branch instruction into a first entry in a set of entries in a memory. The performance monitoring unit counts each branch not taken in processing the branch instructions occurring after the first branch instruction to form a branch count. In response to a branch being taken during processing of subsequent branch instructions in the branch instructions after the first branch instruction, the performance monitoring unit determines whether to create a second entry in the set of entries in the memory using the branch count with a set of rules identifying when the second entry is to be made.
摘要:
A method for a hybrid code signature including executing, via a processor, an application, the executing comprising executing a root instruction of the application; profiling, via the processor, the executing of the application, the profiling comprising storing a reference signature; determining, via the processor, a working signature of instructions executed subsequent to the executing of the root instruction, the determining comprising implementing a hashing function of the instructions in response to storing the reference signature; tracking the updating of the working signature by storing a value in a counter; and updating continuously, via the processor, the working signature with the hashing function while at least the working signature does not match the reference signature.
摘要:
Mechanisms are provided for data placement optimization during runtime of a computer program. The mechanisms detect cache misses in a cache of the data processing system and collect cache miss information for objects of the computer program. Data context information is generated for an object in an object access sequence of the computer program. The data context information identifies one or more additional objects accessed as part of the object access sequence in association with the object. The cache miss information is correlated with the data context information of the object. Data placement optimization is performed on the object, in the object access sequence, with which the cache miss information is associated. The data placement optimization places connected objects in the object access sequence in close proximity to each other in a memory structure of the data processing system.
摘要:
A method for a hybrid code signature including executing, via a processor, an application, the executing comprising executing a root instruction of the application; profiling, via the processor, the executing of the application, the profiling comprising storing a reference signature; determining, via the processor, a working signature of instructions executed subsequent to the executing of the root instruction, the determining comprising implementing a hashing function of the instructions in response to storing the reference signature; tracking the updating of the working signature by storing a value in a counter; and updating continuously, via the processor, the working signature with the hashing function while at least the working signature does not match the reference signature.
摘要:
This invention describes a method and several variants for compiling programs or components of programs in a mixed static and dynamic environment, so as to reduce the amount of time and memory spent in run-time compilation, or to exercise greater control over testing of the executable code for the program, or both. The invention involves generating persistent code images prior to program execution based on static compilation or dynamic compilation from a previous run, and then, adapting those images during program execution. We describe a method for generating auxiliary information in addition to the executable code that is recorded in the persistent code image. Further, we describe a method for checking the validity of those code images, adapting those images to the new execution context, and generating new executable code to respond to dynamic events, during program execution. Our method allows global interprocedural optimizations to be performed on the program, even if the programming language supports, or requires, dynamic binding. Variants of the method show how one or several of the features of the method may be performed. The invention is particularly useful in the context of implementing Java Virtual Machines, although it can also be used in implementing other programming languages.
摘要:
A system and method for cache replacement includes: augmenting each cache block in a cache region with a region hint indicating a temporal priority of the cache block; receiving an indication that a cache miss has occurred; and selecting for eviction the cache block comprising the region hint indicating a low temporal priority.
摘要:
A system for predicting multiple targets for a single branch includes: a branch target buffer that includes a previous next address for an instruction and that receives an indirect instruction address to provide a first branch target prediction; a first branch table for capturing local past target information of an indirect branch in an encoded form; a second branch table which is a correlation table for storing potential branch targets based on a local branch history and which provides a second branch target prediction when the first branch target prediction is not successful; an exclusion predictor for inhibiting updates of inefficient entries; and a multiplexer to select the predicted target as output.
摘要:
A method and several variants for using information about the scope of access of objects acted upon by mutual exclusion, or mutex, locks to transform a computer program by eliminating locking operations from the program or simplifying the locking operations, while strictly performing the semantics of the original program. In particular, if it can be determined by a compiler that the object locked can only be accessed by a single thread it is not necessary to perform the “acquire” or “release” part of the locking operation, and only its side effects must be performed. Likewise, if it can be determined that the side effects of a locking operation acting on a variable which is locked in multiple threads are not needed, then only the locking operation, and not the side effects, needs to be performed. This simplifies the locking operation, and leads to faster programs which use fewer computer processor resources to execute; and programs which perform fewer shared memory accesses, which in turn not only causes the optimized program, but also other programs executing on the same computing machine to execute faster. The method also describes how information about the semantics of the locking operation side effects and the information about the scope of access can also be used to eliminate performing the side effect parts of the locking operation, thereby completely eliminating the locking operation. The method also describes how to analyze the program to compute the necessary information about the scope of access. Variants of the method show how one or several of the features of the method may be performed.
摘要:
An apparatus includes a processor for executing instructions at runtime and instructions for dynamically compiling the set of instructions executing at runtime. A memory device stores the instructions to be executed and the dynamic compiling instructions. A memory device serves as a trace buffer used to store traces during formation during the dynamic compiling. The dynamic compiling instructions includes a next-executing-cycle (N-E-C) trace selection process for forming traces for the instructions executing at runtime. The N-E-C trace selection process continues through an existing trace-head when forming traces without terminating a recording of a current trace if an existing trace-head is encountered.