Abstract:
Computer-implemented method, computerized apparatus and a computer program product for test selection. The computer-implemented method comprising: obtaining a test suite comprising a plurality of tests for a Software Under Test (SUT); and selecting a subset of the test suite, wherein the subset provides coverage of the SUT that correlates to a coverage by a workload of the SUT, wherein the workload defines a set of input events to the SUT thereby defining portions of the SUT that are to be invoked during execution.
Abstract:
Detecting optimization opportunities is enabled by utilizing a trace of a target concurrent computer program and determining a relation between data objects accessed during the tracked execution. The relation may be stored in a Temporal Relation Graph (TRG), in an extended-TRG or another data structure. The relation may be affected by temporally-adjacent accesses to data objects. The relation may further be affected by accesses to data objects performed during critical sections of the target program.
Abstract:
An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to pack the packed data responsive to a pack instruction received by the decoder. A first packed data element and a second packed data element are received from the first source register. A third packed data element and a fourth packed data element are received from the second source register. The circuit packs packing a portion of each of the packed data elements into a destination register resulting with the portion from second packed data element adjacent to the portion from the first packed data element, and the portion from the fourth packed data element adjacent to the portion from the third packed data element.
Abstract:
A processor. The processor includes a first register for storing a first packed data, a decoder, and a functional unit. The decoder has a control signal input. The control signal input is for receiving a first control signal and a second control signal. The first control signal is for indicating a pack operation. The second control signal is for indicating an unpack operation. The functional unit is coupled to the decoder and the register. The functional unit is for performing the pack operation and the unpack operation using the first packed data. The processor also supports a move operation.
Abstract:
Systems and methods for cache optimization are provided. The method comprises tracing objects instantiated during execution of a program code under test according to type of access by one or more threads running in parallel, wherein said tracing provides information about order in which different instances of one or more objects are accessed by said one or more threads and whether the type of access is a read operation or a write operation; and utilizing tracing information to build a temporal relationship graph (TRG) for the accessed objects, wherein the objects are represented by nodes in the TRG and at least two types of edges for connecting the nodes are defined.
Abstract:
An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.
Abstract:
An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.
Abstract:
A processor. The processor includes a first register for storing a first packed data, a decoder, and a functional unit. The decoder has a control signal input. The control signal input is for receiving a first control signal and a second control signal. The first control signal is for indicating a pack operation. The second control signal is for indicating an unpack operation. The functional unit is coupled to the decoder and the register. The functional unit is for performing the pack operation and the unpack operation using the first packed data. The processor also supports a move operation.
Abstract:
Optimizing a computer program by setting a first compiler optimization configuration for a first entity of a computer program, setting a second compiler optimization configuration for a second entity of the computer program, where the first and second entities are of the same type and where the first and second compiler optimization configurations differ, and compiling the computer program in accordance with the compiler optimization configurations, thereby creating a compiled program.
Abstract:
A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. One embodiment of a central processing unit (CPU) includes instruction fetch logic to fetch a single-instruction-multiple-data (SIMD) shift instruction. A register stores a multiple data elements to be operated upon by the SIMD shift instruction. A barrel shifter concurrently shifts the data elements in a bit-wise manner by a variable number of bit positions in response to the SIMD shift instruction.