Abstract:
The present disclosure relates to a non-transitory computer-readable recording medium storing an analysis program that causes a computer to execute a process. The process includes sampling an instruction address of one of instructions included in a program during execution of the program, identifying a first function that includes the sampled instruction address in an address range, rewriting mark information associated with the identified first function, identifying first information corresponding to the instruction address of the first function among a plurality of first information based on the rewritten mark information, identifying second information corresponding to the instruction address of the first function among a plurality of second information based on the rewritten mark information, storing the first information and the second information in a memory, and analyzing performance of the program based on the first information and the second information stored in the memory.
Abstract:
A Conditional Transaction End (CTEND) instruction is provided that allows a program executing in a nonconstrained transactional execution mode to inspect a storage location that is modified by either another central processing unit or the Input/Output subsystem. Based on the inspected data, transactional execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs. For instance, when the instruction executes, the processor is in a nonconstrained transaction execution mode, and the transaction nesting depth is one at the beginning of the instruction, a second operand of the instruction is inspected, and based on the inspected data, transaction execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs, such as the value of the second operand becomes a prespecified value or a time interval is exceeded.
Abstract:
A method of encapsulating a long instruction in a set of short instructions for execution on a processor, the long instruction having k bits and each short instruction having l bits where l
Abstract:
A predecode repair cache is described in a processor capable of fetching and executing variable length instructions having instructions of at least two lengths which may be mixed in a program. An instruction cache is operable to store in an instruction cache line instructions having at least a first length and a second length, the second length longer than the first length. A predecoder is operable to predecode instructions fetched from the instruction cache that have invalid predecode information to form repaired predecode information. A predecode repair cache is operable to store the repaired predecode information associated with instructions of the second length that span across two cache lines in the instruction cache. Methods for filling the predecode repair cache and for executing an instruction that spans across two cache lines are also described.
Abstract:
A method of encapsulating a long instruction in a set of short instructions for execution on a processor, the long instruction having k bits and each short instruction having l bits where l
Abstract:
An apparatus extracts instructions from a stream of undifferentiated instruction bytes in a microprocessor having an instruction set architecture in which the instructions are variable length. Decoders generate an associated start/end mark for each instruction byte of a line from a first queue of entries each storing a line of instruction bytes. A second queue has entries each storing a line received from the first queue along with the associated start/end marks. Control logic detects a condition where the length of an instruction whose initial portion within a first line in the first queue is yet undeterminable because the instruction's remainder resides in a second line yet to be loaded into the first queue from the instruction cache; loads the first line and corresponding start/end marks into the second queue and refrains from shifting the first line out of the first queue, in response to detecting the condition; and extracts instructions from the first line in the second queue based on the corresponding start/end marks. The instructions exclude the yet undeterminable length instruction.
Abstract:
A method and apparatus is presented for identifying instructions in a stream of information by preprocessing the stream of information, creating a vector of instructions and breaking the vector of instructions into two or more vectors for picking the identified instructions at a high frequency.
Abstract:
When a branch instruction is decoded by the instruction decoders 409a˜409c, the upper 29 bits of the PC relative value included in the branch instruction are sent to the upper PC calculator 411 and the lower 3 bits are sent to the lower PC calculator 405. The lower PC calculator 405 adds the lower 3 bits of the PC relative value and the lower 3 bits of the present lower PC 404 and sends the result to the lower PC 404 as the updated lower PC. The upper PC calculator 411 adds the upper 29 bits of the PC relative value, the upper 29 bits of the present upper PC 403, and a carry that may be received from the lower PC calculator 405, and sends the result to the upper PC 403 as the updated upper PC.
Abstract:
A memory subsystem includes a first memory, a second memory, a first compressor, and a first decompressor. The first memory is configured to store instruction bytes of a fetch window and to store first predecode information and first branch information that characterizes the instruction bytes of the fetch window. The second memory is configured to store the instruction bytes of the fetch window upon eviction of the instruction bytes from the first memory and to store combined predecode/branch information that also characterizes the instruction bytes of the fetch window. The first compressor is configured to compress the first predecode information and the first branch information into the combined predecode/branch information. The first decompressor is configured to decode at least some of the instruction bytes stored in the second memory to convert the combined predecode/branch information into second predecode information, which corresponds to an uncompressed version of the first predecode information, for storage in the third memory.
Abstract:
In a pipelined processor where instructions are pre-decoded prior to being stored in a cache, an incorrectly pre-decoded instruction is detected during execution in the pipeline. The corresponding instruction is invalidated in the cache, and the instruction is forced to evaluate as a branch instruction. In particular, the branch instruction is evaluated as “mispredicted not taken” with a branch target address of the incorrectly pre-decoded instruction's address. This, with the invalidated cache line, causes the incorrectly pre-decoded instruction to be re-fetched from memory with a precise address. The re-fetched instruction is then correctly pre-decoded, written to the cache, and executed.