Abstract:
Embodiments relate to fingerprint-based code version selection. An aspect includes based on a call to a method being issued by calling software that is currently executing on a processor of a computer system, determining a fingerprint comprising a representation of a sequence of behavior that occurs in the processor while the calling software is executing. Another aspect includes, based on determining that the match for the fingerprint is located in the entry in the fingerprint table, executing the associated code version of the method. Another aspect includes, based on determining that no match for the fingerprint is located in any entry in the fingerprint table: determining a new code version of the method by a compiler of the computer system; storing the fingerprint with an identifier of the new code version in a new entry in the fingerprint table; and executing the new code version.
Abstract:
A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary.
Abstract:
A method in a computer-aided design system for generating a functional design model of a processor, is described herein. The method comprises detecting memory address information corresponding to accessed data in a first instruction, and detecting memory address information corresponding to accessed data in a second instruction. The method further comprises comparing the memory address information corresponding to the first instruction and the memory address information corresponding to the second instruction, and detecting, based on the comparison, that the accessed data in the first instruction and the accessed data in the second instruction are in a same data range of the memory device. In addition the method comprise executing the second instruction using the accessed data from the first instruction and detecting an error from the execution of the second instruction.
Abstract:
Embodiments relate to accessing data in a memory. A method for accessing data in a memory coupled to a processor is provided. The method receives a memory reference instruction for accessing data of a first size at an address in the memory. The method determines an alignment size of the address in the memory. The method accesses the data of the first size in one or more groups of data by accessing each group of data block concurrently. The groups of data have sizes that are multiples of the alignment size.
Abstract:
Embodiments of the present invention provide systems and methods for increasing the efficiency of data conversion in a coprocessor by using the statistical occurrence of data patterns to convert frequently occurring data patterns in one conversion cycle. In one embodiment, a coprocessor system is disclosed containing a converter engine, which includes a parser and a converter, an input buffer, and a result store. The input buffer is configured to transfer a set of source data to the converter engine, which converts the source data from first code format to a second code format, and sends the converted source data to the result store.
Abstract:
A system and method of implementing a modified priority routing of an input/output (I/O) interruption. The system and method determines whether the I/O interruption is pending for a core and whether any of a plurality of guest threads of the core is enabled for guest thread processing of the interruption in accordance with the determining that the I/O interruption is pending. Further, the system and method determines whether at least one of the plurality of guest threads enabled for guest thread processing is in a wait state and, in accordance with the determining that the at least one of the plurality of guest threads enabled for guest thread processing is in the wait state, routes the I/O interruption to a guest thread enabled for guest thread processing and in the wait state.
Abstract:
A mechanism for simultaneous multithreading is provided. Responsive to performing a store instruction for a given thread of threads on a processor core and responsive to the core having ownership of a cache line in a cache, an entry of the store instruction is placed in a given store queue belonging to the given thread. The entry for the store instruction has a starting memory address and an ending memory address on the cache line. The starting memory addresses through ending memory addresses of load queues of the threads are compared on a byte-per-byte basis against the starting through ending memory address of the store instruction. Responsive to one memory address byte in the starting through ending memory addresses in the load queues overlapping with a memory address byte in the starting through ending memory address of the store instruction, the threads having the one memory address byte is flushed.
Abstract:
Fetching a cache line into a plurality of caches of a multilevel cache system. The multilevel cache system includes at least a first cache, a second cache on a next higher level and a memory, the first cache being arranged to hold a subset of information of the second cache, the second cache being arranged to hold a subset of information of a next higher level cache or memory if no higher level cache exists. A fetch request is sent from one cache to the next cache in the multilevel cache system. The cache line is fetched in a particular state into one of the caches, and in another state into at least one of the other caches.
Abstract:
Embodiments relate to intra-instructional transaction abort handling. An aspect includes using an emulation routine to execute an instruction within a transaction. The instruction includes at least one unit of operation. The transaction effectively delays committing stores to memory until the transaction has completed successfully. After receiving an abort indication, emulation of the instruction is terminated prior to completing the execution of the instruction. The instruction is terminated after the emulation routine completes any previously initiated unit of operation of the instruction.
Abstract:
A hierarchical cache structure includes at least one real indexed higher level cache with a directory and a unified cache array for data and instructions, and at least two lower level caches, each split in an instruction cache and a data cache. An instruction cache of a split real indexed second level cache includes a directory and a corresponding cache array connected to the real indexed third level cache. A data cache of the split second level cache includes a directory connected to the third level cache. An instruction cache of a split virtually indexed first level cache is connected to the second level instruction cache. A cache array of a data cache of the first level cache is connected to the cache array of the second level instruction cache and to the cache array of the third level cache. A directory of the first level data cache is connected to the second level instruction cache directory and to the third level cache directory.