摘要:
A system for binary multiplication in a superscalar processor includes a first pipeline, an execution unit, and a first multiplexer; a first rotator in communication with one register of the first pipeline and the execution unit; and a leading zero detection register in communication with the execution unit and another register of the first pipeline; a second pipeline, a second execution unit, and a second multiplexer; a rotator in communication with one register of the second pipeline and the second execution unit; and a leading zero detection register in communication with the second execution unit and another register of the first pipeline; and a third pipeline, a binary multiplier in communication with a pair registers of the third pipeline; a general register; an operand buffer for obtaining first and second operands; and a bus for communication between the pipelines, the general register and the operand buffer.
摘要:
A method of pre-aligning data for storage during instruction execution improves performance by eliminating the cycles otherwise required for data alignment. The method can convert data between ASCII and Packed Decimal format, and between Unicode Basic Latin and Packed Decimal format. Conversion to Packed Decimal format is needed for decimal hardware in a microprocessor designed to generate decimal results. Converting from Packed Decimal to ASCII and Unicode Basic Latin is necessary to report Decimal Arithmetic results in a required format for the application program. To further improve performance, all available write ports in the fixed point unit (FXU) are utilized to reduce the number of cycles necessary to store results. To prevent data fetching of the unused destination data from slowing down instruction execution, the destination locations are tested for storage access exceptions, but the data for these operands are not actually fetched. A single read request from the FXU to the operand buffers effectively reads the entire destination address (up to 8 double-words of data) in a single cycle.
摘要:
Verification of a system-under-test (SUT) supporting the functionality of operating a self modifying code is disclosed. A generator may generate a self modifying code. In response to identification that a simulator is about to simulate code generated by the self modifying code, the simulator may simulate the execution in a “rollover mode”. The code may include instruction codes having variable byte size, branching instructions, loops or the like. The simulator may further simulate execution of an invalid instruction. The simulator may perform rollback the simulation of the rollover mode in certain cases and avoid entering the rollover mode. The simulator may perform rollback in response to identifying a termination condition, as to insure avoiding endless loops. The simulator may perform rollback in response to reading an initialized value that is indefinite.
摘要:
Verification of a system-under-test (SUT) supporting the functionality of operating a self modifying code is disclosed. A generator may generate a self modifying code. In response to identification that a simulator is about to simulate code generated by the self modifying code, the simulator may simulate the execution in a “rollover mode”. The code may include instruction codes having variable byte size, branching instructions, loops or the like. The simulator may further simulate execution of an invalid instruction. The simulator may perform rollback the simulation of the rollover mode in certain cases and avoid entering the rollover mode. The simulator may perform rollback in response to identifying a termination condition, as to insure avoiding endless loops. The simulator may perform rollback in response to reading an initialized value that is indefinite.
摘要:
A method for decimal multiplication in a superscaler processor comprising: obtaining a first operand and a second operand; establishing a multiplier and an effective multiplicand from the first operand and the second operand; and generating and accumulating a partial product term every two cycles. The partial product terms are created from the effective multiplicand and multiples of the multiplier, where the effective multiplicand is stored in a first register file, the multiples being ones times the effective multiplier, two times the effective multiplier, four times the effective multiplier and eight times the effective multiplier and the partial product terms are added to an accumulation of previous partial product terms shifted one digit right such that a digit shifted off is preserved as a result digit.
摘要:
A method of implementing binary multiplication in a processing device includes obtaining a multiplicand and a multiplier from a storage device; in the event the multiplier is larger than a selected length, partitioning the multiplier into a plurality of multiplier subgroups; in the event the multiplicand is larger than a selected length, partitioning the multiplicand into a plurality of multiplicand subgroups and at least one of zeroing out of unused bits of the multiplicand subgroup and sign-extending a smaller portion of the multiplicand subgroup; establishing a plurality of multiplicand multiples based on at least one of a selected multiplicand subgroup of the plurality of multiplicand subgroups and the multiplicand; selecting one or more of the multiplicand multiples of the plurality of multiplicand multiples based on the each multiplier subgroup of the plurality of multiplier subgroups; and generating a first modular product based on the selected multiplicand multiples.