摘要:
A method and system for an infinite precision split multiply and add operation which has increased speed. The method and system for providing a split multiply and add of a plurality of operands include a multiplier and an adder means. The multiplier multiplies a first portion of the plurality of operands, thereby providing a product. The adder, which combines the remaining operands and the product, comprise at least one pair of data paths. Each pair of data paths comprises a first data path and a second data path. The first data path comprises a first aligner, a first adder, and a first normalizer capable of shifting a mantissa by a substantially fewer number digits than the aligner. The second data path comprises a second aligner, a second adder, and a second normalizer capable of shifting a mantissa by a substantially larger number of digits than the aligner. Accordingly, the present invention includes split multiply and add data paths which, individually, are faster than a fused multiply and add. In addition, the split multiply and add data paths can preserve the appearance of infinite precision. Consequently, overall system performance is increased.
摘要:
A system and method for calculating a floating point add/subtract of a plurality of floating point operands is disclosed. The system comprises at least one pair of data paths. Each pair of data paths comprises a first data path and a second data path. The first data path includes a first aligner, a first adder coupled to the first aligner, and a first normalizer coupled to the first adder. The first normalizer is capable of shifting a mantissa by a substantially smaller number of digits than the first aligner. The second data path comprises control logic, a second aligner coupled to the control logic, a second adder coupled to the second aligner, and a second normalizer coupled to the second adder. The control logic provides a control signal that is responsive to a first predetermined number of digits of each exponent of a pair of exponents. The pair of exponents are the exponents for a pair of inputs to the second data path. The second aligner is responsive to the control signal provided by the control logic. In addition, the second normalizer is capable of shifting a mantissa by a substantially larger number of digits than the second aligner.
摘要:
A method and system for reducing the dispatch latency of instructions of a processor provides for reordering the instructions in a predetermined format before the instructions enter the cache. The method and system also stores information in the cache relating to the reordering of the instructions. The reordered instructions are then provided to the appropriate execution units based upon the predetermined format. With this system, a dispatch buffer is not required when sending the instructions to the cache.
摘要:
A system and method for improving arbitration of a plurality of events that may require access to a cache is disclosed. In a first aspect, the method and system provide dynamic arbitration. The first aspect comprises first logic for determining whether at least one of the plurality of events requires access to the cache and for outputting at least one signal in response thereto. Second logic coupled to the first logic determines the priority of each of the plurality of events in response to the at least one signal and outputs a second signal specifying the priority of each event. Third logic coupled to the second logic grants access to the cache in response to the second signal. A second aspect of the method and system provides user programmable arbitration. The second aspect comprises a storage unit which allows the user to input information indicating the priority of at least one of the plurality of events and outputs a first signal in response to the information. In the second aspect, first logic coupled to the storage unit determines the priority of each of the plurality of events in response to the first signal and outputs a second signal indicating the priority of each event. Second logic coupled to the first logic grants access to the cache in response to the second signal.
摘要:
A method and system for fast calculation of the sticky bit and a function of the guard bit is disclosed. A first aspect of the method and system provides a fast calculation of the sticky bit. A second aspect provides a fast calculation of a function of the guard bit. Both aspects comprise means for providing an intermediate result of a floating point mathematical operation involving at least a first and a second operand and means for providing a mask indicating a position of a leading one in a mantissa of the intermediate result. In the first aspect, means for aligning a first bit of the mask to an (n+2)nd bit of the intermediate result, where n is the number of bits in a mantissa of the first or second operand, are coupled to the intermediate result providing means. In the second aspect, means for aligning a first bit of the mask to an (n+1)st bit of the intermediate result are coupled to the intermediate result providing means. In both aspects, means for providing an output are coupled to the aligning means and intermediate result providing means. The output of the first aspect comprises the sticky bit. The output of the second aspect comprises a function of the guard bit. Thus, the method and system allow the sticky bit and a function of the guard bit to be calculated substantially simultaneously with normalization. Because the method and system allow fast determination of the sticky bit and a function of the guard bit, the overall speed of the calculation is increased and system performance is improved.