摘要:
A method and system for early instruction text based operand store compare avoidance in a processor are provided. The system includes a processor pipeline for processing instruction text in an instruction stream, where the instruction text includes operand address information. The system also includes delay logic to monitor the instruction stream. The delay logic performs a method that includes detecting a load instruction following a store instruction in the instruction stream, comparing the operand address information of the store instruction with the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to detecting a common field value between the operand address information of the store instruction and the load instruction.
摘要:
A method of implementing binary multiplication in a processing device includes obtaining a multiplicand and a multiplier from a storage device; in the event the multiplier is larger than a selected length, partitioning the multiplier into a plurality of multiplier subgroups; in the event the multiplicand is larger than a selected length, partitioning the multiplicand into a plurality of multiplicand subgroups and at least one of zeroing out of unused bits of the multiplicand subgroup and sign-extending a smaller portion of the multiplicand subgroup; establishing a plurality of multiplicand multiples based on at least one of a selected multiplicand subgroup of the plurality of multiplicand subgroups and the multiplicand; selecting one or more of the multiplicand multiples of the plurality of multiplicand multiples based on the each multiplier subgroup of the plurality of multiplier subgroups; and generating a first modular product based on the selected multiplicand multiples.
摘要:
A system for binary multiplication in a superscalar processor includes a first pipeline, an execution unit, and a first multiplexer; a first rotator in communication with one register of the first pipeline and the execution unit; and a leading zero detection register in communication with the execution unit and another register of the first pipeline; a second pipeline, a second execution unit, and a second multiplexer; a rotator in communication with one register of the second pipeline and the second execution unit; and a leading zero detection register in communication with the second execution unit and another register of the first pipeline; and a third pipeline, a binary multiplier in communication with a pair registers of the third pipeline; a general register; an operand buffer for obtaining first and second operands; and a bus for communication between the pipelines, the general register and the operand buffer.
摘要:
A method and apparatuses for performing binary multiplication on signed and unsigned operands of various lengths is discussed herein. It is a concept that may be split into two parts, the first of which is the multiplication hardware itself, a compact, less than-full sized multiplier employing Booth or other type of recoding methods upon the multiplier to reduce the number of partial products per scan, and implemented in such a manner so that a multiplication operation with large operands may be broken into subgroups of operations that will fit into this mid-sized multiplier whose results, here called modular products, may be knitted back together to form a correct, final product. The second part of the concept is the supporting hardware used to separate the operands into subgroups and input the data and control signals to the multiplier, and the algorithms and apparatuses used to align and combine the modular products properly to obtain the final product. These algorithms used to obtain a result as specified by the operation may be as varied as the supporting hardware with which the multiplier may be used, making this multiplier a very flexible and powerful design.
摘要:
A method and system for early instruction text based operand store compare avoidance in a processor are provided. The system includes a processor pipeline for processing instruction text in an instruction stream, where the instruction text includes operand address information. The system also includes delay logic to monitor the instruction stream. The delay logic performs a method that includes detecting a load instruction following a store instruction in the instruction stream, comparing the operand address information of the store instruction with the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to detecting a common field value between the operand address information of the store instruction and the load instruction.
摘要:
Systems, methods and computer program products for exploiting orthogonal control vectors in timing driven systems. An exemplary embodiment includes running an initial logic synthesis run on the system, identifying critical inputs to a logic cone related to the run, identifying orthogonal vectors in the logic cone, adding vectors to the logic cone, obtaining logical solutions and selecting a solution from the logical solutions.
摘要:
Systems, methods and computer program products for exploiting orthogonal control vectors in timing driven systems. An exemplary embodiment includes running an initial logic synthesis run on the system, identifying critical inputs to a logic cone related to the run, identifying orthogonal vectors in the logic cone, adding vectors to the logic cone, obtaining logical solutions and selecting a solution from the logical solutions.
摘要:
A system includes a storage device including a human readable common instruction table (CIT) stored as a text file. The system also includes CIT access software for performing a method including receiving a request from a first user for all or a subset of the CIT table relating to logic design and for providing the requested data to the first user. The method also includes receiving a request from a second user is received for all or a subset of the CIT table relating to performance analysis and for providing the requested data to the second user. A request is received from a third user for all or a subset of the CIT data relating to design verification and the requested data is provided to the third user.
摘要:
A system includes a storage device including a human readable common instruction table (CIT) stored as a text file. The system also includes CIT access software for performing a method including receiving a request from a first user for all or a subset of the CIT table relating to logic design and for providing the requested data to the first user. The method also includes receiving a request from a second user is received for all or a subset of the CIT table relating to performance analysis and for providing the requested data to the second user. A request is received from a third user for all or a subset of the CIT data relating to design verification and the requested data is provided to the third user.
摘要:
A method, system, and computer program product for minimizing branch prediction latency in a pipelined computer processing environment are provided. The method includes detecting a branch loop utilizing branch instruction addresses and corresponding target addresses stored in a branch target buffer (BTB). The method also includes fetching the branch loop into a pre-decode instruction buffer and qualifying the branch loop for loop lockdown. The method further includes locking an instruction stream that forms the branch loop in the pre-decode instruction buffer and processing qualified branch loop instructions from the buffer and powering down instruction fetching and branch prediction logic (BPL) associated with the BTB.