摘要:
A method for executing an MMX PSADBW instruction by a microprocessor. The method includes generating packed differences of packed operands of the instruction and generating borrow bits associated with each of the packed differences; for each of the packed differences: determining whether the borrow bit indicates the packed difference is positive or negative and selecting a value in response to the determining, the value comprising the packed difference if the associated borrow bit is positive and a complement of the packed difference if the associated borrow bit is negative; adding the selected values to generate a first sum and a first carry and in parallel adding the borrow bits to generate a second sum and a second carry; adding the first and second sums and the first and second carries to generate a result of the instruction; storing the result in a register of the microprocessor.
摘要:
A method and system for permitting the selective support of non-architected instructions within a superscalar processor system. A special access bit within the system machine state register is provided and set in response to each initiation of an application during which execution of non-architected instructions is desired. Thereafter, each time a non-architected instruction is decoded the status of the special access bit is determined. The non-architected instruction is executed in response to a set state of the special access bit. The illegal instruction program interrupt is issued in response to an attempted execution of a non-architected instruction if the special access bit is not set. In this manner, for example, complex instruction set computing (CISC) instructions may be selectively enabled for execution within a reduced instruction set computing (RISC) data processing system while maintaining full architectural compliance with the reduced instruction set computing (RISC) instructions.
摘要:
A method for selectively executing speculative load instructions in a high-performance processor is disclosed. In accordance with the present disclosure, when a speculative load instruction for which the data is not stored in a data cache is encountered, a bit within an enable speculative load table which is associated with that particular speculative load instruction is read in order to determine a state of the bit. If the associated bit is in a first state, data for the speculative load instruction is requested from a system bus and further execution of the speculative load instruction is then suspended to wait for control signals from a branch processing unit. If the associated bit is in a second state, the execution of the speculative load instruction is immediately suspended to wait for control signals from the branch processing unit. If the speculative load instruction is executed in response to the control signals, then the associated bit in the enable speculative load table will be set to the first state. However, if the speculative load instruction is not executed in response to the control signals, then the associated bit in the enable speculative load table is set to the second state. In this manner, the displacement of useful data in the data cache due to wrongful execution of the speculative load instruction is avoided.
摘要:
A method for executing an MMX PSADBW instruction by a microprocessor. The method includes generating packed differences of packed operands of the instruction and generating borrow bits associated with each of the packed differences; for each of the packed differences: determining whether the borrow bit indicates the packed difference is positive or negative and selecting a value in response to the determining, the value comprising the packed difference if the associated borrow bit is positive and a complement of the packed difference if the associated borrow bit is negative; adding the selected values to generate a first sum and a first carry and in parallel adding the borrow bits to generate a second sum and a second carry; adding the first and second sums and the first and second carries to generate a result of the instruction; storing the result in a register of the microprocessor.
摘要:
A BIU prioritizes L1 requests above L2 requests. The L2 generates a first request to the BIU and detects the generation of a snoop request and L1 request to the same cache line. The L2 determines whether a bus transaction to fulfill the first request may be retried and, if so, generates a miss, and otherwise generates a hit. Alternatively, the L2 detects the L1 generated a request to the L2 for the same line and responsively requests the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted the bus. Alternatively, a prefetch cache and the L2 allow the same line to be simultaneously present. If an L1 request hits in both the L2 and in the prefetch cache, the prefetch cache invalidates its copy of the line and the L2 provides the line to the L1.
摘要:
A method and apparatus in a superscalar microprocessor for early completion of floating-point instructions prior to a previous load/store multiple instruction is provided. The microprocessor's load/store execution unit loads or stores data to or from the general purpose registers, and the microprocessor's dispatch unit dispatches instructions to a plurality of execution units, including the load/store execution unit and the floating point execution unit. The method comprises the dispatch unit dispatching a multi-register instruction to the load/store unit to begin execution of the multi-register instruction, wherein the multi-register instruction, such as a store multiple or a load multiple, stores or loads data from more than one of the plurality of general purpose registers to memory, and further, prior to the multi-register instruction finishing execution in the load/store unit, the dispatch unit dispatches a floating-point instruction, which is dependent upon source operand data stored in one or more floating-point registers of the plurality of floating point registers, to the floating-point execution unit, wherein the dispatched floating-point instruction completes execution prior to the multi-register instruction finishing execution.
摘要:
A microprocessor configured to access an external memory includes a first-level cache, a second-level cache, and a bus interface unit (BIU) configured to interface the first-level and second-level caches to a bus used to access the external memory. The BIU is configured to prioritize requests from the first-level cache above requests from the second-level cache. The second-level cache is configured to generate a first request to the BIU to fetch a cache line from the external memory. The second-level cache is also configured to detect that the first-level cache has subsequently generated a second request to the second-level cache for the same cache line. The second-level cache is also configured to request the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted ownership of the bus to fulfill the first request.
摘要:
An apparatus for performing an MultiMedia extension (MMX) Packed Sum of Absolute Differences (PSADBW) instruction is disclosed. The apparatus includes carry-generating subtraction logic that generates packed differences of the subtrahend from the minuend and associated carry bits indicating whether the difference is positive or negative. The apparatus selectively inverts the differences based on the carry bits. Addition logic adds the selectively inverted differences and carry bits substantially in parallel to generate the PSADBW instruction result. In one embodiment, the apparatus also includes two muxes. The first mux selects the selectively inverted differences in the case of a PSADBW instruction and selects a multiply instruction's partial products otherwise. The second mux selects the carry bits in the case of a PSADBW instruction and selects a second multiply instruction's partial products otherwise. The two mux outputs are provided to the addition logic. In one embodiment, the microprocessor translates the PSADBW instruction into at least first and second microinstructions for execution.
摘要:
A memory subsystem in a microprocessor includes a first-level cache, a second-level cache, and a prefetch cache configured to speculatively prefetch cache lines from a memory external to the microprocessor. The second-level cache and the prefetch cache are configured to allow the same cache line to be simultaneously present in both. If a request by the first-level cache for a cache line hits in both the second-level cache and in the prefetch cache, the prefetch cache invalidates its copy of the cache line and the second-level cache provides the cache line to the first-level cache.
摘要:
A BIU prioritizes L1 requests above L2 requests. The L2 generates a first request to the BIU and detects the generation of a snoop request and L1 request to the same cache line. The L2 determines whether a bus transaction to fulfill the first request may be retried and, if so, generates a miss, and otherwise generates a hit. Alternatively, the L2 detects the L1 generated a request to the L2 for the same line and responsively requests the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted the bus. Alternatively, a prefetch cache and the L2 allow the same line to be simultaneously present. If an L1 request hits in both the L2 and in the prefetch cache, the prefetch cache invalidates its copy of the line and the L2 provides the line to the L1.