摘要:
A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.
摘要:
A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.
摘要:
A method and apparatus for including in a processor instructions for performing multiply-subtract operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least one of the data elements in this third packed data storing the result of performing a multiply-subtract operation on data elements in the first and second packed data.
摘要:
A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.
摘要:
A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.
摘要:
A method of multiplying and accumulating two sets of values in a computer system. A packed multiply add is performed on a first portion of a first set of values packed into a first source and a first portion of a second set of values packed into a second source to generate a first result. The first result is unpacked into a plurality of values (e.g. two). The plurality of values is then added together to form a resulting accumulation value.
摘要:
A method and apparatus for performing a shift operation on packed data elements having multiple values. One embodiment includes accessing the shift control signal of a first format from a memory. The shift control signal identifyies a first packed shift operation and whether the shift positions are byte positions or bit positions, and causes a processor to execute a set of control signals of a second format, thereby accessing the packed data, shifting the packed data by the number of shift positions according to the first packed shift operation, generating a first replacement data for one of the number of positions, and producing a shifted first packed data comprising the first replacement data.
摘要:
A method and apparatus for providing, in a processor, a shift operation on a packed data element having multiple values. The apparatus having multiple muxes, each of the multiple muxes having a first input, a second input, a select input and an output. Each of the multiple bits that represent a shifted packed intermediate result on a first bus is coupled to the corresponding first input. Each of the multiple bits representing a replacement bit for one of the multiple values is coupled to a corresponding second input. Each of the multiple bits driven by a correction circuit is coupled to a corresponding select input. Each output corresponds to a bit of a shifted packed result.
摘要:
An apparatus for performing a shift operation on a packed data element having multiple values. The apparatus having multiple muxes, each of the multiple muxes having a first input, a second input, a select input and an output. Each of the multiple bits that represent a shifted packed intermediate result on a first bus is coupled to the corresponding first input. Each of the multiple bits representing a replacement bit for one of the multiple values is coupled to a corresponding second input. Each of the multiple bits driven by a correction circuit is coupled to a corresponding select input. Each output corresponds to a bit of a shifted packed result.
摘要:
A technique for sorting packed signed numbers of two operands into maxima and minima operands and solving absolute differences for each pair of corresponding values of maxima and minima. After packing two source operands with a plurality of data elements containing signed values, a greater-than comparison operation is performed on each pair of corresponding numbers in the two operands to determine which is greater. An exclusive-OR mask is generated for use in swapping those values which need to be rearranged so that all maxima are in one operand and all minima are in another operand. Once the sorting of maxima and minima is complete, a packed subtraction operation is then performed by subtracting the minima from corresponding maxima to obtain absolute differences.