摘要:
A processing core is described having execution unit logic circuitry having a first register to store a first vector input operand, a second register to a store a second vector input operand and a third register to store a packed data structure containing scalar input operands a, b, c. The execution unit logic circuitry further include a multiplier to perform the operation (a*(first vector input operand))+(b*(second vector operand))+c.
摘要:
A processing core is described having execution unit logic circuitry having a first register to store a first vector input operand, a second register to a store a second vector input operand and a third register to store a packed data structure containing scalar input operands a, b, c. The execution unit logic circuitry further include a multiplier to perform the operation (a*(first vector input operand))+(b*(second vector operand))+c.
摘要:
Instructions and logic provide vector linear interpolation functionality. In some embodiments, responsive to an instruction specifying: a first operand from a set of vector registers, a size of each of the vector elements, a portion of the vector elements upon which to compute linear interpolations, a second operand from a set of vector registers, and a third operand; an execution unit, reads a first, a second and a third value of the size of vector elements from corresponding data fields in the first, the second and the third operand respectively and computes an interpolated value as the first value multiplied by the second value minus the second value multiplied by the third value plus the third value.
摘要:
A method of processing an instruction is described that includes fetching and decoding the instruction. The instruction has separate destination address, first operand source address and second operand source address components. The first operand source address identifies a location of a first mask pattern in mask register space. The second operand source address identifies a location of a second mask pattern in the mask register space. The method further includes fetching the first mask pattern from the mask register space; fetching the second mask pattern from the mask register space; merging the first and second mask patterns into a merged mask pattern; and, storing the merged mask pattern at a storage location identified by the destination address.
摘要:
A method of processing an instruction is described that includes fetching and decoding the instruction. The instruction has separate destination address, first operand source address and second operand source address components. The first operand source address identifies a location of a first mask pattern in mask register space. The second operand source address identifies a location of a second mask pattern in the mask register space. The method further includes fetching the first mask pattern from the mask register space; fetching the second mask pattern from the mask register space; merging the first and second mask patterns into a merged mask pattern; and, storing the merged mask pattern at a storage location identified by the destination address.
摘要:
A math circuit for computing an estimate of a transcendental function is described. A lookup table storage circuit has stored therein several groups of binary values, where each group of values represents a respective coefficient of a first polynomial that estimates the function to a high precision. A computing circuit uses a portion of a binary value, that is also taken from one of the groups of values, to evaluate a second polynomial that estimates the function to a low precision. Other embodiments are also described and claimed.
摘要:
A hierarchical power control system for an integrated circuit may be integrated into a clocking system that includes a global clock generator, a clock distribution network in communication with the global clock generator and a plurality of functional unit blocks each in communication with the global clock generator. The hierarchical power control system may include a first power controller provided in a communication path between the global clock generator and the clock distribution network, and a plurality of second power controllers, one provided in each communication path between the clock distribution network and a functional unit block.
摘要:
A processor including a first execution core section clocked to perform execution operations at a first clock frequency, and a second execution core section clocked to perform execution operations at a second clock frequency which is different than the first clock frequency. The second execution core section runs faster and includes a data cache and critical ALU functions, while the first execution core section includes latency-tolerant functions such as instruction fetch and decode units and non-critical ALU functions. The processor may further include an I/O ring which may be still slower than the first execution core section. Optionally, the first execution core section may include a third execution core section whose clock rate is between that of the first and second execution core sections. Clock multipliers/dividers may be used between the various sections to derive their clocks from a single source, such as the I/O clock.
摘要:
A logic structure adapted to receive pulsed active input signals produces a logical output with a very small inherent switching delay. Pull-down transistors and complementary pull-up transistors are ratioed such that the default logical output level remains close to nominal even when the logic structure sinks or sources a DC current. When the pulsed input signals are inactive, no DC current path is enabled.
摘要:
A method for compressing speech. An audio signal comprising speech is broken down into its phonetic components. These phonetic components are then converted into data elements that represent each of the phonetic components. The determination of data elements is accomplished using a predefined table that correlates phonetic sounds to data elements. The data elements representing the phonetic sounds are then stored.