摘要:
A method and apparatus for selectively writing data elements from packed data based upon a mask using predication. In one embodiment of the invention, for each data element of a packed data operand, the following is performed in parallel processing units: determining a predicate value for the data element from one or more bits of a corresponding packed data mask element indicating whether the data element is selected for writing to a corresponding storage location, and storing in the corresponding storage location the data element based on the predicate value.
摘要:
An interactive translation system (10) includes a front end (40), a back end (42), and a user interface (16). The front end (40) is operable to identify source elements (86) in a source file (24). The back end (42) is operable to generate a translation file having translation elements corresponding to translation of said identified source elements (86) and having an interface (16) for receiving inputs for modifying said translation.
摘要:
An apparatus for operating on the contents of an input register to generate the contents of an output register which contains a permutation, with or without repetitions, or a combination of the contents of the input register. The apparatus partitions the input register into a plurality of sub-words, each sub-word being characterized by a location in the input register and a length greater than one bit. In response to an instruction specifying a rearrangement of the input register, the present invention directs at least one of the sub-words in the input register to a location in the output register that differs from the location occupied by the sub-word in the input register. The ordering of the sub-words in the output register differ from the order obtainable by a single shift instruction. In the preferred embodiment of the present invention, the invention is implemented by modifying a conventional shifter comprising a plurality of layers of multiplexers. The modification comprises independently setting the control signals for at least one of the multiplexers in at least one of the layers.
摘要:
A digital signal processor includes a computation block with an arithmetic logic unit, a multiplier, a shifter and a register file. The computation block includes a plurality of registers for storing instructions and operands in a bit format as a continuous bit stream, and utilizes a bit transfer mechanism for transferring in a single cycle a bit field of an arbitrary bit length between the plurality of registers and the shifter. The plurality of registers may be general purpose registers located in the register file. The register file may further include at least one control information register for storing control information used by the bit transfer mechanism.
摘要:
In one embodiment, a programmable processor is adapted to conditionally move data between a pointer register and a data register in response to a single machine instruction. The processor has a plurality of pipelines. In response to the machine instruction, a control unit directs the pipelines to forward the data across the pipelines in order to move the data between the registers.
摘要:
An instruction aligner and method evaluates a fixed length instruction cache line by breaking it into at least two components. These two components, in one embodiment, include half of the instruction cache line being designated as most significant bytes and the second half of the instruction cache line being designated as least significant bytes. A byte right rotator is responsible for feeding the next sixteen bytes of the instruction stream, while a byte right shifter shifts the unused bytes of the current sixteen bytes the aligner is working on. The byte rotator and byte shifter combine to provide aligned variable length instructions for decoding based on either a fetch PC value or current instruction length.
摘要:
An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.
摘要:
A lookup operation is carried out on a data table by logically dividing the data table into a number of smaller sets of data that can be indexed with a single byte of data. Each set of data consists of two vectors, which constitute the operands for a permute instruction. Only a limited number of bits are required to index into the table during the execution of this instruction. The remaining bits of each index are used as masks into a series of select instructions. The select instruction chooses between two vector components, based on the mask, and places the selected components into a new vector. The mask is generated by shifting one of the higher order bits of the index to the most significant position, and then propagating that bit throughout a byte, for example by means of an arithmetic shift. This procedure is carried out for all of the index bytes in the vector, to generate a select mask. The select mask is then used during a select operation, to choose between the results of permute instructions on different ones of the logically divided sets of data. Multi-byte table entries are retrieved by replicating each index value and adding consecutive values to form multiple consecutive index values that are then used in multiple permute operations.
摘要:
A computer-implemented method and apparatus for transferring the contents of a general register, in a register stack, to a location in a backing store in a main memory are described. When transferring the contents of a general register to a location in the backing store, the invention proposes collecting attribute bits included in each general register of a predetermined group of registers in a temporary collection register. Once the temporary collection register has been filled, the contents of this register are written to the next available location in the backing store. Similarly, on the restoration of registers from the backing store, a collection of attribute bits saved in the backing register is transferred to a temporary collection register. Thereafter, each attribute bit is saved together with associated data into a general register, thereby to restore the former contents of each general register.