Abstract:
A steaming engine in a system receives a first set of stream parameters into a queue to define a first stream along with an indication of either a queue mode of operation or a speculative mode of operation for the first stream. Acquisition of the first stream then begins. At some point, a second set of stream parameters is received into the queue to define a second stream. When the queue mode of operation was specified for the first stream, the second set of parameters is queued and acquisition of the second stream is delayed until completion of acquisition of the first stream. When the speculative mode of operation was specified for the first stream, acquisition of the first stream is canceled upon receipt of the second set of stream parameters and acquisition of the second stream begins immediately.
Abstract:
A Very Long Instruction Word (VLIW) digital signal processor particularly adapted for single instruction multiple data (SIMD) operation on various operand widths and data sizes. A vector compare instruction compares first and second operands and stores compare bits. A companion vector conditional instruction performs conditional operations based upon the state of a corresponding predicate data register bit. A predicate unit performs data processing operations on data in at least one predicate data register including unary operations and binary operations. The predicate unit may also transfer data between a general data register file and the predicate data register file.
Abstract:
One embodiment of this invention provides two conditional execution auxiliary instructions directed to disparate subsets of the plural functional units. Depending on the conditional execution desired, only one of the two conditional execution auxiliary instructions may be required for a particular execute packet. Another embodiment of this invention employs only one of two possible register files for the condition registers. In a VLIW processor it may be advantageous to split the functional units into separate sets with corresponding register files. This limits the number of functional units that may simultaneously access the register files. In the preferred embodiment of this invention the functional units are divided into a scalar set which access scalar registers and a vector set which access vector registers. The data registers storing the conditions for both scalar and vector instructions are in the scalar data register file.
Abstract:
Software instructions are executed on a processor within a computer system to configure a steaming engine to operate in either a linear mode or a transpose mode. A stream of addresses is generated using an address generator, in which the stream of addresses includes consecutive nested loop iterations for at least a first loop and a second loop. While in the linear mode, the first loop is treated as an inner loop. While in the transpose mode, the second loop is treated as the inner loop. A matrix can be fetched from memory in the linear mode to provide row-wise vectors. A matrix can be fetched from the memory in the transpose mode to provide column wise vectors. Local storage on the streaming engine is organized as sectors based on the number of rows in the matrix to allow overlapping transposition processing and to minimize memory accesses.
Abstract:
The level one memory controller maintains a local copy of the cacheability bit of each memory attribute register. The level two memory controller is the initiator of all configuration read/write requests from the CPU. Whenever a configuration write is made to a memory attribute register, the level one memory controller updates its local copy of the memory attribute register.
Abstract:
A memory management and protection system that manages memory access requests from a number of requestors. Memory accesses are allowed or disallowed based on the privilege level of the master, usually a CPU originating the request based on a Privilege Identifier that accompanies each memory access request. Deputy masters such as DMA controllers inherit the Privilege Identifier of the originating master. An extended memory controller selects the appropriate set of segment registers based on the Privilege Identifier to insure that the request is compared to and translated by the segment register associated with the master originating the request.
Abstract:
In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.
Abstract:
Devices and methods are provided for identifying a debug event associated with a data element of a data stream, and performing debugging when a processor executes a software program in connection with the data stream. The debug event is tracked through a data pipeline to the processor. In an embodiment, the debug event is acted on only when the processor is ready to consume the data element associated with the debug event. In an embodiment, the debug event is determined by monitoring iteration counts of loop counters associated with an address generator and comparing the iteration counts to respective stored count values.
Abstract:
A streaming engine in a system receives a first set of stream parameters into a queue to define a first stream along with an indication of either a queue mode of operation or a speculative mode of operation for the first stream. Acquisition of the first stream then begins. At some point, a second set of stream parameters is received into the queue to define a second stream. When the queue mode of operation was specified for the first stream, the second set of parameters is queued and acquisition of the second stream is delayed until completion of acquisition of the first stream. When the speculative mode of operation was specified for the first stream, acquisition of the first stream is canceled upon receipt of the second set of stream parameters and acquisition of the second stream begins immediately.
Abstract:
In a very long instruction word (VLIW) central processing unit instructions are grouped into execute packets that execute in parallel. A constant may be specified or extended by bits in a constant extension instruction in the same execute packet. If an instruction includes an indication of constant extension, the decoder employs bits of a constant extension instruction to extend the constant of an immediate field. Two or more constant extension slots are permitted in each execute packet, each extending constants for a different predetermined subset of functional unit instructions. In an alternative embodiment, more than one functional unit may have constants extended from the same constant extension instruction employing the same extended bits. A long extended constant may be formed using the extension bits of two constant extension instructions.