摘要:
A processor (100) is provided that is a programmable fixed point digital signal processor (DSP) with variable instruction length, offering both high code density and easy programming. Architecture and instruction set are optimized for low power consumption and high efficiency execution of DSP algorithms, such as for wireless telephones, as well as pure control tasks. The processor includes an instruction buffer unit (106), a program flow control unit (108), an address/data flow unit (110), a data computation unit (112), and multiple interconnecting busses. Dual multiply-accumulate blocks improve processing performance. A memory interface unit (104) provides parallel access to data and instruction memories. The instruction buffer is operable to buffer single and compound instructions pending execution thereof. A decode mechanism is configured to decode instructions from the instruction buffer. The use of compound instructions enables effective use of the bandwidth available within the processor. A soft dual memory instruction can be compiled from separate first and second programmed memory instructions. Instructions can be conditionally executed or repeatedly executed. Bit field processing and various addressing modes, such as circular buffer addressing, further support execution of DSP algorithms. The processor includes a multistage execution pipeline with pipeline protection features. Various functional modules can be separately powered down to conserve power. The processor includes emulation and code debugging facilities with support for cache analysis.
摘要:
A zero anticipation mechanism for an arithmetic unit 42 of a processing engine includes an array of cells 420, 430 interconnected to produce an ordered sequence of intermediate anticipation signals. The array of cells includes cells connected to receive intermediate result signals from the arithmetic unit, cells for forwarding an intermediate anticipation signal supplied thereto, and cells for generating a combination of first intermediate anticipation signals and second intermediate anticipation signals supplied thereto. The zero anticipation mechanism implements a zero look-ahead mechanism which can predict a zero result 479 prior to the arithmetic unit completing an arithmetic operation.
摘要:
Data processing apparatus 10 supporting circular buffers CB includes address storage ARx for holding a virtual buffer index and offset storage BOFxx for holding an offset address. Circular buffer management logic 802 is configured to be operable to apply a modifier to a virtual buffer index held in the address storage to derive a modified virtual buffer index and to apply a buffer offset held in the offset storage to the modified virtual buffer index to derive a physical address for addressing a circular buffer. By employing virtual addressing to a buffer index for a circular buffer management, it is possible to make efficient use of memory resources. One or more circular buffers can be located contiguously with respect to each other and/or other data in memory, avoiding fragmentation of the memory. The buffer index forms a pointer for the circular buffer. The apparatus enables circular buffers to be implemented without alignment constraints, while maintaining compatibility with prior circular buffer implementations with alignment constraints.
摘要:
A system and method for organizing pixel information in memory. A method according to an embodiment of the disclosure includes storing data representative of pixels of a scene in a growing window (“GW”) portion of a reference frame in an on-chip memory, storing data representative of pixels of the visual scene in a sliding window (“SW”) portion of the reference frame thereby forming a hybrid window, searching the memory to locate a portion of the stored data that corresponds with data representative of pixels in a current frame descriptive of the scene, performing motion estimation according to results of the search, generating a compressed version of the current frame according to results of the motion estimation, and storing the compressed version for later visual rendering. The system includes a processing unit and a video encoder. The processing unit includes an on-chip memory. The video encoder includes a motion estimation engine and a compression unit.
摘要:
A processing engine 10 includes an instruction buffer 502 operable to buffer single and compound instructions pending execution. A decode mechanism is configured to decode instructions from the instruction buffer. The decode mechanism is arranged to respond to a predetermined tag in a tag field of an instruction, which predetermined tag is representative of the instruction being a compound instruction formed from separate programmed memory instructions. The decode mechanism is operable in response to the predetermined tag to decode at least first data flow control for a first programmed instruction and second data flow control for a second programmed instruction. The use of compound instructions enables effective use of the bandwidth available within the processing engine. A soft dual memory instruction can be compiled from separate first and second programmed memory instructions. A compound address field of the predetermined compound instruction can be arranged at the same bit positions as the address field for a hard compound memory instruction, that is a compound instruction which is programmed. In this case the decoding of the addresses can be started before the operation code of the instructions have been decoded. To reduce the number of bits in the compound instruction, addressing can be restricted to indirect addressing and the operation codes for at least the first instruction can be reduced in size. In this way, the compound instruction can be arranged to have the same number of bits in total as the sum of the bits of the separate programmed instructions.
摘要:
A system and method for organizing pixel information in memory. A method according to an embodiment of the disclosure includes storing data representative of pixels of a scene in a growing window (“GW”) portion of a reference frame in an on-chip memory, storing data representative of pixels of the visual scene in a sliding window (“SW”) portion of the reference frame thereby forming a hybrid window, searching the memory to locate a portion of the stored data that corresponds with data representative of pixels in a current frame descriptive of the scene, performing motion estimation according to results of the search, generating a compressed version of the current frame according to results of the motion estimation, and storing the compressed version for later visual rendering. The system includes a processing unit and a video encoder. The processing unit includes an on-chip memory. The video encoder includes a motion estimation engine and a compression unit.
摘要:
A processor is provided that is a programmable digital signal processor (DSP) with variable instruction length, offering both high code density and easy programming. Architecture and instruction set are optimized for low power consumption and high efficiency execution of DSP algorithms, such as for wireless telephones, as well as pure control tasks. A coefficient data pointer is provided for accessing coefficient data for use in a multiply-accumulate (MAC) unit. Monitoring circuitry determines when the coefficient data pointer is modified (step 1104). When an instruction is executed (step 1102) that requires a coefficient datum from memory in accordance with the coefficient data pointer, a memory access is inhibited (step 1108) if the coefficient data pointer has not been modified since the last time a memory fetch was made in accordance with the coefficient data pointer and the previously fetched coefficient datum is reused. However, if the coefficient data pointer was modified since the last time a memory fetch was made in accordance with the coefficient data pointer, then the required coefficient datum is fetched from memory (step 1106). A shadow register within the MAC unit execution pipeline temporarily saves coefficient data for possible reuse.