摘要:
This invention is an application specific integrated processor to implement the complete fixed-rate DRX signal processing paths (FDRX) for a reconfigurable processor-based multi-mode 3G wireless application. This architecture is based on the baseline 16-bit RISC architecture with addition functional blocks (ADU) tightly coupled with the based processor's data path. Each ADU accelerates a computation-intensive tasks in FDRX signal path, such as multi-tap FIRs, IIRs, complex domain and vectored data processing. The ADUs are controlled through custom instructions based on the load/store architecture. The whole FDRX data path can be easily implemented by the software employing these custom instructions.
摘要:
A method of instruction issue (3200) in a microprocessor (1100, 1400, or 1500) with execution pipestages (E1, E2, etc.) and that executes a producer instruction Ip and issues a candidate instruction I0 (3245) having a source operand dependency on a destination operand of instruction Ip. The method includes issuing the candidate instruction I0 as a function (1720, 1950, 1958, 3235) of a pipestage EN(I0) of first need by the candidate instruction for the source operand, a pipestage EA(Ip) of first availability of the destination operand from the producer instruction, and the one execution pipestage E(Ip) currently associated with the producer instruction. A method of data forwarding (3300) in a microprocessor (1100, 1400, or 1500) having a pipeline (1640) having pipestages (E1, E2, etc.), wherein the method includes scoreboarding information E(Ip) (1710, 2220) to represent a changing pipestage position for data from a producer instruction Ip, and selectively forwarding (2310, 3360) the data from the pipestage having the represented pipestage position E(Ip), based on the information (1710), to a receiving pipestage (1682, E1) for a dependent instruction. Wireless communications devices (1010, 1010′, 1040, 1050, 1060, 1080), systems, circuits, devices, scoreboards (1700.N), processes and methods of operation, processes and articles of manufacture (FIGS. 13-16), are also disclosed.
摘要:
A digital circuitry comprising a processing unit that receives a first clock and comprising of a first self-clock circuitry that generates a first internal clock; wherein the said first self-clock circuitry further comprises of a mechanism to select between the said first clock and the first internal clock of the said processing unit for clock edge synchronization.
摘要:
A system and a method to identify a conditional branch instruction having a program counter and a target address, and increment a loop count each time the program counter and the target address equal a stored program counter and a target address. The system and method additionally includes assignment of a start loop pointer and an end loop pointer, based on an offset, when the loop count is equal to a threshold value, and capturing instructions for a loop, as defined by the start loop pointer and the end loop pointer, in an instruction queue.
摘要:
This invention is an application specific integrated processor to implement the complete fixed-rate DRX signal processing paths (FDRX) for a reconfigurable processor-based multi-mode 3G wireless application. This architecture is based on the baseline 16-bit RISC architecture with addition functional blocks (ADU) tightly coupled with the based processor's data path. Each ADU accelerates a computation-intensive tasks in FDRX signal path, such as multi-tap FIRs, IIRs, complex domain and vectored data processing. The ADUs are controlled through custom instructions based on the load/store architecture. The whole FDRX data path can be easily implemented by the software employing these custom instructions.
摘要:
A cache memory consists of plurality of memory bits within a random-access memory (RAM) cell. An extra address decode circuit is needed to select a single memory bit within the multi-bit RAM cell before normal access of RAM array circuit. Combining of multiple bits into a RAM cell reduces the number of interconnections in comparison to single bit RAM cell. This technique eliminates the need to break up the cache array into multiple sets for reducing power dissipation. The area advantages are also from optimal layout of multi-bit RAM cell, address decoder, and sense amplifier unit. Furthermore, the interconnections can be widened to reduce the RC delay as it is a dominating factor in future technology advancement. The multiplexing of the bits are done before the row decoding thus reducing one level of multiplexing after reading of data from the sense amplifier units.
摘要:
A self-timed and self-enabled distributed clock is provided for a functional unit that has variable executing time. The self-timed clock provides plurality of clock pulses within a clock cycle for latching of result and starting execution of the next operation. The functional unit can execute more than one operation per clock cycle thus increasing the utilization of the execute unit and the performance of the processor. The state machine is designed to keep track of the current clock pulse and the execution time of the current operation. The functional unit includes the output queue buffer to keep plurality of results from execute unit. The functional unit executes data close to its optimal timing while the data between functional units are synchronized on the clock boundary as in synchronous design. It is more efficient than synchronous design yet the outputs are deterministic as the clocking is preserved in the design.