Abstract:
An asynchronous processing system comprising an asynchronous scalar processor and an asynchronous vector processor coupled to the scalar processor. The asynchronous scalar processor is configured to perform processing functions on input data and to output instructions. The asynchronous vector processor is configured to perform processing functions in response to a very long instruction word (VLIW) received from the scalar processor. The VLIW comprises a first portion and a second portion, at least the first portion comprising a vector instruction.
Abstract:
A method and a system is provided for Coordinate Rotation Digital Computer (CORDIC) based matrix inversion of input digital signal streams from multiple antennas using an bi-directional ring-bus architecture. The bi-directional ring bus includes a first ring bus having signals flow in a clockwise direction, and a second ring bus having signals flow in a counter-clockwise direction. An I/O controller is coupled to the first and the second ring bus, respectively. A plurality of processing elements (PEs), where each of the plurality of PEs is coupled to the first and the second ring bus, respectively, wherein each of the plurality of PEs includes at least one CORDIC core for performing CORDIC iterations on the plurality of input digital stream signals to produce inversed matrix signals.
Abstract translation:提供了一种用于使用双向环形总线架构从多个天线输入数字信号流的坐标旋转数字计算机(CORDIC)矩阵求逆的方法和系统。 双向环形总线包括具有沿顺时针方向流动的信号的第一环形总线和具有逆时针方向的信号的第二环形总线。 I / O控制器分别耦合到第一和第二环形总线。 多个处理元件(PE),其中所述多个PE中的每一个分别耦合到所述第一和第二环形总线,其中所述多个PE中的每一个包括用于在所述多个PE中执行CORDIC迭代的至少一个CORDIC内核 输入数字流信号以产生反相矩阵信号。
Abstract:
A method of operating a clock-less asynchronous processing system comprising a plurality of successive asynchronous processing components. The method comprises providing a first token signal path in the plurality of processing components to allow propagation of a token through the processing components. Possession of the token by one of the processing components enables the processing component to conduct a transaction with a resource component that is shared among the processing components. The method comprises propagating the token from one processing component to another processing component along the token signal path.
Abstract:
A clock-less asynchronous processor comprising a plurality of parallel asynchronous processing logic circuits, each processing logic circuit configured to generate an instruction execution result. The processor comprises an asynchronous instruction dispatch unit coupled to each processing logic circuit, the instruction dispatch unit configured to receive multiple instructions from memory and dispatch individual instructions to each of the processing logic circuits. The processor comprises a crossbar coupled to an output of each processing logic circuit and to the dispatch unit, the crossbar configured to store the instruction execution results.
Abstract:
Embodiments are provided for an asynchronous processor with heterogeneous processors. In an embodiment, the apparatus for an asynchronous processor comprises a memory configured to cache instructions, and a first unit (XU) configured to processing a first instruction of the instructions. The apparatus also comprises a second XU having less restricted access than the first XU to a resource of the asynchronous processor and configured to process a second instruction of the instructions. The second instruction requires access to the resource. The apparatus further comprises a feedback engine configured to decode the first instruction and the second instruction, and issue the first instruction to the first XU, and a scheduler configured to send the second instruction to the second XU.
Abstract:
Embodiments are provided for an asynchronous processor with pipelined arithmetic and logic unit. The asynchronous processor includes a non-transitory memory for storing instructions and a plurality of instruction execution units (XUs) arranged in a ring architecture for passing tokens. Each one of the XUs comprises a logic circuit configured to fetch a first instruction from the non-transitory memory, and execute the first instruction. The logic circuit is also configured to fetch a second instruction from the non-transitory memory, and execute the second instruction, regardless whether the one of the XUs holds a token for writing the first instruction. The logic circuit is further configured to write the first instruction to the non-transitory memory after fetching the second instruction.
Abstract:
Embodiments are provided for an asynchronous processor using master and assisted tokens. In an embodiment, an apparatus for an asynchronous processor comprises a memory to cache a plurality of instructions, a feedback engine to decode the instructions from the memory, and a plurality of XUs coupled to the feedback engine and arranged in a token ring architecture. Each one of the XUs is configured to receive an instruction of the instructions form the feedback engine, and receive a master token associated with a resource and further receive an assisted token for the master token. Upon determining that the assisted token and the master token are received in an abnormal order, the XU is configured to detect an operation status for the instruction in association with the assisted token, and upon determining a needed action in accordance with the operation status and the assisted token, perform the needed action.
Abstract:
Embodiments are provided for an asynchronous processor with an asynchronous Instruction fetch, decode, and issue unit. The asynchronous processor comprises an execution unit for asynchronous execution of a plurality of instructions, and a fetch, decode and issue unit configured for asynchronous decoding of the instructions. The fetch, decode and issue unit comprises a plurality of resources supporting functions of the fetch, decode and issue unit, and a plurality of decoders arranged in a predefined order for passing a plurality of tokens. The tokens control access of the decoders to the resources and allow the decoders exclusive access to the resources. The fetch, decode and issue unit also comprises an issuer unit for issuing the instructions from the decoders to the execution unit
Abstract:
Embodiments are provided for an asynchronous processor with a Hierarchical Token System. The asynchronous processor includes a set of primary processing units configured to gate and pass a set of tokens in a predefined order of a primary token system. The asynchronous processor further includes a set of secondary units configured to gate and pass a second set of tokens in a second predefined order of a secondary token system. The set of tokens of the primary token system includes a token consumed in the set of primary processing units and designated for triggering the secondary token system in the set of secondary units.
Abstract:
A clock-less asynchronous processing circuit or system utilizes a self-clocked generator to adjust the processing delay (latency) needed/allowed to the processing cycle in the circuit/system. The timing of the self-clocked generator is dynamically adjustable depending on various parameters. These parameters may include processing instruction, opcode information, type of processing to be performed by the circuit/system, or overall desired processing performance. The latency may also be adjusted to change processing performance, including power consumption, speed etc.