摘要:
A clock-less asynchronous processing circuit or system having a plurality of pipelined processing stages utilizes self-clocked generators to tune the delay needed in each of the processing stages to complete the processing cycle. Because different processing stages may require different amounts of time to complete processing or may require different delays depending on the processing required in a particular stage, the self-clocked generators may be tuned to each stage's necessary delay(s) or may be programmably configured.
摘要:
Embodiments are provided for adding a token jump logic to an asynchronous processor with token passing. The token jump logic allows token forward jumps and token backward jumps over a cascade of token processing logics in the processor. An embodiment method includes determining, using a token jump logic coupled to a cascade of token processing logics, whether to administer a token forward jump or a token backward jump of a token signal passing through the token processing logics. The token forward jump and token backward jump allow the token signal to skip one or more token processing logics in the cascade. The method further includes monitoring, for each of the token processing logics, a polarity status of a token sense logic, and inverting the polarity status according to the determination at the token jump logic.
摘要:
Embodiments are provided for an asynchronous processor with multiple threading. The asynchronous processor includes a program counter (PC) logic and instruction cache unit comprising a plurality of PC logics configured to perform branch prediction and loop predication for a plurality of threads of instructions, and determine target PC addresses for caching the plurality of threads. The processor further comprises an instruction memory configured to cache the plurality of threads in accordance with the target PC addresses from the PC logic and instruction cache unit. The processor further includes a multi-threading (MT) scheduling unit configured to schedule and merge instruction flows for the plurality of threads from the instruction memory into a single combined thread of instructions. Additionally, a MT register window register is included to map operands in the plurality of threads to a plurality of corresponding register windows in a register file.
摘要:
A method of operating a clock-less asynchronous processing system comprising a plurality of successive asynchronous processing components. The method comprises providing a first token signal path in the plurality of processing components to allow propagation of a token through the processing components. Possession of the token by one of the processing components enables the processing component to conduct a transaction with a resource component that is shared among the processing components. The method comprises propagating the token from one processing component to another processing component along the token signal path.
摘要:
A clock-less asynchronous processor comprising a plurality of parallel asynchronous processing logic circuits, each processing logic circuit configured to generate an instruction execution result. The processor comprises an asynchronous instruction dispatch unit coupled to each processing logic circuit, the instruction dispatch unit configured to receive multiple instructions from memory and dispatch individual instructions to each of the processing logic circuits. The processor comprises a crossbar coupled to an output of each processing logic circuit and to the dispatch unit, the crossbar configured to store the instruction execution results.
摘要:
Embodiments are provided for an asynchronous processor with heterogeneous processors. In an embodiment, the apparatus for an asynchronous processor comprises a memory configured to cache instructions, and a first unit (XU) configured to processing a first instruction of the instructions. The apparatus also comprises a second XU having less restricted access than the first XU to a resource of the asynchronous processor and configured to process a second instruction of the instructions. The second instruction requires access to the resource. The apparatus further comprises a feedback engine configured to decode the first instruction and the second instruction, and issue the first instruction to the first XU, and a scheduler configured to send the second instruction to the second XU.
摘要:
Embodiments are provided for an asynchronous processor with pipelined arithmetic and logic unit. The asynchronous processor includes a non-transitory memory for storing instructions and a plurality of instruction execution units (XUs) arranged in a ring architecture for passing tokens. Each one of the XUs comprises a logic circuit configured to fetch a first instruction from the non-transitory memory, and execute the first instruction. The logic circuit is also configured to fetch a second instruction from the non-transitory memory, and execute the second instruction, regardless whether the one of the XUs holds a token for writing the first instruction. The logic circuit is further configured to write the first instruction to the non-transitory memory after fetching the second instruction.
摘要:
Embodiments are provided for an asynchronous processor using master and assisted tokens. In an embodiment, an apparatus for an asynchronous processor comprises a memory to cache a plurality of instructions, a feedback engine to decode the instructions from the memory, and a plurality of XUs coupled to the feedback engine and arranged in a token ring architecture. Each one of the XUs is configured to receive an instruction of the instructions form the feedback engine, and receive a master token associated with a resource and further receive an assisted token for the master token. Upon determining that the assisted token and the master token are received in an abnormal order, the XU is configured to detect an operation status for the instruction in association with the assisted token, and upon determining a needed action in accordance with the operation status and the assisted token, perform the needed action.
摘要:
Embodiments are provided for an asynchronous processor with an asynchronous Instruction fetch, decode, and issue unit. The asynchronous processor comprises an execution unit for asynchronous execution of a plurality of instructions, and a fetch, decode and issue unit configured for asynchronous decoding of the instructions. The fetch, decode and issue unit comprises a plurality of resources supporting functions of the fetch, decode and issue unit, and a plurality of decoders arranged in a predefined order for passing a plurality of tokens. The tokens control access of the decoders to the resources and allow the decoders exclusive access to the resources. The fetch, decode and issue unit also comprises an issuer unit for issuing the instructions from the decoders to the execution unit
摘要:
Embodiments are provided for an asynchronous processor with a Hierarchical Token System. The asynchronous processor includes a set of primary processing units configured to gate and pass a set of tokens in a predefined order of a primary token system. The asynchronous processor further includes a set of secondary units configured to gate and pass a second set of tokens in a second predefined order of a secondary token system. The set of tokens of the primary token system includes a token consumed in the set of primary processing units and designated for triggering the secondary token system in the set of secondary units.