摘要:
An Explicit Multi-Threading (XMT) system and method is provided for processing multiple spawned threads associated with SPAWN-type commands of an XMT program. The method includes executing a plurality of child threads by a plurality of TCUs including a first TCU executing a child thread which is allocated to it; completing execution of the child thread by the first TCU; announcing that the first TCU is available to execute another child thread; executing by a second TCU a parent child thread that includes a nested spawn-type command for spawning additional child threads of the plurality of child threads, wherein the parent child thread is related in a parent-child relationship to the child threads that are spawned in conjunction with the nested spawn-type command; assigning a thread ID (TID) to each child thread, wherein the TID is unique with respect to the other TIDs; and allocating a new child thread to the first TCU.
摘要:
A multi-chip processor/memory arrangement replacing a large computer chip, includes a number of modules each including processing elements, registers, and/or memories interconnected by an optical interconnection fabric providing an all-to-all interconnection between the chips, so that the memory cells on each chip represent a portion of shared memory. The optical interconnect fabric is responsible for transporting data between the chips while processing elements on each chip dominate processing. Each chip is manufactured in mass production so that the entire processor/memory arrangement is fabricated in an inexpensive and simplified technology process. The optical communication fabric is based on waveguide technology and includes a number of waveguides, the layout of which follows certain constraints. The waveguides can intersect each other in the single plane, or alternatively, a double layer of waveguide structures and bent over approach may be used. Specific layout patterns of the optical waveguides are presented. The communication of data along the optical communication channels is performed in highly pipelined decentralized routing manner and is envisioned for XMT architecture application.
摘要:
An Explicit Multi-Threading (XMT) system and method is provided for processing multiple spawned threads associated with SPAWN-type commands of an XMT program. The method includes executing a plurality of child threads by a plurality of TCUs including a first TCU executing a child thread which is allocated to it; completing execution of the child thread by the first TCU; announcing that the first TCU is available to execute another child thread; executing by a second TCU a parent child thread that includes a nested spawn-type command for spawning additional child threads of the plurality of child threads, wherein the parent child thread is related in a parent-child relationship to the child threads that are spawned in conjunction with the nested spawn-type command; assigning a thread ID (TID) to each child thread, wherein the TID is unique with respect to the other TIDs; and allocating a new child thread to the first TCU.
摘要:
A multi-chip processor/memory arrangement (20) is shown which includes a plurality of modules (22), also referred to herein as chips. The modules (22) are interconnected there between by an optical interconnect structure (24) also referred to herein as optical interconnect fabric. The basic concept underlining the structure of the arrangement (20) is to position the processing elements and memory cells on the small chips (22) which are fabricated in mass production based on inexpensive technology, for example, 0.25 micron technology and interconnected with the optical interconnect fabric (24). Packaged with the optical interconnect structure (24), a plurality of inexpensive chips (22) provides sufficient performance but for a small fraction of the cost of the processor/memory argument implemented on a single large computer chips (0.065 micron chip).
摘要:
The invention presents a unique computational paradigm that provides the tools to take advantage of the parallelism inherent in parallel algorithms to the full spectrum from algorithms through architecture to implementation. The invention provides a new processing architecture that extends the standard instruction set of the conventional uniprocessor architecture. The architecture used to implement this new computational paradigm includes a thread control unit (34), a spawn control unit (38), and an enabled instruction memory (50). The architecture initiates multiple threads and executes them in parallel. Control of the threads is provided such that the threads may be suspended or allowed to execute each at its own pace.