摘要:
A microprocessor architecture including a finite state machine in combination with a microcode instruction cache for executing microinstructions. Microinstructions which would normally result in small sequences of high-repetition looped operations are implemented in a finite state machine (FSM). The use of the FSM is more energy-efficient than looping instructions in a cache or register set. In addition, the flexibility of a cache, or other memory oriented approach, in executing microcode instructions is still available. A microinstruction is identified as an FSM operation (as opposed to a cache operation) by an ID tag. Other fields of the microinstruction can be used to identify the type of FSM circuitry to use, direct configuration of a FSM to implement the microinstruction, indicate that certain fields are to be implemented in one or more FSMs and/or in memory-oriented operations such as in a cache or register.
摘要:
A memory controller to provide memory access services in an adaptive computing engine is provided. The controller comprises: a network interface configured to receive a memory request from a programmable network; and a memory interface configured to access a memory to fulfill the memory request from the programmable network, wherein the memory interface receives and provides data for the memory request to the network interface, the network interface configured to send data to and receive data from the programmable network.
摘要:
A computing system and method are provided for algorithmic electronic system level design. An exemplary system comprises a plurality of databases for storing a plurality of functional models, a plurality of computational element models, and a plurality of hardware definition representations. An application design processor is adapted to perform a first functional simulation of an algorithm using a plurality of computational element architecture definitions to generate a first selection of a plurality of computational elements and corresponding control code for an implementation of the algorithm. A control and memory modeling processor is adapted to generate a plurality of flow transforms from the algorithm and to convert the plurality of flow transforms into the plurality of plurality of computational element models. A system simulation processor is adapted to convert the plurality of computational element models into the plurality of hardware definition representations and to perform a second functional simulation of the algorithm using the plurality of computational element models corresponding to the first selection and the corresponding control code.
摘要:
A method and system of efficient use and programming of a multi-processing core device. The system includes a programming construct that is based on stream-domain code. A programmable core based computing device is disclosed. The computing device includes a plurality of processing cores coupled to each other. A memory stores stream-domain code including a stream defining a stream destination module and a stream source module. The stream source module places data values in the stream and the stream conveys data values from the stream source module to the stream destination module. A runtime system detects when the data values are available to the stream destination module and schedules the stream destination module for execution on one of the plurality of processing cores.
摘要:
A method and system of compiling and linking source stream programs for efficient use of multi-node devices. The system includes a compiler, a linker, a loader and a runtime component. The process converts a source code stream program to a compiled object code that is used with a programmable node based computing device having a plurality of processing nodes coupled to each other. The programming modules include stream statements for input values and output values in the form of sources and destinations for at least one of the plurality of processing nodes and stream statements that determine the streaming flow of values for the at least one of the plurality of processing nodes. The compiler converts the source code stream based program to object modules, object module instances and executables. The linker matches the object module instances to at least one of the multiple cores. The loader loads the tasks required by the object modules in the nodes and configure the nodes matched with the object module instances. The runtime component runs the converted program.
摘要:
A method and system of efficient use and programming of a multi-processing core device. The system includes a programming construct that is based on stream-domain code. A programmable core based computing device is disclosed. The computing device includes a plurality of processing cores coupled to each other. A memory stores stream-domain code including a stream defining a stream destination module and a stream source module. The stream source module places data values in the stream and the stream conveys data values from the stream source module to the stream destination module. A runtime system detects when the data values are available to the stream destination module and schedules the stream destination module for execution on one of the plurality of processing cores.
摘要:
The present invention includes an apparatus, method and system for generating a configuration of an adaptive circuit which is inseparable from selected content. Either the adaptive circuit or encrypted, selected content has a unique identifier. In one of the preferred method and system embodiments in which the adaptive circuit has the unique identifier, a request for the selected content is received, along with the unique identifier, such as by a network server. The selected content is then encrypted, based upon the unique identifier, to form encrypted content. Configuration information for the adaptive circuit, corresponding to the unique identifier and the encrypted content, is generated to form corresponding configuration information. A service provider, such as through a network server, transfers the encrypted content and the corresponding configuration information to the adaptive circuit having the unique identifier, which may then be configured for use of the selected content. As a consequence, the present invention creates adaptive hardware configurations which are uniquely coupled to the selected content.
摘要:
The present invention includes an apparatus, method and system for generating a configuration of an adaptive circuit which is inseparable from selected content. Either the adaptive circuit or encrypted, selected content has a unique identifier. In one of the preferred method and system embodiments in which the adaptive circuit has the unique identifier, a request for the selected content is received, along with the unique identifier, such as by a network server. The selected content is then encrypted, based upon the unique identifier, to form encrypted content. Configuration information for the adaptive circuit, corresponding to the unique identifier and the encrypted content, is generated to form corresponding configuration information. A service provider, such as through a network server, transfers the encrypted content and the corresponding configuration information to the adaptive circuit having the unique identifier, which may then be configured for use of the selected content. As a consequence, the present invention creates adaptive hardware configurations which are uniquely coupled to the selected content.
摘要:
A method and system of compiling and linking source stream programs for efficient use of multi-node devices. The system includes a compiler, a linker, a loader and a runtime component. The process converts a source code stream program to a compiled object code that is used with a programmable node based computing device having a plurality of processing nodes coupled to each other. The programming modules include stream statements for input values and output values in the form of sources and destinations for at least one of the plurality of processing nodes and stream statements that determine the streaming flow of values for the at least one of the plurality of processing nodes. The compiler converts the source code stream based program to object modules, object module instances and executables. The linker matches the object module instances to at least one of the multiple cores. The loader loads the tasks required by the object modules in the nodes and configure the nodes matched with the object module instances. The runtime component runs the converted program.
摘要:
An apparatus, computer-readable medium, and computer-implemented method for parallelization of a computer program on a plurality of computing cores includes receiving a computer program comprising a plurality of commands, decomposing the plurality of commands into a plurality of node networks, each node network corresponding to a command in the plurality of commands and including one or more nodes corresponding to execution dependencies of the command, mapping the plurality of node networks to a plurality of systolic arrays, each systolic array comprising a plurality of cells and each non-data node in each node network being mapped to a cell in the plurality of cells, and mapping each cell in each systolic array to a computing core in the plurality of computing cores.