摘要:
An arbitration mechanism comprising a request FIFO (408) for storing ones and zeros corresponding to received requests in the order that they are made. A one indicates that the request was made by the module in which the FIFO is located, and a zero indicates that the request was made by one of a number of other similar modules. The request status information from the other modules is received over signal lines (411) connected between the modules. This logic separates multiple requests into time-ordered slots, such that all requests in a particular time slot may be serviced before any requests in the next time slot. A store (409) stores a unique logical module number. An arbiter (410) examines this logical number bit-by-bit in successive cycles and places a one in a grant queue (412) upon the condition that the bit examined in a particular cycle is a zero and signals this condition over the signal lines. If the bit examined in a particular cycle is a one, the arbiter drops out of contention and signals this condition over the signal lines (411). This logic orders multiple requests within a single time slot, which requests are made by multiple modules, in accordance with the logical module numbers of the modules making the requests. The grant queue (412) stores status information (ones and zeros) corresponding to granted requests in the order that they are granted--a one indicating that the granted request was granted to the module in which the grant queue is located, and a zero indicating that the granted request was granted to one of the other modules. The granted request status information from the other modules is received over the signal lines (411). This logic separates multiple granted requests such that only one request corresponding to a particular module is at the head of any one grant queue at any one time.
摘要:
An execution unit which is part of a general-purpose microprocessor, partitioned between two integrated circuit chips, with the execution unit on one chip and an instruction unit on another chip. The execution unit provides the interface for accessing a main memory to thereby fetch data and macroinstructions for transfer to the instruction unit when requested to do so by the instruction unit. The execution unit receives arithmetic microinstructions in order to perform various arithmetic operations, and receives access-memory microinstructions in order to develop memory references from logical addresses received from the instruction unit. Arithmetic operations are performed by a data manipulation unit which contains registers and arithmetic capability, controlled by a math sequencer. Memory references are performed by a reference-generation unit which contains base-and-length registers and an arithmetic capability to generate and check addresses for referencing an off-chip main memory, and is controlled by an access sequencer.
摘要:
An instruction translator unit which receives an instruction stream from a main memory of a microprocessor, for latching data fields, for generating microinstructions necessary to emulate the function encoded in an instruction, and for transferring the data and microinstructions to a microinstruction execution unit over an output bus. The instruction unit includes an instruction decoder (ID) which interprets the fields of received instructions and generates single forced microinstructions and starting addresses of multiple-microinstruction routines. A microinstruction sequencer (MIS) accepts the forced microinstructions and the starting addresses and places on the output bus correct microinstruction sequences necessary to execute the received instruction. The microinstruction routines are stored in a read-only memory (ROM) in the MIS. The starting addresses received from the ID are used to index into and to fetch these microinstructions from the ROM. Forced microinstructions bypass the ROM and are transferred directly by the MIS to the execution unit.The ID processes macroinstructions comprised of variable bit length fields by utilizing an extractor in conjunction with a bit pointer (BIP) for stripping off the bits comprising a particular field. The extracted field is presented to a state machine which decodes the particular field and generates data, microinstructions and starting addresses relating to the particular field for use by the MIS. The state machine then updates the BIP by the bit count of the particular field so that it points to the next field to be extracted.
摘要:
A data processor architecture wherein the processors recognize two basic types of objects, an object being a representation of related information maintained in a contiguously-addresed set of memory locations. The first type of object contains ordinary data, such as characters, integers, reals, etc. The second type of object contains a list of access descriptors. Each access descriptor provides information for locating and defining the extent of access to an object associated with that access descriptor. The processors recognize complex objects that are combinations of objects of the basic types. One such complex object (a context) defines an environment for execution of objects accessible to a given instance of a procedural operation. The dispatching of tasks to the processors is accomplished by hardware-controlled queuing mechanisms (dispatching-port objects) which allow multiple sets of processors to serve multiple, but independent sets of tasks. Communication between asynchronous tasks or processes is accomplished by related hardware-controlled queuing mechanisms (buffered-port objects) which allow messages to move between internal processes or input/output processes without the need for interrupts. A mechanism is provided which allows the processors to communicate with each other. This mechanism is used to reawaken an idle processor to alert the processor to the fact that a ready-to-run process at a dispatching port needs execution.
摘要:
A parallel processor comprised of a plurality of processing nodes (10), each node including a processor (100-114) and a memory (116). Each processor includes means (100, 102) for executing instructions, logic means (114) connected to the memory for interfacing the processor with the memory and means (112) for internode communication. The internode communication means (112) connect the nodes to form a first array (8) of order n having a hypercube topology. A second array (21) of order n having nodes (22) connected together in a hypercube topology is interconnected with the first array to form an order n+l array. The order n+l array is made up of the first and second arrays of order n, such that a parallel processor system may be structured with any number of processors that is a power of two. A set of I/O processors (24) are connected to the nodes of the arrays (8, 21) by means of I/O channels (106). The means for internode communication (112) comprises a serial data channel driven by a clock that is common to all of the nodes.
摘要:
A broadcast pointer instruction has a first source operand (address pointer value) which is the starting address in a memory of message data to be broadcast to a number of processors through output ports. The broadcast pointer instruction has a first destination operand (first multibit mask), there being one bit position in the first mask for each one of the plurality of output ports. The address pointer value is loaded into each of the output ports whose numbers correspond to bit positions in the first mask that are set to be one, such that each output port that is designated in the first mask receives the starting address of the message data in the memory. A broadcast count instruction has a second source operand (a byte count value) equal to the number of bytes in the message data. The broadcast count instruction has a second destination operand (a second multibit mask), there being one bit position in the second mask for each one of the plurality of output ports. The byte count value is sent to each of the output ports whose numbers correspond to bit positions in the second mask register that are set to be one, such that each output port that is designated in the second mask receives the byte count value corresponding to the number of bytes in the message data that are to be transferred from the memory. Once the byte count is initialized, data are transferred from the starting address in memory over each output port designated in the masks, until the byte count is decremented to zero.
摘要:
A parallel processor network comprised of a plurality of nodes, each node including a processor containing a number of I/O ports, and a local memory. Each processor in the network is assigned a unique processor ID (202) such that the processor IDs of two processors connected to each other through port number n, vary only in the nth bit. Input message decoding means (200) and compare logic and message routing logic (204) create a message path through the processor in response to the decoding of an address message packet and remove the message path in response to the decoding of an end of transmission (EOT) Packet. Each address message packet includes a Forward bit used to send a message to a remote destination either within the network or to a foreign network. Each address packet includes Node Address bits that contain the processor ID of the destination node, it the destination node is in the local network. If the destination node is in a foreign network space, the destination node must be directly connected to a node in the local network space. In this case, the Node Address bits contain the processor ID of the local node connected to the destination node. Path creation means in said processor node compares the masked node address with its own processor ID and sends the address packet out the port number corresponding to the bit position of the first difference between the masked node address and its own processor ID, starting at bit n+1, where n is the number of the port on which the message was received.