Abstract:
L'architecture de microprocesseur RISC, basée sur une mémoire centrale et à performances élevées, permet d'exécuter simultanément des instructions obtenues de la mémoire par l'intermédiaire d'une unité d'extraction d'instructions comprenant des parcours d'extractions multiples permettant l'extraction d'une suite d'instructions de programme principal, d'une suite d'instructions de branchement conditionnel cible et d'une suite d'instructions de procédure. La trajectoire d'extraction de branchement conditionnel cible permet d'extraire les deux suites d'instructions possibles pour une instruction de branchement conditionnel. La trajectoire d'extraction d'instructions de procédure permet d'accéder à une suite d'instructions supplémentaires sans remettre à zéro les tampons d'extraction principaux ou cibles. Chaque ensemble d'instruction comprend une multiplicité d'instructions de longueur fixe. Un système premier entré-premier sorti pour les instructions est prévu afin de mettre en mémoire tampon des ensembles d'instructions dans une multiplicité de tampons d'ensembles d'instructions comprenant un premier et un second tampon. Une unité d'exécution d'instructions comprenant un fichier de registre et une multiplicité d'unités fonctionnelles est pourvue d'une unité de commande d'instructions pouvant examiner les ensembles d'instructions dans les premier et second tampons et organiser n'importe quelle instruction afin qu'elle soit exécutée par des unités fonctionnelles disponibles. Des trajectoires de données multiples entre les unités fonctionnelles et le fichier de registre permettent aux unités fonctionnelles d'obtenir des accès multiples et indépendants au fichier de registre tel qu'il est requis pour l'exécution des instructions respectives.
Abstract:
Disclosed is a data processing apparatus comprising a decode device for decoding an instruction code including an operation code and two register designation codes and an instruction execution device for executing appropriate process according to the results decoded by the decode device, wherein the instruction execution device executes a first process when the two register designation codes are different from each other and executes a second process when they are equal.
Abstract:
A memory stack used for storing microinstruction addresses in a pipelined CPU is constructed as a last-in, first-out memory using a stack pointer which applies a read control to one location of the stack and applies a write control to the next higher location. An unconditional read and write is done every machine cycle, before a microinstruction could be decoded, then the data on the read bus, or data from the write bus, is used and the pointer is incremented or decremented if a stack Push or Pop is decoded. These correspond to a Call or Return microinstruction. Thus the delay in decoding the microinstruction does not prevent completion of the stack operation in one machine cycle.
Abstract:
A system for issuing a family of instructions during a single clock includes a decoder for decoding the family of instructions and logic, responsive to the decode result, for determining whether resource conflicts would occur if the family were issued during one clock. If no resource conflicts occur, an execution unit executes the family regardless of whether dependencies among the instructions in the family exist.
Abstract:
A system for issuing a family of instructions during a single clock includes a decoder for decoding the family of instructions and logic, responsive to the decode result, for determining whether resource conflicts would occur if the family were issued during one clock. If no resource conflicts occur, an execution unit executes the family regardless of whether dependencies among the instructions in the family exist.
Abstract:
A memory stack used for storing microinstruction addresses in a pipelined CPU is constructed as a last-in, first-out memory using a stack pointer which applies a read control to one location of the stack and applies a write control to the next higher location. An unconditional read and write is done every machine cycle, before a microinstruction could be decoded, then the data on the read bus, or data from the write bus, is used and the pointer is incremented or decremented if a stack Push or Pop is decoded. These correspond to a Call or Return microinstruction. Thus the delay in decoding the microinstruction does not prevent completion of the stack operation in one machine cycle.
Abstract:
The local variables of procedures are automatically mapped form main memory of a computer into a circular buffer comprising machine registers. As instructions are fetched from the main memory, the instructions are partially decoded by adding the stack pointer to the offset and then stored in an instruction cache. In cases where the sum of the memory locations required by the procedures exceed available register memory, the contents of the registers nearest the maximum stack pointer are flushed back to main memory.
Abstract:
A threaded interpretive processor includes an input/ output (I/O) bus (10) and an address bus (12) for carrying data thereon. An internal ROM/RAM (80) is interfaced with the I/O bus (10) and is addressable from the address bus (12). Instructions placed on the I/O bus (10) are clocked onto the address bus (12) through an instruction pointer (86) in response to a system dock. The data on the I/O bus (10) is also clocked to a microcode ROM (60) through an instruction register (58). The microcode ROM (60) outputs microcode instructions to control the system operation. The microcode instructions control a parameter stack. The parameter stack consists of an eight register rotary stack (44) that has the outputs thereof simultaneously accessable by two output buses (46) and (48) and the inputs thereof accessable by an interface bus (36) and a data input bus (50). The outputs of the rotary stack (44) are input to an arithmetic logic unit (16), the output of which is input back into the rotary stack (44). Transfer gates are provided to control data flow on the output buses and input buses such that the data in the rotary stack (44) can be manipulated. Addresses of microcode instructions are sequentially placed onto I/O bus (10) for controlling the microcode ROM (60) and the instruction pointer (86) increments this instruction address to select the next sequential instruction address. In this manner, instructions can be sequentially executed in sequential clock cycles.
Abstract:
Disclosed are a method and apparatus for processing neural network feature map using a plurality of accelerators. The method includes: reading first feature data about the neural network feature map from first shift register array in first accelerator among a plurality of neural network accelerators, and first weight data corresponding to the first feature data from first buffer; performing preset operation on the first feature data and first weight data using the first accelerator, to obtain a first operation result; shifting, according to preset shift rule, first overlapping feature data in the first feature data and required by a second accelerator to a second shift register array of the second accelerator; and performing a preset operation on the second feature data from the second shift register array including the first overlapping feature data and the read second weight data using the second accelerator, to obtain a second operation result.