摘要:
A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions. The isolated complex instructions are decoded into nano-instructions which are processed by a RISC processor core.
摘要:
The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load/store unit is provided whose main purpose is to make load requests out-of-order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out-of-order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit. Thus, the three main tasks of the load/store unit are: (1) handling out of order cache requests; (2) detecting address collisions; and (3) alignment of data.
摘要:
A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions. The isolated complex instructions are decoded into nano-instructions which are processed by a RISC processor core.
摘要:
A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions. The isolated complex instructions are decoded into nano-instructions which are processed by a RISC processor core.
摘要:
A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions. The isolated complex instructions are decoded into nano-instructions which are processed by a RISC processor core.
摘要:
A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions. The isolated complex instructions are decoded into nano-instructions which are processed by a RISC processor core.
摘要:
The present invention provides a microcomputer that makes it possible to implement a real-time trace on a mass-produced chip using few terminals, acquire trace information from within a specified range, and measure execution times, together with electronic equipment and a debugging system comprising this microcomputer.A trace information output section (16) outputs trace information for implementing a real-time trace, to four dedicated terminals. It outputs instruction execution status information (DST[2:0]) of the CPU to three terminals and the PC value (DPCO) of a branch destination when an PC absolute branch has occurred, serially to one terminal. A microcomputer (10) outputs information indicating the start and end of a trace range or execution-time measurement range to DST[2] in a predetermined sequence. A debugging tool (20) determines the start and end of the trace range or execution-time measurement range from the values in DST[2].
摘要:
The present invention provides a microcomputer that makes it possible to implement a real-time trace on a mass-produced chip using few terminals, acquire trace information from within a specified range, and measure execution times, together with electronic equipment and a debugging system comprising this microcomputer. A trace information output section (16) outputs trace information for implementing a real-time trace, to four dedicated terminals. It outputs instruction execution status information (DST[2:0]) of the CPU to three terminals and the PC value (DPCO) of a branch destination when an PC absolute branch has occurred, serially to one terminal. A microcomputer (10) outputs information indicating the start and end of a trace range or execution-time measurement range to DST[2] in a predetermined sequence. A debugging tool (20) determines the start and end of the trace range or execution-time measurement range from the values in DST[2].
摘要:
The data processing circuit of this invention enables efficient description and execution of processes that act upon the stack pointer, using short instructions. It also enables efficient description of processes that save and restore the contents of registers, increasing the speed of processing of interrupts and subroutine calls and returns. A CPU that uses this data processing circuit comprises a dedicated stack pointer register SP and uses an instruction decoder to decode a group of dedicated stack pointer instructions that specify the SP as an implicit operand. This group of dedicated stack pointer instructions are implemented in hardware by using general-purpose registers, the PC, the SP, an address adder, an ALU, a PC incrementer, internal buses, internal signal lines, and external buses. This group of dedicated stack pointer instructions comprises SP-relative load instructions, stack pointer move instructions, a call instruction, a ret instruction, a sequential push instruction, and a sequential pop instruction.
摘要:
An objective of this invention is a design that improves the memory usage ratio and execution speed of a sum-of-products operation instruction, improves the critical path of sum-of-products operations, and prevents overflows. A sum-of-products operation circuit executes sum-of-products operations a number of times that is specified by number-of-executions information comprised within a sum-of-products operation instruction, under the control of a control circuit. The number of times the sum-of-products operation is to be executed is set into a register, that number is decremented every time one cycle of the sum-of-products operation ends, and the sum-of-products operation instruction ends when the value in the register reaches zero. If an interrupt is received during the execution of a plurality of sum-of-products operations, execution of the sum-of-products operations resumes after the interrupt processing. First and second sum-of-products input data are read at the same time by a single memory access. A 16-bit×16-bit multiplication result is added by a 32-bit adder, and upper 32-bit data is either incremented or decremented when a carry or borrow is generated by a lower 32-bit add.