摘要:
In a data processing system employing virtual memory techniques and capable of performing a plurality of overlapping scalar and vector data processing operations, apparatus and method are provided to allow continuation of program execution after one or more vector load/store instructions, which refer to data values that are not currently in memory, receive page faults. At the occurrence of such a page fault, all instructions currently in execution are allowed to be completed, whereupon information summarizing the page fault condition is recorded in memory for use by the operating system software and a vector exception is generated. Operating system software responds to this exception, examines the fault information, causes the missing pages to be read into the main memory unit from the mass storage media, re-executes the exception producing vector instruction(s) and continues with the program execution.
摘要:
An integrated circuit (100) includes a firewall input terminal, a first circuit (110, 120, 170, 172), and a second circuit (220). The firewall input terminal is for receiving a firewall input signal. The first circuit (110, 120, 170, 172) is coupled to a first power supply voltage terminal (203) and has an output for providing a control signal. The second circuit is coupled to a second power supply voltage terminal (210), to the firewall input terminal (214), and to the first circuit (110, 120, 170, 172). When the firewall input signal is inactive, an activation of the control signal affects the operation of the second circuit. When the firewall input signal is active, an activation of the control signal does not affect the operation of the second circuit.
摘要:
In a multiprocessor data processing unit, a data element in the main memory unit, that has system wide significance, can have a requirement that this data element be altered in a controlled manner. Because other data processing units can have access to this data element, the alteration of the data element must be synchronized so the other data processing units are not in the process of altering the same data element simultaneously. The present invention includes an instruction that acquires access to an interlock signal in the main memory unit and initiates an interlock in the main memory unit, thereby excluding other data processing units from gaining access to the interlock signal simultaneously. The instruction causes the data element related to the interlock signal to be transferred to the data processing unit where the data element is saved, can be entered in mask apparatus and then have a quantity added thereto. The altered data element is returned to the main memory unit location and the main memory interlock signal is released, thereby completing the instruction.
摘要:
A data processing system (100) comprises a system bus (120), a plurality of devices (110, 150, 160, 170) coupled to the system bus (120), a bus monitor circuit (140), and a clock generator (130). The plurality of devices (110, 150, 160, 170) includes at least one bus master (110, 150) which is capable of performing accesses on the system bus (120). The bus monitor circuit (140) is coupled to the at least one bus master (110, 150), and has an output for providing a bus idle signal to indicate that no bus master is attempting to perform an access on the system bus (120). The clock generator (130) has an output coupled to at least one of the plurality of devices (110, 150, 160, 170) and provides a bus clock signal having a first frequency when the bus idle signal is inactive and having a second frequency lower than the first frequency when the bus idle signal is active.
摘要:
A CPU of the RISC type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes limited to register-to-register operations and register load/store operations. Byte manipulation instructions include the facility for doing in-register byte extract, insert and masking, along with non-aligned load and store instructions. The load/locked and store/conditional instructions permits the implementation of atomic byte writes. By providing a conditional move instruction, many short branches can be eliminated altogether. A conditional move instruction tests a register and moves a second register to a third if the condition is met; this function can be substituted for short branches and thus maintain the sequentiality of the instruction stream. Performance can be speeded up by predicting the target of a branch and prefetching the new instruction based upon this prediction; a branch prediction rule is followed that requires all forward branches to be predicted not-taken and all backward branches to be predicted as taken. Another embodiment uses unused bits in the standard-sized instruction to provide a hint of the expected target address for jump and jump to subroutine instructions or the like. The target can thus be prefetched before the actual address has been calculated and placed in a register. In addition, the unused displacement part of the jump instruction can contain a field to define the actual type of jump, i.e., jump, jump to subroutine, return from subroutine, and thus place a predicted target address in a stack to allow prefetching before the instruction has been executed. The processor can employ a variable memory page size, so that the entries in a translation buffer for implementing virtual addressing can be optimally used. A granularity hint is added to the page table entry to define the page size for this entry. An additional feature is the addition of a prefetch instruction which serves to move a block of data to a faster-access cache in the memory hierarchy before the data block is to be used.
摘要:
A high-performance CPU of the RISC (reduced instruction set) type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes. The instruction set is limited to register-to-register operations and register load/store operations. Byte manipulation instructions, included to permit use of previously-established data structures, include the facility for doing in-register byte extract, insert and masking, along with non-aligned load and store instructions. The provision of load/locked and store/conditional instructions permits the implementation of atomic byte writes. By providing a conditional move instruction, many short branches can be eliminated altogether. A conditional move instruction tests a register and moves a second register to a third if the condition is met; this function can be substituted for short branches and thus maintain the sequentiality of the instruction stream. Performance can be speeded up by predicting the target of a branch and prefetching the new instruction based upon this prediction; a branch prediction rule is followed that requires all forward branches to be predicted not-taken and all backward branches (as is common for loops) to be predicted as taken. Another performance improvement makes use of unused bits in the standard-sized instruction to provide a hint of the expected target address for jump and jump to subroutine instructions or the like. The target can thus be prefetched before the actual address has been calculated and placed in a register. In addition, the unused displacement part of the jump instruction can contain a field to define the actual type of jump, i.e., jump, jump to subroutine, return from subroutine, and thus place a predicted target address in a stack to allow prefetching before the instruction has been executed. The processor can employ a variable memory page size, so that the entries in a translation buffer for implementing virtual addressing can be optimally used. A granularity hint is added to the page table entry to define the page size for this entry. An additional feature is the addition of a prefetch instruction which serves to move a block of data to a faster-access cache in the memory hierarchy before the data block is to be used.
摘要:
In one embodiment, a system comprises a memory; a memory interface coupled to the memory; a processor unit coupled to the memory interface, a second interface coupled to the processor unit, and a graphics processing unit. The processor unit comprises at least one processor core and a display controller configured to couple to a display. The graphics processing unit is configured to render data into a frame buffer representing an image to be displayed on the display. The processor unit is configured to deactivate the second interface if the graphics processing unit is not rendering, and the display controller is configured to read the frame buffer data for display even if the second interface is deactivated.
摘要:
A high-performance CPU of the RISC (reduced instruction set) type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes. The instruction set is limited to register-to-register operations and register load/store operations. Byte manipulation instructions, included to permit use of previously-established data structures, include the facility for doing in-register byte extract, insert and masking, along with non-aligned load and store instructions. The provision of load/locked and store/conditional instructions permits the implementation of atomic byte writes. By providing a conditional move instruction, many short branches can be eliminated altogether. A conditional move instruction tests a register and moves a second register to a third if the condition is met; this function can be substituted for short branches and thus maintain the sequentiality of the instruction stream.
摘要:
An instruction eases exception handling in a data processing system having one or more parallel pipelined execution units by permitting the central processing unit to complete instructions currently being processed by the execution units, but preventing further instructions from being initiated until all currently executing instructions have been completed and all outstanding exception conditions have been resolved. After all the instructions preceding the DRAIN instruction of the present invention in the program instruction sequence have been executed, the central processing unit can continue to execute the sequential program instructions when no arithmetic exception has been identified, or can invoke an exception handling procedure when an arithmetic exception has been identified. The instruction is typically positioned in an instruction sequence after an instruction that has high degree of probability of resulting in the identification of an arithmetic exception condition. The DRAIN instruction permits the source of the exception to be localized and permits the response to all arithmetic exceptions associated with instructions initiated before the DRAIN instruction, but identified after the execution of the DRAIN instruction, to be handled in the same context environment in which the instruction was initiated.
摘要:
A method for reading data blocks from main memory by central processing units in a multiprocessor system containing write-back caches. Load or gather instructions contain a write-intent flag. The status of the write-intent flag is determined. It is also determined whether a data block requested in the instruction by one of the processors is located in a corresponding cache, and if so, the requested data block is returned to the processor. If the data block is not in the cache and the write-intent flag indicates that the block will not be modified, the data block is read from main memory without obtaining a write privilege. The requested data block is subsequently returned from the cache to the processor. If the data block is not in the cache and the write-intent flag indicates the data block will be modified by the processor, then the data block is read from main memory while obtaining the write privilege. Subsequently, the requested data block is returned from the cache to the processor.