摘要:
A high speed computer processor system has a high speed interface for a graphics processor. A preferred embodiment combines a PowerPC microprocessor called the Giga-Processor Ultralite (GPUL) 110 from International Business Machines Corporation (IBM) with a high speed interface on a multi-chip module.
摘要:
An instruction, corresponding methods, and circuitry for efficiently performing partial dot sum products are provided. The instruction may include a source select field for specifying one or more source word elements to participate in the dot sum operation. The instruction may also include a target select field for specifying one or more (or none) target word elements for storing the result of the dot sum operation.
摘要:
This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.
摘要:
An apparatus and method for inhibiting data cache thrashing in a multi-threading execution mode through simulating a higher level of associativity in a data cache. The apparatus temporarily splits a data cache into multiple regions and each region is selected according to a thread ID indicator in an instruction register. The data cache is split when the apparatus is in the multi-threading execution mode indicated by an enable cache split bit.
摘要:
A method, apparatus, system, and signal-bearing medium that in an embodiment detect an event that will cause idle cycles in the processor and issue diagnostic instructions to the processor during the cycles that would be idle. In another embodiment, the processor is periodically interrupted and diagnostic instructions are issued to the processor, where the diagnostic instructions are selected based on a history of activity at the processor and a log of previous errors at the processor. In this way, errors may be detected at the processor without undue cost and impact on performance.
摘要:
A register file bit includes a primary latch and a secondary latch with a feedback path and a context switch mechanism that allows a fast context switch when execution changes from one thread to the next. A bit value for a second thread of execution is stored in the primary latch, then transferred to the secondary latch. The bit value for a first thread of execution is then written to the primary latch. When a context switch is needed (when the first thread stalls and the second thread needs to begin execution), the register file bit can perform a context switch from the first thread to the second thread in a single clock cycle. The register file bit contains a backup latch inside the register file itself so that minimal extra wire paths are needed to or from the existing register file.
摘要:
Methods and apparatus are disclosed that provide for improved addressing of a register file in a computer system. The register file has one or more redundant words. A logical address in an instruction is mapped, during a predecode operation, to a physical address having a larger address space than the logical address. Addresses of nonfaulty words are mapped to the same word in the larger address space as the logical address. Logical addresses that point to faulty words are mapped to a redundant word that is in the larger address space but not in the address space of the logical address. Because all addresses presented to a register file decoder at access time point to nonfaulty words, no delay penalty associated with address compare during the access time is required.
摘要:
Compressed memory systems and methods that reduce problems of memory overflow and data loss. A compression engine compresses blocks of data for storage in a compressed memory. A compression monitor monitors the achieved compression ratio and provides a software trap when the achieved compression ratio falls below a minimum. After the trap is provided software monitors the fill state of the compressed memory. If the compressed memory is approaching full, the software changes the block size to improve the compression ratio.
摘要:
Methods and systems for repairing ports are disclosed. Embodiments may detect a hard failure of a port, select an alternative port from existing ports in use within an array, and share the alternative port to route operands bound for the first port and the alternative port, to transmit operands associated with the failed port to the corresponding destination unit. Embodiments include an additional wire, or an alternative port path, that couples the alternative port to the destination unit that is associated with the first port. For instance, in a multi-pipeline processor, an operand of an instruction that is bound for the failed read port may be routed via an alternative read port to the corresponding execution unit. Similarly, data bound for failed write ports may be, e.g., written back to a register file by routing the data via an alternative write port of the register file.
摘要:
A computer system includes a main memory, at least one processor, and at least one level of cache. The system contains at least one segment table having multiple segment entries recording the assignment of segments in an address space. At least some segment table entries include pre-fetch data indicating which portions of a segment should be pre-fetched. Preferably, a segment table entry contains a list of pages for pre-fetching. Preferably, pre-fetching a listed page causes address translation data for the page to be cached in at least one address translation caching structure. Pre-fetching may also cause data contents of the page to be loaded into a cache.