摘要:
A method for optimizing a register file hierarchy in a multithreaded processor. The method includes providing a register file hierarchy with a plurality of register file cells, associating the plurality of register file cells with respective threads when the processor is operating in a multithreaded mode and flattening the plurality of register file cells with a single thread when the processor is operating in a single threaded mode. The register file cells correspond to threads of the multithreaded processor.
摘要:
A processor that protects an execution pipeline includes a residue-based error detection infrastructure including a first logic for computing a first residue of a result of an executed instruction instance, and a second logic for computing a second residue of the result. The second logic applies arithmetic operations of the executed instruction instance to residues of operands of the instruction instance. The execution pipeline includes registers and one or more arithmetic execution units. A method of protecting an execution pipeline includes performing one or more operations of an instruction instance on residues of operands of the instruction instance, computing a first residue of a result of the operations on the operand residues, computing a second residue from a result of executing the instruction instance, and checking the first residue against the second residue to determine whether errors were introduced while the instruction instance was resident in the execution pipeline.
摘要:
A system for handling a plurality of single precision floating point instructions and a plurality of double precision floating point instructions that both index a same set of registers is provided. The system comprises a decode unit arranged to decode, stall, and forward at least one of the plurality of single precision and at least one of the plurality of double precision floating point instructions in a fetch group. The decode unit includes a first counter arranged to increment for each of the plurality of single precision floating point instructions forwarded down a pipeline; a second counter arranged to increment for each of the plurality of double precision floating point instructions forwarded down the pipeline; a first mask register and a second mask register. The first mask register is updated by each of the single precision floating point instructions forwarded and the second mask register is updated by each of the double precision floating point instructions forwarded.
摘要:
A method and apparatus to determine readiness of a complex instruction for retirement includes decoding a complex instruction into a plurality of helper instructions; executing the plurality of helper instructions using an execution unit; indicating the plurality of helper instructions that are alive using a live instruction register; and maintaining a complex instruction identification for the complex instruction using a complex instruction identification register.
摘要:
A method for protecting reliability of data associated with a data array is provided. The method initiates with defining state information associated with the data array. Then, crucial state information is identified from the state information. Next, a copy of the crucial state information is generated. Then, the state information and the copy of the crucial state information are protected. Next, a worst case state associated with non-crucial information is defined. In response to detecting an error associated with the non-crucial information, the method includes defaulting to the worst case state. A computer readable media and a shared memory multiprocessor chip are also provided.
摘要:
An apparatus including a cache subsystem arrangement for efficient management of input/output operations and of memory shared by processors in a multiprocessor system. The apparatus includes a central processing unit, an input/output device such as a network device or a display device for example, and the cache arrangement, which includes a coalescing buffer coupled with the data processing unit for receiving non-cacheable data from the processing unit. The non-cacheable data is combined in the coalescing buffer into non-cacheable data blocks. A system bus is coupled with the buffer and the input/output device for storing the non-cacheable data blocks to the input/output device. By combining the non-cacheable data before storage to the input/output device, the coalescing buffer provides higher performance in the multiprocessor system, since fewer bus transactions are issued for serial store operations and more stores can complete in a given amount of time than if they were issued singly on the bus. This is particularly advantageous in the multiprocessing system since multiple processors must compete for limited bus transaction bandwidth.
摘要:
Embodiments of an invention for consecutive bit error detection and correction are disclosed. In one embodiment, an apparatus includes a storage structure, a second storage structure, a parity checker, an error correction code (ECC) checker, and an error corrector. The first storage structure is to store a plurality of data values, a plurality of parity values, and a plurality of ECC values, each parity value corresponding to one of the plurality of data values, a first bit of each parity value corresponding to a first of a plurality of portions of a corresponding data value, wherein the first of the plurality of portions of the corresponding data value is interleaved with a second of the plurality of portions of the corresponding data value, wherein a second bit of each parity value corresponds to a second of the plurality of portions of the corresponding data value, each ECC value corresponding to one of the plurality of data values. The parity checker is to detect a parity error in a data value stored in the first storage structure using a parity value corresponding to the data value. The ECC checker is to generate a syndrome. The error corrector is to detect and correct consecutive bit errors in the data value using the syndrome.
摘要:
The present application describes a method and a processor for handling register dependency conflicts between lesser and greater width instructions, colloquially referred to as “evil twins.” If there is a register dependency between a greater width producer instruction and a lesser width consumer instruction, a greater width source register is substituted for the source register specified by the lesser width producer. If there is a register dependency between a lesser width producer instruction and a greater width producer instruction, the greater width consumer instruction is replaced by multiple helper instructions. One or more of the helper instructions merge lesser width registers aliased onto the source registers specified by the greater width consumer instruction, into temporary registers. Another helper instruction executes the greater width consumer instruction using the temporary registers instead of the original source registers.
摘要:
The present application describes a method and a system for executing instructions while reducing the logic required for execution in a processor. Instructions (e.g., atomic, integer-multiply, integer-divide, move on integer registers, graphics, floating point calculations or the like) are expanded into helper instructions before execution (e.g., in the integer, floating point, graphics and memory units or the like). Such instructions are treated as complex instructions. The functionality of a complex instruction is shared among multiple helpers so that by executing the helpers representing the complex instruction, the functionality of complex instruction is achieved. The expansion of complex instructions into helper instructions reduces the amount of hardware and complexity involved in supporting these individual complex instructions in various units in the processor.
摘要:
An uncorrectable error is detected in the data of a computer system. The erroneous data is allowed to be stored in first and second caches of the computer system while the system runs first and second processes, the first process being associated with the data. The first process is terminated when an attempt is made to load the data from the cache. Meanwhile, the second process continues to run.