摘要:
A processor including instruction support for implementing hash algorithms may issue, for execution, programmer-selectable hash instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include hash instructions defined within the ISA. In addition, the hash instructions may be executable by the cryptographic unit to implement a hash that is compliant with one or more respective hash algorithm specifications. In response to receiving a particular hash instruction defined within the ISA, the cryptographic unit may retrieve a set of input data blocks from a predetermined set of architectural registers of the processor, and generate a hash value of the set of input data blocks according to a hash algorithm that corresponds to the particular hash instruction.
摘要:
An apparatus and method for floating-point special case handling. In one embodiment, a processor may include a first execution unit configured to execute a longer-latency floating-point instruction, and a second execution unit configured to execute a shorter-latency floating-point instruction. In response to the longer-latency floating-point instruction being issued to the first execution unit, the second execution unit may be further configured to detect whether a result of the longer-latency floating-point instruction is determinable from one or more operands of the longer-latency floating-point instruction independently of the first execution unit executing the longer-latency floating-point instruction. In response to detecting that the result is determinable, the second execution unit may be further configured to flush the longer-latency floating-point instruction from the first execution unit and to determine the result.
摘要:
In one embodiment, a processor is configured to execute a window swap instruction. The processor comprises a register file (that comprises a plurality of registers) and first and second execution units coupled to the register file. A first pipeline associated with the first execution unit has a first number of pipeline stages, and a second pipeline associated with the second execution unit has a second number of pipeline stages. The first execution unit is configured to change the current register window from the first register window to the second register window in the register file in response to the instruction. The second execution unit is configured to perform an operation defined by the instruction and write the result to the register file. The second number of pipeline stages exceeds the first number, whereby the second register window is established in the register file prior to writing the result.
摘要:
A method of enabling a single instruction stream multiple data stream operation and a double precision floating point operation within a single floating point execution unit which includes providing a floating point unit with a two way aligner and a two way normalizer, selectively aligning a value based upon whether a single instruction stream multiple data stream operation is to be performed or a double precision operation is to be performed, and selectively normalizing a value based upon whether a single instruction stream multiple data stream operation is to be performed or a double precision operation is to be performed.
摘要:
Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
摘要:
Techniques are disclosed relating to integrated circuits that include hardware support for divide and/or square root operations. In one embodiment, an integrated circuit is disclosed that includes a division unit that, in turn, includes a normalization circuit and a plurality of divide engines. The normalization circuit is configured to normalize a set of operands. Each divide engine is configured to operate on a respective normalized set of operands received from the normalization circuit. In some embodiments, the integrated circuit includes a scheduler unit configured to select instructions for issuance to a plurality of execution units including the division unit. The scheduler unit is further configured to maintain a counter indicative of a number of instructions currently being operated on by the division unit, and to determine, based on the counter whether to schedule subsequent instructions for issuance to the division unit.
摘要:
A processor includes an instruction fetch unit configured to issue instructions for execution, where the instructions are selected from a number of threads, where each given instruction has a corresponding thread identifier, and where at least some of the instructions specify operand(s) via register identifiers. A register file stores operands usable by the instructions, and may include several banks, each corresponding to a register identifiers and including several entries corresponding to the several threads, wherein the entries are configured to store data values. In response to receiving a request to read a particular register identifier for a given thread identifier, the register file may be configured to decode the given thread identifier to retrieve entries from the banks that correspond to the given thread identifier. The register file may further select, from among the retrieved entries, a data value corresponding to the particular register identifier to be output.
摘要:
Described is an execution unit for performing at least part of the Data Encryption Standard that includes a Left Half input; a Key input; and a Table input, as well as a first group of transistors configured to receive the Table input, perform a table look-up, and output data. The execution unit further includes a first exclusive-or operator having two inputs and an output that is configured to receive the Left Half input and the Key input. The execution unit also includes a second exclusive-or operator having two inputs and an output that is configured to receive the data output by the first group of transistors and to receive the output of the first exclusive-or operator. The execution unit also includes a third exclusive-or operator having two inputs and an output that is configured to receive the Left Half input and the data output by the first group of transistors.
摘要:
In one embodiment, a processor comprises a first register file configured to store speculative register state, a second register file configured to store committed register state, a check circuit and a control unit. The first register file is protected by a first error protection scheme and the second register file is protected by a second error protection scheme. A check circuit is coupled to receive a value and corresponding one or more check bits read from the first register file to be committed to the second register file in response to the processor selecting a first instruction to be committed. The check circuit is configured to detect an error in the value responsive to the value and the check bits. Coupled to the check circuit, the control unit is configured to cause reexecution of the first instruction responsive to the error detected by the check circuit.
摘要:
A processor including instruction support for large-operand instructions that use multiple register windows may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may also include an instruction execution unit that, during operation, receives instructions for execution from the instruction fetch unit and executes a large-operand instruction defined within the ISA, where execution of the large-operand instruction is dependent upon a plurality of registers arranged within a plurality of register windows. The processor may further include control circuitry (which may be included within the fetch unit, the execution unit, or elsewhere within the processor) that determines whether one or more of the register windows depended upon by the large-operand instruction are not present. In response to determining that one or more of these register windows are not present, the control circuitry causes them to be restored.