摘要:
A multithreaded microprocessor includes an instruction fetch unit including a perceptron-based conditional branch prediction unit configured to provide, for each of one or more concurrently executing threads, a direction branch prediction. The conditional branch prediction unit includes a plurality of storages each including a plurality of entries. Each entry may be configured to store one or more prediction values. Each prediction value of a given storage may correspond to at least one conditional branch instruction in a cache line. The conditional branch prediction unit may generate a separate index value for accessing each storage by generating a first index value for accessing a first storage by combining one or more portions of a received instruction fetch address, and generating each other index value for accessing the other storages by combining the first index value with a different portion of direction branch history information.
摘要:
A multithreaded microprocessor includes an instruction fetch unit including a perceptron-based conditional branch prediction unit configured to provide, for each of one or more concurrently executing threads, a direction branch prediction. The conditional branch prediction unit includes a plurality of storages each including a plurality of entries. Each entry may be configured to store one or more prediction values. Each prediction value of a given storage may correspond to at least one conditional branch instruction in a cache line. The conditional branch prediction unit may generate a separate index value for accessing each storage by generating a first index value for accessing a first storage by combining one or more portions of a received instruction fetch address, and generating each other index value for accessing the other storages by combining the first index value with a different portion of direction branch history information.
摘要:
A processor including instruction support for implementing the Data Encryption Standard (DES) block cipher algorithm may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include one or more DES instructions defined within the ISA. In addition, the DES instructions may be executable by the cryptographic unit to implement portions of an DES cipher that is compliant with Federal Information Processing Standards Publication 46-3 (FIPS 46-3). In response to receiving a DES key expansion instruction defined within the ISA, the cryptographic unit may generate one or more expanded cipher keys of the DES cipher key schedule from an input key.
摘要:
Techniques relating to a processor including instruction support for implementing a cyclic redundancy check (CRC) operation. The processor may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit configured to receive instructions that include a first instance of a cyclic redundancy check (CRC) instruction defined within the ISA, where the first instance of the CRC instruction is executable by the cryptographic unit to perform a first CRC operation on a set of data that produces a checksum value. In one embodiment, the cryptographic unit is configured to generate the checksum value using a generator polynomial of 0x11EDC6F41. In some embodiments, the first instance of the CRC instruction specifies an initial value to be used in performing the first CRC operation, the set of data, and a storage location in which the cryptographic unit is configured to store the checksum value produced by the first CRC operation.
摘要:
A processor may include a hardware instruction fetch unit configured to issue instructions for execution, and a hardware functional unit configured to receive instructions for execution, where the instructions include cryptographic instruction(s) and non-cryptographic instruction(s). The functional unit may include a cryptographic execution pipeline configured to execute the cryptographic instructions with a corresponding cryptographic execution latency, and a non-cryptographic execution pipeline configured to execute the non-cryptographic instructions with a corresponding non-cryptographic execution latency that is longer than the cryptographic execution latency. The functional unit may further include a local bypass network configured to bypass results produced by the cryptographic execution pipeline to dependent cryptographic instructions executing within the cryptographic execution pipeline, such that each instruction within a sequence of dependent cryptographic instructions is executable with the cryptographic execution latency, and where the results of the cryptographic execution pipeline are not bypassed to any other functional unit within the processor.
摘要:
A system and method for servicing translation lookaside buffer (TLB) misses may manage separate input and output pipelines within a memory management unit. A pending request queue (PRQ) in the input pipeline may include an instruction-related portion storing entries for instruction TLB (ITLB) misses and a data-related portion storing entries for potential or actual data TLB (DTLB) misses. A DTLB PRQ entry may be allocated to each load/store instruction selected from the pick queue. The system may select an ITLB- or DTLB-related entry for servicing dependent on prior PRQ entry selection(s). A corresponding entry may be held in a translation table entry return queue (TTERQ) in the output pipeline until a matching address translation is received from system memory. PRQ and/or TTERQ entries may be deallocated when a corresponding TLB miss is serviced. PRQ and/or TTERQ entries associated with a thread may be deallocated in response to a thread flush.
摘要:
A processor including instruction support for implementing the Camellia block cipher algorithm may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include one or more Camellia instructions defined within the ISA. In addition, the Camellia instructions may be executable by the cryptographic unit to implement portions of a Camellia cipher that is compliant with Internet Engineering Task Force (IETF) Request For Comments (RFC) 3713. In response to receiving a Camellia F( )-operation instruction defined within the ISA, the cryptographic unit may perform an F( ) operation, as defined by the Camellia cipher, upon a data input operand and a subkey operand, in which the data input operand and subkey operand may be specified by the Camellia F( )-operation instruction.
摘要翻译:包括用于实现Camellia块密码算法的指令支持的处理器可以从定义的指令集体系结构(ISA)发出用于执行编程器可选择指令的执行。 处理器可以包括可以接收执行指令的密码单元。 说明书包括在ISA内定义的一个或多个Camellia指令。 另外,山茶花指令可以由加密单元执行,以实现符合因特网工程任务组(IETF)请求注释(RFC)3713的Camellia密码的部分。响应于接收到Camellia F()操作 在ISA内定义的指令中,加密单元可以在数据输入操作数和子键操作数上执行由Camellia密码定义的F()操作,其中数据输入操作数和子键操作数可以由Camellia F ()操作指令。
摘要:
A method and mechanism for monitoring events in a processing system. A performance monitoring mechanism includes is configured to store a count of events in an event counter. Periodically, the count stored in the event counter is updated to a new count. If the new count equals a predetermined value, an indication that the count equals the predetermined value is conveyed. If the new count does not equal the predetermined value, but is within a given epsilon of the predetermined value and the occurrence of a corresponding event is detected, an indication that the count equals the predetermined value is conveyed. The mechanism is further configured to suppress event counts which correspond to mis-speculations.
摘要:
An apparatus and method for implementing a unified hash algorithm pipeline. In one embodiment, a cryptographic unit may include hash logic configured to compute a hash value of a data block according to a hash algorithm, where the hash algorithm is dynamically selectable from a plurality of hash algorithms, and where the hash logic comprises a plurality of pipeline stages each configured to compute a portion of the hash algorithm. The cryptographic unit may further include a word buffer configured to store the data block during computing by the hash logic.
摘要:
An apparatus and method for fine-grained multithreading in a multipipelined processor core. According to one embodiment, a processor may include instruction fetch logic configured to assign a given one of a plurality of threads to a corresponding one of a plurality of thread groups, where each of the plurality of thread groups may comprise a subset of the plurality of threads, to issue a first instruction from one of the plurality of threads during one execution cycle, and to issue a second instruction from another one of the plurality of threads during a successive execution cycle. The processor may further include a plurality of execution units, each configured to execute instructions issued from a respective thread group.