-
公开(公告)号:US20240354057A1
公开(公告)日:2024-10-24
申请号:US18523186
申请日:2023-11-29
Applicant: Intel Corporation
Inventor: Jongwook Sohn , David Dean , Eric Quintana , Wing Shek Wong
IPC: G06F7/523
CPC classification number: G06F7/523
Abstract: Techniques and mechanisms for circuitry to support the performance of a fused multiply-add (FMA) operation with one or more denormal numbers. In some embodiments, a processor is operable to execute a FMA instruction comprising or otherwise identifying two multiplicands, and an addend. Such execution includes performing one-way alignment of an addend significand based on a difference between respective exponent values of the two multiplicands. The alignment is performed in parallel with operations by a multiplier circuit based on respective significand values of the two multiplicands. Subtraction of a J-bit correction value is performed in the multiplier circuit to avoid mitigate execution delay. In another embodiment, first circuitry of a processor executes an FMA instruction, wherein components of the first circuitry are shared with second circuitry of the processor, and wherein the second circuitry supports the execution of a floating-point multiplication instruction.
-
公开(公告)号:US12039329B2
公开(公告)日:2024-07-16
申请号:US17134108
申请日:2020-12-24
Applicant: Intel Corporation
Inventor: Wing Shek Wong , Vikash Agarwal , Charles Vitu , Mihir Shah
CPC classification number: G06F9/223 , G06F9/30145
Abstract: Systems, methods, and apparatuses relating to circuitry to implement dynamic two-pass execution of a partial flag updating instruction in a processor are described. In one embodiment, a hardware processor core includes a decoder circuit to decode instructions into a set of one or more micro-operations, an execution circuit to execute the micro-operations decoded for the instructions, a data register to store data, a flag register to store a plurality of flags, and a reservation station circuit coupled between the decoder circuit and the execution circuit, the reservation station circuit to, in response to an indicator bit set to a multiple pass mode for a single micro-operation in a reservation station entry, perform a first dispatch of the single micro-operation to the execution circuit, when a source data operand in the data register is ready for execution and a source flag operand in the flag register is not ready for execution, to generate a data resultant, and a second dispatch of the single micro-operation to the execution circuit when both the source data operand in the data register and the source flag operand in the flag register are ready for execution to generate a flag resultant based on one or more of the plurality of flags in the flag register.
-
公开(公告)号:US20220206791A1
公开(公告)日:2022-06-30
申请号:US17134100
申请日:2020-12-24
Applicant: Intel Corporation
Inventor: Wing Shek Wong , Kameswar Subramaniam , Eric Quintana
IPC: G06F9/22
Abstract: Systems, methods, and apparatuses relating to circuitry to implement a cross-lane packed data instruction on a partial (e.g., half) width processor with a minimal number of micro-operations are described. In one embodiment, a hardware processor core includes a decoder circuit to decode a single packed data instruction into only a first micro-operation and a second micro-operation, a packed data execution circuit to execute the first micro-operation and the second micro-operation, and a reservation station circuit coupled between the decoder circuit and the packed data execution circuit, the reservation station circuit comprising a first reservation station entry for the first micro-operation to store a first set of fields that indicate three or more input sources and a first destination, and a second reservation station entry for the second micro-operation to store a second set of fields to indicate three or more input sources and a second destination.
-
公开(公告)号:US09851976B2
公开(公告)日:2017-12-26
申请号:US14581101
申请日:2014-12-23
Applicant: Intel Corporation
Inventor: Wing Shek Wong , James E. Phillips
IPC: G06F9/38
CPC classification number: G06F9/3838
Abstract: A processor includes a core and a scheduler. The scheduler includes first and second dependency matrices and a ready determination unit. The scheduler also includes logic to queue a first parent operation, a second parent operation, and a child operation that includes first and second sources dependent on the first and second parent operations. The scheduler also includes logic to store physical addresses of the first and second sources of the child operation respectively in the first and second dependency matrices. Further, the scheduler includes logic to perform a tag comparisons between the respective physical addresses of the destinations of the first and second parent operations respectively with the respective physical address of the first and second sources of the child operation. In addition, the ready determination unit includes logic to determine that the child operation is ready for dispatch based on the tag comparisons.
-
公开(公告)号:US20240211253A1
公开(公告)日:2024-06-27
申请号:US18145744
申请日:2022-12-22
Applicant: Intel Corporation
Inventor: Santosh Ghosh , Christoph Dobraunig , Manoj Sastry , Andrew H. Reinders , Regev Shemy , Qian Wang , Rotem Ohana Peretz , Wing Shek Wong , Wajdi Feghali
CPC classification number: G06F9/30029 , G06F9/3016 , G06F9/3802
Abstract: A method comprises fetching, by fetch circuitry, an encoded parity instruction comprising at least one opcode, a first source identifier for a first source, a second source identifier for a second source, a third source identifier for a third source, and a destination identifier for a destination, decoding, by decode circuitry, the encoded parity instruction to generate a decoded parity instruction; and executing, by execution circuitry, the decoded parity instruction to retrieve operands representing a first register from the first source, a second register from the second source, a third register from the third source, and an index from the third source, perform an XOR operation of four words of data from the first register and single word of data from the second register in a position represented by the index to generate a parity value, and store the parity value in a the first register in a position represented by the index.
-
公开(公告)号:US09330022B2
公开(公告)日:2016-05-03
申请号:US13926564
申请日:2013-06-25
Applicant: Intel Corporation
Inventor: James E Phillips , Wing Shek Wong , Charles Vitu
CPC classification number: G06F12/1036 , G06F9/3001 , G06F9/32
Abstract: In an embodiment, a processor includes a plurality of cores. Each core includes conversion power logic to receive an instruction including an untranslated memory address, determine whether a code segment (CS) base address is equal to zero, and in response to a determination that the CS base address is equal to zero, execute the instruction using the untranslated memory address. Other embodiments are described and claimed.
Abstract translation: 在一个实施例中,处理器包括多个核。 每个核心包括用于接收包括非翻译存储器地址的指令的转换功率逻辑,确定代码段(CS)基地址是否等于零,并且响应于CS基地址等于零的确定,执行指令 使用非翻译的内存地址。 描述和要求保护其他实施例。
-
-
-
-
-