Patent search ap:("Intel Corporation") AND inv:"Wing Shek Wong" Page 1

1.

发明公开
PROCESSOR CIRCUITRY TO PERFORM A FUSED MULTIPLY-ADD 审中-公开

公开(公告)号：US20240354057A1

公开(公告)日：2024-10-24

申请号：US18523186

申请日：2023-11-29

Applicant: Intel Corporation

Inventor： Jongwook Sohn , David Dean , Eric Quintana , Wing Shek Wong

IPC: G06F7/523

CPC classification number: G06F7/523

Abstract: Techniques and mechanisms for circuitry to support the performance of a fused multiply-add (FMA) operation with one or more denormal numbers. In some embodiments, a processor is operable to execute a FMA instruction comprising or otherwise identifying two multiplicands, and an addend. Such execution includes performing one-way alignment of an addend significand based on a difference between respective exponent values of the two multiplicands. The alignment is performed in parallel with operations by a multiplier circuit based on respective significand values of the two multiplicands. Subtraction of a J-bit correction value is performed in the multiplier circuit to avoid mitigate execution delay. In another embodiment, first circuitry of a processor executes an FMA instruction, wherein components of the first circuitry are shared with second circuitry of the processor, and wherein the second circuitry supports the execution of a floating-point multiplication instruction.

2.

发明授权
Methods, systems, and apparatuses to optimize partial flag updating instructions via dynamic two-pass execution in a processor 有权

公开(公告)号：US12039329B2

公开(公告)日：2024-07-16

申请号：US17134108

申请日：2020-12-24

Applicant: Intel Corporation

Inventor： Wing Shek Wong , Vikash Agarwal , Charles Vitu , Mihir Shah

IPC: G06F9/24 , G06F9/22 , G06F9/30

CPC classification number: G06F9/223 , G06F9/30145

Abstract: Systems, methods, and apparatuses relating to circuitry to implement dynamic two-pass execution of a partial flag updating instruction in a processor are described. In one embodiment, a hardware processor core includes a decoder circuit to decode instructions into a set of one or more micro-operations, an execution circuit to execute the micro-operations decoded for the instructions, a data register to store data, a flag register to store a plurality of flags, and a reservation station circuit coupled between the decoder circuit and the execution circuit, the reservation station circuit to, in response to an indicator bit set to a multiple pass mode for a single micro-operation in a reservation station entry, perform a first dispatch of the single micro-operation to the execution circuit, when a source data operand in the data register is ready for execution and a source flag operand in the flag register is not ready for execution, to generate a data resultant, and a second dispatch of the single micro-operation to the execution circuit when both the source data operand in the data register and the source flag operand in the flag register are ready for execution to generate a flag resultant based on one or more of the plurality of flags in the flag register.

3.

发明申请
METHODS, SYSTEMS, AND APPARATUSES TO OPTIMIZE CROSS-LANE PACKED DATA INSTRUCTION IMPLEMENTATION ON A PARTIAL WIDTH PROCESSOR WITH A MINIMAL NUMBER OF MICRO-OPERATIONS 有权

公开(公告)号：US20220206791A1

公开(公告)日：2022-06-30

申请号：US17134100

申请日：2020-12-24

Applicant: Intel Corporation

Inventor： Wing Shek Wong , Kameswar Subramaniam , Eric Quintana

IPC: G06F9/22

Abstract: Systems, methods, and apparatuses relating to circuitry to implement a cross-lane packed data instruction on a partial (e.g., half) width processor with a minimal number of micro-operations are described. In one embodiment, a hardware processor core includes a decoder circuit to decode a single packed data instruction into only a first micro-operation and a second micro-operation, a packed data execution circuit to execute the first micro-operation and the second micro-operation, and a reservation station circuit coupled between the decoder circuit and the packed data execution circuit, the reservation station circuit comprising a first reservation station entry for the first micro-operation to store a first set of fields that indicate three or more input sources and a first destination, and a second reservation station entry for the second micro-operation to store a second set of fields to indicate three or more input sources and a second destination.

4.

发明授权
Instruction and logic for a matrix scheduler 有权

公开(公告)号：US09851976B2

公开(公告)日：2017-12-26

申请号：US14581101

申请日：2014-12-23

Applicant: Intel Corporation

Inventor： Wing Shek Wong , James E. Phillips

IPC: G06F9/38

CPC classification number: G06F9/3838

Abstract: A processor includes a core and a scheduler. The scheduler includes first and second dependency matrices and a ready determination unit. The scheduler also includes logic to queue a first parent operation, a second parent operation, and a child operation that includes first and second sources dependent on the first and second parent operations. The scheduler also includes logic to store physical addresses of the first and second sources of the child operation respectively in the first and second dependency matrices. Further, the scheduler includes logic to perform a tag comparisons between the respective physical addresses of the destinations of the first and second parent operations respectively with the respective physical address of the first and second sources of the child operation. In addition, the ready determination unit includes logic to determine that the child operation is ready for dispatch based on the tag comparisons.

5.

发明公开
ACCELERATING KECCAK ALGORITHMS 审中-公开

公开(公告)号：US20240211253A1

公开(公告)日：2024-06-27

申请号：US18145744

申请日：2022-12-22

Applicant: Intel Corporation

Inventor： Santosh Ghosh , Christoph Dobraunig , Manoj Sastry , Andrew H. Reinders , Regev Shemy , Qian Wang , Rotem Ohana Peretz , Wing Shek Wong , Wajdi Feghali

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30029 , G06F9/3016 , G06F9/3802

Abstract: A method comprises fetching, by fetch circuitry, an encoded parity instruction comprising at least one opcode, a first source identifier for a first source, a second source identifier for a second source, a third source identifier for a third source, and a destination identifier for a destination, decoding, by decode circuitry, the encoded parity instruction to generate a decoded parity instruction; and executing, by execution circuitry, the decoded parity instruction to retrieve operands representing a first register from the first source, a second register from the second source, a third register from the third source, and an index from the third source, perform an XOR operation of four words of data from the first register and single word of data from the second register in a position represented by the index to generate a parity value, and store the parity value in a the first register in a position represented by the index.

6.

发明授权
Power logic for memory address conversion 有权
Title translation: 用于存储器地址转换的电源逻辑

公开(公告)号：US09330022B2

公开(公告)日：2016-05-03

申请号：US13926564

申请日：2013-06-25

Applicant: Intel Corporation

Inventor： James E Phillips , Wing Shek Wong , Charles Vitu

IPC: G06F12/10 , G06F9/30 , G06F9/32

CPC classification number: G06F12/1036 , G06F9/3001 , G06F9/32

Abstract: In an embodiment, a processor includes a plurality of cores. Each core includes conversion power logic to receive an instruction including an untranslated memory address, determine whether a code segment (CS) base address is equal to zero, and in response to a determination that the CS base address is equal to zero, execute the instruction using the untranslated memory address. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，处理器包括多个核。每个核心包括用于接收包括非翻译存储器地址的指令的转换功率逻辑，确定代码段（CS）基地址是否等于零，并且响应于CS基地址等于零的确定，执行指令使用非翻译的内存地址。描述和要求保护其他实施例。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification