Patent search ap:("INTEL CORPORATION") AND inv:"GIRKAR Page Milind B."

1.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE STORE 审中-公开

公开(公告)号：WO2018174932A1

公开(公告)日：2018-09-27

申请号：PCT/US2017/040543

申请日：2017-07-01

Applicant: INTEL CORPORATION

Inventor： VALENTINE, Robert , ADELMAN, Menachem , OULD-AHMED-VALL, Elmoustapha , TOLL, Bret L. , GIRKAR, Milind B. , SPERBER, Zeev , CHARNEY, Mark J. , RAPPOPORT, Rinat , CORBAL, Jesus , SHWARTSMAN, Stanislav , YANOVER, Igor , HEINECKE, Alexander, F. , ZIV, Barukh , BAUM, Daniel , GEBIL, Yuri

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information

2.

发明申请
FLOATING POINT (FP) ADD LOW INSTRUCTIONS FUNCTIONAL UNIT 审中-公开
Title translation: 浮点（FP）添加低指令功能单元

公开(公告)号：WO2017112308A1

公开(公告)日：2017-06-29

申请号：PCT/US2016/063634

申请日：2016-11-23

Applicant: INTEL CORPORATION

Inventor： ANDERSON, Cristina S. , CORNEA-HASEGAN, Marius A. , OULD-AHMED-VALL, Elmoustapha , VALENTINE, Robert , CORBAL, Jesus , ASTAFEV, Nikita , CHARNEY, Mark J. , GIRKAR, Milind B. , GRADSTEIN, Amit , RUBANOVICH, Simon , SPERBER, Zeev

IPC: G06F9/30

CPC classification number: G06F7/485

Abstract: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.

Abstract translation: 示例处理器包括寄存器和ADD低功能单元。该寄存器存储第一，第二和第三浮点（FP）值。 ADD低功能单元接收执行ADD低操作的请求，并且响应于该请求：将第一FP值与第二FP值相加以获得第一总和值; 将第一个总和值舍入以生成ADD值; 将第一FP值与第二FP值相加以获得第二和值; 从第二和值中减去ADD值以生成差值; 归一化差值以获得归一化差值; 舍入归一化差值以生成ADD低值; 并将ADD低值发送给应用程序。

3.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR DATA SPECULATION EXECUTION 审中-公开
Title translation: 用于数据规范执行的系统，设备和方法

公开(公告)号：WO2016105800A1

公开(公告)日：2016-06-30

申请号：PCT/US2015/062299

申请日：2015-11-24

Applicant: INTEL CORPORATION

Inventor： HUGHES, Christopher J. , OULD-AHMED-VALL, Elmoustapha , VALENTINE, Robert , GIRKAR, Milind B.

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3834 , G06F9/3842

Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for DSX comprises execution hardware to execute instructions to begin and end a data speculative execution (DSX) and speculative instructions during the DSX, and DSX tracking hardware to track speculative memory accesses and detect ordering violations in a DSX of speculative instructions using a sequence number, addresses of instruction accesses, and whether an instruction being tracked is a write, and to trigger a mis-speculation upon an ordering violation.

Abstract translation: 描述用于数据推测执行（DSX）的系统，方法和装置。在一些实施例中，用于DSX的硬件装置包括执行硬件以执行在DSX期间开始和结束数据推测执行（DSX）和推测指令的指令，以及DSX跟踪硬件以跟踪推测性存储器访问并检测DSX中的排序违规使用序列号，指令访问地址以及正在跟踪的指令是否为写入的推测性指令，以及在排序违规时触发错误猜测。

4.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE LOAD 审中-公开

公开(公告)号：WO2018174933A1

公开(公告)日：2018-09-27

申请号：PCT/US2017/040544

申请日：2017-07-01

Applicant: INTEL CORPORATION

Inventor： VALENTINE, Robert , ADELMAN, Menachem , GIRKAR, Milind B. , SPERBER, Zeev , CHARNEY, Mark J. , TOLL, Bret L. , RAPPOPORT, Rinat , CORBAL, Jesus , SHWARTSMAN, Stanislav , BAUM, Dan , YANOVER, Igor , HEINECKE, Alexander F. , ZIV, Barukh , OULD-AHMED-VALL, Elmoustapha , GEBIL, Yuri , SADE, Raanan

IPC: G06F9/345 , G06F9/30

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

5.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR FUSED MULTIPLY ADD 审中-公开
Title translation: 用于融合乘法加法的系统，装置和方法

公开(公告)号：WO2018075052A1

公开(公告)日：2018-04-26

申请号：PCT/US2016/057991

申请日：2016-10-20

Applicant: INTEL CORPORATION

Inventor： VALENTINE, Robert , RYVCHIN, Galina , MAJCHER, Piotr , CHARNEY, Mark J. , OULD-AHMED-VALL, Elmoustapha , CORBAL, Jesus , GIRKAR, Milind B. , SPERBER, Zeev , RUBANOVICH, Simon , GRADSTEIN, Amit

IPC: G06F9/30

Abstract: I n some embodiments, packed data elements of first and second packed data source operands are of a first, different size than a second size of packed data elements of a third packed data operand. Execution circuitry executes decoded single instruction to perform, for each packed data element position of a destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.

Abstract translation: 在一些实施例中，第一和第二打包数据源操作数的打包数据元素具有与第三打包数据操作数的打包数据元素的第二大小不同的第一大小。执行电路执行解码的单个指令以针对目的地操作数的每个打包数据元素位置执行来自第一打包数据源和第二打包数据源的M N个打包数据元素的对应于第三打包数据元素位置的打包数据元素位置将来自这些乘法的结果添加到第三打包数据源的打包数据元素位置的全尺寸打包数据元素，以及将相加结果存储在与打包数据元素对应的打包数据元素位置目的地中第三打包数据源的位置，其中M等于全尺寸打包数据元素除以N。

6.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR GETTING EVEN AND ODD DATA ELEMENTS 审中-公开
Title translation: 用于获得偶数和奇数数据元素的系统，装置和方法

公开(公告)号：WO2017117387A1

公开(公告)日：2017-07-06

申请号：PCT/US2016/069199

申请日：2016-12-29

Applicant: INTEL CORPORATION

Inventor： VALENTINE, Robert , OULD-AHMED-VALL, Elmoustapha , BRANDT, Jason W. , CHARNEY, Mark J. , JHA, Ashish , GIRKAR, Milind B. , TOLL, Bret L. , STUPACHENKO, Evgeny V. , OSTANEVICH, Sergey Y.

IPC: G06F9/30

CPC classification number: G06F9/3016 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/30101 , G06F9/30192

Abstract: Embodiments of systems, apparatuses, and method for getting even or odd data elements are described. For example, in some embodiments, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields for a first source operand, a second source operand, and a destination operand; and execution circuitry to execute the decoded instruction to extract data elements from even data element positions of the first and second source operands and store the extracted data elements into the destination operand.

Abstract translation: 描述了用于获取偶数或奇数数据元素的系统，装置和方法的实施例。例如，在一些实施例中，一种装置包括解码器以解码指令，其中所述指令包括用于第一源操作数，第二源操作数和目的地操作数的字段; 以及执行电路，用于执行解码的指令以从第一和第二源操作数的偶数据元素位置提取数据元素，并将提取的数据元素存储到目的操作数中。

7.

发明申请
VECTOR STORE/LOAD INSTRUCTIONS FOR ARRAY OF STRUCTURES 审中-公开
Title translation: 结构阵列的矢量存储/加载说明

公开(公告)号：WO2017112227A1

公开(公告)日：2017-06-29

申请号：PCT/US2016/063173

申请日：2016-11-21

Applicant: INTEL CORPORATION

Inventor： JHA, Ashish , OULD-AHMED-VALL, Elmoustapha , VALENTINE, Robert , CHARNEY, Mark J. , GIRKAR, Milind B.

IPC: G06F9/30

CPC classification number: G06F9/30036 , G06F9/30043 , G06F9/30109 , G06F9/3455

Abstract: A processor comprises a plurality of vector registers, and an execution unit, operatively coupled to the plurality of vector registers, the execution unit comprising a logic circuit implementing a load instruction for loading, into two or more vector registers, two or more data items associated with a data structure stored in a memory, wherein each one of the two or more vector registers is to store a data item associated with a certain position number within the data structure.

Abstract translation: 处理器包括多个向量寄存器以及可操作地耦合到所述多个向量寄存器的执行单元，所述执行单元包括实现用于加载到两个或更多个向量中的加载指令的逻辑电路寄存器，与存储在存储器中的数据结构相关联的两个或更多个数据项，其中所述两个或更多个向量寄存器中的每一个将存储与所述数据结构内的特定位置编号相关联的数据项。

8.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR DATA SPECULATION EXECUTION 审中-公开
Title translation: 系统，设备和方法的数据传播执行

公开(公告)号：WO2016105799A1

公开(公告)日：2016-06-30

申请号：PCT/US2015/062293

申请日：2015-11-24

Applicant: INTEL CORPORATION

Inventor： OULD-AHMED-VALL, Elmoustapha , HUGHES, Christopher J. , VALENTINE, Robert , GIRKAR, Milind B.

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3016 , G06F9/3013 , G06F9/3842 , G06F12/0875 , G06F2212/452

Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode, and execution hardware to execute the decoded instruction to continue a data speculative execution (DSX) and to determine that a DSX loop iteration is to be committed, commit speculative stores associated with the DSX loop iteration, and start a new DSX loop iteration.

Abstract translation: 描述了用于数据推测执行（DSX）的系统，方法和设备。在一些实施例中，用于执行DSX的硬件设备包括用于解码指令的硬件解码器，包括操作码的指令以及用于执行解码的指令以继续数据推测执行（DSX）并且确定DSX环路的执行硬件迭代将被提交，提交与DSX循环迭代相关的推测存储，并开始一个新的DSX循环迭代。

9.

发明申请
VECTORIZE STORE INSTRUCTIONS METHOD AND APPARATUS 审中-公开
Title translation: 矢量化存储指令方法和设备

公开(公告)号：WO2018004372A1

公开(公告)日：2018-01-04

申请号：PCT/RU2016/000410

申请日：2016-07-01

Applicant: INTEL CORPORATION , PLOTNIKOV, Mikhail , IDO, Hideki , TIAN, Xinmin , PREIS, Sergey , GIRKAR, Milind B. , SHUTOV, Maxim

Inventor： PLOTNIKOV, Mikhail , IDO, Hideki , TIAN, Xinmin , PREIS, Sergey , GIRKAR, Milind B. , SHUTOV, Maxim

IPC: G06F9/45

Abstract: Methods, apparatus, and system to optimize compilation of source code into vectorized compiled code, notwithstanding the presence of output dependencies which might otherwise preclude vectorization.

Abstract translation:

10.

发明申请
AGGREGATE SCATTER INSTRUCTIONS 审中-公开
Title translation: 聚合散射指令

公开(公告)号：WO2017112194A1

公开(公告)日：2017-06-29

申请号：PCT/US2016/062936

申请日：2016-11-18

Applicant: INTEL CORPORATION

Inventor： JHA, Ashish , OULD-AHMED-VALL, Elmoustapha , VALENTINE, Robert , CHARNEY, Mark J. , GIRKAR, Milind B.

IPC: G06F9/38 , G06F15/80

CPC classification number: G06F15/8007 , G06F9/30 , G06F9/30098 , G06F9/3016

Abstract: An Aggregate Scatter instruction is described. A processor may include a memory interface and a register to store data elements of a data structure. The data elements may be contiguously stored in a first location in a memory accessible via the memory interface. The processor may further include a decoder to decode an aggregate scatter instruction specifying a store operation for the data structure and an execution unit to contiguously store the data elements to a second storage location in the memory in response to the decoded aggregate scatter instruction. The second storage location may be identified by a starting memory address of the second storage location.

Abstract translation:
描述了一个聚合散射指令。处理器可以包括存储器接口和寄存器以存储数据结构的数据元素。数据元素可以连续存储在可经由存储器接口访问的存储器中的第一位置中。处理器还可以包括解码器，用于解码指定数据结构的存储操作的聚合分散指令和响应于解码的聚合分散指令，将数据元素连续地存储到存储器中的第二存储位置的执行单元。第二存储位置可以由第二存储位置的起始存储器地址标识。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification