Patent search ipc:"G06F9/302" Page 1

1.

发明申请
矩阵乘法的运算方法及装置审中-公开

公开(公告)号：WO2023078364A1

公开(公告)日：2023-05-11

申请号：PCT/CN2022/129619

申请日：2022-11-03

Applicant: 深圳市中兴微电子技术有限公司

Inventor： 雷洪 , 甄德根 , 吴桐庆 , 孔德辉 , 徐科

IPC: G06F9/302 , G06F7/487 , G06F7/485

Abstract: 本发明实施例提供了一种矩阵乘法的运算方法及装置，所述运算方法包括：将两个2N比特的浮点型数据分别拆分为对应的符号位、精度位和指数位，以及将四个N比特的整型数据分别拆分为对应的符号位和精度位；通过指数位相加、符号位异或和精度位相乘对所述两个浮点型数据进行矩阵乘法运算，以及通过符号位异或和精度位相乘对所述四个整型数据两两进行矩阵乘法运算，并在所述浮点型数据和所述整型数据的矩阵乘法运算中复用乘法单元和加法单元。在本发明中，通过将不同数据类型的输入数据进行拆分，从而可以在矩阵乘法过程中复用加速器的乘法和加法运算资源，从而大大减少了加速器的芯片面积和降低了成本。

2.

发明申请
APPARATUS AND METHOD FOR ENERGY-EFFICIENT AND ACCELERATED PROCESSING OF AN ARITHMETIC OPERATION 审中-公开

公开(公告)号：WO2023000110A1

公开(公告)日：2023-01-26

申请号：PCT/CA2022/051140

申请日：2022-07-22

Applicant: SOLID STATE OF MIND

Inventor： DUMESNIL, Etienne , JULIEN, Maxime

IPC: G06F7/57 , G06F9/302

Abstract: An apparatus and a method for accelerated processing of an arithmetic operation. The apparatus comprises an operand pre-arithmetic status register configured to generate a status notification that flags that one of predetermined combinatory conditions between a first operand and a second operand is met; and a modified arithmetic logic unit. The modified arithmetic logic unit comprises an electronic logic circuit configured to, in response to receiving the status notification from the operand pre-arithmetic status register, readdress execution of the arithmetic operation towards an expedited routine within the modified arithmetic logic unit if the status notification comprises one or more flags or to a conventional routine if the status notification is a blank status notification, the expedited routine having less calculation cycles to output an operation result than the conventional routine.

3.

发明申请
PROCESSOR UNIT FOR MULTIPLY AND ACCUMULATE OPERATIONS 审中-公开

公开(公告)号：WO2021111272A1

公开(公告)日：2021-06-10

申请号：PCT/IB2020/061262

申请日：2020-11-30

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION , IBM (CHINA) INVESTMENT COMPANY LTD. , IBM DEUTSCHLAND GMBH

Inventor： LEENSTRA, Jentje , WAGNER, Andreas , MOREIRA, Jose , THOMPTO, Brian

IPC: G06F9/302

Abstract: A processor unit for multiply and accumulate ("MAC") operations is provided, the processor unit comprising: a plurality of MAC units for performing a set of MAC operations, wherein each MAC unit of the plurality of MAC units including an execution unit and a one-write one-read ("1W/1R") register file, wherein the 1W/1R register file having at least one accumulator; and another register file, wherein the execution unit of each MAC unit being configured to perform a subset of MAC operations by computing a product of a set of values received from the another register file and adding the computed product to a content of the at least one accumulator, wherein each MAC unit being configured to perform the subset of MAC operations in a single clock cycle.

4.

发明申请
一种用于执行矩阵加/减运算的装置和方法审中-公开

公开(公告)号：WO2017185396A1

公开(公告)日：2017-11-02

申请号：PCT/CN2016/081117

申请日：2016-05-05

Applicant: 北京中科寒武纪科技有限公司

Inventor： 张潇 , 刘少礼 , 陈天石 , 陈云霁

IPC: G06F9/302

Abstract: 本公开提供了一种用于执行矩阵加减运算的装置，其中，包括：存储单元，用于存储矩阵运算指令相关的矩阵数据；寄存器单元，用于存储矩阵运算指令相关的标量数据；控制单元，用于对矩阵运算指令进行译码，并控制矩阵运算指令的运算过程；矩阵运算单元，用于根据译码后的矩阵运算指令，对输入矩阵进行矩阵加减运算操作；其中，所述矩阵运算单元为定制的硬件电路。本公开还提供了一种执行矩阵加减法运算的方法。

5.

发明申请
一种用于执行向量合并运算的装置和方法审中-公开

公开(公告)号：WO2017185385A1

公开(公告)日：2017-11-02

申请号：PCT/CN2016/080963

申请日：2016-05-04

Applicant: 北京中科寒武纪科技有限公司

Inventor： 李震 , 张潇 , 刘少礼 , 陈天石 , 陈云霁

IPC: G06F9/302

Abstract: 一种用于执行向量合并运算的装置，其包括：存储单元，用于存储向量合并运算指令相关的向量数据；寄存器单元，用于存储向量合并运算指令相关的标量数据；控制单元，用于对向量合并运算指令进行译码，并控制向量合并运算指令的运算过程；向量合并单元，用于根据译码后的向量合并运算指令，对两待合并输入向量进行向量合并操作；其中，所述向量合并单元为定制的硬件电路。提供的用于执行向量合并运算的装置和方法，通过定制的硬件电路实现了精简向量合并指令的完整过程，即通过一条精简的向量合并指令即可实现向量合并运算。

6.

发明申请
APPARATUS AND METHOD FOR SELECTING ELEMENTS OF A VECTOR COUMPUTATION 审中-公开
Title translation: 选择矢量图选择元素的装置和方法

公开(公告)号：WO2013147869A1

公开(公告)日：2013-10-03

申请号：PCT/US2012/031596

申请日：2012-03-30

Applicant: INTEL CORPORATION , LEE, Victor W. , BHARADWAJ, Jayashankar , KIM, Daehyun , VASUDEVAN, Nalini , NGAI, Tin-Fook , HARTONO, Albert , BAGHSORKHI, Sara

Inventor： LEE, Victor W. , BHARADWAJ, Jayashankar , KIM, Daehyun , VASUDEVAN, Nalini , NGAI, Tin-Fook , HARTONO, Albert , BAGHSORKHI, Sara

IPC: G06F1/00 , G06F9/06 , G06F9/302

CPC classification number: G06F7/548 , G06F9/3001 , G06F9/30018 , G06F9/30036 , G06F9/30145 , G06F9/3877 , G06F9/3895

Abstract: An apparatus and method are described for performing a vector reduction. For example, an apparatus according to one embodiment comprises: a reduction logic tree comprised of a set of N- l reduction logic blocks used to perform reduction in a single operation cycle for N vector elements; a first input vector register storing a first input vector communicatively Coupled to the set of reduction logic blocks; a second input vector register storing a second input vector communicatively coupled to the set of reduction logic blocks; a mask register storing a mask value controlling a set of one or more multiplexers, each of the set of multiplexers selecting a value directly from the first input vector register or an output containing a processed value from one of the reduction logic blocks; and an output vector register coupled to outputs of the one or more multiplexers to receive values output passed through by each of the multiplexers responsive to the control signals.

Abstract translation: 描述了用于执行向量减少的装置和方法。例如，根据一个实施例的装置包括：还原逻辑树，包括用于对N个向量元素执行单个操作周期的减少的一组N-1个减少逻辑块; 第一输入向量寄存器，其以通信方式存储耦合到所述一组还原逻辑块的第一输入向量; 存储通信地耦合到所述一组减少逻辑块的第二输入向量的第二输入向量寄存器; 屏蔽寄存器，其存储控制一个或多个多路复用器的集合的掩码值，所述多路复用器集合中的每一个直接从所述第一输入向量寄存器选择值，或者包含来自所述还原逻辑块之一的处理值的输出; 以及耦合到所述一个或多个多路复用器的输出的输出矢量寄存器，以响应于所述控制信号接收由所述多路复用器中的每一个通过的值。

7.

发明申请
MULTI-ELEMENT INSTRUCTION WITH DIFFERENT READ AND WRITE MASKS 审中-公开
Title translation: 具有不同读取和写入掩码的多元素指令

公开(公告)号：WO2013095659A1

公开(公告)日：2013-06-27

申请号：PCT/US2011067248

申请日：2011-12-23

Applicant: INTEL CORP , PLOTNIKOV MIKHAIL , NARAIKAN ANDREY , OULD-AHMED-VALL ELMOUSTAPHA , VALENTINE ROBERT , TOLL BRET L , CORBAL JESUS

Inventor： PLOTNIKOV MIKHAIL , NARAIKAN ANDREY , OULD-AHMED-VALL ELMOUSTAPHA , VALENTINE ROBERT , TOLL BRET L , CORBAL JESUS

IPC: G06F9/30 , G06F9/302 , G06F9/305

CPC classification number: G06F9/3013 , G06F7/764 , G06F9/3001 , G06F9/30014 , G06F9/30018 , G06F9/30029 , G06F9/30036

Abstract: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.

Abstract translation: 描述了一种包括从第一寄存器读取第一读取掩码的方法。该方法还包括从第二寄存器或存储器位置读取第一向量操作数。该方法还包括对第一向量操作数应用读取掩码以产生用于操作的一组元素。该方法还包括执行设定元件的操作。该方法还包括通过产生操作结果的多个实例来创建输出向量。该方法还包括从第三寄存器读取第一写掩码，第一写掩码不同于第一读掩码。该方法还包括针对输出向量应用写掩码以产生合成矢量。该方法还包括将结果矢量写入目的地寄存器。

8.

发明申请
APPARATUS AND METHOD FOR VECTOR INSTRUCTIONS FOR LARGE INTEGER ARITHMETIC 审中-公开
Title translation: 用于大规模整数算术的矢量指令的装置和方法

公开(公告)号：WO2013095629A1

公开(公告)日：2013-06-27

申请号：PCT/US2011/067165

申请日：2011-12-23

Applicant: INTEL CORPORATION , WOLRICH, Gilbert M. , YAP, Kirk S. , GUILFORD, James D. , OZTURK, Erdinc , GOPAL, Vinodh , FEGHALI, Wajdi K. , GULLEY, Sean M. , DIXON, Martin G.

Inventor： WOLRICH, Gilbert M. , YAP, Kirk S. , GUILFORD, James D. , OZTURK, Erdinc , GOPAL, Vinodh , FEGHALI, Wajdi K. , GULLEY, Sean M. , DIXON, Martin G.

IPC: G06F9/30 , G06F9/302

CPC classification number: G06F9/3016 , G06F7/525 , G06F7/57 , G06F9/3001 , G06F9/30036 , G06F9/3893

Abstract: An apparatus is described that includes a semiconductor chip having an instruction execution pipeline having one or more execution units with respective logic circuitry to: a) execute a first instruction that multiplies a first input operand and a second input operand and presents a lower portion of the result, where, the first and second input operands are respective elements of first and second input vectors; b) execute a second instruction that multiplies a first input operand and a second input operand and presents an upper portion of the result, where, the first and second input operands are respective elements of first and second input vectors; and, c) execute an add instruction where a carry term of the add instruction's adding is recorded in a mask register.

Abstract translation: 描述了一种装置，其包括具有指令执行流水线的半导体芯片，该指令执行流水线具有一个或多个具有相应逻辑电路的执行单元，以便：a）执行将第一输入操作数与第二输入操作数相乘并且呈现下一部分的第一指令其中，第一和第二输入操作数是第一和第二输入向量的相应元素; b）执行第二指令，其将第一输入操作数和第二输入操作数相乘并呈现结果的上部，其中第一和第二输入操作数是第一和第二输入向量的相应元素; 以及c）执行加法指令，其中加法指令的相加的进位项被记录在掩码寄存器中。

9.

发明申请
SUPER MULTIPLY ADD (SUPER MADD) INSTRUCTION WITH THREE SCALAR TERMS 审中-公开
Title translation: 使用三个标量条件的超级增量（超级增量）指令

公开(公告)号：WO2013095619A1

公开(公告)日：2013-06-27

申请号：PCT/US2011067096

申请日：2011-12-23

Applicant: INTEL CORP , CORBAL JESUS , FORSYTH ANDREW T , FLETCHER THOMAS D , WU LISA K , SPRANGLE ERIC

Inventor： CORBAL JESUS , FORSYTH ANDREW T , FLETCHER THOMAS D , WU LISA K , SPRANGLE ERIC

IPC: G06F9/30 , G06F9/302

CPC classification number: G06F9/3001 , G06F9/30014 , G06F9/30018 , G06F9/30036 , G06F9/30145 , G06F9/3893

Abstract: A processing core is described having execution unit logic circuitry having a first register to store a first vector input operand, a second register to a store a second vector input operand and a third register to store a packed data structure containing scalar input operands a, b, c. The execution unit logic circuitry further include a multiplier to perform the operation (a*(first vector input operand)) + (b*(second vector operand)) + c.

Abstract translation: 描述了具有执行单元逻辑电路的处理核心，该执行单元逻辑电路具有存储第一向量输入操作数的第一寄存器，存储第二向量输入操作数的第二寄存器以及存储包含标量输入操作数a，b的打包数据结构的第三寄存器， C。执行单元逻辑电路还包括执行操作（a *（第一向量输入操作数））+（b *（第二向量操作数））+ c的乘法器。

10.

发明申请
MICROPROCESSOR AND METHOD FOR ENHANCED PRECISION SUM-OF-PRODUCTS CALCULATION ON A MICROPROCESSOR 审中-公开
Title translation: 微处理器和微处理器产品精度计算的微处理器和方法

公开(公告)号：WO2011063824A1

公开(公告)日：2011-06-03

申请号：PCT/EP2009/008522

申请日：2009-11-30

Applicant: RAUBUCH, Martin

Inventor： RAUBUCH, Martin

IPC: G06F7/544 , G06F7/499 , G06F9/30 , G06F9/302

CPC classification number: G06F9/30014 , G06F7/49942 , G06F7/5443 , G06F9/30112 , G06F9/30138

Abstract: A microprocessor (10) comprises at least one general-purpose-register (12) arranged to store and provide a number of destination bits to a multiply unit (14); a control unit (18) adapted to provide at least a multiply-high instruction (20) and a multiply-high- and- accumulate instruction (22) to the multiply unit. The multiply unit is further arranged to receive at least a first and a second source operand (24, 26), each having an associated number of source bits and a sum of the associated numbers of source bits exceeding the number of destination bits, connected to a register-extension cache (28) comprising at least one cache entry arranged to store and provide a number of precision-enhancement bits, and adapted to store a destination portion of a result operand in the general-purpose- register and a precision-enhancement portion of the result operand in the cache entry. The result operand is generated by a multiply-high operation when or by a multiply-high-and-accumulate operation depending on the recieved instruction.

Abstract translation: 微处理器（10）包括至少一个通用寄存器（12），其被布置为将多个目的地位存储并提供给乘法单元（14）; 适于向乘法单元提供至少一个乘法高精度指令（20）和一个乘法和累加指令（22）的控制单元（18）。乘法单元还被布置成接收至少第一和第二源操作数（24,26），每个源操作数具有相关联的数量的源比特，并且相关联的数量的源比特的总和超过目的地比特数，连接到寄存器扩展高速缓存（28），包括至少一个高速缓存条目，其被布置为存储和提供多个精度增强位，并且适于将结果操作数的目的地部分存储在通用寄存器中，并且精度增强结果操作数的一部分在缓存条目中。根据接收到的指令，结果操作数是通过乘法运算产生的，也可以通过乘法和累加运算生成。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification