-
公开(公告)号:US10761757B2
公开(公告)日:2020-09-01
申请号:US16024812
申请日:2018-06-30
Applicant: INTEL CORPORATION
Inventor: Krishnakumar Nair , Andrew Yang , Michael Rotzin , Nitin Garegrat , Tom Schebye , Tony Werner
Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.
-
公开(公告)号:US20250123843A1
公开(公告)日:2025-04-17
申请号:US18990080
申请日:2024-12-20
Applicant: Intel Corporation
Inventor: Nitin N. Garegrat , Tony L. Werner , Jeff DelChiaro , Michael Rotzin , Robert T. Rhoades , Ujwal Basavaraj Sajjanar , Anne Q. Ye
Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
-
公开(公告)号:US11520562B2
公开(公告)日:2022-12-06
申请号:US16557959
申请日:2019-08-30
Applicant: Intel Corporation
Inventor: Brian J. Hickmann , Nitin N. Garegrat , Maciej Urbanski , Michael Rotzin
Abstract: A method comprising storing a plurality of entries, each entry of the plurality of entries associated with a portion of a range of input values, each entry of the plurality of entries comprising a set of coefficients defining a power series approximation; selecting first entry of the plurality of entries based on a determination that a floating point input value is within a portion of the range of input values that is associated with the first entry; and calculating an output value by evaluating the power series approximation defined by the set of coefficients of the first entry at the floating point input value.
-
公开(公告)号:US20190384575A1
公开(公告)日:2019-12-19
申请号:US16557959
申请日:2019-08-30
Applicant: Intel Corporation
Inventor: Brian J. Hickmann , Nitin N. Garegrat , Maciej Urbanski , Michael Rotzin
Abstract: A method comprising storing a plurality of entries, each entry of the plurality of entries associated with a portion of a range of input values, each entry of the plurality of entries comprising a set of coefficients defining a power series approximation; selecting first entry of the plurality of entries based on a determination that a floating point input value is within a portion of the range of input values that is associated with the first entry; and calculating an output value by evaluating the power series approximation defined by the set of coefficients of the first entry at the floating point input value.
-
公开(公告)号:US12229560B2
公开(公告)日:2025-02-18
申请号:US18199771
申请日:2023-05-19
Applicant: Intel Corporation
Inventor: Nitin N. Garegrat , Tony L. Werner , Jeff DelChiaro , Michael Rotzin , Robert T. Rhoades , Ujwal Basavaraj Sajjanar , Anne Q. Ye
Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
-
公开(公告)号:US11687341B2
公开(公告)日:2023-06-27
申请号:US16556223
申请日:2019-08-29
Applicant: Intel Corporation
Inventor: Nitin N. Garegrat , Tony L. Werner , Jeff DelChiaro , Michael Rotzin , Robert T. Rhoades , Ujwal Basavaraj Sajjanar , Anne Q. Ye
CPC classification number: G06F9/3455 , G06F9/30032 , G06F9/30036
Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
-
公开(公告)号:US20190391811A1
公开(公告)日:2019-12-26
申请号:US16556223
申请日:2019-08-29
Applicant: Intel Corporation
Inventor: Nitin N. Garegrat , Tony L. Werner , Jeff DelChiaro , Michael Rotzin , Robert T. Rhoades , Ujwal Basavaraj Sajjanar , Anne Q. Ye
Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
-
公开(公告)号:US12254061B2
公开(公告)日:2025-03-18
申请号:US17256195
申请日:2018-09-27
Applicant: Intel Corporation
Inventor: Maciej Urbanski , Brian J. Hickmann , Michael Rotzin , Krishnakumar Nair , Andrew Yang , Brian S. Morris , Dennis Bradford
Abstract: Methods and apparatuses relating to performing vector multiplication are described. Hardware accelerators to perform vector multiplication are also described. In one embodiment, a combined fixed-point and floating-point vector multiplication circuit includes at least one switch to change the circuit between a first mode and a second mode, where in the first mode, each multiplier of a set of multipliers is to multiply mantissas from a same element position of a first floating-point vector and a second floating-point vector to produce a corresponding product, shift the corresponding products with a set of shift registers based on a maximum exponent of exponents for the corresponding products determined by a maximum exponent determiner to produce shifted products, perform an numeric conversion operation on the shifted products with a set of numeric conversion circuits based on sign bits from the same element position of the first floating-point vector and the second floating-point vector to produce signed representations of the shifted products, add the signed representations of the shifted products with a set of adders to produce a single product, and normalize the single product with a normalization circuit based on the maximum exponent into a single floating-point resultant, and in the second mode, each multiplier of the set of multipliers is to multiply values from a same element position of a first integer vector and a second integer vector to produce a corresponding product, and add each corresponding product with the set of adders to produce a single integer resultant.
-
公开(公告)号:US20230333855A1
公开(公告)日:2023-10-19
申请号:US18199771
申请日:2023-05-19
Applicant: Intel Corporation
Inventor: Nitin N. Garegrat , Tony L. Werner , Jeff DelChiaro , Michael Rotzin , Robert T. Rhoades , Ujwal Basavaraj Sajjanar , Anne Q. Ye
CPC classification number: G06F9/3455 , G06F9/30032 , G06F9/30036
Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.
-
公开(公告)号:US11169776B2
公开(公告)日:2021-11-09
申请号:US16457318
申请日:2019-06-28
Applicant: Intel Corporation
Inventor: Nitin N. Garegrat , Maciej Urbanski , Michael Rotzin , Brian J. Hickmann , Valentina Popescu
Abstract: Systems, apparatuses and methods may provide for technology that in response to an identification that one or more hardware units are to execute on a first type of data format, decomposes a first original floating point number to a plurality of first segmented floating point numbers that are to be equivalent to the first original floating point number. The technology may further in response to the identification, decompose a second original floating point number to a plurality of second segmented floating point numbers that are to be equivalent to the second original floating point number. The technology may further execute a multiplication operation on the first and second segmented floating point numbers to multiply the first segmented floating point numbers with the second segmented floating point numbers.
-
-
-
-
-
-
-
-
-