-
公开(公告)号:US20240241693A1
公开(公告)日:2024-07-18
申请号:US18426504
申请日:2024-01-30
Applicant: Intel Corporation
CPC classification number: G06F7/483 , G06F1/03 , G06F7/4873 , G06F7/4988 , G06F7/544
Abstract: A processor to facilitate execution of a single-precision floating point operation on an operand is disclosed. The processor includes one or more execution units, each having a plurality of floating point units to execute one or more instructions to perform the single-precision floating point operation on the operand, including performing a floating point operation on an exponent component of the operand; and performing a floating point operation on a mantissa component of the operand, comprising dividing the mantissa component into a first sub-component and a second sub-component, determining a result of the floating point operation for the first sub-component and determining a result of the floating point operation for the second sub-component, and returning a result of the floating point operation.
-
公开(公告)号:US11593069B2
公开(公告)日:2023-02-28
申请号:US17477939
申请日:2021-09-17
Applicant: Intel Corporation
Abstract: Embodiments described herein are generally directed to an improved vector normalization instruction. An embodiment of a method includes responsive to receipt by a GPU of a single instruction specifying a vector normalization operation to be performed on V vectors: (i) generating V squared length values, N at a time, by a first processing unit, by, for each N sets of inputs, each representing multiple component vectors for N of the vectors, performing N parallel dot product operations on the N sets of inputs. Generating V sets of outputs representing multiple normalized component vectors of the V vectors, N at a time, by a second processing unit, by, for each N squared length values of the V squared length values, performing N parallel operations on the N squared length values, wherein each of the N parallel operations implement a combination of a reciprocal square root function and a vector scaling function.
-
公开(公告)号:US11574382B2
公开(公告)日:2023-02-07
申请号:US17466591
申请日:2021-09-03
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Eric G. Liskay , Prasoonkumar Surti , Sudhakar Kamma , Karthik Vaidyanathan , Rajasekhar Pantangi , Altug Koker , Abhishek Rhisheekesan , Shashank Lakshminarayana , Priyanka Ladda , Karol A. Szerszen
Abstract: Examples described herein relate to a decompression engine that can request compressed data to be transferred over a memory bus. In some cases, the memory bus is a width that requires multiple data transfers to transfer the requested data. In a case that requested data is to be presented in-order to the decompression engine, a re-order buffer can be used to store entries of data. When a head-of-line entry is received, the entry can be provided to the decompression engine. When a last entry in a group of one or more entries is received, all entries in the group are presented in-order to the decompression engine. In some examples, a decompression engine can borrow memory resources allocated for use by another memory client to expand a size of re-order buffer available for use. For example, a memory client with excess capacity and a slowest growth rate can be chosen to borrow memory resources from.
-
公开(公告)号:US20220147316A1
公开(公告)日:2022-05-12
申请号:US17477939
申请日:2021-09-17
Applicant: Intel Corporation
Abstract: Embodiments described herein are generally directed to an improved vector normalization instruction. An embodiment of a method includes responsive to receipt by a GPU of a single instruction specifying a vector normalization operation to be performed on V vectors: (i) generating V squared length values, N at a time, by a first processing unit, by, for each N sets of inputs, each representing multiple component vectors for N of the vectors, performing N parallel dot product operations on the N sets of inputs. Generating V sets of outputs representing multiple normalized component vectors of the V vectors, N at a time, by a second processing unit, by, for each N squared length values of the V squared length values, performing N parallel operations on the N squared length values, wherein each of the N parallel operations implement a combination of a reciprocal square root function and a vector scaling function.
-
公开(公告)号:US11157238B2
公开(公告)日:2021-10-26
申请号:US16685561
申请日:2019-11-15
Applicant: Intel Corporation
Abstract: Embodiments described herein are generally directed to an improved vector normalization instruction. An embodiment of a method includes responsive to receipt by a GPU of a single instruction specifying a vector normalization operation to be performed on V vectors: (i) generating V squared length values, N at a time, by a first processing unit, by, for each N sets of inputs, each representing multiple component vectors for N of the vectors, performing N parallel dot product operations on the N sets of inputs. Generating V sets of outputs representing multiple normalized component vectors of the V vectors, N at a time, by a second processing unit, by, for each N squared length values of the V squared length values, performing N parallel operations on the N squared length values, wherein each of the N parallel operations implement a combination of a reciprocal square root function and a vector scaling function.
-
公开(公告)号:US11113783B2
公开(公告)日:2021-09-07
申请号:US16683024
申请日:2019-11-13
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Eric G. Liskay , Prasoonkumar Surti , Sudhakar Kamma , Karthik Vaidyanathan , Rajasekhar Pantangi , Altug Koker , Abhishek Rhisheekesan , Shashank Lakshminarayana , Priyanka Ladda , Karol A. Szerszen
Abstract: Examples described herein relate to a decompression engine that can request compressed data to be transferred over a memory bus. In some cases, the memory bus is a width that requires multiple data transfers to transfer the requested data. In a case that requested data is to be presented in-order to the decompression engine, a re-order buffer can be used to store entries of data. When a head-of-line entry is received, the entry can be provided to the decompression engine. When a last entry in a group of one or more entries is received, all entries in the group are presented in-order to the decompression engine. In some examples, a decompression engine can borrow memory resources allocated for use by another memory client to expand a size of re-order buffer available for use. For example, a memory client with excess capacity and a slowest growth rate can be chosen to borrow memory resources from.
-
公开(公告)号:US20200319851A1
公开(公告)日:2020-10-08
申请号:US16375307
申请日:2019-04-04
Applicant: Intel Corporation
Abstract: A processor to facilitate execution of a single-precision floating point operation on an operand is disclosed. The processor includes one or more execution units, each having a plurality of floating point units to execute one or more instructions to perform the single-precision floating point operation on the operand, including performing a floating point operation on an exponent component of the operand; and performing a floating point operation on a mantissa component of the operand, comprising dividing the mantissa component into a first sub-component and a second sub-component, determining a result of the floating point operation for the first sub-component and determining a result of the floating point operation for the second sub-component, and returning a result of the floating point operation.
-
公开(公告)号:US11934797B2
公开(公告)日:2024-03-19
申请号:US16375307
申请日:2019-04-04
Applicant: Intel Corporation
CPC classification number: G06F7/483 , G06F1/03 , G06F7/4873 , G06F7/4988 , G06F7/544
Abstract: A processor to facilitate execution of a single-precision floating point operation on an operand is disclosed. The processor includes one or more execution units, each having a plurality of floating point units to execute one or more instructions to perform the single-precision floating point operation on the operand, including performing a floating point operation on an exponent component of the operand; and performing a floating point operation on a mantissa component of the operand, comprising dividing the mantissa component into a first sub-component and a second sub-component, determining a result of the floating point operation for the first sub-component and determining a result of the floating point operation for the second sub-component, and returning a result of the floating point operation.
-
公开(公告)号:US20210149635A1
公开(公告)日:2021-05-20
申请号:US16685561
申请日:2019-11-15
Applicant: Intel Corporation
Abstract: Embodiments described herein are generally directed to an improved vector normalization instruction. An embodiment of a method includes responsive to receipt by a GPU of a single instruction specifying a vector normalization operation to be performed on V vectors: (i) generating V squared length values, N at a time, by a first processing unit, by, for each N sets of inputs, each representing multiple component vectors for N of the vectors, performing N parallel dot product operations on the N sets of inputs. Generating V sets of outputs representing multiple normalized component vectors of the V vectors, N at a time, by a second processing unit, by, for each N squared length values of the V squared length values, performing N parallel operations on the N squared length values, wherein each of the N parallel operations implement a combination of a reciprocal square root function and a vector scaling function.
-
-
-
-
-
-
-
-