-
公开(公告)号:EP3394729A1
公开(公告)日:2018-10-31
申请号:EP16879791.8
申请日:2016-11-23
申请人: Intel Corporation
发明人: ANDERSON, Cristina S. , CORNEA-HASEGAN, Marius A. , OULD-AHMED-VALL, Elmoustapha , VALENTINE, Robert , CORBAL, Jesus , ASTAFEV, Nikita , CHARNEY, Mark J. , GIRKAR, Milind B. , GRADSTEIN, Amit , RUBANOVICH, Simon , SPERBER, Zeev
CPC分类号: G06F7/4876 , G06F7/485 , G06F7/49915
摘要: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.
-
公开(公告)号:EP3391206A1
公开(公告)日:2018-10-24
申请号:EP16879661.3
申请日:2016-11-18
申请人: Intel Corporation
CPC分类号: G06F9/30036 , G06F9/30101 , G06F9/3016 , G06F12/084 , G06F12/0855 , G06F12/0862 , G06F12/0875 , G06F12/1027 , G06F2212/452
摘要: A processor includes a front end to decode an instruction and an allocator to assign the instruction to an execution unit to execute the instruction to permute vector data into a destination register for storing elements. The execution unit includes logic to compute an element count, logic to compute an index size, logic to compute a byte count, a temporary destination, an index from an index vector, an offset, logic to determine a subset of the temporary destination, and logic to store the subset in one element in the destination register.
-
公开(公告)号:EP3394719A1
公开(公告)日:2018-10-31
申请号:EP16879850.2
申请日:2016-12-02
申请人: INTEL Corporation
CPC分类号: G06F9/3001 , G06F7/483 , G06F7/49942 , G06F9/30145 , G06F2207/483
摘要: In an embodiment, a processor includes a plurality of cores, with at least one core including a cancellation monitor unit. The cancellation monitor unit comprises circuitry to: detect an execution of a floating point (FP) instruction in the core, wherein the execution of the FP instruction uses a set of FP inputs and generates an FP output; determine a maximum exponent value associated with the set of FP inputs to the FP instruction; subtract an exponent value of the FP output from the maximum exponent value to obtain an exponent difference; and in response to a determination that the exponent difference meets or exceeds a threshold level, increment a cancellation event count. Other embodiments are described and claimed.
-
公开(公告)号:EP3391204A1
公开(公告)日:2018-10-24
申请号:EP16879683.7
申请日:2016-11-18
申请人: Intel Corporation
CPC分类号: G06F9/30043 , G06F9/30021 , G06F9/30036 , G06F9/30098 , G06F9/3016 , G06F9/345 , G06F9/3455 , G06F9/3824 , G06F9/383 , G06F9/3889 , G06F12/0862 , G06F12/0875 , G06F2212/1016
摘要: A processor includes a front end to decode an instruction and an allocator to assign the instruction to an execution unit to execute the instruction to gather scattered data from a memory into a destination register, and a cache with cache lines. The execution unit includes logic to compute the number of elements to gather and the address in memory for an element, and logic to fetch a cache line corresponding to the computed address into the cache, and logic to load the destination register from the cache.
-
公开(公告)号:EP3394730A1
公开(公告)日:2018-10-31
申请号:EP16879792.6
申请日:2016-11-23
申请人: Intel Corporation
发明人: ANDERSON, Cristina S. , CORNEA-HASEGAN, Marius A. , OULD-AHMED-VALL, Elmoustapha , VALENTINE, Robert , CORBAL, Jesus , ASTAFEV, Nikita , CHARNEY, Mark J. , GIRKAR, Milind B. , GRADSTEIN, Amit , RUBANOVICH, Simon , SPERBER, Zeev
IPC分类号: G06F9/30
CPC分类号: G06F7/485
摘要: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.
-
公开(公告)号:EP3391198A1
公开(公告)日:2018-10-24
申请号:EP16879681.1
申请日:2016-11-18
申请人: Intel Corporation
IPC分类号: G06F9/30
CPC分类号: G06F9/30029 , G06F7/485 , G06F7/499 , G06F7/4991 , G06F9/30014 , G06F9/30021 , G06F9/3016 , G06F9/3861 , G06F11/00
摘要: A processor includes a front end to decode an instruction and an allocator to assign the instruction to an execution unit to execute the instruction to compute a floating point result subject to a cancellation effect. The execution unit includes a threshold to control notification the cancellation effect, a logic to compute the maximum exponent from a source value, a logic to compute the floating point exponent, a logic to compute the detected cancellation value, and a logic to compare the detected cancellation value to the threshold.
-
公开(公告)号:EP3391197A1
公开(公告)日:2018-10-24
申请号:EP16879680.3
申请日:2016-11-18
申请人: Intel Corporation
IPC分类号: G06F9/30 , G06F12/084
CPC分类号: G06F9/30036 , G06F9/30101 , G06F9/3016 , G06F12/084 , G06F12/0855 , G06F12/0862 , G06F12/0875 , G06F12/1027 , G06F2212/452
摘要: A processor includes a front end to decode an instruction, a temporary destination, and an allocator to assign the instruction to an execution unit to execute the instruction to get a selected column of data into a destination register. The execution unit includes an element counter, a logic to determine an index from an index vector based on the element count, a logic to compute an address of the data, a row to be loaded into the temporary destination, and a data processing unit to copy a portion of the temporary destination into the element of the destination register.
-
-
-
-
-
-