-
公开(公告)号:US11095313B2
公开(公告)日:2021-08-17
申请号:US16659241
申请日:2019-10-21
摘要: Single error correction (“SEC”) code and triple error detection (“TED”) code are used to optimize bandwidth and resilience under multiple bit failures. One or more errors in data stored in duplicated registers are detected and corrected using the SEC code and TED code where simultaneous read operations are produced with two copies of data for each of the duplicated registers for a multi-port banked memory array. The SEC code and TED code may be included in each of the two data copies of the simultaneous read operations.
-
公开(公告)号:US11775257B2
公开(公告)日:2023-10-03
申请号:US16840847
申请日:2020-04-06
发明人: Silvia Melitta Mueller , Ankur Agrawal , Bruce Fleischer , Kailash Gopalakrishnan , Dongsoo Lee
IPC分类号: G06F7/499
CPC分类号: G06F7/49915 , G06F7/49968
摘要: Techniques for operating on and calculating binary floating-point numbers using an enhanced floating-point number format are presented. The enhanced format can comprise a single sign bit, six bits for the exponent, and nine bits for the fraction. Using six bits for the exponent can provide an enhanced exponent range that facilitates desirably fast convergence of computing-intensive algorithms and low error rates for computing-intensive applications. The enhanced format can employ a specified definition for the lowest binade that enables the lowest binade to be used for zero and normal numbers; and a specified definition for the highest binade that enables it to be structured to have one data point used for a merged Not-a-Number (NaN)/infinity symbol and remaining data points used for finite numbers. The signs of zero and merged NaN/infinity can be “don't care” terms. The enhanced format employs only one rounding mode, which is for rounding toward nearest up.
-
公开(公告)号:US11314482B2
公开(公告)日:2022-04-26
申请号:US16684081
申请日:2019-11-14
摘要: Methods and systems for division operation are described. A processor can initialize an estimated quotient between the dividend and the divisor separately from a floating-point unit (FPU) pipeline. The processor can implement the FPU pipeline to execute a refinement process that can include at least a first iteration of operations and a second iteration of operations. The refinement process can include, in the first iteration of operations, generating a first unnormalized floating-point value using the initialized estimated quotient. The refinement process can include, in the second iteration of operations, generating a second unnormalized floating-point value using the first unnormalized floating-point value. The processor can determine a final quotient based on the second unnormalized floating-point value.
-
公开(公告)号:US20190179639A1
公开(公告)日:2019-06-13
申请号:US15834403
申请日:2017-12-07
摘要: Aspects of the invention include receiving, by a processor, a plurality of instructions at an instruction pipeline. The processor can further determine an operand bit field size for each of the received plurality of instructions. The processor can further compare the operand bit field size of at least a subset of the received instructions to a predetermined threshold. The processor can further fuse at least two of the received instructions that have an operand bit field size that meets the predetermined threshold. The processor can further perform an execution stage within the instruction pipeline to execute the received instructions, including the fused instructions.
-
公开(公告)号:US11281745B2
公开(公告)日:2022-03-22
申请号:US16542447
申请日:2019-08-16
摘要: Methods and systems of matrix multiplication are described. In an example, a processor can multiply a first entry of a first vector of a first data array with a second vector of a second data array to generate a third vector of a third data array. The processor can store the third vector of the third data array in the second register file. The processor can multiply a second entry of the first vector with the second vector to generate a fourth vector of the third data array. The processor can store the fourth vector of the third data array in the second register file. The processor can combine vectors of the third data array that are stored in the second register file to produce the third data array.
-
公开(公告)号:US11223703B2
公开(公告)日:2022-01-11
申请号:US16358356
申请日:2019-03-19
摘要: Various embodiments are provided for implementing instruction initialization in a dataflow architecture in a computing environment. A data packet may be transmitted from a selected node to one or more of a plurality of nodes using one or more existing data paths in an initialization network. A determination operation is performed to determine whether one or more of a plurality of nodes is a target node intended for the data packet. Those of the plurality of nodes determined to be a target node initialize one or more components of the target node using the data packet. The data packet may be forwarded by each of the one or more of a plurality of nodes to a subsequent node in the initialization network.
-
公开(公告)号:US11182127B2
公开(公告)日:2021-11-23
申请号:US16363611
申请日:2019-03-25
摘要: Techniques facilitating binary floating-point multiply and scale operation for compute-intensive numerical applications and apparatuses are provided. An embodiment relates to a system that can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a receiver component that receives an instruction to perform a multiply and scale operation of the first floating point operand value, the second floating point operand value, and the integer operand value, wherein the multiplication component obtains the floating-point product in response to the instruction to perform the multiply and scale operation. The multiplication can be performed as a single instruction.
-
8.
公开(公告)号:US20200310755A1
公开(公告)日:2020-10-01
申请号:US16363611
申请日:2019-03-25
摘要: Techniques facilitating binary floating-point multiply and scale operation for compute-intensive numerical applications and apparatuses are provided. An embodiment relates to a system that can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a receiver component that receives an instruction to perform a multiply and scale operation of the first floating point operand value, the second floating point operand value, and the integer operand value, wherein the multiplication component obtains the floating-point product in response to the instruction to perform the multiply and scale operation. The multiplication can be performed as a single instruction.
-
公开(公告)号:US10656913B2
公开(公告)日:2020-05-19
申请号:US16000435
申请日:2018-06-05
发明人: Silvia Melitta Mueller , Ankur Agrawal , Bruce Fleischer , Kailash Gopalakrishnan , Dongsoo Lee
摘要: Techniques for operating on and calculating binary floating-point numbers using an enhanced floating-point number format are presented. The enhanced format can comprise a single sign bit, six bits for the exponent, and nine bits for the fraction. Using six bits for the exponent can provide an enhanced exponent range that facilitates desirably fast convergence of computing-intensive algorithms and low error rates for computing-intensive applications. The enhanced format can employ a specified definition for the lowest binade that enables the lowest binade to be used for zero and normal numbers; and a specified definition for the highest binade that enables it to be structured to have one data point used for a merged Not-a-Number (NaN)/infinity symbol and remaining data points used for finite numbers. The signs of zero and merged NaN/infinity can be “don't care” terms. The enhanced format employs only one rounding mode, which is for rounding toward nearest up.
-
公开(公告)号:US11669489B2
公开(公告)日:2023-06-06
申请号:US17490830
申请日:2021-09-30
发明人: Swagath Venkataramani , Sanchari Sen , Vijayalakshmi Srinivasan , Ankur Agrawal , Sunil K Shukla , Bruce Fleischer , Kailash Gopalakrishnan
CPC分类号: G06F15/8046 , G06F7/50 , G06F7/523 , G06F7/5443 , G06F9/3001 , G06F9/30069
摘要: A systolic array can be configured to skip distributed operands that have zero-values, resulting in improved resource efficiency. A skip module is introduced to receive operands from memory, identify whether they have a zero value or not, and, if they are nonzero, generate an operand vector including an index before sending the operand vector to a processing element.
-
-
-
-
-
-
-
-
-