Methods for using a multiplier circuit to support multiple sub-multiplications using bit correction and extension

    公开(公告)号:US10732932B2

    公开(公告)日:2020-08-04

    申请号:US16231170

    申请日:2018-12-21

    Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit such as an 18×18 multiplier circuit may be used to support two or more smaller multiplication operations such as two 8×8 integer multiplications or two 9×9 integer multiplications. To implement the two 8×8 or 9×9 unsigned/signed multiplications, the 18×18 multiplier may be configured to support two 8×8 multiplications with one shared operand, two 6×6 multiplications without any shared operand, or two 7×7 multiplications without any shared operand. Any potential overlap of partial product terms may be subtracted out using correction logic. The multiplication of the remaining most significant bits can be computed using associated multiplier extension logic and appended to the other least significant bits using merging logic.

    METHODS FOR USING A MULTIPLIER TO SUPPORT MULTIPLE SUB-MULTIPLICATION OPERATIONS

    公开(公告)号:US20190042198A1

    公开(公告)日:2019-02-07

    申请号:US16144999

    申请日:2018-09-27

    Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit (e.g., an 18×18 or 18×19 multiplier circuit) may be used to support two or more smaller multiplication operations sharing one or two sets of multiplier operands, a complex multiplication, and a sum of two multiplications. If the multiplier products overflow and interfere with one another, correction operations can be performed. Partial products from two or more larger multiplier circuits can be used to combine decomposed partial products. A large multiplier circuit can also be used to support two floating-point mantissa multipliers.

    Logic circuits with simultaneous dual function capability

    公开(公告)号:US10790829B2

    公开(公告)日:2020-09-29

    申请号:US16144558

    申请日:2018-09-27

    Abstract: Integrated circuits with programmable logic regions are provided. The programmable logic regions may be organized into smaller logic units sometimes referred to as a logic element. A logic element may include four lookup tables coupled to an adder carry chain. At least some of the lookup tables are configured to output combinatorial outputs, whereas the adder carry chain are used to output sum outputs. Both the combinatorial outputs and the sum outputs may be used simultaneously to support a multiplication operation, three or more logic operations, or arithmetic and combinatorial operations in parallel.

    LOGIC CIRCUITS WITH AUGMENTED ARITHMETIC DENSITIES

    公开(公告)号:US20190288688A1

    公开(公告)日:2019-09-19

    申请号:US16434088

    申请日:2019-06-06

    Abstract: Integrated circuits with programmable logic regions are provided. The programmable logic regions may be organized into smaller logic units sometimes referred to as a logic cell. A logic cell may include four 4-input lookup tables (LUTs) coupled to an adder carry chain. Each of the four 4-input LUTs may include two 3-input LUTs and a selector multiplexer. The carry chain may include at three or more full adder circuits. The outputs of the 3-input LUTs may be directly connected to inputs of the full adder circuits in the carry chain. By providing at least the same or more number of full adder circuits as the total number of 4-input LUTs in the logic cell, the arithmetic density of the logic is enhanced.

    RAM-based shift register with embedded addressing

    公开(公告)号:US10102892B1

    公开(公告)日:2018-10-16

    申请号:US15611070

    申请日:2017-06-01

    Inventor: Sergey Gribok

    Abstract: Unlike prior RAM-based shift register circuits, the presently-disclosed shift register circuit does not require control circuits to generate write and read address signals. Instead, the presently-disclosed shift register circuit utilizes a portion of RAM to store and provide the write and read address signals. The write and read addresses are output from the data output port of the RAM, and received by the write and read address ports of the RAM. Advantageously, the presently-disclosed shift register circuit requires less area to implement because the need for write and read control circuits is eliminated.

    Method and apparatus for performing multiplier regularization

    公开(公告)号:US11436399B2

    公开(公告)日:2022-09-06

    申请号:US16218179

    申请日:2018-12-12

    Abstract: A method for implementing a multiplier on a programmable logic device (PLD) is disclosed. Partial product bits of the multiplier are identified and how the partial product bits are to be summed to generate a final product from a multiplier and multiplicand are determined. Chains of PLD cells and cells in the chains of PLD cells for generating and summing the partial product bits are assigned. It is determined whether a bit in an assigned cell in an assigned chain of PLD cells is under-utilized. In response to determining that a bit is under-utilized, the assigning of the chains of PLD cells and cells for generating and summing the partial product bits are changed to improve an overall utilization of the chains of PLD cells and cells in the chains of PLD cells.

    Machine learning training architecture for programmable devices

    公开(公告)号:US11210063B2

    公开(公告)日:2021-12-28

    申请号:US16585857

    申请日:2019-09-27

    Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.

    High Performance Systems And Methods For Modular Multiplication

    公开(公告)号:US20230026331A1

    公开(公告)日:2023-01-26

    申请号:US17952085

    申请日:2022-09-23

    Abstract: A circuit system for performing modular reduction of a modular multiplication includes multiplier circuits that receive a first subset of coefficients that are generated by summing partial products of a multiplication operation that is part of the modular multiplication. The multiplier circuits multiply the coefficients in the first subset by constants that equal remainders of divisions to generate products. Adder circuits add a second subset of the coefficients and segments of bits of the products that are aligned with respective ones of the second subset of the coefficients to generate sums.

Patent Agency Ranking