Power saving floating point Multiplier-Accumulator with a high precision accumulation detection mode

    公开(公告)号:US12079593B2

    公开(公告)日:2024-09-03

    申请号:US17352373

    申请日:2021-06-21

    发明人: Dylan Finch

    摘要: A floating point multiplier-accumulator (MAC) multiplies and accumulates N pairs of floating point values using N MAC processors operating simultaneously, each pair of values comprising an input value and a coefficient value to be multiplied and accumulated. The pairs of floating point values are simultaneously processed by the plurality of MAC processors, each of which output a signed integer form fraction with a first bitwidth and a second bitwidth, along with a maximum exponent. The first bitwidth signed integer form fractions are summed by an adder tree using the first bitwidth to form a first sum, and when an excess leading 0 condition is detected, a second adder tree operative on the second bitwidth integer form fractions forms a second sum. The first sum or second sum, along with the maximum exponent, is converted into floating point result.

    Look up table (LUT) based chiplet to chiplet secure communication

    公开(公告)号:US12041159B2

    公开(公告)日:2024-07-16

    申请号:US17683087

    申请日:2022-02-28

    申请人: Ceremorphic, Inc.

    IPC分类号: H04L9/06 H04L9/14

    CPC分类号: H04L9/0618 H04L9/14

    摘要: A cryptographic method includes (1) with the first chiplet, parsing a message into one or more message blocks (2) dynamically generating a first target value that is associated with a first key (3) dynamically generating a second target value that is associated with a second key (4) encrypting at least one message block of the at least one or more message blocks to generate some ciphertext, the encryption being performed with at least one operation that includes at least one XOR operation, the at least one XOR operation performed at least in part with the first target value and with at least the second target value, the first target value and the second target value being accessed via the first and second keys, respectively; and (5) with at least one processing device associated with the first chiplet, transmitting the some ciphertext to a second chiplet.

    Chopper stabilized analog multiplier accumulator with binary weighted charge transfer capacitors

    公开(公告)号:US12032926B2

    公开(公告)日:2024-07-09

    申请号:US17334887

    申请日:2021-05-31

    摘要: An architecture for a chopper stabilized multiplier-accumulator (MAC) uses a chop clock and common Unit Element (UE), the MAC formed as a plurality of MAC UEs receiving X and W values and a sign bit exclusive ORed with the chop clock, a plurality of Bias UEs receiving E value and a sign bit exclusive ORed with the chop clock, and a plurality of Analog to Digital Conversion (ADC) UEs which collectively perform a scalable MAC operation and generate a binary result. Each MAC UE, BIAS UE and ADC UE comprises groups of NAND gates with complementary outputs arranged in NAND-groups, each NAND gate coupled to a differential charge transfer bus through a binary weighted charge transfer capacitor. The analog charge transfer bus is coupled to groups of ADC UEs with an ADC controller which enables and disables the ADC UEs using successive approximation to determine the accumulated multiplication result.

    System and method for nanomagnet based logic device

    公开(公告)号:US11962298B1

    公开(公告)日:2024-04-16

    申请号:US17829091

    申请日:2022-05-31

    申请人: Ceremorphic, Inc.

    摘要: A system and method for a logic device is disclosed. A first substrate, and a second substrate is provided, which are spaced apart from each other and manifests Spin orbit torque effect. A nanomagnet is disposed over the first substrate and the second substrate. A first charge current is passed through the first substrate and a second charge current is passed through the second substrate. A direction of flow of the first charge current and the second charge current defines an input value of either a first value or a second value. A spin in the nanomagnet is selectively oriented based on the direction of flow of the first charge current and the second charge current. The spin in the nanomagnet is selectively read to determine an output value as the first value or the second value. The logic device is configured as a XOR logic.

    System for a decision feedback equalizer

    公开(公告)号:US11936504B1

    公开(公告)日:2024-03-19

    申请号:US17829070

    申请日:2022-05-31

    申请人: Ceremorphic, Inc.

    IPC分类号: H04L25/03

    摘要: A decision feedback equalizer includes a summer, a slicer, and a feedback circuit. The summer is configured to receive an input signal and a correction signal from the feedback circuit and generate a summer output signal. The slicer includes a first slicer and a second slicer, both are configured to receive the summer output signal as an input, and output a slicer output signal. The feedback circuit is configured to receive the slicer output signal, and based on the slicer output signal, generate the correction signal. The input signal is received at a first clock rate. The first slicer and the second slicer sample the input signal at a second clock rate, about half the first clock rate.

    Fast recovery for dual core lock step

    公开(公告)号:US11928475B2

    公开(公告)日:2024-03-12

    申请号:US17519588

    申请日:2021-11-05

    申请人: Ceremorphic, Inc.

    发明人: Heonchul Park

    IPC分类号: G06F9/30 G06F9/38 G06F11/16

    摘要: An exemplary fault-tolerant computing system comprises a secondary processor configured to execute in delayed lock step with a primary processor from a common program store, comparators in the store data and writeback paths to detect a fault based on comparing primary and secondary processor states, and a writeback path delay permitting aborting execution when a fault is detected, before writeback of invalid data. The secondary processor execution and the primary processor store data and writeback may be delayed a predetermined number of cycles, permitting fault detection before writing invalid data. Store data and writeback paths may include triple module redundancy configured to pass only majority data through the store data and writeback path delay stages. Some implementations may forward data from the store data path delay stages to the writeback stage or memory if the load data address matches the address of data in a store data path delay stage.

    Unit element for asynchronous analog multiplier accumulator

    公开(公告)号:US11922240B2

    公开(公告)日:2024-03-05

    申请号:US17139226

    申请日:2020-12-31

    申请人: Ceremorphic, Inc.

    摘要: A multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. The number of bits in the B input is a number of AND-groups and the number of bits in A is the number of AND gates in an AND-group. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges from the charge transfer lines, the charge coupled to an analog to digital converter which forms the dot product output. The charge transfer lines may span multiple unit elements.

    Cascade multiplier using unit element analog multiplier-accumulator

    公开(公告)号:US11886835B2

    公开(公告)日:2024-01-30

    申请号:US17139242

    申请日:2020-12-31

    IPC分类号: G06F7/544 H03M1/12

    CPC分类号: G06F7/5443 H03M1/12

    摘要: A multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. The number of bits in the B input is a number of AND-groups and the number of bits in A is the number of AND gates in an AND-group. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges from the charge transfer lines, the charge coupled to an analog to digital converter which forms the dot product output. The charge transfer lines may span multiple unit elements.