Power saving floating point multiplier-accumulator with precision-aware accumulation

    公开(公告)号:US12106069B2

    公开(公告)日:2024-10-01

    申请号:US17352370

    申请日:2021-06-21

    申请人: Ceremorphic, Inc.

    发明人: Dylan Finch

    IPC分类号: G06F7/544 G06F7/487

    摘要: A floating point multiplier-accumulator (MAC) multiplies and accumulates N pairs of floating point values using N MAC processors operating simultaneously, each pair of values comprising an input value and a coefficient value to be multiplied and accumulated. The pairs of floating point values are simultaneously processed by the plurality of MAC processors, each of which outputs a signed integer form fraction and a maximum exponent. A range estimator forms a possible range of values from the exponent differences and determines an adder precision. The integer form fractions are summed using the adder precision, a sign bit is extracted, and a floating point value is output. Each MAC processor provides its integer form fraction with a precision determined by the MAC processor's exponent difference.

    System and method for skyrmion based logic device

    公开(公告)号:US12081213B1

    公开(公告)日:2024-09-03

    申请号:US17829127

    申请日:2022-05-31

    申请人: Ceremorphic, Inc.

    IPC分类号: H03K19/00 H03K19/08

    CPC分类号: H03K19/0008 H03K19/08

    摘要: A system and method for a logic device is disclosed. A substrate is provided. Three nanotracks are disposed over the substrate and intersect in a central portion. Two nanotracks are disposed about a first axis and one nanotrack is disposed about a second axis perpendicular to the first axis. A ground pad is disposed in the central portion. Nanotrack along the second axis extend beyond the central portion to define an output portion. An input value is set by nucleating a skyrmion about a first end of the nanotracks. Presence of the skyrmion indicates a first value and absence indicates a second value. A charge current is passed in the substrate, along the first axis and the second axis to move the nucleated skyrmions towards the central portion. Presence of the skyrmion is sensed in the output portion and indicates a first value when skyrmion is present.

    Multiplier-accumulator unit element with binary weighted charge transfer capacitors

    公开(公告)号:US12014152B2

    公开(公告)日:2024-06-18

    申请号:US17334816

    申请日:2021-05-31

    摘要: A Unit Element (UE) has a digital X input and a digital W input, and comprises groups of NAND gates generating complementary outputs which are coupled to a differential charge transfer bus comprising a positive charge transfer line and a negative charge transfer line. The number of bits in the X input determines the number of NAND gates in a NAND-group and the number of bits in the W input determines the number of NAND groups. Each NAND-group receives one bit of the W input applied to all of the NAND gates of the NAND-group, and each unit element having the bits of X applied to each associated NAND gate input of each unit element. The NAND gate outputs are coupled through binary weighted charge transfer capacitors to a positive charge transfer line and negative charge transfer line.

    Reconfigurable SIMD engine
    45.
    发明授权

    公开(公告)号:US11940945B2

    公开(公告)日:2024-03-26

    申请号:US17566848

    申请日:2021-12-31

    申请人: CEREMORPHIC, INC.

    发明人: Heonchul Park

    摘要: An exemplary SIMD computing system comprises a SIMD processing element (SPE) configured to perform a selected operation on a portion of a processor input data word, with the operation selected by control signals read from a control memory location addressed by a decoded instruction. The SPE may comprise one or more adder, multiplier, or multiplexer coupled to the control signals. The control signals may comprise one or more bit read from the control memory. The control memory may be an M×N (M rows by N columns) memory having M possible SIMD operations and N control signals. Each instruction decoded may select an SPE operation from among N rows. A plurality of SPEs may receive the same control signals. The control memory may be rewritable, advantageously permitting customizable SIMD operations that are reconfigurable by storing in the control memory locations control signals designed to cause the SPE to perform selected operations.

    Multi-threaded secure processor with control flow attack detection

    公开(公告)号:US11921843B2

    公开(公告)日:2024-03-05

    申请号:US17485471

    申请日:2021-09-26

    申请人: Ceremorphic, Inc.

    IPC分类号: G06F11/16 G06F9/38 G06F21/52

    摘要: A fault detecting multi-thread pipeline processor with fault detection is operative with a single pipeline stage which generates branch status comprising at least one of branch taken/not_taken, branch direction, and branch target. A first thread has control and data instructions, the control instructions comprising loop instructions including unconditional and conditional branch instructions, loop initialization instructions, loop arithmetic instructions, and no operation (NOP) instructions. A second thread has only control instructions and either has the non-control instructions replaced with NOP instructions, or removed entirely. A fault detector compares the branch status of the first thread and second thread and asserts a fault output when they do not match.

    SYSTEM FOR ERROR DETECTION AND CORRECTION IN A MULTI-THREAD PROCESSOR

    公开(公告)号:US20230409329A1

    公开(公告)日:2023-12-21

    申请号:US17829050

    申请日:2022-05-31

    申请人: Ceremorphic, Inc.

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3802 G06F9/3851

    摘要: A master processor is configured to execute a first thread and a second thread designated to run a program in sequence. A slave processor is configured to execute a third thread to run the program in sequence. An instruction fetch compare engine is provided. The first thread initiates a first thread instruction fetch for the program and stored in an instruction fetch storage. Retrieved data associated with the fetched first thread instruction is stored in a retrieved data storage. The second thread initiates a second thread instruction fetch for the program. The instruction fetch compare logic compares the second thread instruction fetch for the program with the first thread instruction fetch stored in the instruction fetch storage for a match. When there is a match, the retrieved data associated with the fetched first thread instruction is presented from the retrieved data storage, in response to the second thread instruction fetch.

    System for error detection and correction in a multi-thread processor

    公开(公告)号:US11720436B1

    公开(公告)日:2023-08-08

    申请号:US17536195

    申请日:2021-11-29

    申请人: Ceremorphic, Inc.

    发明人: Heonchul Park

    IPC分类号: G06F11/00 G06F11/07

    摘要: A system for detecting errors and correcting errors in a multi-thread processor is disclosed. The multi-thread processor includes a first processor and a second processor. First processor executes a first thread and a second thread. Second processor executes a third thread and fourth thread. An instruction execution is initiated in all four threads. Output of the instruction execution from all four threads are compared for a match by a data compare engine to detect an error in execution of the instruction. When output of the instruction execution from one of the four threads does not match, an error in execution is detected and the output is replaced by one of the other three threads whose output does match. When output of the instruction execution by two or more threads does not match, error is detected, but not corrected.

    Process for Generation of Addresses in Multi-Level Data Access

    公开(公告)号:US20230244600A1

    公开(公告)日:2023-08-03

    申请号:US17588238

    申请日:2022-01-29

    申请人: Ceremorphic, Inc.

    IPC分类号: G06F12/06

    CPC分类号: G06F12/06

    摘要: A process for iterating through a multi-dimensional array has an iteration process and an address generation process. In one example of the invention an input address process, a coefficient address process, and an output address process generate addresses for a convolutional neural network (CNN. Each of the input address process, coefficient address process, and output address process is coupled to a plurality of iteration variables generated by an iteration variable process, each iteration variable process having an associated with a bound and stride for each iteration variable, thereby generating an input address, a coefficient address, and an output address.