CAMS FOR LOW LATENCY COMPLEX DISTRIBUTION SAMPLING

    公开(公告)号:US20240153555A1

    公开(公告)日:2024-05-09

    申请号:US18411222

    申请日:2024-01-12

    IPC分类号: G11C13/00 G06N7/01 G11C15/04

    摘要: Systems and methods are provided for employing analog content addressable memory (aCAMs) to achieve low latency complex distribution sampling. For example, an aCAM core circuit can include an aCAM array. Amplitudes of a probability distribution function are mapped to a width of one or more aCAM cells in each row of the aCAM array. The aCAM core circuit can also include a resistive random access memory (RRAM) storing lookup information, such as information used for processing a model. By randomly selecting columns to search of the aCAM array, the mapped probability distribution function is sampled in a manner that has low latency. The aCAM core circuit can accelerate the sampling step in methods relying on sampling from arbitrary probability distributions, such as particle filter techniques. A hardware architecture for an aCAM Particle Filter that utilizes the aCAM core circuit as a central structure is also described.

    Enhanced k-SAT solver using analog content addressable memory

    公开(公告)号:US11899965B2

    公开(公告)日:2024-02-13

    申请号:US17691642

    申请日:2022-03-10

    IPC分类号: G06F3/06 G06N3/063

    摘要: A system for facilitating an enhanced k-SAT solver is provided. The system can include a set of analog content addressable memory (aCAM) modules that can represent an expression in a conjunctive normal form (CNF), wherein a respective aCAM module corresponds to a clause of the expression. The system can also include a set of data lines that can provide input candidate values to the set of aCAM modules. A controller of the system can program the set of aCAM modules with respective analog values to represent the expression. The system can also include sensing logic block to determine a distance of a current solution from a target solution based on a combination of respective outputs from the set of aCAM modules. The controller can then iteratively modify an input value for a subset of data lines until the current solution converges based on a convergence condition.

    ANALOG ERROR DETECTION AND CORRECTION IN ANALOG IN-MEMORY CROSSBARS

    公开(公告)号:US20240039562A1

    公开(公告)日:2024-02-01

    申请号:US18482964

    申请日:2023-10-09

    IPC分类号: H03M13/00 H03M13/15 G11C13/00

    摘要: An analog error correction circuit is disclosed that implements an analog error correction code. The analog circuit includes a crossbar array of memristors or other non-volatile tunable resistive memory devices. The crossbar array includes a first crossbar array portion programmed with values of a target computation matrix and a second crossbar array portion programmed with values of an encoder matrix for correcting computation errors in the matrix multiplication of an input vector with the computation matrix. The first and second crossbar array portions share the same row lines and are connected to a third crossbar array portion that is programmed with values of a decoder matrix, thereby enabling single-cycle error detection. A computation error is detected based on output of the decoder matrix circuitry and a location of the error is determined via an inverse matrix multiplication operation whereby the decoder matrix output is fed back to the decoder matrix.

    Adjustable precision for multi-stage compute processes

    公开(公告)号:US11385863B2

    公开(公告)日:2022-07-12

    申请号:US16052218

    申请日:2018-08-01

    IPC分类号: G06F7/483 G06N3/08 G06N3/063

    摘要: Disclosed techniques provide for dynamically changing precision of a multi-stage compute process. For example, changing neural network (NN) parameters on a per-layer basis depending on properties of incoming data streams and per-layer performance of an NN among other considerations. NNs include multiple layers that may each be calculated with a different degree of accuracy and therefore, compute resource overhead (e.g., memory, processor resources, etc.). NNs are usually trained with 32-bit or 16-bit floating-point numbers. Once trained, an NN may be deployed in production. One approach to reduce compute overhead is to reduce parameter precision of NNs to 16 or 8 for deployment. The conversion to an acceptable lower precision is usually determined manually before deployment and precision levels are fixed while deployed. Disclosed techniques and implementations address automatic rather than manual determination or precision levels for different stages and dynamically adjusting precision for each stage at run-time.

    Memristor spiking architecture
    7.
    发明授权

    公开(公告)号:US11232352B2

    公开(公告)日:2022-01-25

    申请号:US16037060

    申请日:2018-07-17

    IPC分类号: G06N3/04 G06N3/063 G06N3/08

    摘要: A circuit for a neuron of a multi-stage compute process is disclosed. The circuit comprises a weighted charge packet (WCP) generator. The circuit may also include a voltage divider controlled by a programmable resistance component (e.g., a memristor). The WCP generator may also include a current mirror controlled via the voltage divider and arrival of an input spike signal to the neuron. WCPs may be created to represent the multiply function of a multiply accumulate processor. The WCPs may be supplied to a capacitor to accumulate and represent the accumulate function. The value of the WCP may be controlled by the length of the spike in signal times the current supplied through the current mirror. Spikes may be asynchronous. Memristive components may be electrically isolated from input spike signals so their programmed conductance is not affected. Positive and negative spikes and WCPs for accumulation may be supported.

    Content addressable memory circuits with threshold switching memristors

    公开(公告)号:US10896731B1

    公开(公告)日:2021-01-19

    申请号:US16526455

    申请日:2019-07-30

    IPC分类号: G11C15/04

    摘要: A content addressable memory (CAM) structure is provided. The CAM comprises a plurality of CAM cells communicatively coupled to processing circuitry. A plurality of threshold switching (TS) memristors are included, each configured to connect to a one of the plurality of CAM cells, with the first end connected to the CAM cell and the second connected to a match line. A discharge transistor is included and configured to discharge any charge on the match line in response to the CAM receiving a command to perform a search.

    FAULT-TOLERANT ANALOG COMPUTING
    9.
    发明申请

    公开(公告)号:US20200382135A1

    公开(公告)日:2020-12-03

    申请号:US16429983

    申请日:2019-06-03

    IPC分类号: H03M13/11 G06F17/16

    摘要: A fault-tolerant analog computing device includes a crossbar array having a number l rows and a number n columns intersecting the l rows to form l×n memory locations. The l rows of the crossbar array receive an input signal as a vector of length l. The n columns output an output signal as a vector of length n that is a dot product of the input signal and the matrix values defined in the l×n memory locations. Each memory location is programmed with a matrix value. A first set of k columns of the n columns is programmed with continuous analog target matrix values with which the input signal is to be multiplied, where k