Computational memory
    1.
    发明授权

    公开(公告)号:US12124530B2

    公开(公告)日:2024-10-22

    申请号:US17675729

    申请日:2022-02-18

    摘要: A processing device includes a two-dimensional array of processing elements, each processing element including an arithmetic logic unit to perform an operation. The device further includes interconnections among the two-dimensional array of processing elements to provide direct communication among neighboring processing elements of the two-dimensional array of processing elements. A processing element of the two-dimensional array of processing elements is connected to a first neighbor processing element that is immediately adjacent the processing element in a first dimension of the two-dimensional array. The processing element is further connected to a second neighbor processing element that is immediately adjacent the processing element in a second dimension of the two-dimensional array.

    ASYNCHRONOUS ACCUMULATOR USING LOGARITHMIC-BASED ARITHMETIC

    公开(公告)号:US20240311626A1

    公开(公告)日:2024-09-19

    申请号:US18674632

    申请日:2024-05-24

    摘要: Neural networks, in many cases, include convolution layers that are configured to perform many convolution operations that require multiplication and addition operations. Compared with performing multiplication on integer, fixed-point, or floating-point format values, performing multiplication on logarithmic format values is straightforward and energy efficient as the exponents are simply added. However, performing addition on logarithmic format values is more complex. Conventionally, addition is performed by converting the logarithmic format values to integers, computing the sum, and then converting the sum back into the logarithmic format. Instead, logarithmic format values may be added by decomposing the exponents into separate quotient and remainder components, sorting the quotient components based on the remainder components, summing the sorted quotient components using an asynchronous accumulator to produce partial sums, and multiplying the partial sums by the remainder components to produce a sum. The sum may then be converted back into the logarithmic format.

    NEURAL NETWORKS FOR EMBEDDED DEVICES
    6.
    发明公开

    公开(公告)号:US20240296330A1

    公开(公告)日:2024-09-05

    申请号:US18664035

    申请日:2024-05-14

    申请人: Tesla, Inc.

    IPC分类号: G06N3/08 G06F7/575

    CPC分类号: G06N3/08 G06F7/575

    摘要: A neural network architecture is used that reduces the processing load of implementing the neural network. This network architecture may thus be used for reduced-bit processing devices. The architecture may limit the number of bits used for processing and reduce processing to prevent data overflow at individual calculations of the neural network. To implement this architecture, the number of bits used to represent inputs at levels of the network and the related filter masks may also be modified to ensure the number of bits of the output does not overflow the resulting capacity of the reduced-bit processor. To additionally reduce the load for such a network, the network may implement a “starconv” structure that permits the incorporation of nearby nodes in a layer to balance processing requirements and permit the network to learn from context of other nodes.