OPTIMIZING METHOD AND COMPUTING SYSTEM FOR DEEP LEARNING NETWORK

    公开(公告)号:US20240361988A1

    公开(公告)日:2024-10-31

    申请号:US18342661

    申请日:2023-06-27

    摘要: Disclosed are an optimizing method and a computing system used for deep learning networks. The first data is obtained. The first data is quantized through the power of two quantization. The first data after the power of two quantization is the first format or the second format. The numbers of the first values in the first format or the second format is different. The second data is obtained. The second data is quantized through dynamic fixed-point quantization. A computation related to a deep learning network is performed on the quantized first data after the power of two quantization and the quantized second data after dynamic fixed-point quantization. Accordingly, the prediction precision could be increased, and the complexity of the model could be reduced.

    OPERATION METHOD OF MULTIPLIER, ELECTRONIC DEVICE, AND STORAGE MEDIUM

    公开(公告)号:US20240361985A1

    公开(公告)日:2024-10-31

    申请号:US18603806

    申请日:2024-03-13

    IPC分类号: G06F7/53 G06F7/50 H03K19/21

    CPC分类号: G06F7/5318 G06F7/50 H03K19/21

    摘要: Disclosed are an operation method of multiplier, an electronic device, and a storage medium. The method includes: determining a plurality of input data sets of the multiplier and an encoding manner for the multiplier; determining at least one low-order input data set in the plurality of input data sets; determining a carry compensation term corresponding to the at least one low-order input data set based on the at least one low-order input data set and the encoding manner; determining a target partial product array based on the carry compensation term corresponding to the at least one low-order input data set and the plurality of input data sets; and determining a product operation result for each input data set based on the target partial product array. According to this disclosure, multiplication operations with multiple precision may be implemented by using one multiplier, thereby reducing hardware resource consumption and hardware area.

    Palettization of Kernel Vector in Neural Network Processor

    公开(公告)号:US20240232571A1

    公开(公告)日:2024-07-11

    申请号:US18094251

    申请日:2023-01-06

    申请人: Apple Inc.

    发明人: Sung Hee Park

    IPC分类号: G06N3/04 G06F7/50 G06F7/523

    CPC分类号: G06N3/04 G06F7/50 G06F7/523

    摘要: Embodiments of the present disclosure relate to decompressing a kernel for neural network operations in a neural processor circuit, using a look-up table (LUT) with each of its entries associated with a plurality of kernel coefficients. Index data in compressed kernel data includes indices that indicate entries in the LUT. During decompression, all kernel coefficients in entries as indicated by the indices of the index data are retrieved and assembled into the decompressed kernel. A block sparse mask may also be used to indicate a block of locations in the uncompressed kernel to be filled with zero values. Only one or more blocks of locations indicated by the block sparse mask to include at least one none-zero kernel coefficient may be populated with the kernel coefficients from the LUT while remaining blocks of locations are padded with zero.

    A METHOD AND ARCHITECTURE FOR PERFORMING MODULAR ADDITION AND MULTIPLICATION SEQUENCES

    公开(公告)号:US20240220201A1

    公开(公告)日:2024-07-04

    申请号:US17925367

    申请日:2021-09-20

    IPC分类号: G06F7/523 G06F7/50

    CPC分类号: G06F7/523 G06F7/50

    摘要: A computer processing system that includes at least one arithmetic logic unit in a computer processing device and includes at least one addition circuit operably configured to compute addition operations, operably configured to receive two numerical inputs, and operably configured to compute a sum and includes at least one modular multiplication circuit operably configured to receive the sum from the at least one addition circuit, receive at least one other numerical input, and receive a numerical modulus to perform a modular multiplication operation and generate a modular multiplication operation result

    Differential unit element for multiply-accumulate operations on a shared charge transfer bus

    公开(公告)号:US12026479B2

    公开(公告)日:2024-07-02

    申请号:US17163494

    申请日:2021-01-31

    申请人: Ceremorphic, Inc.

    摘要: A Unit Element (UE) has a digital X input and a digital W input, and comprises groups of NAND gates generating complementary outputs which are coupled to differential charge transfer lines through respective charge transfer capacitor Cu. The number of bits in the X input determines the number of NAND gates in a NAND-group and the number of bits in the W input determines the number of NAND groups. Each NAND-group receives one bit of the W input applied to all of the NAND gates of the NAND-group, and each unit element having the bits of X applied to each associated NAND gate input of each unit element. The NAND gate outputs are coupled through a charge transfer capacitor Cu to charge transfer lines. Multiple Unit Elements may be placed in parallel to sum and scale the charges from the charge transfer lines, the charges coupled to an analog to digital converter which forms the dot product output.