ACCURACY-BASED APPROXIMATION OF ACTIVATION FUNCTIONS WITH PROGRAMMABLE LOOK-UP TABLE HAVING AREA BUDGET

    公开(公告)号:US20240111830A1

    公开(公告)日:2024-04-04

    申请号:US18534035

    申请日:2023-12-08

    CPC classification number: G06F17/17 G06F1/0307

    Abstract: A non-linear activation function in a neural network may be approximated by one or more linear functions. The input range may be divided into input segments, each of which corresponds to a different exponent in the input range of the activation function and includes input data elements having the exponent. Target accuracies may be assigned to the identified exponents based on a statistics analysis of the input data elements. The target accuracy of an input segment will be used to determine one or more linear functions that approximate the activation function for the input segment. An error of an approximation of the activation function by a linear function for the input segment may be within the target accuracy. The parameters of the linear functions may be stored in a look-up table (LUT). During the execution of the DNN, the LUT may be used to execute the activation function.

    FLOATING POINT MULTIPLY-ACCUMULATE UNIT FOR DEEP LEARNING

    公开(公告)号:US20220188075A1

    公开(公告)日:2022-06-16

    申请号:US17688131

    申请日:2022-03-07

    Abstract: A FPMAC operation has two operands: an input operand and a weight operand. The operands may have a format of FP16, BF16, or INT8. Each operand is split into two portions. The two portions are stored in separate storage units. Then operands are transferred to register files of a PE, with each register file storing bits of an operand sequentially. The PE performs the FPMAC operation based on the operands. The PE may include an FPMAC unit configured to compute an individual partial sum of the PE. The PE may also include an FP adder to accumulate the individual partial sum with other data, such as an output from another PE or an output form another PE array. The FP adder may be fused with the FPMAC unit in a single circuit that can do speculative alignment and has separate critical paths for alignment and normalization.

    METHODS, SYSTEMS, ARTICLES OF MANUFACTURE, AND APPARATUS TO DECODE ZERO-VALUE-COMPRESSION DATA VECTORS

    公开(公告)号:US20200228137A1

    公开(公告)日:2020-07-16

    申请号:US16832804

    申请日:2020-03-27

    Abstract: Methods, systems, articles of manufacture, and apparatus are disclosed to decode zero-value-compression data vectors. An example apparatus includes: a buffer monitor to monitor a buffer for a header including a value indicative of compressed data; a data controller to, when the buffer includes compressed data, determine a first value of a sparse select signal based on (1) a select signal and (2) a first position in a sparsity bitmap, the first value of the sparse select signal corresponding to a processing element that is to process a portion of the compressed data; and a write controller to, when the buffer includes compressed data, determine a second value of a write enable signal based on (1) the select signal and (2) a second position in the sparsity bitmap, the second value of the write enable signal corresponding to the processing element that is to process the portion of the compressed data.

    LIGHTWEIGHT TRUSTED EXECUTION FOR INTERNET-OF-THINGS DEVICES

    公开(公告)号:US20170372088A1

    公开(公告)日:2017-12-28

    申请号:US15190396

    申请日:2016-06-23

    Abstract: Lightweight trusted execution technologies for internet-of-things devices are described. In response to a memory request at a page unit from an application executing in a current domain, the page unit is to map a current virtual address (VA) to a current physical address (PA). The policy enforcement logic (PEL) reads, from a secure domain cache (SDC), a domain value (DID) and a VA value that correspond to the current PA. The PEL grants access when the current domain and the DID correspond to the unprotected region or the current domain and the DID correspond to the secure domain region, the current domain is equal to the DID, and the current VA is equal to the VA value. The PEL grants data access and denies code access when the current domain corresponds to the secure domain region and the DID corresponds to the unprotected region.

Patent Agency Ranking