Transposing Memory Layout of Weights in Deep Neural Networks (DNNs)

    公开(公告)号:US20230229910A1

    公开(公告)日:2023-07-20

    申请号:US17937592

    申请日:2022-10-03

    CPC classification number: G06N3/08 G06N3/0481 G06F13/28 G06F2213/28

    Abstract: A compute block includes a DMA engine that reads data from an external memory and write the data into a local memory of the compute block. An MAC array in the compute block may use the data to perform convolutions. The external memory may store weights of one or more filters in a memory layout that comprises a sequence of sections for each filter. Each section may correspond to a channel of the filter and may store all the weights in the channel. The DMA engine may convert the memory layout to a different memory layout, which includes a sequence of new sections for each filter. Each new section may include a weight vector that includes a sequence of weights, each of which is from a different channel. The DMA engine may also compress the weights, e.g., by removing zero valued weights, before the conversion of the memory layout.

    SYSTEMS, APPARATUS, AND METHODS TO DEBUG ACCELERATOR HARDWARE

    公开(公告)号:US20240118992A1

    公开(公告)日:2024-04-11

    申请号:US18487490

    申请日:2023-10-16

    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.

    Pedestrian crossing and/or traffic light control method and apparatus

    公开(公告)号:US10269239B2

    公开(公告)日:2019-04-23

    申请号:US15661801

    申请日:2017-07-27

    Abstract: Apparatuses, methods and storage media associated with controlling a pedestrian crossing or traffic light are disclosed herein. In embodiments, an apparatus may include a control unit to extend a duration of a pedestrian crossing state of the pedestrian crossing or traffic light in response to receipt of sensor data that convey detection of at least one commence crossing event of the pedestrian, while the pedestrian crossing or traffic light is in a pedestrian crossing state, but yet to receive sensor data that convey detection of all corresponding end of crossing event or events of the one or more pedestrians, prior to expiration of the duration of the pedestrian crossing state. The controller may extend the duration of the pedestrian crossing state until receipt of sensor data that convey receipt of all corresponding end of crossing event/events of the one or more pedestrians, or until a timeout threshold is reached.

Patent Agency Ranking