Image processing accelerator
    1.
    发明授权

    公开(公告)号:US12111778B2

    公开(公告)日:2024-10-08

    申请号:US17558252

    申请日:2021-12-21

    CPC classification number: G06F13/1668 G06F13/28 G06T1/20 H04N5/765

    Abstract: A processing accelerator includes a shared memory, and a stream accelerator, a memory-to-memory accelerator, and a common DMA controller coupled to the shared memory. The stream accelerator is configured to process a real-time data stream, and to store stream accelerator output data generated by processing the real-time data stream in the shared memory. The memory-to-memory accelerator is configured to retrieve input data from the shared memory, to process the input data, and to store, in the shared memory, memory-to-memory accelerator output data generated by processing the input data. The common DMA controller is configured to retrieve stream accelerator output data from the shared memory and transfer the stream accelerator output data to memory external to the processing accelerator; and to retrieve the memory-to-memory accelerator output data from the shared memory and transfer the memory-to-memory accelerator output data to memory external to the processing accelerator.

    NEURAL NETWORK PROCESSOR
    2.
    发明公开

    公开(公告)号:US20240104361A1

    公开(公告)日:2024-03-28

    申请号:US18355749

    申请日:2023-07-20

    CPC classification number: G06N3/063 G06N3/048

    Abstract: In one example, a neural network processor comprises a computing engine and a post-processing engine, the post-processing engine configurable to perform different post-processing operations for a range of output precisions and a range of weight precisions. The neural network processor further comprises a controller configured to: receive a first indication of a particular output precision, a second indication of the particular weight precision, and post-processing parameters; and configure the post-processing engine based on the first and second indications and the first and second post-processing parameters. The controller is further configured to, responsive to a first instruction, perform, using the computing engine, multiplication and accumulation operations between input data elements and weight elements to generate intermediate data elements. The controller is further configured to, responsive to a second instruction, perform, using the configured post-processing engine, post-processing operations on the intermediate data elements to generate output data elements.

    NEURAL NETWORK PROCESSOR
    3.
    发明公开

    公开(公告)号:US20240103875A1

    公开(公告)日:2024-03-28

    申请号:US18355689

    申请日:2023-07-20

    CPC classification number: G06F9/3814 G06F9/3004 G06N3/063

    Abstract: In one example, a neural network processor comprises a memory interface, an instruction buffer, a weights buffer, an input data register, a weights register, an output data register, a computing engine, and a controller. The controller is configured to: receive a first instruction from the instruction buffer; responsive to the first instruction, fetch input data elements from the memory interface to the input data register, and fetch weight elements from the weights buffer to the weights register. The controller is also configured to: receive a second instruction from the instruction buffer; and responsive to the second instruction: fetch the input data elements and the weight elements from, respectively, the input data register and the weights register to the computing engine; and perform, using the computing engine, computation operations between the input data elements and the weight elements to generate output data elements.

    Debug for multi-threaded processing

    公开(公告)号:US11789836B2

    公开(公告)日:2023-10-17

    申请号:US17462046

    申请日:2021-08-31

    Abstract: A system to implement debugging for a multi-threaded processor is provided. The system includes a hardware thread scheduler configured to schedule processing of data, and a plurality of schedulers, each configured to schedule a given pipeline for processing instructions. The system further includes a debug control configured to control at least one of the plurality of schedulers to halt, step, or resume the given pipeline of the at least one of the plurality of schedulers for the data to enable debugging thereof. The system further includes a plurality of hardware accelerators configured to implement a series of tasks in accordance with a schedule provided by a respective scheduler in accordance with a command from the debug control. Each of the plurality of hardware accelerators is coupled to at least one of the plurality of schedulers to execute the instructions for the given pipeline and to a shared memory.

    Bandwidth Controlled Data Synchronization for Image and Vision Processor

    公开(公告)号:US20230259402A1

    公开(公告)日:2023-08-17

    申请号:US18302945

    申请日:2023-04-19

    CPC classification number: G06F9/5044 G06F9/4887 G06F11/0757 G06F9/52

    Abstract: Systems include data processors to process a set of image data in parallel, and thread schedulers coupled to the data processors. Each of the thread schedulers provides a respective task start signal for a respective data processor. Such systems also include a bandwidth controller coupled to one or more data processors. The bandwidth controller is configured to, for each of the data processor(s): maintain a respective token count, and determine whether to stall or propagate the respective task start signal from the respective thread scheduler to the data processor based on the respective token count. Other aspects include pattern adaptors respectively provided in the schedulers to allow mixing of multiple data patterns across blocks of data, transaction aggregators that allow re-using the same image data by multiple threads of execution while the image data remains in a given data buffer, and timers to detect failure and hang events.

    Scheduling of external block based data processing tasks on a hardware thread scheduler

    公开(公告)号:US10908946B2

    公开(公告)日:2021-02-02

    申请号:US15396172

    申请日:2016-12-30

    Abstract: A data processing device is provided that includes a plurality of hardware data processing nodes, wherein each hardware data processing node performs a task, and a hardware thread scheduler including a plurality of hardware task schedulers configured to control execution of a respective task on a respective hardware data processing node of the plurality of hardware data processing nodes, and a proxy hardware task scheduler coupled to a data processing node external to the data processing device, wherein the proxy hardware task scheduler is configured to control execution of a task by the external data processing device, wherein the hardware thread scheduler is configurable to execute a thread of tasks, the tasks including the task controlled by the proxy hardware task scheduler and a first task controlled by a first hardware task scheduler of the plurality of hardware task schedulers.

    Hierarchical data organization for dense optical flow processing in a computer vision system

    公开(公告)号:US10824877B2

    公开(公告)日:2020-11-03

    申请号:US15638142

    申请日:2017-06-29

    Abstract: A computer vision system is provided that includes an image generation device configured to capture consecutive two dimensional (2D) images of a scene, a first memory configured to store the consecutive 2D images, a second memory configured to store a growing window of consecutive rows of a reference image and a growing window of consecutive rows of a current image, wherein the reference image and the current image are a pair of consecutive 2D images stored in the first memory, a third memory configured to store a sliding window of pixels fetched from the growing window of the reference image, wherein the pixels in the sliding window are stored in tiles, and a dense optical flow engine (DOFE) configured to determine a dense optical flow map for the pair of consecutive 2D images, wherein the DOFE uses the sliding window as a search window for pixel correspondence searches.

Patent Agency Ranking