Thread scheduling for multithreaded data processing environments

    公开(公告)号:US12164956B2

    公开(公告)日:2024-12-10

    申请号:US17349310

    申请日:2021-06-16

    Abstract: Methods, apparatus, systems and articles of manufacture (e.g., physical storage media) to implement thread scheduling for multithreaded data processing environments are disclosed. Example thread schedulers disclosed herein for a data processing system include a buffer manager to determine availability of respective buffers to be acquired for respective processing threads implementing respective functional nodes of a processing flow, and to identify first ones of the processing threads as stalled due to unavailability of at least one buffer in the respective buffers to be acquired for the first ones of the processing threads. Disclosed example thread schedulers also include a thread execution manager to initiate execution of second ones of the processing threads that are not identified as stalled.

    Low latency streaming remapping engine

    公开(公告)号:US12112399B2

    公开(公告)日:2024-10-08

    申请号:US17520861

    申请日:2021-11-08

    CPC classification number: G06T1/60 G06F9/5027 G06F9/5066 G06F9/544

    Abstract: A lens distortion correction function operates by backmapping output images to the uncorrected, distorted input images. As a vision image processor completes processing on the image data lines needed for the lens distortion correction function to operate on a group of output, undistorted image lines, the lens distortion correction function begins processing the image data. This improves image processing pipeline delays by overlapping the operations. The vision image processor provides output image data to a circular buffer in SRAM, rather than providing it to DRAM. The lens distortion correction function operates from the image data in the circular buffer. By operating from the SRAM circular buffer, access to the DRAM for the highly fragmented backmapping image data read operations is removed, improving available DRAM bandwidth. By using a circular buffer, less space is needed in the SRAM. The improved memory operations further improve the image processing pipeline delays.

    Image processing accelerator
    33.
    发明授权

    公开(公告)号:US12111778B2

    公开(公告)日:2024-10-08

    申请号:US17558252

    申请日:2021-12-21

    CPC classification number: G06F13/1668 G06F13/28 G06T1/20 H04N5/765

    Abstract: A processing accelerator includes a shared memory, and a stream accelerator, a memory-to-memory accelerator, and a common DMA controller coupled to the shared memory. The stream accelerator is configured to process a real-time data stream, and to store stream accelerator output data generated by processing the real-time data stream in the shared memory. The memory-to-memory accelerator is configured to retrieve input data from the shared memory, to process the input data, and to store, in the shared memory, memory-to-memory accelerator output data generated by processing the input data. The common DMA controller is configured to retrieve stream accelerator output data from the shared memory and transfer the stream accelerator output data to memory external to the processing accelerator; and to retrieve the memory-to-memory accelerator output data from the shared memory and transfer the memory-to-memory accelerator output data to memory external to the processing accelerator.

    Methods and apparatus for image frame freeze detection

    公开(公告)号:US11863713B2

    公开(公告)日:2024-01-02

    申请号:US16669138

    申请日:2019-10-30

    CPC classification number: H04N17/004 H04N5/144 H04N7/183

    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for image frame freeze detection. An example hardware accelerator includes a core logic circuit to generate second image data based on first image data associated with a first image frame, the second image data corresponding to at least one of processed image data, transformed image data, or one or more image data statistics, a load/store engine (LSE) coupled to the core logic circuit, the LSE to determine a first CRC value based on the second image data obtained from the core logic circuit, and a first interface coupled to a second interface, the second interface coupled to memory, the first interface to transmit the first CRC value obtained from the memory to a host device.

    DATA PROCESSING PIPELINE
    36.
    发明公开

    公开(公告)号:US20230385114A1

    公开(公告)日:2023-11-30

    申请号:US18175333

    申请日:2023-02-27

    CPC classification number: G06F9/5027 G06F9/4881 G06F9/544

    Abstract: A data processing device includes a plurality of hardware accelerators, a scheduler circuit, and a blocking circuit. The scheduler circuit is coupled to the plurality of hardware accelerators, and includes a plurality of hardware task schedulers. Each hardware task scheduler is coupled to a corresponding hardware accelerator, and is configured to control execution of the task by the hardware accelerator. The blocking circuit is coupled to the plurality of hardware accelerators and configured to inhibit communication between a first hardware accelerator and a second hardware accelerator of the plurality of hardware task schedulers.

    Debug for multi-threaded processing

    公开(公告)号:US11789836B2

    公开(公告)日:2023-10-17

    申请号:US17462046

    申请日:2021-08-31

    Abstract: A system to implement debugging for a multi-threaded processor is provided. The system includes a hardware thread scheduler configured to schedule processing of data, and a plurality of schedulers, each configured to schedule a given pipeline for processing instructions. The system further includes a debug control configured to control at least one of the plurality of schedulers to halt, step, or resume the given pipeline of the at least one of the plurality of schedulers for the data to enable debugging thereof. The system further includes a plurality of hardware accelerators configured to implement a series of tasks in accordance with a schedule provided by a respective scheduler in accordance with a command from the debug control. Each of the plurality of hardware accelerators is coupled to at least one of the plurality of schedulers to execute the instructions for the given pipeline and to a shared memory.

    Bandwidth Controlled Data Synchronization for Image and Vision Processor

    公开(公告)号:US20230259402A1

    公开(公告)日:2023-08-17

    申请号:US18302945

    申请日:2023-04-19

    CPC classification number: G06F9/5044 G06F9/4887 G06F11/0757 G06F9/52

    Abstract: Systems include data processors to process a set of image data in parallel, and thread schedulers coupled to the data processors. Each of the thread schedulers provides a respective task start signal for a respective data processor. Such systems also include a bandwidth controller coupled to one or more data processors. The bandwidth controller is configured to, for each of the data processor(s): maintain a respective token count, and determine whether to stall or propagate the respective task start signal from the respective thread scheduler to the data processor based on the respective token count. Other aspects include pattern adaptors respectively provided in the schedulers to allow mixing of multiple data patterns across blocks of data, transaction aggregators that allow re-using the same image data by multiple threads of execution while the image data remains in a given data buffer, and timers to detect failure and hang events.

Patent Agency Ranking