METHOD FOR PERFORMING RANDOM READ ACCESS TO A BLOCK OF DATA USING PARALLEL LUT READ INSTRUCTION IN VECTOR PROCESSORS

    公开(公告)号:US20210255869A1

    公开(公告)日:2021-08-19

    申请号:US17306350

    申请日:2021-05-03

    Abstract: This disclosure is directed to the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory.

    Method for performing random read access to a block of data using parallel LUT read instruction in vector processors

    公开(公告)号:US10996955B2

    公开(公告)日:2021-05-04

    申请号:US16451330

    申请日:2019-06-25

    Abstract: This disclosure is directed to the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory.

    Method for performing random read access to a block of data using parallel LUT read instruction in vector processors

    公开(公告)号:US10331347B2

    公开(公告)日:2019-06-25

    申请号:US15991653

    申请日:2018-05-29

    Abstract: This disclosure is directed to the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory.

    Optical flow determination using pyramidal block matching

    公开(公告)号:US09681150B2

    公开(公告)日:2017-06-13

    申请号:US14737904

    申请日:2015-06-12

    CPC classification number: H04N19/53 H04N19/56

    Abstract: An image processing system includes a processor and optical flow determination logic. The optical flow determination logic is to quantify relative motion of a feature present in a first frame of video and a second frame of video with respect to the two frames of video. The optical flow determination logic configures the processor to convert each of the frames of video into a hierarchical image pyramid. The image pyramid comprises a plurality of image levels. Image resolution is reduced at each higher one of the image levels. For each image level and for each pixel in the first frame, the processor is configured to establish an initial estimate of a location of the pixel in the second frame and to apply a plurality of sequential searches, starting from the initial estimate, that establish refined estimates of the location of the pixel in the second frame.

    Optimized fast feature detection for vector processors

    公开(公告)号:US09652686B2

    公开(公告)日:2017-05-16

    申请号:US15345523

    申请日:2016-11-08

    Abstract: This invention enables effective corner detection of pixels of an image using the FAST algorithm using a vector SIMD processor. This invention loads an 8×8 pixel block that includes four 7×7 pixel blocks including the 16 peripheral pixels to be tested for each of four center pixels. This invention rearranges the 64 pixels of the 8×8 block to form a 16 element array for each center pixel preferably using a vector permutation instruction. This invention uses vector SIMD subtraction and compare and vector SIMD addition and compare to make the FAST algorithm comparisons. The N consecutive pixels determinations of the FAST algorithm are made from the results of plural shift and AND operations. The corresponding center pixel is marked a corner or not a corner dependent upon of the results of plural shift and AND operations.

    Processor Instructions for Accelerating Video Coding
    7.
    发明申请
    Processor Instructions for Accelerating Video Coding 审中-公开
    加速视频编码的处理器说明

    公开(公告)号:US20150296212A1

    公开(公告)日:2015-10-15

    申请号:US14684334

    申请日:2015-04-11

    Abstract: A control processor for a video encode-decode engine is provided that includes an instruction pipeline. The instruction pipeline includes an instruction fetch stage coupled to an instruction memory to fetch instructions, an instruction decoding stage coupled to the instruction fetch stage to receive the fetched instructions, and an execution stage coupled to the instruction decoding stage to receive and execute decoded instructions. The instruction decoding stage and the instruction execution stage are configured to decode and execute a set of instructions in an instruction set of the control processor that are designed specifically for accelerating video sequence encoding and encoded video bit stream decoding.

    Abstract translation: 提供了一种用于视频编码解码引擎的控制处理器,其包括指令流水线。 指令流水线包括与指令存储器耦合以取指令的指令提取级,耦合到指令提取级以接收所取指令的指令解码级,以及耦合到指令解码级的接收和执行解码指令的执行级。 指令解码级和指令执行级被配置为解码和执行专门用于加速视频序列编码和编码视频位流解码的控制处理器的指令集中的一组指令。

Patent Agency Ranking