Convolutional Neural Network On Programmable Two Dimensional Image Processor

    公开(公告)号:US20180005074A1

    公开(公告)日:2018-01-04

    申请号:US15201204

    申请日:2016-07-01

    Applicant: Google Inc.

    Abstract: A method is described that includes executing a convolutional neural network layer on an image processor having an array of execution lanes and a two-dimensional shift register. The two-dimensional shift register provides local respective register space for the execution lanes. The executing of the convolutional neural network includes loading a plane of image data of a three-dimensional block of image data into the two-dimensional shift register. The executing of the convolutional neural network also includes performing a two-dimensional convolution of the plane of image data with an array of coefficient values by sequentially: concurrently multiplying within the execution lanes respective pixel and coefficient values to produce an array of partial products; concurrently summing within the execution lanes the partial products with respective accumulations of partial products being kept within the two dimensional register for different stencils within the image data; and, effecting alignment of values for the two-dimensional convolution within the execution lanes by shifting content within the two-dimensional shift register array.

    MULTI-FUNCTIONAL EXECUTION LANE FOR IMAGE PROCESSOR

    公开(公告)号:US20170161064A1

    公开(公告)日:2017-06-08

    申请号:US14960334

    申请日:2015-12-04

    Applicant: Google Inc.

    CPC classification number: G06F9/3001 G06F7/57 G06F9/30014 G06F15/80

    Abstract: An apparatus is described that includes an execution unit having a multiply add computation unit, a first ALU logic unit and a second ALU logic unit. The ALU unit is to perform first, second, third and fourth instructions. The first instruction is a multiply add instruction. The second instruction is to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction. The third instruction is to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction. The fourth instruction is to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate during to determine first and second division resultant digit values.

    COMPILER MANAGED MEMORY FOR IMAGE PROCESSOR
    6.
    发明申请

    公开(公告)号:US20170249717A1

    公开(公告)日:2017-08-31

    申请号:US15427374

    申请日:2017-02-08

    Applicant: Google Inc.

    CPC classification number: G06T1/60 G06F9/3887 G06T1/20

    Abstract: A method is described. The method includes repeatedly loading a next sheet of image data from a first location of a memory into a two dimensional shift register array. The memory is locally coupled to the two-dimensional shift register array and an execution lane array having a smaller dimension than the two-dimensional shift register array along at least one array axis. The loaded next sheet of image data keeps within an image area of the two-dimensional shift register array. The method also includes repeatedly determining output values for the next sheet of image data through execution of program code instructions along respective lanes of the execution lane array, wherein, a stencil size used in determining the output values encompasses only pixels that reside within the two-dimensional shift register array. The method also includes repeatedly moving a next sheet of image data to be fully loaded into the two dimensional shift register array from a second location of the memory to the first location of the memory.

    Shift Register With Reduced Wiring Complexity

    公开(公告)号:US20170163931A1

    公开(公告)日:2017-06-08

    申请号:US15352260

    申请日:2016-11-15

    Applicant: Google Inc.

    Abstract: A shift register is described. The shift register includes a plurality of cells and register space. The shift register includes circuitry having inputs to receive shifted data and outputs to transmit shifted data, wherein: i) circuitry of cells physically located between first and second logically ordered cells are configured to not perform any logical shift; ii) circuitry of cells coupled to receive shifted data transmitted by an immediately preceding logically ordered cell comprises circuitry for writing into local register space data received at an input assigned an amount of shift specified in a shift command being executed by the shift register, and, iii) circuitry of cells coupled to transmit shifted data to an immediately following logically ordered cell comprises circuitry to transmit data from an output assigned an incremented shift amount from a shift amount of an input that the data was received on.

Patent Agency Ranking