IMAGE PROCESSOR WITH CONFIGURABLE NUMBER OF ACTIVE CORES AND SUPPORTING INTERNAL NETWORK

    公开(公告)号:US20180329864A1

    公开(公告)日:2018-11-15

    申请号:US15594502

    申请日:2017-05-12

    Applicant: Google Inc.

    Abstract: A method is described. The method includes configuring a first instance of object code to execute on a processor. The processor has multiple cores and an internal network. The internal network is configured in a first configuration that enables a first number of the cores to be communicatively coupled. The method also includes configuring a second instance of the object code to execute on a second instance of the processor. A respective internal network of the second instance of the processor is configured in a second configuration that enables a different number of cores to be communicatively coupled, wherein, same positioned cores on the processor and the second instance of the processor have same network addresses for the first and second configurations. A processor is also described having an internal network designed to enable the above method.

    MULTI-FUNCTIONAL EXECUTION LANE FOR IMAGE PROCESSOR

    公开(公告)号:US20170161064A1

    公开(公告)日:2017-06-08

    申请号:US14960334

    申请日:2015-12-04

    Applicant: Google Inc.

    CPC classification number: G06F9/3001 G06F7/57 G06F9/30014 G06F15/80

    Abstract: An apparatus is described that includes an execution unit having a multiply add computation unit, a first ALU logic unit and a second ALU logic unit. The ALU unit is to perform first, second, third and fourth instructions. The first instruction is a multiply add instruction. The second instruction is to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction. The third instruction is to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction. The fourth instruction is to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate during to determine first and second division resultant digit values.

    Convolutional Neural Network On Programmable Two Dimensional Image Processor

    公开(公告)号:US20180005074A1

    公开(公告)日:2018-01-04

    申请号:US15201204

    申请日:2016-07-01

    Applicant: Google Inc.

    Abstract: A method is described that includes executing a convolutional neural network layer on an image processor having an array of execution lanes and a two-dimensional shift register. The two-dimensional shift register provides local respective register space for the execution lanes. The executing of the convolutional neural network includes loading a plane of image data of a three-dimensional block of image data into the two-dimensional shift register. The executing of the convolutional neural network also includes performing a two-dimensional convolution of the plane of image data with an array of coefficient values by sequentially: concurrently multiplying within the execution lanes respective pixel and coefficient values to produce an array of partial products; concurrently summing within the execution lanes the partial products with respective accumulations of partial products being kept within the two dimensional register for different stencils within the image data; and, effecting alignment of values for the two-dimensional convolution within the execution lanes by shifting content within the two-dimensional shift register array.

    Compiler Techniques for Mapping Program Code to a High Performance, Power Efficient, Programmable Image Processing Hardware Platform

    公开(公告)号:US20170249716A1

    公开(公告)日:2017-08-31

    申请号:US15389113

    申请日:2016-12-22

    Applicant: Google Inc.

    CPC classification number: G06T1/20 G06F8/447 G06F9/5077

    Abstract: A method is described. The method includes compiling program code targeted for an image processor having programmable stencil processors composed of respective two-dimensional execution lane and shift register circuit structures. The program code is to implement a directed acyclic graph and is composed of multiple kernels that are to execute on respective ones of the stencil processors, wherein the compiling includes any of: recognizing there are a different number of kernels in the program code than stencil processors in the image processor; recognizing that at least one of the kernels is more computationally intensive than another one of the kernels; and, recognizing that the program code has resource requirements that exceed the image processor's memory capacity. The compiling further includes in response to any of the recognizing above performing any of: horizontal fusion of kernels; vertical fusion of kernels; fission of one of the kernels into multiple kernels; spatial partitioning of a kernel into multiple spatially partitioned kernels; splitting the directed acyclic graph into smaller graphs.

    Statistics Operations On Two Dimensional Image Processor

    公开(公告)号:US20180005059A1

    公开(公告)日:2018-01-04

    申请号:US15201134

    申请日:2016-07-01

    Applicant: Google Inc.

    CPC classification number: G06K9/00986 G06T1/20 G11C19/00

    Abstract: A method is described that includes loading an array of content into a two-dimensional shift register. The two-dimensional shift register is coupled to an execution lane array. The method includes repeatedly performing a first sequence that includes: shifting with the shift register first content residing along a particular row or column into another parallel row or column where second content resides and performing mathematical operations with a particular corresponding row or column of the execution lane array on the first and second content. The method also includes repeatedly performing a second sequence that includes: shifting with the shift register content from a set of first locations along a resultant row or column that is parallel with the rows or columns of the first sequence into a corresponding set of second locations along the resultant row or column. The resultant row or column has values determined at least in part from the mathematical operations of the first sequence. The second sequence further includes performing mathematical operations on items of content from the set of first locations and respective items of content from the set of second locations with the execution lane array.

Patent Agency Ranking