PROGRAM CODE TRANSFORMATIONS TO IMPROVE IMAGE PROCESSOR RUNTIME EFFICIENCY

    公开(公告)号:US20180329745A1

    公开(公告)日:2018-11-15

    申请号:US15594517

    申请日:2017-05-12

    Applicant: Google Inc.

    CPC classification number: G06F9/5005 G06F9/4881 G06T1/20 G06T1/60

    Abstract: A method is described. The method includes constructing an image processing software data flow in which a buffer stores and forwards image data being transferred from a producing kernel to one or more consuming kernels. The method also includes recognizing that the buffer has insufficient resources to store and forward the image data. The method also includes modifying the image processing software data flow to include multiple buffers that store and forward the image data during the transfer of the image data from the producing kernel to the one or more consuming kernels.

    COMPILER MANAGED MEMORY FOR IMAGE PROCESSOR
    2.
    发明申请

    公开(公告)号:US20170249717A1

    公开(公告)日:2017-08-31

    申请号:US15427374

    申请日:2017-02-08

    Applicant: Google Inc.

    CPC classification number: G06T1/60 G06F9/3887 G06T1/20

    Abstract: A method is described. The method includes repeatedly loading a next sheet of image data from a first location of a memory into a two dimensional shift register array. The memory is locally coupled to the two-dimensional shift register array and an execution lane array having a smaller dimension than the two-dimensional shift register array along at least one array axis. The loaded next sheet of image data keeps within an image area of the two-dimensional shift register array. The method also includes repeatedly determining output values for the next sheet of image data through execution of program code instructions along respective lanes of the execution lane array, wherein, a stencil size used in determining the output values encompasses only pixels that reside within the two-dimensional shift register array. The method also includes repeatedly moving a next sheet of image data to be fully loaded into the two dimensional shift register array from a second location of the memory to the first location of the memory.

    Compiler Techniques for Mapping Program Code to a High Performance, Power Efficient, Programmable Image Processing Hardware Platform

    公开(公告)号:US20170249716A1

    公开(公告)日:2017-08-31

    申请号:US15389113

    申请日:2016-12-22

    Applicant: Google Inc.

    CPC classification number: G06T1/20 G06F8/447 G06F9/5077

    Abstract: A method is described. The method includes compiling program code targeted for an image processor having programmable stencil processors composed of respective two-dimensional execution lane and shift register circuit structures. The program code is to implement a directed acyclic graph and is composed of multiple kernels that are to execute on respective ones of the stencil processors, wherein the compiling includes any of: recognizing there are a different number of kernels in the program code than stencil processors in the image processor; recognizing that at least one of the kernels is more computationally intensive than another one of the kernels; and, recognizing that the program code has resource requirements that exceed the image processor's memory capacity. The compiling further includes in response to any of the recognizing above performing any of: horizontal fusion of kernels; vertical fusion of kernels; fission of one of the kernels into multiple kernels; spatial partitioning of a kernel into multiple spatially partitioned kernels; splitting the directed acyclic graph into smaller graphs.

    CONFIGURATION OF APPLICATION SOFTWARE ON MULTI-CORE IMAGE PROCESSOR

    公开(公告)号:US20180329746A1

    公开(公告)日:2018-11-15

    申请号:US15594529

    申请日:2017-05-12

    Applicant: Google Inc.

    Abstract: A method is described. The method includes calculating data transfer metrics for kernel-to-kernel connections of a program having a plurality of kernels that is to execute on an image processor. The image processor includes a plurality of processing cores and a network connecting the plurality of processing cores. Each of the kernel-to-kernel connections include a producing kernel that is to execute on one of the processing cores and a consuming kernel that is to execute on another one of the processing cores. The consuming kernel is to operate on data generated by the producing kernel. The method also includes assigning kernels of the plurality of kernels to respective ones of the processing cores based on the calculated data transfer metrics.

Patent Agency Ranking