-
公开(公告)号:US20210042875A1
公开(公告)日:2021-02-11
申请号:US16976316
申请日:2019-02-21
申请人: GOOGLE LLC
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for supporting large lookup tables on an image processor. One of the methods includes receiving an input kernel program for an image processor having a two-dimensional array of execution lanes, a shift-register array, and a plurality of memory banks. If the kernel program has an instruction that reads a lookup table value for a lookup table partitioned across the plurality of memory banks, the instruction in the kernel program are replaced with a sequence of instructions that, when executed by an execution lane, causes the execution lane to read a first value from a local memory bank and a second value from the local memory bank on behalf of another execution lane belonging to a different group of execution lanes.
-
公开(公告)号:US10754654B2
公开(公告)日:2020-08-25
申请号:US16368288
申请日:2019-03-28
申请人: Google LLC
发明人: Albert Meixner , Jason Rupert Redgrave , Ofer Shacham , Daniel Frederic Finchelstein , Qiuling Zhu
摘要: An apparatus that includes a program controller to fetch and issue instructions is described. The apparatus includes an execution lane having at least one execution unit to execute the instructions. The execution lane is part of an execution lane array that is coupled to a two dimensional shift register array structure, wherein, execution lane s of the execution lane array are located at respective array locations and are coupled to dedicated registers at same respective array locations in the two-dimensional shift register array.
-
公开(公告)号:US20200258190A1
公开(公告)日:2020-08-13
申请号:US16779257
申请日:2020-01-31
申请人: Google LLC
发明人: Albert Meixner
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for supporting complex transfer functions on an image processor. One of the methods includes traversing, by each execution lane of an image processor using a shift-register array, a respective local support region and storing input pixels encountered during the traversal into local memory of the image processor. Each execution lane obtains from the local memory of the image processor one or more input pixels according to a complex transfer function. Each execution lane computes a respective output pixel for the kernel program using one or more input pixels obtained from the local memory according to the complex transfer function.
-
公开(公告)号:US10685422B2
公开(公告)日:2020-06-16
申请号:US16272819
申请日:2019-02-11
申请人: Google LLC
摘要: A method is described. The method includes repeatedly loading a next sheet of image data from a first location of a memory into a two dimensional shift register array. The memory is locally coupled to the two-dimensional shift register array and an execution lane array having a smaller dimension than the two-dimensional shift register array along at least one array axis. The loaded next sheet of image data keeps within an image area of the two-dimensional shift register array. The method also includes repeatedly determining output values for the next sheet of image data through execution of program code instructions along respective lanes of the execution lane array, wherein, a stencil size used in determining the output values encompasses only pixels that reside within the two-dimensional shift register array.
-
75.
公开(公告)号:US20200154072A1
公开(公告)日:2020-05-14
申请号:US16735050
申请日:2020-01-06
申请人: Google LLC
发明人: Albert Meixner , Daniel Frederic Finchelstein , David Patterson , William R. Mark , Jason Rupert Redgrave , Ofer Shacham
摘要: A method is described that includes, on an image processor having a two dimensional execution lane array and a two dimensional shift register array, repeatedly shifting first content of multiple rows or columns of the two dimensional shift register array and repeatedly executing at least one instruction between shifts that operates on the shifted first content and/or second content that is resident in respective locations of the two dimensional shift register array that the shifted first content has been shifted into.
-
公开(公告)号:US20200120287A1
公开(公告)日:2020-04-16
申请号:US16659702
申请日:2019-10-22
申请人: Google LLC
发明人: Qiuling Zhu , Ofer Shacham , Jason Rupert Redgrave , Daniel Frederic Finchelstein , Albert Meixner
摘要: In a general aspect, an apparatus can include image processing logic (IPL) configured to perform an image processing operation on pixel data corresponding with an image having a width of W pixels and a height of H pixels to produce output pixel data in vertical slices of K pixels using K vertically overlapping stencils of S×S pixels, K being greater than 1 and less than H, S being greater than or equal to 2, and W being greater than S. The apparatus can also include a linebuffer operationally coupled with the IPL, the linebuffer configured to buffer the pixel data for the IPL. The linebuffer can include a full-size buffer having a width of W and a height of (S−1). The linebuffer can also include a sliding buffer having a width of SB and a height of K, SB being greater than or equal to S and less than W.
-
公开(公告)号:US20200050488A1
公开(公告)日:2020-02-13
申请号:US16658989
申请日:2019-10-21
申请人: Google LLC
发明人: Hyunchul Park , Albert Meixner
摘要: A method is described. The method includes constructing an image processing software data flow in which a buffer stores and forwards image data being transferred from a producing kernel to one or more consuming kernels. The method also includes recognizing that the buffer has insufficient resources to store and forward the image data. The method also includes modifying the image processing software data flow to include multiple buffers that store and forward the image data during the transfer of the image data from the producing kernel to the one or more consuming kernels.
-
78.
公开(公告)号:US20200020069A1
公开(公告)日:2020-01-16
申请号:US16529633
申请日:2019-08-01
申请人: Google LLC
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for restructuring an image processing pipeline. The method includes compiling program code targeted for an image processor having programmable stencil processors composed of respective two-dimensional execution lane and shift register circuit structures. The program code is to implement a directed acyclic graph and is composed of multiple kernels that are to execute on respective ones of the stencil processors, wherein the compiling includes performing any of: horizontal fusion of kernels; vertical fusion of kernels; fission of one of the kernels into multiple kernels; spatial partitioning of a kernel into multiple spatially partitioned kernels; or splitting the directed acyclic graph into smaller graphs.
-
公开(公告)号:US10467056B2
公开(公告)日:2019-11-05
申请号:US15594529
申请日:2017-05-12
申请人: Google LLC
发明人: Hyunchul Park , Albert Meixner
摘要: A method is described. The method includes calculating data transfer metrics for kernel-to-kernel connections of a program having a plurality of kernels that is to execute on an image processor. The image processor includes a plurality of processing cores and a network connecting the plurality of processing cores. Each of the kernel-to-kernel connections include a producing kernel that is to execute on one of the processing cores and a consuming kernel that is to execute on another one of the processing cores. The consuming kernel is to operate on data generated by the producing kernel. The method also includes assigning kernels of the plurality of kernels to respective ones of the processing cores based on the calculated data transfer metrics.
-
公开(公告)号:US10380969B2
公开(公告)日:2019-08-13
申请号:US15389168
申请日:2016-12-22
申请人: Google LLC
摘要: An image processor is described. The image processor includes an I/O unit to read input image data from external memory for processing by the image processor and to write output image data from the image processor into the external memory. The I/O unit includes multiple logical channel units. Each logical channel unit is to form a logical channel between the external memory and a respective producing or consuming component within the image processor. Each logical channel unit is designed to utilize reformatting circuitry and addressing circuitry. The addressing circuitry is to control addressing schemes applied to the external memory and reformatting of image data between external memory and the respective producing or consuming component. The reformatting circuitry is to perform the reformatting.
-
-
-
-
-
-
-
-
-