-
公开(公告)号:US10791284B2
公开(公告)日:2020-09-29
申请号:US16659702
申请日:2019-10-22
Applicant: Google LLC
Inventor: Qiuling Zhu , Ofer Shacham , Jason Rupert Redgrave , Daniel Frederic Finchelstein , Albert Meixner
Abstract: In a general aspect, an apparatus can include image processing logic (IPL) configured to perform an image processing operation on pixel data corresponding with an image having a width of W pixels and a height of H pixels to produce output pixel data in vertical slices of K pixels using K vertically overlapping stencils of S×S pixels, K being greater than 1 and less than H, S being greater than or equal to 2, and W being greater than S. The apparatus can also include a linebuffer operationally coupled with the IPL, the linebuffer configured to buffer the pixel data for the IPL. The linebuffer can include a full-size buffer having a width of W and a height of (S−1). The linebuffer can also include a sliding buffer having a width of SB and a height of K, SB being greater than or equal to S and less than W.
-
公开(公告)号:US20190238758A1
公开(公告)日:2019-08-01
申请号:US16376479
申请日:2019-04-05
Applicant: Google LLC
Inventor: Qiuling Zhu , Ofer Shacham , Jason Rupert Redgrave , Daniel Frederic Finchelstein , Albert Meixner
Abstract: In a general aspect, an apparatus can include image processing logic (IPL) configured to perform an image processing operation on pixel data corresponding with an image having a width of W pixels and a height of H pixels to produce output pixel data in vertical slices of K pixels using K vertically overlapping stencils of S×S pixels, K being greater than 1 and less than H, S being greater than or equal to 2, and W being greater than S. The apparatus can also include a linebuffer operationally coupled with the IPL, the linebuffer configured to buffer the pixel data for the IPL. The linebuffer can include a full-size buffer having a width of W and a height of (S−1). The linebuffer can also include a sliding buffer having a width of SB and a height of K, SB being greater than or equal to S and less than W.
-
公开(公告)号:US10334194B2
公开(公告)日:2019-06-25
申请号:US15946095
申请日:2018-04-05
Applicant: Google LLC
Inventor: Albert Meixner , Daniel Frederic Finchelstein , David Patterson , William R. Mark , Jason Rupert Redgrave , Ofer Shacham
Abstract: A method is described that includes, on an image processor having a two dimensional execution lane array and a two dimensional shift register array, repeatedly shifting first content of multiple rows or columns of the two dimensional shift register array and repeatedly executing at least one instruction between shifts that operates on the shifted first content and/or second content that is resident in respective locations of the two dimensional shift register array that the shifted first content has been shifted into.
-
公开(公告)号:US11153464B2
公开(公告)日:2021-10-19
申请号:US16526063
申请日:2019-07-30
Applicant: Google LLC
Inventor: Ofer Shacham , Jason Rupert Redgrave , Albert Meixner , Qiuling Zhu , Daniel Frederic Finchelstein , David Patterson , Donald Stark
Abstract: An apparatus is described. The apparatus includes an execution lane array coupled to a two dimensional shift register array structure. Locations in the execution lane array are coupled to same locations in the two-dimensional shift register array structure such that different execution lanes have different dedicated registers.
-
公开(公告)号:US11140293B2
公开(公告)日:2021-10-05
申请号:US16786359
申请日:2020-02-10
Applicant: Google LLC
Inventor: Albert Meixner , Jason Rupert Redgrave , Ofer Shacham , Qiuling Zhu , Daniel Frederic Finchelstein
Abstract: A sheet generator circuit is described. The sheet generator includes electronic circuitry to receive a line group of image data including multiple rows of data from a frame of image data. The multiple rows are sufficient in number to encompass multiple neighboring overlapping stencils. The electronic circuitry is to parse the line group into a smaller sized sheet. The electronic circuitry is to load the sheet into a data computation unit having a two dimensional shift array structure coupled to an array of processors.
-
公开(公告)号:US20210004633A1
公开(公告)日:2021-01-07
申请号:US17028097
申请日:2020-09-22
Applicant: Google LLC
Inventor: Ofer Shacham , David Patterson , William R. Mark , Albert Meixner , Daniel Frederic Finchelstein , Jason Rupert Redgrave
Abstract: A method is described that includes executing a convolutional neural network layer on an image processor having an array of execution lanes and a two-dimensional shift register. The two-dimensional shift register provides local respective register space for the execution lanes. The executing of the convolutional neural network includes loading a plane of image data of a three-dimensional block of image data into the two-dimensional shift register. The executing of the convolutional neural network also includes performing a two-dimensional convolution of the plane of image data with an array of coefficient values by sequentially: concurrently multiplying within the execution lanes respective pixel and coefficient values to produce an array of partial products; concurrently summing within the execution lanes the partial products with respective accumulations of partial products being kept within the two dimensional register for different stencils within the image data; and, effecting alignment of values for the two-dimensional convolution within the execution lanes by shifting content within the two-dimensional shift register array.
-
公开(公告)号:US10719905B2
公开(公告)日:2020-07-21
申请号:US16547801
申请日:2019-08-22
Applicant: Google LLC
Inventor: Qiuling Zhu , Ofer Shacham , Albert Meixner , Jason Rupert Redgrave , Daniel Frederic Finchelstein , David Patterson , Neeti Desai , Donald Stark , Edward Chang , William R. Mark
Abstract: An apparatus is described. The apparatus includes an image processing unit. The image processing unit includes a plurality of stencil processor circuits each comprising an array of execution unit lanes coupled to a two-dimensional shift register array structure to simultaneously process multiple overlapping stencils through execution of program code. The image processing unit includes a plurality of sheet generators respectively coupled between the plurality of stencil processors and the network. The sheet generators are to parse input line groups of image data into input sheets of image data for processing by the stencil processors, and, to form output line groups of image data from output sheets of image data received from the stencil processors. The image processing unit includes a plurality of line buffer units coupled to the network to pass line groups in a direction from producing stencil processors to consuming stencil processors to implement an overall program flow.
-
公开(公告)号:US10417732B2
公开(公告)日:2019-09-17
申请号:US15599348
申请日:2017-05-18
Applicant: Google LLC
Inventor: Qiuling Zhu , Ofer Shacham , Albert Meixner , Jason Rupert Redgrave , Daniel Frederic Finchelstein , David Patterson , Neeti Desai , Donald Stark , Edward Chang , William Mark
Abstract: An apparatus is described. The apparatus includes an image processing unit. The image processing unit includes a plurality of stencil processor circuits each comprising an array of execution unit lanes coupled to a two-dimensional shift register array structure to simultaneously process multiple overlapping stencils through execution of program code. The image processing unit includes a plurality of sheet generators respectively coupled between the plurality of stencil processors and the network. The sheet generators are to parse input line groups of image data into input sheets of image data for processing by the stencil processors, and, to form output line groups of image data from output sheets of image data received from the stencil processors. The image processing unit includes a plurality of line buffer units coupled to the network to pass line groups in a direction from producing stencil processors to consuming stencil processors to implement an overall program flow.
-
公开(公告)号:US10387989B2
公开(公告)日:2019-08-20
申请号:US15628480
申请日:2017-06-20
Applicant: Google LLC
Inventor: Albert Meixner , Hyunchul Park , William R. Mark , Daniel Frederic Finchelstein , Ofer Shacham
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for restructuring an image processing pipeline. The method includes compiling program code targeted for an image processor having programmable stencil processors composed of respective two-dimensional execution lane and shift register circuit structures. The program code is to implement a directed acyclic graph and is composed of multiple kernels that are to execute on respective ones of the stencil processors, wherein the compiling includes performing any of: horizontal fusion of kernels; vertical fusion of kernels; fission of one of the kernels into multiple kernels; spatial partitioning of a kernel into multiple spatially partitioned kernels; or splitting the directed acyclic graph into smaller graphs.
-
公开(公告)号:US10387988B2
公开(公告)日:2019-08-20
申请号:US15389113
申请日:2016-12-22
Applicant: Google LLC
Inventor: Albert Meixner , Hyunchul Park , William R. Mark , Daniel Frederic Finchelstein , Ofer Shacham
Abstract: A method is described. The method includes compiling program code targeted for an image processor having programmable stencil processors composed of respective two-dimensional execution lane and shift register circuit structures. The program code is to implement a directed acyclic graph and is composed of multiple kernels that are to execute on respective ones of the stencil processors, wherein the compiling includes any of: recognizing there are a different number of kernels in the program code than stencil processors in the image processor; recognizing that at least one of the kernels is more computationally intensive than another one of the kernels; and, recognizing that the program code has resource requirements that exceed the image processor's memory capacity. The compiling further includes in response to any of the recognizing above performing any of: horizontal fusion of kernels; vertical fusion of kernels; fission of one of the kernels into multiple kernels; spatial partitioning of a kernel into multiple spatially partitioned kernels; splitting the directed acyclic graph into smaller graphs.
-
-
-
-
-
-
-
-
-