-
公开(公告)号:US11153464B2
公开(公告)日:2021-10-19
申请号:US16526063
申请日:2019-07-30
Applicant: Google LLC
Inventor: Ofer Shacham , Jason Rupert Redgrave , Albert Meixner , Qiuling Zhu , Daniel Frederic Finchelstein , David Patterson , Donald Stark
Abstract: An apparatus is described. The apparatus includes an execution lane array coupled to a two dimensional shift register array structure. Locations in the execution lane array are coupled to same locations in the two-dimensional shift register array structure such that different execution lanes have different dedicated registers.
-
公开(公告)号:US11140293B2
公开(公告)日:2021-10-05
申请号:US16786359
申请日:2020-02-10
Applicant: Google LLC
Inventor: Albert Meixner , Jason Rupert Redgrave , Ofer Shacham , Qiuling Zhu , Daniel Frederic Finchelstein
Abstract: A sheet generator circuit is described. The sheet generator includes electronic circuitry to receive a line group of image data including multiple rows of data from a frame of image data. The multiple rows are sufficient in number to encompass multiple neighboring overlapping stencils. The electronic circuitry is to parse the line group into a smaller sized sheet. The electronic circuitry is to load the sheet into a data computation unit having a two dimensional shift array structure coupled to an array of processors.
-
公开(公告)号:US20210255972A1
公开(公告)日:2021-08-19
申请号:US17139750
申请日:2020-12-31
Applicant: Google LLC
Inventor: Vinod Chamarty , Xiaoyu Ma , Hongil Yoon , Keith Robert Pflederer , Weiping Liao , Benjamin Dodge , Albert Meixner , Allan Douglas Knies , Manu Gulati , Rahul Jagdish Thakur , Jason Rupert Redgrave
IPC: G06F13/16 , G06F12/0811 , G06F12/0877 , G06F12/0815
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for a system-level cache to allocate cache resources by a way-partitioning process. One of the methods includes maintaining a mapping between partitions and priority levels and allocating primary ways to respective enabled partitions in an order corresponding to the respective priority levels assigned to the enabled partitions.
-
公开(公告)号:US10998070B2
公开(公告)日:2021-05-04
申请号:US16659695
申请日:2019-10-22
Applicant: Google LLC
Inventor: Jason Rupert Redgrave
IPC: G11C19/38 , G11C19/00 , G06T1/20 , H04N5/907 , G06F15/173 , G06F15/80 , G11C7/10 , H01P5/12 , H03K19/17736 , H04N9/04 , G11C19/28
Abstract: A shift register is described. The shift register includes a plurality of cells and register space. The shift register includes circuitry having inputs to receive shifted data and outputs to transmit shifted data, wherein: i) circuitry of cells physically located between first and second logically ordered cells are configured to not perform any logical shift; ii) circuitry of cells coupled to receive shifted data transmitted by an immediately preceding logically ordered cell comprises circuitry for writing into local register space data received at an input assigned an amount of shift specified in a shift command being executed by the shift register, and, iii) circuitry of cells coupled to transmit shifted data to an immediately following logically ordered cell comprises circuitry to transmit data from an output assigned an incremented shift amount from a shift amount of an input that the data was received on.
-
公开(公告)号:US20210004633A1
公开(公告)日:2021-01-07
申请号:US17028097
申请日:2020-09-22
Applicant: Google LLC
Inventor: Ofer Shacham , David Patterson , William R. Mark , Albert Meixner , Daniel Frederic Finchelstein , Jason Rupert Redgrave
Abstract: A method is described that includes executing a convolutional neural network layer on an image processor having an array of execution lanes and a two-dimensional shift register. The two-dimensional shift register provides local respective register space for the execution lanes. The executing of the convolutional neural network includes loading a plane of image data of a three-dimensional block of image data into the two-dimensional shift register. The executing of the convolutional neural network also includes performing a two-dimensional convolution of the plane of image data with an array of coefficient values by sequentially: concurrently multiplying within the execution lanes respective pixel and coefficient values to produce an array of partial products; concurrently summing within the execution lanes the partial products with respective accumulations of partial products being kept within the two dimensional register for different stencils within the image data; and, effecting alignment of values for the two-dimensional convolution within the execution lanes by shifting content within the two-dimensional shift register array.
-
公开(公告)号:US10733956B2
公开(公告)日:2020-08-04
申请号:US16685388
申请日:2019-11-15
Applicant: Google LLC
Inventor: Albert Meixner , Neeti Desai , Dilan Manatunga , Jason Rupert Redgrave , William R. Mark
Abstract: An image processor is described. The image processor includes an I/O unit to read input image data from external memory for processing by the image processor and to write output image data from the image processor into the external memory. The I/O unit includes multiple logical channel units. Each logical channel unit is to form a logical channel between the external memory and a respective producing or consuming component within the image processor. Each logical channel unit is designed to utilize reformatting circuitry and addressing circuitry. The addressing circuitry is to control addressing schemes applied to the external memory and reformatting of image data between external memory and the respective producing or consuming component. The reformatting circuitry is to perform the reformatting.
-
公开(公告)号:US10719905B2
公开(公告)日:2020-07-21
申请号:US16547801
申请日:2019-08-22
Applicant: Google LLC
Inventor: Qiuling Zhu , Ofer Shacham , Albert Meixner , Jason Rupert Redgrave , Daniel Frederic Finchelstein , David Patterson , Neeti Desai , Donald Stark , Edward Chang , William R. Mark
Abstract: An apparatus is described. The apparatus includes an image processing unit. The image processing unit includes a plurality of stencil processor circuits each comprising an array of execution unit lanes coupled to a two-dimensional shift register array structure to simultaneously process multiple overlapping stencils through execution of program code. The image processing unit includes a plurality of sheet generators respectively coupled between the plurality of stencil processors and the network. The sheet generators are to parse input line groups of image data into input sheets of image data for processing by the stencil processors, and, to form output line groups of image data from output sheets of image data received from the stencil processors. The image processing unit includes a plurality of line buffer units coupled to the network to pass line groups in a direction from producing stencil processors to consuming stencil processors to implement an overall program flow.
-
公开(公告)号:US20200159494A1
公开(公告)日:2020-05-21
申请号:US16687488
申请日:2019-11-18
Applicant: Google LLC
Inventor: Artem Vasilyev , Albert Meixner , Jason Rupert Redgrave
Abstract: An execution unit is described. The execution unit includes an arithmetic logic unit (ALU) circuit having a first input to receive a first value and a second input to receive a second value. The ALU circuit includes circuitry to determine an absolute value of the first value and to add the absolute value to the second value. The first input is coupled to a first data path having register space and an output of another ALU of the execution unit circuit as alternative sources of the first value. The second input is coupled to a second data path having the register space as a source for the second value.
-
公开(公告)号:US10417732B2
公开(公告)日:2019-09-17
申请号:US15599348
申请日:2017-05-18
Applicant: Google LLC
Inventor: Qiuling Zhu , Ofer Shacham , Albert Meixner , Jason Rupert Redgrave , Daniel Frederic Finchelstein , David Patterson , Neeti Desai , Donald Stark , Edward Chang , William Mark
Abstract: An apparatus is described. The apparatus includes an image processing unit. The image processing unit includes a plurality of stencil processor circuits each comprising an array of execution unit lanes coupled to a two-dimensional shift register array structure to simultaneously process multiple overlapping stencils through execution of program code. The image processing unit includes a plurality of sheet generators respectively coupled between the plurality of stencil processors and the network. The sheet generators are to parse input line groups of image data into input sheets of image data for processing by the stencil processors, and, to form output line groups of image data from output sheets of image data received from the stencil processors. The image processing unit includes a plurality of line buffer units coupled to the network to pass line groups in a direction from producing stencil processors to consuming stencil processors to implement an overall program flow.
-
公开(公告)号:US20190188824A1
公开(公告)日:2019-06-20
申请号:US16272819
申请日:2019-02-11
Applicant: Google LLC
Inventor: Albert Meixner , Hyunchul Park , Qiuling Zhu , Jason Rupert Redgrave
Abstract: A method is described. The method includes repeatedly loading a next sheet of image data from a first location of a memory into a two dimensional shift register array. The memory is locally coupled to the two-dimensional shift register array and an execution lane array having a smaller dimension than the two-dimensional shift register array along at least one array axis. The loaded next sheet of image data keeps within an image area of the two-dimensional shift register array. The method also includes repeatedly determining output values for the next sheet of image data through execution of program code instructions along respective lanes of the execution lane array, wherein, a stencil size used in determining the output values encompasses only pixels that reside within the two-dimensional shift register array.
-
-
-
-
-
-
-
-
-