-
公开(公告)号:US20240103813A1
公开(公告)日:2024-03-28
申请号:US17934145
申请日:2022-09-21
Applicant: Amazon Technologies, Inc.
Inventor: Xiaodan Tan , Paul Gilbert Meyer , Sheng Xu , Ron Diamant
Abstract: An integrated circuit that combines transpose and compute operations may include a transpose circuit coupled to a set of compute channels. Each compute channel may include multiple arithmetic logic unit (ALU) circuits coupled in series. The transpose circuit is operable to receive an input tensor, transpose the input tensor, and output a transposed tensor to the set of compute channels. The set of compute channels is operable to generate outputs in parallel, with each of the outputs being generated from a corresponding vector of the transposed tensor.
-
公开(公告)号:US20240111528A1
公开(公告)日:2024-04-04
申请号:US17934147
申请日:2022-09-21
Applicant: Amazon Technologies, Inc.
Inventor: Xiaodan Tan , Paul Gilbert Meyer , Sheng Xu , Ron Diamant
CPC classification number: G06F9/30036 , G06F9/30145 , G06F9/3555
Abstract: A technique to execute transpose and compute operations may include retrieving a set of machine instructions from an instruction buffer of a data processor. The instruction buffer has multiple entries, and each entry stores one machine instruction. A machine instruction from the set of machine instructions is executed to transpose a submatrix of an input tensor and perform computations on column elements of the submatrix. The machine instruction combines the transpose operation with computational operations into a single machine instruction.
-
公开(公告)号:US12008368B2
公开(公告)日:2024-06-11
申请号:US17934147
申请日:2022-09-21
Applicant: Amazon Technologies, Inc.
Inventor: Xiaodan Tan , Paul Gilbert Meyer , Sheng Xu , Ron Diamant
CPC classification number: G06F9/30036 , G06F9/30145 , G06F9/3555
Abstract: A technique to execute transpose and compute operations may include retrieving a set of machine instructions from an instruction buffer of a data processor. The instruction buffer has multiple entries, and each entry stores one machine instruction. A machine instruction from the set of machine instructions is executed to transpose a submatrix of an input tensor and perform computations on column elements of the submatrix. The machine instruction combines the transpose operation with computational operations into a single machine instruction.
-
-