-
公开(公告)号:US20240103879A1
公开(公告)日:2024-03-28
申请号:US17952270
申请日:2022-09-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Bin He , Michael John Mantor , Brian Emberling , Liang Huang , Chao Liu
CPC classification number: G06F9/3887 , G06F9/3001 , G06F9/30043 , G06F9/30098
Abstract: Block data load with transpose techniques are described. In one example, an input is received, at a control unit, specifying an instruction to load a block of data to at least one memory module using a transpose operation. Responsive to the receiving the input by the control unit, the block of data is caused to be loaded to the at least one memory module by transposing the block of data to form a transposed block of data and storing the transposed block of data in the at least one memory.
-
公开(公告)号:US11789732B2
公开(公告)日:2023-10-17
申请号:US17574026
申请日:2022-01-12
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Bin He , Jiasheng Chen , Jian Huang
CPC classification number: G06F9/3001 , G06F7/57 , G06F9/3009 , G06F9/30101 , G06F9/4806
Abstract: A graphics processing unit (GPU) sequences provision of operands to a set of operand registers, thereby allowing the GPU to share at least one of the operand registers between processing. The GPU includes a plurality of arithmetic logic units (ALUs) with at least one of the ALUs configured to perform double precision operations. The GPU further includes a set of operand registers configured to store single precision operands. For a plurality of executing threads that request double precision operations, the GPU stores the corresponding operands at the operand registers. Over a plurality of execution cycles, the GPU sequences transfer of operands from the set of operand registers to a designated double precision operand register. During each execution cycle, the double-precision ALU executes a double precision operation using the operand stored at the double precision operand register.
-