-
1.
公开(公告)号:EP4446870A1
公开(公告)日:2024-10-16
申请号:EP23892911.1
申请日:2023-03-06
发明人: ZUO, Hang
IPC分类号: G06F7/485
摘要: A tensor calculation unit and a use method, and a data processing apparatus and an operation method. The tensor calculation unit comprises a first multiply-add operator and a second multiply-add operator, which are cascaded, wherein the first multiply-add operator comprises a first input port, a second input port, a third input port and a first output port, the first input port, the second input port and the third input port are used for respectively receiving parameters A0, B0 and C, and the first multiply-add operator is configured to perform the calculation of D0 = A0 × B0 +C, and output a calculation result D0 at the first output port; and the second multiply-add operator comprises a fourth input port, a fifth input port, a sixth input port and a second output port, the fourth input port and the fifth input port are used for respectively receiving parameters A1 and B1, the sixth input port is coupled to the first output port so as to receive the calculation result D0, and the second multiply-add operator is configured to perform the calculation of D1 = A1 × B1 + D0, and output a calculation result D1 at the second output port. The tensor calculation unit can be infinitely extended and stacked.
-
2.
公开(公告)号:EP4432210A1
公开(公告)日:2024-09-18
申请号:EP23892910.3
申请日:2023-03-06
发明人: ZUO, Hang
摘要: A data processing method, a data processing apparatus, an electronic device, and a computer-readable storage medium. The data processing method is applied to a data processing apparatus. The data processing apparatus comprises a plurality of computing modules, wherein each of the plurality of computing modules comprises a plurality of thread execution units and a shared memory shared by the plurality of thread execution units, and the plurality of computing modules comprise a first computing module and a second computing module. The data processing method comprises: by means of a data transmission channel established between the shared memory of the first computing module and the shared memory of the second computing module, directly transmitting data to be exchanged between a first workgroup run by the first computing module and a second workgroup run by the second computing module. This method can shorten a data reading path, accelerate a data reading speed, and reduce the bandwidth requirements on a global memory reading path.
-
公开(公告)号:EP4446891A1
公开(公告)日:2024-10-16
申请号:EP23869325.3
申请日:2023-09-14
发明人: YU, Chen , ZUO, Hang , YUAN, Qing , PAN, Yu
IPC分类号: G06F9/50
摘要: The embodiments of the present disclosure provide a processing system of a thread block, a method and a relative device. The processing system includes: a first computing unit for running the first sub-thread block and a second computing unit for running the second sub-thread block; the first computing unit is used for obtaining the data to be processed of the thread block, and the second computing unit is used for executing the processing task of the thread block according to the data to be processed obtained by the first computing unit. The processing system of the thread block provided by the embodiments of the present disclosure can effectively reduce the time delay of loading data to be processed, especially reduce the time delay of loading data to be processed from external storage space, and improve the processing efficiency of thread blocks.
-
公开(公告)号:EP4332781A1
公开(公告)日:2024-03-06
申请号:EP22894185.2
申请日:2022-05-16
发明人: ZHAI, Haifeng , ZUO, Hang , WANG, Sen , PAN, Yu , MEI, Chengqiang
IPC分类号: G06F13/16
摘要: A data processing method and apparatus, and a cache, a processor and an electronic device. The data processing method comprises: receiving a data processing request, wherein data that is requested by the data processing request comprises data that is suitable for being stored in at least two cache units, and main memory addresses of the data in each of the cache units are consecutive; and when main memory address information of each cache unit that satisfies a mapping relationship comprises all the main memory addresses, simultaneously performing data processing on the data which corresponds to the main memory addresses. By means of the method, a bandwidth is improved such that the efficiency of data transmission is improved.
-
-
-