-
公开(公告)号:US20250060940A1
公开(公告)日:2025-02-20
申请号:US18931973
申请日:2024-10-30
Applicant: Intel Corporation
Inventor: Arnab Raha , Michael Wu , Deepak Abraham Mathaikutty , Daksha Sharma , Martin Langhammer
Abstract: A data processing unit may include a memory, processing elements (PEs), and a control unit. The memory may store weight blocks within a weight tensor of a neural network operation. Each weight block has an input channel (IC) dimension and an output channel (OC) dimension and includes subblocks. A subblock includes one or more weights having a first data precision and one or more other weights having a second data precision. The second data precision is lower than the first data precision. The control unit may distribute different ones of the subblocks to different ones of the PEs. A PE may receive a subblock and perform a first MAC operation on a weight having a first data precision and a second MAC operation on a weight having a second data precision. The first MAC operation may consume more computation cycles or more multipliers than the second MAC operation.