-
公开(公告)号:US20210081201A1
公开(公告)日:2021-03-18
申请号:US17107823
申请日:2020-11-30
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Jorge Parra , Ashutosh Garg , Chandra Gurram , Chunhui Mei , Durgesh Borkar , Shubra Marwaha , Supratim Pal , Varghese George , Wei Xiong , Yan Li , Yongsheng Liu , Dipankar Das , Sasikanth Avancha , Dharma Teja Vooturi , Naveen K. Mellempudi
Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
-
公开(公告)号:US10747923B2
公开(公告)日:2020-08-18
申请号:US16365032
申请日:2019-03-26
Applicant: Intel Corporation
Inventor: Jing Zhang , Yan Li
IPC: G06F30/331 , G01R31/3177 , G06F7/58 , G06F30/34
Abstract: Programmable integrated circuits may be used to perform hardware emulation of an application-specific integrated circuit (ASIC) design. The ASIC design may be loaded onto the programmable integrated circuit as a circuit under test (CUT). During hardware emulation operations, an emulation host may be used to coordinate testing of the CUT on the programmable device. To help emulate a power gating event for the ASIC design, the programmable device may be provided with an encoder at the input of the CUT, a decoder at the output of the CUT, and a pseudorandom number generator (PRNG) that outputs a value for adjusting the encoder and decoder. The value output from the PRNG stays fixed when there is no power loss, but will change to a new value during a power gating event. Operated in this way, the data read out from the CUT after the power gating event is effectively corrupted.
-
公开(公告)号:US20250045582A1
公开(公告)日:2025-02-06
申请号:US18804720
申请日:2024-08-14
Applicant: Intel Corporation
Inventor: Anbang Yao , Yiwen Guo , Yan Li , Yurong Chen
Abstract: Techniques related to compressing a pre-trained dense deep neural network to a sparsely connected deep neural network for efficient implementation are discussed. Such techniques may include iteratively pruning and splicing available connections between adjacent layers of the deep neural network and updating weights corresponding to both currently disconnected and currently connected connections between the adjacent layers.
-
公开(公告)号:US11977885B2
公开(公告)日:2024-05-07
申请号:US17107823
申请日:2020-11-30
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Jorge Parra , Ashutosh Garg , Chandra Gurram , Chunhui Mei , Durgesh Borkar , Shubra Marwaha , Supratim Pal , Varghese George , Wei Xiong , Yan Li , Yongsheng Liu , Dipankar Das , Sasikanth Avancha , Dharma Teja Vooturi , Naveen K. Mellempudi
CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30101 , G06F9/3893 , G06F15/8046
Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
-
公开(公告)号:US20250053613A1
公开(公告)日:2025-02-13
申请号:US18812251
申请日:2024-08-22
Applicant: Intel Corporation
Inventor: Chunhui Mei , Hong Jiang , Jiasheng Chen , Yongsheng Liu , Yan Li
Abstract: Matrix multiply units can take advantage of input sparsity by zero gating ALUs, which saves power consumption, but compute throughput does not increase. To improve compute throughput from sparsity, processing resources in a matrix accelerator can skip computation with zero involved in input or output. If zeros in input can be skipped, the processing units can focus calculations on generating meaningful non-zero output.
-
公开(公告)号:US20240320000A1
公开(公告)日:2024-09-26
申请号:US18621539
申请日:2024-03-29
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Jorge Parra , Ashutosh Garg , Chandra Gurram , Chunhui Mei , Durgesh Borkar , Shubra Marwaha , Supratim Pal , Varghese George , Wei Xiong , Yan Li , Yongsheng Liu , Dipankar Das , Sasikanth Avancha , Dharma Teja Vooturi , Naveen K. Mellempudi
CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30101 , G06F9/3893 , G06F15/8046
Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
-
公开(公告)号:US12093813B2
公开(公告)日:2024-09-17
申请号:US16328689
申请日:2016-09-30
Applicant: Intel Corporation
Inventor: Anbang Yao , Yiwen Guo , Yan Li , Yurong Chen
Abstract: Techniques related to compressing a pre-trained dense deep neural network to a sparsely connected deep neural network for efficient implementation are discussed. Such techniques may include iteratively pruning and splicing available connections between adjacent layers of the deep neural network and updating weights corresponding to both currently disconnected and currently connected connections between the adjacent layers.
-
公开(公告)号:US20190188567A1
公开(公告)日:2019-06-20
申请号:US16328689
申请日:2016-09-30
Applicant: Intel Corporation
Inventor: Anbang Yao , Yiwen Guo , Yan Li , Yurong Chen
CPC classification number: G06N3/08 , G06N3/04 , G06N3/0454 , G06N3/082
Abstract: Techniques related to compressing a pre-trained dense deep neural network to a sparsely connected deep neural network for efficient implementation are discussed. Such techniques may include iteratively pruning and splicing available connections between adjacent layers of the deep neural network and updating weights corresponding to both currently disconnected and currently connected connections between the adjacent layers.
-
公开(公告)号:US12086205B2
公开(公告)日:2024-09-10
申请号:US17211627
申请日:2021-03-24
Applicant: Intel Corporation
Inventor: Chunhui Mei , Hong Jiang , Jiasheng Chen , Yongsheng Liu , Yan Li
CPC classification number: G06F17/16 , G06F7/5443 , G06F9/3001 , G06F9/30043 , G06F15/8046 , G06F17/11
Abstract: Matrix multiply units can take advantage of input sparsity by zero gating ALUs, which saves power consumption, but compute throughput does not increase. To improve compute throughput from sparsity, processing resources in a matrix accelerator can skip computation with zero involved in input or output. If zeros in input can be skipped, the processing units can focus calculations on generating meaningful non-zero output.
-
公开(公告)号:US20190220559A1
公开(公告)日:2019-07-18
申请号:US16365032
申请日:2019-03-26
Applicant: Intel Corporation
Inventor: Jing Zhang , Yan Li
IPC: G06F17/50 , G06F7/58 , G01R31/3177
Abstract: Programmable integrated circuits may be used to perform hardware emulation of an application-specific integrated circuit (ASIC) design. The ASIC design may be loaded onto the programmable integrated circuit as a circuit under test (CUT). During hardware emulation operations, an emulation host may be used to coordinate testing of the CUT on the programmable device. To help emulate a power gating event for the ASIC design, the programmable device may be provided with an encoder at the input of the CUT, a decoder at the output of the CUT, and a pseudorandom number generator (PRNG) that outputs a value for adjusting the encoder and decoder. The value output from the PRNG stays fixed when there is no power loss, but will change to a new value during a power gating event. Operated in this way, the data read out from the CUT after the power gating event is effectively corrupted.
-
-
-
-
-
-
-
-
-