Computing efficient cross channel operations in parallel computing machines using systolic arrays

    公开(公告)号:US12093213B2

    公开(公告)日:2024-09-17

    申请号:US18310129

    申请日:2023-05-01

    申请人: Intel Corporation

    IPC分类号: G06F15/80 G06F17/16 G06N20/00

    摘要: An apparatus to facilitate computing efficient cross channel operations in parallel computing machines using systolic arrays is disclosed. The apparatus includes a plurality of registers and one or more processing elements communicably coupled to the plurality of registers. The one or more processing elements include a systolic array circuit to perform cross-channel operations on source data received from a single source register of the plurality of registers, wherein the systolic array circuit is modified to: receive inputs from the single source register at different stages of the systolic array circuit; perform cross-channel operations at channels of the systolic array circuit; bypass disabled channels of the systolic array circuit, the disabled channels not used to compute the cross-channel operations; and broadcast a final result of a final stage of the systolic array circuit to all channels of a destination register.

    Computing efficient cross channel operations in parallel computing machines using systolic arrays

    公开(公告)号:US11669490B2

    公开(公告)日:2023-06-06

    申请号:US17518202

    申请日:2021-11-03

    申请人: Intel Corporation

    IPC分类号: G06F15/80 G06N20/00 G06F17/16

    摘要: An apparatus to facilitate computing efficient cross channel operations in parallel computing machines using systolic arrays is disclosed. The apparatus includes a plurality of registers and one or more processing elements communicably coupled to the plurality of registers. The one or more processing elements include a systolic array circuit to perform cross-channel operations on source data received from a single source register of the plurality of registers, wherein the systolic array circuit is modified to: receive inputs from the single source register at different stages of the systolic array circuit; perform cross-channel operations at channels of the systolic array circuit; bypass disabled channels of the systolic array circuit, the disabled channels not used to compute the cross-channel operations; and broadcast a final result of a final stage of the systolic array circuit to all channels of a destination register.