-
公开(公告)号:US10970081B2
公开(公告)日:2021-04-06
申请号:US15637629
申请日:2017-06-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Jiasheng Chen , Bin He , Mohammad Reza Hakami , Timothy Lottes , Justin David Smith , Michael J. Mantor , Derek Carson
Abstract: Systems, apparatuses, and methods for implementing a decoupled crossbar for a stream processor are disclosed. In one embodiment, a system includes at least a multi-lane execution pipeline, a vector register file, and a crossbar. The system is configured to determine if a given instruction in an instruction stream requires a permutation on data operands retrieved from the vector register file. The system conveys the data operands to the multi-lane execution pipeline on a first path which includes the crossbar responsive to determining the given instruction requires a permutation on the data operands. The crossbar then performs the necessary permutation to route the data operands to the proper processing lanes. Otherwise, the system conveys the data operands to the multi-lane execution pipeline on a second path which bypasses the crossbar responsive to determining the given instruction does not require a permutation on the input operands.
-
公开(公告)号:US20190004814A1
公开(公告)日:2019-01-03
申请号:US15637629
申请日:2017-06-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Jiasheng Chen , Bin He , Mohammad Reza Hakami , Timothy Lottes , Justin David Smith , Michael J. Mantor , Derek Carson
Abstract: Systems, apparatuses, and methods for implementing a decoupled crossbar for a stream processor are disclosed. In one embodiment, a system includes at least a multi-lane execution pipeline, a vector register file, and a crossbar. The system is configured to determine if a given instruction in an instruction stream requires a permutation on data operands retrieved from the vector register file. The system conveys the data operands to the multi-lane execution pipeline on a first path which includes the crossbar responsive to determining the given instruction requires a permutation on the data operands. The crossbar then performs the necessary permutation to route the data operands to the proper processing lanes. Otherwise, the system conveys the data operands to the multi-lane execution pipeline on a second path which bypasses the crossbar responsive to determining the given instruction does not require a permutation on the input operands.
-