-
1.
公开(公告)号:US10678724B1
公开(公告)日:2020-06-09
申请号:US16236423
申请日:2018-12-29
Applicant: Intel Corporation
Inventor: Kermin ChoFleming , Simon Steely, Jr. , Kent Glossop
IPC: G06F3/00 , G06F13/24 , H04L29/08 , H04L12/933
Abstract: Systems, methods, and apparatuses relating to in-network storage for a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a plurality of processing elements; a circuit switched interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the circuit switched interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements; and an in-network storage element of the circuit switched interconnect network comprising a queue coupled to an output queue of a first processing element, and a controller that switches the in-network storage element into a first mode that provides a value stored in the queue of the in-network storage element by the output queue of the first processing element to an input queue of a second processing element when a configuration value is a first value, and into a second mode that bypasses the queue of the in-network storage element and provides a value from the output queue of the first processing element to the input queue of the second processing element when the configuration value is a second value.
-
公开(公告)号:US11106438B2
公开(公告)日:2021-08-31
申请号:US16832797
申请日:2020-03-27
Applicant: Intel Corporation
Inventor: Dounia Khaldi , Rakesh Krishnaiyer , Rajiv Deodhar , Daniel Woodworth , Joshua Cranmer , Kent Glossop
Abstract: Various embodiments are generally directed to optimizing dataflow in automated transformation frameworks (e.g., compiler, runtime, etc.) for spatial architectures (e.g., Configurable Spatial Accelerator) that translate high-level user code into forms that use “streams” (e.g., Latency Insensitive Channels, line buffers) to reduce overhead, eliminate or improve the efficiency of redundant memory accesses, and improve overall throughput.
-
公开(公告)号:US10402176B2
公开(公告)日:2019-09-03
申请号:US15855964
申请日:2017-12-27
Applicant: Intel Corporation
Inventor: Kent Glossop , Kermin Fleming , Yongzhi Zhang , Simon Steely, Jr. , Jim Sukha , Uma Srinivasan
Abstract: Methods, apparatus, systems and articles of manufacture to compiler compile code to generate dataflow code are described. An example compiler apparatus includes an intermediate representation transformer to transform input software code to intermediate representation code; an instruction selector to insert machine instructions of a target execution platform in the intermediate representation code to generate machine intermediate representation code; and a target machine transformer to: convert a portion of the machine intermediate representation code to dataflow code to generate dataflow intermediate representation code; and allocate registers within the dataflow intermediate representation code.
-
公开(公告)号:US20190042217A1
公开(公告)日:2019-02-07
申请号:US15855964
申请日:2017-12-27
Applicant: Intel Corporation
Inventor: Kent Glossop , Kermin Fleming , Yongzhi Zhang , Simon Steely, JR. , James Sukha , Uma Srinivasan
IPC: G06F8/41
CPC classification number: G06F8/433 , G06F8/441 , G06F8/443 , G06F8/4441 , G06F8/447
Abstract: Methods, apparatus, systems and articles of manufacture to compiler compile code to generate dataflow code are described. An example compiler apparatus includes an intermediate representation transformer to transform input software code to intermediate representation code; an instruction selector to insert machine instructions of a target execution platform in the intermediate representation code to generate machine intermediate representation code; and a target machine transformer to: convert a portion of the machine intermediate representation code to dataflow code to generate dataflow intermediate representation code; and allocate registers within the dataflow intermediate representation code.
-
公开(公告)号:US11366647B2
公开(公告)日:2022-06-21
申请号:US16863315
申请日:2020-04-30
Applicant: Intel Corporation
Inventor: Rajiv Deodhar , Sergey Dmitriev , Daniel Woodworth , Rakesh Krishnaiyer , Kent Glossop , Arvind Sudarsanam
Abstract: Systems, apparatuses and methods may provide for technology that detects one or more local variables in source code, wherein the local variable(s) lack dependencies across iterations of a loop in the source code, automatically generate pipeline execution code for the local variable(s), and incorporate the pipeline execution code into an output of a compiler. In one example, the pipeline execution code includes an initialization of a pool of buffer storage for the local variable(s).
-
6.
公开(公告)号:US20200257510A1
公开(公告)日:2020-08-13
申请号:US16863315
申请日:2020-04-30
Applicant: Intel Corporation
Inventor: Rajiv Deodhar , Sergey Dmitriev , Daniel Woodworth , Rakesh Krishnaiyer , Kent Glossop , Arvind Sudarsanam
Abstract: Systems, apparatuses and methods may provide for technology that detects one or more local variables in source code, wherein the local variable(s) lack dependencies across iterations of a loop in the source code, automatically generate pipeline execution code for the local variable(s), and incorporate the pipeline execution code into an output of a compiler. In one example, the pipeline execution code includes an initialization of a pool of buffer storage for the local variable(s).
-
7.
公开(公告)号:US20200210358A1
公开(公告)日:2020-07-02
申请号:US16236423
申请日:2018-12-29
Applicant: Intel Corporation
Inventor: Kermin ChoFleming , Simon Steely, JR. , Kent Glossop
IPC: G06F13/24 , H04L12/933 , H04L29/08
Abstract: Systems, methods, and apparatuses relating to in-network storage for a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a plurality of processing elements; a circuit switched interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the circuit switched interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements; and an in-network storage element of the circuit switched interconnect network comprising a queue coupled to an output queue of a first processing element, and a controller that switches the in-network storage element into a first mode that provides a value stored in the queue of the in-network storage element by the output queue of the first processing element to an input queue of a second processing element when a configuration value is a first value, and into a second mode that bypasses the queue of the in-network storage element and provides a value from the output queue of the first processing element to the input queue of the second processing element when the configuration value is a second value.
-
-
-
-
-
-