-
公开(公告)号:US10929323B2
公开(公告)日:2021-02-23
申请号:US16601137
申请日:2019-10-14
Applicant: Intel Corporation
Inventor: Ren Wang , Yipeng Wang , Andrew Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs
IPC: G06F13/37 , G06F9/54 , G06F12/0868 , G06F12/0811 , G06F13/16 , G06F12/04 , G06F9/38
Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
-
公开(公告)号:US20200042479A1
公开(公告)日:2020-02-06
申请号:US16601137
申请日:2019-10-14
Applicant: Intel Corporation
Inventor: Ren Wang , Yipeng Wang , Andrew Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs
IPC: G06F13/37 , G06F12/0811 , G06F13/16 , G06F9/54 , G06F12/0868
Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
-
公开(公告)号:US11709702B2
公开(公告)日:2023-07-25
申请号:US16777102
申请日:2020-01-30
Applicant: Intel Corporation
Inventor: Joseph R. Hasting , William G. Burroughs
CPC classification number: G06F9/4843 , G06F9/505
Abstract: A system and method are described for work conserving, load balancing, and scheduling by a network processor. For example, one embodiment of a system includes a plurality of processing cores, including a scheduling circuit, at least one source processing core that generates at least one task and at least one destination processing core that receives and processes the at least one task, and generates a response. The scheduling circuit of the exemplary system receives the at least one task and conducts a load balancing to select the at least one destination processing core. In an embodiment, the scheduling circuit further detects a critical sequences of tasks, schedules those tasks to be processed by a single destination processing core, and, upon completion of the critical sequence, conducts another load balancing to potentially select a different processing core to process more tasks.
-
公开(公告)号:US20200241915A1
公开(公告)日:2020-07-30
申请号:US16777102
申请日:2020-01-30
Applicant: Intel Corporation
Inventor: Joseph R. Hasting , William G. Burroughs
Abstract: A system and method are described for work conserving, load balancing, and scheduling by a network processor. For example, one embodiment of a system includes a plurality of processing cores, including a scheduling circuit, at least one source processing core that generates at least one task and at least one destination processing core that receives and processes the at least one task, and generates a response. The scheduling circuit of the exemplary system receives the at least one task and conducts a load balancing to select the at least one destination processing core. In an embodiment, the scheduling circuit further detects a critical sequences of tasks, schedules those tasks to be processed by a single destination processing core, and, upon completion of the critical sequence, conducts another load balancing to potentially select a different processing core to process more tasks.
-
5.
公开(公告)号:US20180365053A1
公开(公告)日:2018-12-20
申请号:US15626806
申请日:2017-06-19
Applicant: Intel Corporation
Inventor: William G. Burroughs , Jerry Pirog , Joseph R. Hasting , Te K. Ma
IPC: G06F9/48
CPC classification number: G06F9/4881
Abstract: Apparatus and method for multi-core dynamically-balanced task processing while maintaining task order in chip multiprocessor platforms. One embodiment of an apparatus includes: a distribution circuitry to distribute, among a plurality of processing units, tasks from one or more workflows; a history list to track all tasks distributed by the distribution circuitry; an ordering queue to store one or more sub-tasks received from a first processing unit as a result of the first processing unit processing a first task; and wherein, responsive to a detection that all sub-tasks of the first task have been received and that the first task is the oldest task for a given parent workflow tracked by the history list, all sub-tasks associated with the first task are to be placed in a replay queue to be replayed in the order in which each sub-task was received.
-
公开(公告)号:US10552205B2
公开(公告)日:2020-02-04
申请号:US15089522
申请日:2016-04-02
Applicant: Intel Corporation
Inventor: Joseph R. Hasting , William G. Burroughs
Abstract: A system and method are described for work conserving, load balancing, and scheduling by a network processor. For example, one embodiment of a system includes a plurality of processing cores, including a scheduling circuit, at least one source processing core that generates at least one task and at least one destination processing core that receives and processes the at least one task, and generates a response. The scheduling circuit of the exemplary system receives the at least one task and conducts a load balancing to select the at least one destination processing core. In an embodiment, the scheduling circuit further detects a critical sequences of tasks, schedules those tasks to be processed by a single destination processing core, and, upon completion of the critical sequence, conducts another load balancing to potentially select a different processing core to process more tasks.
-
公开(公告)号:US10445271B2
公开(公告)日:2019-10-15
申请号:US14987676
申请日:2016-01-04
Applicant: Intel Corporation
Inventor: Ren Wang , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Yipeng Wang , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs , Andrew J. Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson
IPC: G06F13/37 , G06F12/0811 , G06F13/16 , G06F12/0868 , G06F12/04 , G06F9/38
Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
-
8.
公开(公告)号:US10437638B2
公开(公告)日:2019-10-08
申请号:US15626806
申请日:2017-06-19
Applicant: Intel Corporation
Inventor: William G. Burroughs , Jerry Pirog , Joseph R. Hasting , Te K. Ma
Abstract: Apparatus and method for multi-core dynamically-balanced task processing while maintaining task order in chip multiprocessor platforms. One embodiment of an apparatus includes: a distribution circuitry to distribute, among a plurality of processing units, tasks from one or more workflows; a history list to track all tasks distributed by the distribution circuitry; an ordering queue to store one or more sub-tasks received from a first processing unit as a result of the first processing unit processing a first task; and wherein, responsive to a detection that all sub-tasks of the first task have been received and that the first task is the oldest task for a given parent workflow tracked by the history list, all sub-tasks associated with the first task are to be placed in a replay queue to be replayed in the order in which each sub-task was received.
-
公开(公告)号:US20170286157A1
公开(公告)日:2017-10-05
申请号:US15089522
申请日:2016-04-02
Applicant: Intel Corporation
Inventor: Joseph R. Hasting , William G. Burroughs
Abstract: A system and method are described for work conserving, load balancing, and scheduling by a network processor. For example, one embodiment of a system includes a plurality of processing cores, including a scheduling circuit, at least one source processing core that generates at least one task and at least one destination processing core that receives and processes the at least one task, and generates a response. The scheduling circuit of the exemplary system receives the at least one task and conducts a load balancing to select the at least one destination processing core. In an embodiment, the scheduling circuit further detects a critical sequences of tasks, schedules those tasks to be processed by a single destination processing core, and, upon completion of the critical sequence, conducts another load balancing to potentially select a different processing core to process more tasks.
-
-
-
-
-
-
-
-