-
公开(公告)号:US10158578B2
公开(公告)日:2018-12-18
申请号:US15269295
申请日:2016-09-19
Applicant: INTEL CORPORATION
Inventor: Cristian Florin Dumitrescu , Andrey Chilikin , Pierre Laurent , Kannan Babu Ramia , Sravanthi Tangeda
IPC: H04L12/14 , H04L12/869 , H04L12/873 , H04L12/815 , H04L12/863 , H04L12/819 , H04L12/801 , H04L12/813 , H04L12/865 , H04L12/803 , H04L12/851
Abstract: One embodiment provides a network device. The network device includes a a processor including at least one processor core; a network interface configured to transmit and receive packets at a line rate; a memory configured to store a scheduler hierarchical data structure; and a scheduler module. The scheduler module is configured to prefetch a next active pipe structure, the next active pipe structure included in the hierarchical data structure, update credits for a current pipe and an associated subport, identify a next active traffic class within the current pipe based, at least in part, on a current pipe data structure, select a next queue associated with the identified next active traffic class, and schedule a next packet from the selected next queue for transmission by the network interface if available traffic shaping token bucket credits and available traffic class credits are greater than or equal to a next packet credits.
-
2.
公开(公告)号:US10789176B2
公开(公告)日:2020-09-29
申请号:US16059147
申请日:2018-08-09
Applicant: Intel Corporation
Inventor: Ren Wang , Yipeng Wang , Tsung-Yuan Tai , Cristian Florin Dumitrescu , Xiangyang Guo
IPC: G06F12/123 , G06F12/126 , G06F12/128 , G06F12/0864 , G06F12/0891 , G06F9/30 , G06F12/0871
Abstract: Technologies for least recently used (LRU) cache replacement include a computing device with a processor with vector instruction support. The computing device retrieves a bucket of an associative cache from memory that includes multiple entries arranged from front to back. The bucket may be a 256-bit array including eight 32-bit entries. For lookups, a matching entry is located at a position in the bucket. The computing device executes a vector permutation processor instruction that moves the matching entry to the front of the bucket while preserving the order of other entries of the bucket. For insertion, an inserted entry is written at the back of the bucket. The computing device executes a vector permutation processor instruction that moves the inserted entry to the front of the bucket while preserving the order of other entries. The permuted bucket is stored to the memory. Other embodiments are described and claimed.
-
公开(公告)号:US20250117673A1
公开(公告)日:2025-04-10
申请号:US18982209
申请日:2024-12-16
Applicant: Intel Corporation
Inventor: Anjali Singhai Jain , Tamar Bar-Kanarik , Marcos Carranza , Karthik Kumar , Cristian Florin Dumitrescu , Keren Guy , Patrick Connor
Abstract: Techniques described herein address the above challenges that arise when using host executed software to manage vector databases by providing a vector database accelerator and shard management offload logic that is implemented within hardware and by software executed on device processors and programmable data planes of a programmable network interface device. In one embodiment, a programmable network interface device includes infrastructure management circuitry configured to facilitate data access for a neural network inference engine having a distributed data model via dynamic management of a node associated with the neural network inference engine, the node including a database shard of a vector database.
-
4.
公开(公告)号:US20190042471A1
公开(公告)日:2019-02-07
申请号:US16059147
申请日:2018-08-09
Applicant: Intel Corporation
Inventor: Ren Wang , Yipeng Wang , Tsung-Yuan Tai , Cristian Florin Dumitrescu , Xiangyang Guo
IPC: G06F12/123 , G06F12/128 , G06F12/126 , G06F12/0891 , G06F12/0871 , G06F12/0864 , G06F9/30
Abstract: Technologies for least recently used (LRU) cache replacement include a computing device with a processor with vector instruction support. The computing device retrieves a bucket of an associative cache from memory that includes multiple entries arranged from front to back. The bucket may be a 256-bit array including eight 32-bit entries. For lookups, a matching entry is located at a position in the bucket. The computing device executes a vector permutation processor instruction that moves the matching entry to the front of the bucket while preserving the order of other entries of the bucket. For insertion, an inserted entry is written at the back of the bucket. The computing device executes a vector permutation processor instruction that moves the inserted entry to the front of the bucket while preserving the order of other entries. The permuted bucket is stored to the memory. Other embodiments are described and claimed.
-
公开(公告)号:US20250139040A1
公开(公告)日:2025-05-01
申请号:US18988607
申请日:2024-12-19
Applicant: Intel Corporation
Inventor: Anjali Singhai Jain , Naren Mididaddi , Arunkumar Balakrishnan , Tamar Bar-Kanarik , Ji Li , Cristian Florin Dumitrescu , Shweta Shrivastava , Patrick Connor
Abstract: An apparatus includes a host interface; a network interface; hardware storage to store a flow table; and programmable circuitry comprising processors to implement network interface functionality and to: implement a hash table and an age context table, wherein the hash table and the age context table are to reference flow rules maintained in the flow table; process a synchronization packet for a flow by adding a flow rule for the flow to the flow table, adding a hash entry corresponding to the flow rule to the hash table, and adding an age context entry for the flow to the age context table; and process subsequent packets for the flow by performing a first lookup at the hash table to access the flow rule at the flow table and by performing a second lookup at the age context table to apply aging rules to the flow rule in the flow table.
-
公开(公告)号:US10354033B2
公开(公告)日:2019-07-16
申请号:US15711740
申请日:2017-09-21
Applicant: Intel Corporation
Inventor: Cristian Florin Dumitrescu , Jasvinder Singh , Patrick Lu
Abstract: One embodiment provides a system to identify a “best” usage of a given set of CPU cores to maximize performance of a given application. The given application is parsed into a number of functional blocks, and the system maps the functional blocks to the given set of CPU cores to maximize the performance of the given application. The system determines and then tests various mappings to determine the performance, generally preferring mappings that maximize throughput per physical core. Before testing a mapping, the system determines whether the mapping is redundant with any previously tested mappings. In addition, given a performance target for the given application, the system determines a minimum number of CPU cores needed for the application to meet the application performance target.
-
公开(公告)号:US20170149678A1
公开(公告)日:2017-05-25
申请号:US15396488
申请日:2016-12-31
Applicant: INTEL CORPORATION
Inventor: Cristian Florin Dumitrescu , Andrey Chilikin , Pierre Laurent , Kannan Babu Ramia , Sravanthi Tangeda
IPC: H04L12/869 , H04L12/803 , H04L12/819 , H04L12/813 , H04L12/815 , H04L12/851
CPC classification number: H04L47/60 , H04L12/1439 , H04L47/10 , H04L47/125 , H04L47/20 , H04L47/21 , H04L47/215 , H04L47/22 , H04L47/2408 , H04L47/2433 , H04L47/2441 , H04L47/39 , H04L47/50 , H04L47/527 , H04L47/623 , H04L47/6255 , H04L47/627 , H04L47/6275
Abstract: One embodiment provides a network device. The network device includes a a processor including at least one processor core; a network interface configured to transmit and receive packets at a line rate; a memory configured to store a scheduler hierarchical data structure; and a scheduler module. The scheduler module is configured to prefetch a next active pipe structure, the next active pipe structure included in the hierarchical data structure, update credits for a current pipe and an associated subport, identify a next active traffic class within the current pipe based, at least in part, on a current pipe data structure, select a next queue associated with the identified next active traffic class, and schedule a next packet from the selected next queue for transmission by the network interface if available traffic shaping token bucket credits and available traffic class credits are greater than or equal to a next packet credits.
-
公开(公告)号:US10091122B2
公开(公告)日:2018-10-02
申请号:US15396488
申请日:2016-12-31
Applicant: INTEL CORPORATION
Inventor: Cristian Florin Dumitrescu , Andrey Chilikin , Pierre Laurent , Kannan Babu Ramia , Sravanthi Tangeda
IPC: H04L12/869 , H04L12/815 , H04L12/851 , H04L12/819 , H04L12/813 , H04L12/803
Abstract: One embodiment provides a network device. The network device includes a a processor including at least one processor core; a network interface configured to transmit and receive packets at a line rate; a memory configured to store a scheduler hierarchical data structure; and a scheduler module. The scheduler module is configured to prefetch a next active pipe structure, the next active pipe structure included in the hierarchical data structure, update credits for a current pipe and an associated subport, identify a next active traffic class within the current pipe based, at least in part, on a current pipe data structure, select a next queue associated with the identified next active traffic class, and schedule a next packet from the selected next queue for transmission by the network interface if available traffic shaping token bucket credits and available traffic class credits are greater than or equal to a next packet credits.
-
公开(公告)号:US20170070356A1
公开(公告)日:2017-03-09
申请号:US15269295
申请日:2016-09-19
Applicant: INTEL CORPORATION
Inventor: Cristian Florin Dumitrescu , Andrey Chilikin , Pierre Laurent , Kannan Babu Ramia , Sravanthi Tangeda
IPC: H04L12/14 , H04L12/801 , H04L12/863 , H04L12/865 , H04L12/819 , H04L12/815
CPC classification number: H04L47/60 , H04L12/1439 , H04L47/10 , H04L47/125 , H04L47/20 , H04L47/21 , H04L47/215 , H04L47/22 , H04L47/2408 , H04L47/2433 , H04L47/2441 , H04L47/39 , H04L47/50 , H04L47/527 , H04L47/623 , H04L47/6255 , H04L47/627 , H04L47/6275
Abstract: One embodiment provides a network device. The network device includes a a processor including at least one processor core; a network interface configured to transmit and receive packets at a line rate; a memory configured to store a scheduler hierarchical data structure; and a scheduler module. The scheduler module is configured to prefetch a next active pipe structure, the next active pipe structure included in the hierarchical data structure, update credits for a current pipe and an associated subport, identify a next active traffic class within the current pipe based, at least in part, on a current pipe data structure, select a next queue associated with the identified next active traffic class, and schedule a next packet from the selected next queue for transmission by the network interface if available traffic shaping token bucket credits and available traffic class credits are greater than or equal to a next packet credits.
Abstract translation: 一个实施例提供一种网络设备。 网络设备包括:处理器,包括至少一个处理器核心; 网络接口,被配置为以线路速率发送和接收分组; 存储器,被配置为存储调度器分层数据结构; 和调度器模块。 调度器模块被配置为预取下一个活动管道结构,包括在分级数据结构中的下一个主动管道结构,更新当前管道和相关联的子端口的信用,至少基于当前管道识别下一个活动业务类别 部分地,在当前的管道数据结构上,选择与所识别的下一个活动业务类别相关联的下一个队列,并且如果可用的流量整形令牌桶信用和可用业务类别,则可以从所选择的下一个队列调度下一个分组以供网络接口传输 信用额度大于或等于下一个信用额度。
-
-
-
-
-
-
-
-