专利检索 ap:("Kshitij A. Doshi" OR "Namakkal N. Venkatesan" OR "Ren Wang" OR "Andrew J. Herdrich") AND inv:"Ren Wang" 第 1 页

1.

发明申请
PROCESSORS AND METHODS FOR MANAGING CACHE TIERING WITH GATHER-SCATTER VECTOR SEMANTICS 审中-公开

公开(公告)号：US20180095880A1

公开(公告)日：2018-04-05

申请号：US15282483

申请日：2016-09-30

申请人： Kshitij A. Doshi , Namakkal N. Venkatesan , Ren Wang , Andrew J. Herdrich

发明人： Kshitij A. Doshi , Namakkal N. Venkatesan , Ren Wang , Andrew J. Herdrich

IPC分类号： G06F12/0811 , G06F9/30

CPC分类号： G06F12/0811 , G06F9/3004 , G06F12/128 , G06F2212/283 , Y02D10/13

摘要： Processors and methods implementing a machine instruction to perform cache line demotion on multiple cache lines to enable efficient sharing of cache lines between processor cores. One general aspect includes a processor comprising: a plurality of hardware processor cores, where each of the hardware processor cores to include a first cache. The processor also includes a second cache, communicatively coupled to and shared by the plurality of hardware processor cores. The processor to support a first machine instruction, the first machine instruction to include a vector register operand identifying a vector register which contains a plurality of data elements each used to identify a cache line. An execution of the first machine instruction by one of the plurality of hardware processor cores to cause a plurality of identified cache lines to be demoted, such that the demoted cache lines are moved from the first cache to the second cache.

2.

发明申请
HARDWARE/SOFTWARE CO-OPTIMIZATION TO IMPROVE PERFORMANCE AND ENERGY FOR INTER-VM COMMUNICATION FOR NFVS AND OTHER PRODUCER-CONSUMER WORKLOADS 有权

公开(公告)号：US20210004328A1

公开(公告)日：2021-01-07

申请号：US17027248

申请日：2020-09-21

申请人： Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran

发明人： Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran

IPC分类号： G06F12/0842 , G06F12/0831 , G06F12/0893 , G06F12/109 , G06F12/0813 , G06F9/455

摘要： Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.

3.

发明申请
SOFTWARE-TRANSPARENT HARDWARE PREDICTOR FOR CORE-TO-CORE DATA TRANSFER OPTIMIZATION 审中-公开

公开(公告)号：US20190102303A1

公开(公告)日：2019-04-04

申请号：US15721249

申请日：2017-09-29

申请人： Ren Wang , Joseph Nuzman , Samantika S. Sury , Andrew J. Herdrich , Namakkal N. Venkatesan , Anil Vasudevan , Tsung-Yuan C. Tai , Niall D. McDonnell

发明人： Ren Wang , Joseph Nuzman , Samantika S. Sury , Andrew J. Herdrich , Namakkal N. Venkatesan , Anil Vasudevan , Tsung-Yuan C. Tai , Niall D. McDonnell

IPC分类号： G06F12/0831 , G06F12/084 , G06F12/0811

摘要： Apparatus, method, and system for implementing a software-transparent hardware predictor for core-to-core data communication optimization are described herein. An embodiment of the apparatus includes a plurality of hardware processor cores each including a private cache; a shared cache that is communicatively coupled to and shared by the plurality of hardware processor cores; and a predictor circuit. The predictor circuit is to track activities relating to a plurality of monitored cache lines in the private cache of a producer hardware processor core (producer core) and to enable a cache line push operation upon determining a target hardware processor core (target core) based on the tracked activities. An execution of the cache line push operation is to cause a plurality of unmonitored cache lines in the private cache of the producer core to be moved to the private cache of the target core.

4.

发明申请
MULTI-CORE COMMUNICATION ACCELERATION USING HARDWARE QUEUE DEVICE 审中-公开

公开(公告)号：US20170192921A1

公开(公告)日：2017-07-06

申请号：US14987676

申请日：2016-01-04

申请人： Ren Wang , Yipeng Wang , Andrew J. Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs

发明人： Ren Wang , Yipeng Wang , Andrew J. Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs

IPC分类号： G06F13/37 , G06F13/16 , G06F12/08

CPC分类号： G06F13/37 , G06F9/3004 , G06F9/46 , G06F12/04 , G06F12/0811 , G06F12/0868 , G06F13/1642 , G06F13/1673 , G06F2212/283 , G06F2212/6046

摘要： Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.

5.

发明申请
TECHNOLOGIES FOR NETWORK PACKET CACHE MANAGEMENT 有权
标题翻译：网络包高速缓存管理技术

公开(公告)号：US20160182351A1

公开(公告)日：2016-06-23

申请号：US14580792

申请日：2014-12-23

申请人： Ren Wang , Sameh Gobriel , Christian Maciocco , Tsung-Yuan C. Tai , Ben-Zion Friedman , Hang T. Nguyen , Namakkal N. Venkatesan , Michael A. O'Hanlon , Shrikant M. Shah , Sanjeev Jain

发明人： Ren Wang , Sameh Gobriel , Christian Maciocco , Tsung-Yuan C. Tai , Ben-Zion Friedman , Hang T. Nguyen , Namakkal N. Venkatesan , Michael A. O'Hanlon , Shrikant M. Shah , Sanjeev Jain

IPC分类号： H04L12/757 , H04L12/721 , H04L12/727

CPC分类号： H04L67/2852 , H04L41/0893 , H04L45/38 , H04L45/745 , H04L45/7453 , H04L49/00

摘要： Technologies for identifying a cache line of a network packet for eviction from an on-processor cache of a network device communicatively coupled to a network controller. The network device is configured to determine whether a cache line of the cache corresponding to the network packet is to be evicted from the cache based on a determination that the network packet is not needed subsequent to processing the network packet, and provide an indication that the cache line is to be evicted from the cache based on an eviction policy received from the network controller.

摘要翻译： 用于识别网络分组的高速缓存行的技术，用于从通信地耦合到网络控制器的网络设备的处理器上的缓存驱逐。网络设备被配置为基于在处理网络分组之后不需要网络分组的确定来确定与网络分组相对应的高速缓存行是否要从高速缓存中逐出，并提供指示基于从网络控制器接收到的逐出策略，缓存行将从缓存中逐出。

6.

发明申请
TECHNOLOGIES FOR NETWORK DEVICE FLOW LOOKUP MANAGEMENT 审中-公开
标题翻译：网络流量查询管理技术

公开(公告)号：US20160182373A1

公开(公告)日：2016-06-23

申请号：US14580801

申请日：2014-12-23

申请人： Ren Wang , Namakkal N. Venkatesan , Aamer Jaleel , Tsung-Yuan C. Tai , Sameh Gobriel , Christian Maciocco

发明人： Ren Wang , Namakkal N. Venkatesan , Aamer Jaleel , Tsung-Yuan C. Tai , Sameh Gobriel , Christian Maciocco

IPC分类号： H04L12/743 , H04L12/747 , H04L29/06

CPC分类号： H04L45/7453 , H04L45/742 , H04L69/22

摘要： Technologies for managing network flow lookups of a network device include a network controller and a target device, each communicatively coupled to the network device. The network device includes a cache for a processor of the network device and a main memory. The network device additionally includes a multi-level hash table having a first-level hash table stored in the cache of the network device and a second-level hash table stored in the main memory of the network device. The network device is configured to determine whether to store a network flow hash corresponding to a network flow indicating the target device in the first-level or second-level hash table based on a priority of the network flow provided to the network device by the network controller.

摘要翻译： 用于管理网络设备的网络流查找的技术包括网络控制器和目标设备，每个通信地耦合到网络设备。网络设备包括用于网络设备的处理器的缓存和主存储器。网络设备还包括具有存储在网络设备的高速缓存中的第一级散列表的多级散列表和存储在网络设备的主存储器中的第二级散列表。网络设备被配置为基于由网络提供给网络设备的网络流的优先级来确定是否将与指示目标设备的网络流相对应的网络流哈希存储在第一级或第二级哈希表中控制器。

7.

发明申请
MECHANISM TO SUPPORT MULTIPLE-WRITER/MULTIPLE-READER CONCURRENCY FOR SOFTWARE FLOW/PACKET CLASSIFICATION ON GENERAL PURPOSE MULTI-CORE SYSTEMS 审中-公开

公开(公告)号：US20170163575A1

公开(公告)日：2017-06-08

申请号：US14960993

申请日：2015-12-07

申请人： Ren Wang , Christian Maciocco , Namakkal N. Venkatesan , Tsung-Yuan C. Tai

发明人： Ren Wang , Christian Maciocco , Namakkal N. Venkatesan , Tsung-Yuan C. Tai

IPC分类号： H04L12/861 , H04L12/743 , H04L12/741 , H04L12/721

CPC分类号： H04L49/9094 , H04L45/38 , H04L45/54 , H04L45/7453

摘要： Methods and apparatus to support multiple-writer/multiple-reader concurrency for software flow/packet classification on general purpose multi-core systems. A flow table with rows mapped to respective hash buckets with multiple entry slots is implemented in memory of a host platform with multiple cores, with each bucket being associated with a version counter. Multiple writer and reader threads are run on the cores, with writers providing updates to the flow table data. In connection with inserting new key data, a determination is made to which buckets will be changed, and access rights to those buckets are acquired prior to making any changes. For example, under a flow table employing cuckoo hashing, access rights are acquired to buckets along a full cuckoo path. Once the access rights are obtained, a writer is enabled to update data in the applicable buckets to effect entry of the new key data, while other writer threads are prevented from changing any of these buckets, but may concurrently insert or modify key data in other buckets.

8.

发明授权
Mechanism to support multiple-writer/multiple-reader concurrency for software flow/packet classification on general purpose multi-core systems 有权

公开(公告)号：US10218647B2

公开(公告)日：2019-02-26

申请号：US14960993

申请日：2015-12-07

申请人： Ren Wang , Christian Maciocco , Namakkal N. Venkatesan , Tsung-Yuan C. Tai

发明人： Ren Wang , Christian Maciocco , Namakkal N. Venkatesan , Tsung-Yuan C. Tai

IPC分类号： H04L12/861 , H04L12/721 , H04L12/743 , H04L12/741

摘要： Methods and apparatus to support multiple-writer/multiple-reader concurrency for software flow/packet classification on general purpose multi-core systems. A flow table with rows mapped to respective hash buckets with multiple entry slots is implemented in memory of a host platform with multiple cores, with each bucket being associated with a version counter. Multiple writer and reader threads are run on the cores, with writers providing updates to the flow table data. In connection with inserting new key data, a determination is made to which buckets will be changed, and access rights to those buckets are acquired prior to making any changes. For example, under a flow table employing cuckoo hashing, access rights are acquired to buckets along a full cuckoo path. Once the access rights are obtained, a writer is enabled to update data in the applicable buckets to effect entry of the new key data, while other writer threads are prevented from changing any of these buckets, but may concurrently insert or modify key data in other buckets.

9.

发明授权
Packet processing approach to improve performance and energy efficiency for software routers 有权
标题翻译：数据包处理方法来提高软件路由器的性能和能效

公开(公告)号：US09450780B2

公开(公告)日：2016-09-20

申请号：US13559992

申请日：2012-07-27

申请人： Ren Wang , Jr-Shian Tsai , Maziar H. Manesh , Tsung-Yuan C. Tai , Ahmad Samih

发明人： Ren Wang , Jr-Shian Tsai , Maziar H. Manesh , Tsung-Yuan C. Tai , Ahmad Samih

IPC分类号： H04L12/26 , H04L12/58 , H04L12/54 , H04J3/24 , G06F13/28 , G06F12/00 , H04L12/801 , H04L12/721 , H04L12/741 , G06F17/30

CPC分类号： H04L12/54 , G06F13/28 , G06F17/30949 , H04L45/38 , H04L45/54 , H04L47/10 , Y02D10/14

摘要： Methods, apparatus and systems for improved performance and energy efficiency of software-based routers. A software router running on a host computer system employing multiple Network Interface Controllers (NICs) maintains a routing table wherein packet flows are classified as managed flows (MFs) under which packets are received at and forwarded from the same NIC and unmanaged flows UFs under which packets are received at and forwarded from different NICs. Forwarding table data is employed by a NIC to facilitate packet identification and flow classification operations under which the NIC determines whether a received packet is an MF, UF, or an unclassified flow. Under various schemes, packet forwarding for MFs is handled by the software router architecture such that either only the packet header is copied into memory in the host or the entire packet forwarding is handled by the NIC.

摘要翻译： 用于提高基于软件的路由器的性能和能效的方法，装置和系统。在使用多个网络接口控制器（NIC）的主机计算机系统上运行的软件路由器维护路由表，其中分组流被分类为被管理流（MF），在该流中，分组在同一个NIC处接收并从同一个NIC和非托管流UF转发分组在不同的NIC处被接收并从不同的NIC转发。 NIC使用转发表数据来促进分组标识和流分类操作，在该操作下，NIC确定接收的分组是MF，UF还是未分类的流。在各种方案下，MF的分组转发由软件路由器架构处理，使得仅将分组报头复制到主机中的存储器中，或者整个分组转发由NIC处理。

10.

发明申请
TECHNOLOGIES FOR CONCURRENCY OF CUCKOO HASHING FLOW LOOKUP 有权
标题翻译： CUCKOO HASHING FLOW LOOKUP的同步技术

公开(公告)号：US20160241475A1

公开(公告)日：2016-08-18

申请号：US14750921

申请日：2015-06-25

申请人： Ren Wang , Dong Zhou , Bruce Richardson , George W. Kennedy , Christian Maciocco , Sameh Gobriel , Tsung-Yuan C. Tai

发明人： Ren Wang , Dong Zhou , Bruce Richardson , George W. Kennedy , Christian Maciocco , Sameh Gobriel , Tsung-Yuan C. Tai

IPC分类号： H04L12/743 , H04L12/851

CPC分类号： H04L45/7453 , H04L47/21 , H04L47/2483

摘要： Technologies for supporting concurrency of a flow lookup table at a network device. The flow lookup table includes a plurality of candidate buckets that each includes one or more entries. The network device includes a flow lookup table write module configured to perform a displacement operation of a key/value pair to move the key/value pair from one bucket to another bucket via an atomic instruction and increment a version counter associated with the buckets affected by the displacement operation. The network device additionally includes a flow lookup table read module to check the version counters during a lookup operation on the flow lookup table to determine whether a displacement operation is affecting the presently read value of the buckets. Other embodiments are described herein and claimed.

摘要翻译： 支持网络设备上流查询表并发的技术。流查找表包括多个候选桶，每个候选桶包括一个或多个条目。网络设备包括：流查找表写入模块，被配置为执行键/值对的位移操作，以通过原子指令将键/值对从一个桶移动到另一个桶，并且增加与受影响的桶相关联的版本计数器排量操作。网络设备另外包括流查询表读取模块，用于在对查找表的查找操作期间检查版本计数器，以确定位移操作是否影响当前读取的值的值。其他实施例在本文中被描述并被要求保护。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类