-
公开(公告)号:US20210004328A1
公开(公告)日:2021-01-07
申请号:US17027248
申请日:2020-09-21
Applicant: Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran
Inventor: Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran
IPC: G06F12/0842 , G06F12/0831 , G06F12/0893 , G06F12/109 , G06F12/0813 , G06F9/455
Abstract: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
-
公开(公告)号:US20190102303A1
公开(公告)日:2019-04-04
申请号:US15721249
申请日:2017-09-29
Applicant: Ren Wang , Joseph Nuzman , Samantika S. Sury , Andrew J. Herdrich , Namakkal N. Venkatesan , Anil Vasudevan , Tsung-Yuan C. Tai , Niall D. McDonnell
Inventor: Ren Wang , Joseph Nuzman , Samantika S. Sury , Andrew J. Herdrich , Namakkal N. Venkatesan , Anil Vasudevan , Tsung-Yuan C. Tai , Niall D. McDonnell
IPC: G06F12/0831 , G06F12/084 , G06F12/0811
Abstract: Apparatus, method, and system for implementing a software-transparent hardware predictor for core-to-core data communication optimization are described herein. An embodiment of the apparatus includes a plurality of hardware processor cores each including a private cache; a shared cache that is communicatively coupled to and shared by the plurality of hardware processor cores; and a predictor circuit. The predictor circuit is to track activities relating to a plurality of monitored cache lines in the private cache of a producer hardware processor core (producer core) and to enable a cache line push operation upon determining a target hardware processor core (target core) based on the tracked activities. An execution of the cache line push operation is to cause a plurality of unmonitored cache lines in the private cache of the producer core to be moved to the private cache of the target core.
-
公开(公告)号:US10218647B2
公开(公告)日:2019-02-26
申请号:US14960993
申请日:2015-12-07
Applicant: Ren Wang , Christian Maciocco , Namakkal N. Venkatesan , Tsung-Yuan C. Tai
Inventor: Ren Wang , Christian Maciocco , Namakkal N. Venkatesan , Tsung-Yuan C. Tai
IPC: H04L12/861 , H04L12/721 , H04L12/743 , H04L12/741
Abstract: Methods and apparatus to support multiple-writer/multiple-reader concurrency for software flow/packet classification on general purpose multi-core systems. A flow table with rows mapped to respective hash buckets with multiple entry slots is implemented in memory of a host platform with multiple cores, with each bucket being associated with a version counter. Multiple writer and reader threads are run on the cores, with writers providing updates to the flow table data. In connection with inserting new key data, a determination is made to which buckets will be changed, and access rights to those buckets are acquired prior to making any changes. For example, under a flow table employing cuckoo hashing, access rights are acquired to buckets along a full cuckoo path. Once the access rights are obtained, a writer is enabled to update data in the applicable buckets to effect entry of the new key data, while other writer threads are prevented from changing any of these buckets, but may concurrently insert or modify key data in other buckets.
-
公开(公告)号:US20190042602A1
公开(公告)日:2019-02-07
申请号:US16105031
申请日:2018-08-20
Applicant: Ren Wang , Bruce Richardson , Tsung-Yuan Tai , Yipeng Wang , Pablo De Lara Guarch
Inventor: Ren Wang , Bruce Richardson , Tsung-Yuan Tai , Yipeng Wang , Pablo De Lara Guarch
IPC: G06F17/30
Abstract: Techniques and apparatus for dynamic data access mode processes are described. In one embodiment, for example, an apparatus may a processor, at least one memory coupled to the processor, the at least one memory comprising an indication of a database and instructions, the instructions, when executed by the processor, to cause the processor to determine a database utilization value for a database, perform a comparison of the database utilization value to at least one utilization threshold, and set an active data access mode to one of a low-utilization data access mode or a high-utilization data access mode based on the comparison. Other embodiments are described.
-
5.
公开(公告)号:US20180373632A1
公开(公告)日:2018-12-27
申请号:US16056315
申请日:2018-08-06
Applicant: Christopher WILKERSON , Ren WANG , Antoine KAUFMANN , Anil VASUDEVAN , Robert G. BLANKENSHIP , Venkata KRISHNAN , Tsung-Yuan C. Tai
Inventor: Christopher WILKERSON , Ren WANG , Antoine KAUFMANN , Anil VASUDEVAN , Robert G. BLANKENSHIP , Venkata KRISHNAN , Tsung-Yuan C. Tai
IPC: G06F12/0808 , G06F12/0891 , G06F12/0862
Abstract: An apparatus and method are described for a triggered prefetch operation. For example, one embodiment of a processor comprises: a first core comprising a first cache to store a first set of cache lines; a second core comprising a second cache to store a second set of cache lines; a cache management circuit to maintain coherency between one or more cache lines in the first cache and the second cache, the cache management circuit to allocate a lock on a first cache line to the first cache; a prefetch circuit comprising a prefetch request buffer to store a plurality of prefetch request entries including a first prefetch request entry associated with the first cache line, the prefetch circuit to cause the first cache line to be prefetched to the second cache in response to an invalidate command detected for the first cache line.
-
公开(公告)号:US20180083866A1
公开(公告)日:2018-03-22
申请号:US15270377
申请日:2016-09-20
Applicant: Sameh Gobriel , Ren Wang , Eric K. Mann , Christian Maciocco , Tsung-Yuan C. Tai
Inventor: Sameh Gobriel , Ren Wang , Eric K. Mann , Christian Maciocco , Tsung-Yuan C. Tai
IPC: H04L12/721 , H04L12/851 , H04L12/863 , H04L12/751 , H04L12/715
CPC classification number: H04L47/6215 , H04L47/2441 , H04L49/205
Abstract: Methods and apparatus for facilitating efficient Quality of Service (QoS) support for software-based packet processing by offloading QoS rate-limiting to NIC hardware. Software-based packet processing is performed on packet flows received at a compute platform, such as a general purpose server, and/or packet flows generated by local applications running on the compute platform. The packet processing includes packet classification that associates packets with packet flows using flow IDs, and identifying a QoS class for the packet and packet flow. NIC Tx queues are dynamically configured or pre-configured to effect rate limiting for forwarding packets enqueued in the NIC Tx queues. New packet flows are detected, and mapping data is created to map flow IDs associated with flows to the NIC Tx queues used to forward the packets associated with the flows.
-
公开(公告)号:US09860175B2
公开(公告)日:2018-01-02
申请号:US13994416
申请日:2011-12-22
Applicant: Ren Wang , Sanjay Rungta
Inventor: Ren Wang , Sanjay Rungta
IPC: H04L12/803 , H04L29/06 , H04L29/08
CPC classification number: H04L47/125 , H04L67/1023 , H04L69/18
Abstract: A system for processing a packet may include, for each of a network interface controller and a central processing unit, a measurement of the processing time, a determination of the amount of energy consumed to process a unit of information in the packet, and a measurement of the load. A user may provide the system with signals to perform networking processes for the packet in a manner to reduce the processing time of the system or in a manner to reduce the amount of energy consumed by the system for processing the packet. A portion of the system may receive at least one of the measurements, determinations, and signals and may cause one of the network interface controller and the central processing unit to perform networking processes for the packet. The networking processes may include establishing a connection to a network.
-
公开(公告)号:US09847936B2
公开(公告)日:2017-12-19
申请号:US14750085
申请日:2015-06-25
Applicant: Nrupal Jani , Dinesh Kumar , Christian Maciocco , Ren Wang , Neerav Parikh , John Fastabend , Iosif Gasparakis , David J. Harriman , Patrick L. Connor , Sanjeev Jain
Inventor: Nrupal Jani , Dinesh Kumar , Christian Maciocco , Ren Wang , Neerav Parikh , John Fastabend , Iosif Gasparakis , David J. Harriman , Patrick L. Connor , Sanjeev Jain
IPC: H04L12/721 , H04L12/26 , H04L12/725 , H04L12/803 , H04L12/911
CPC classification number: H04L45/44 , H04L43/026 , H04L43/0817 , H04L43/0876 , H04L43/16 , H04L45/306 , H04L47/125 , H04L47/781
Abstract: Devices and techniques for hardware accelerated packet processing are described herein. A device can communicate with one or more hardware switches. The device can detect characteristics of a plurality of packet streams. The device may distribute the plurality of packet streams between the one or more hardware switches and software data plane components based on the detected characteristics of the plurality of packet streams, such that at least one packet stream is designated to be processed by the one or more hardware switches. Other embodiments are also described.
-
公开(公告)号:US20170192921A1
公开(公告)日:2017-07-06
申请号:US14987676
申请日:2016-01-04
Applicant: Ren Wang , Yipeng Wang , Andrew J. Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs
Inventor: Ren Wang , Yipeng Wang , Andrew J. Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs
CPC classification number: G06F13/37 , G06F9/3004 , G06F9/46 , G06F12/04 , G06F12/0811 , G06F12/0868 , G06F13/1642 , G06F13/1673 , G06F2212/283 , G06F2212/6046
Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
-
10.
公开(公告)号:US20170107413A1
公开(公告)日:2017-04-20
申请号:US14887091
申请日:2015-10-19
Applicant: Liang Wang , Viktoria Ren Wang
Inventor: Liang Wang , Viktoria Ren Wang
IPC: C09K3/18
CPC classification number: C09D5/00 , C03C17/34 , C03C2217/445 , C03C2217/48 , C09D183/04 , C08K3/22
Abstract: The present invention relates to a self-renewing, anti-icing composition driven by a dehydrogenative reaction of a reactive hydrogen-rich compound catalyzed by nanoparticle immobilized catalysts, which is active under subzero temperatures. The disclosed coating displays a variety of properties including, but not limited to hydrophobicity, anti-wetting, and resistance to ice formation and ice adhesion. The novel anti-icing coating can be used on glass surfaces requiring optical clarity and transparency and can also be applied to a variety of smooth, roughened, or porous surfaces.
-
-
-
-
-
-
-
-
-