Abstract:
A system and method for maintaining connectivity between a host system running an Always-On-Always-Connected (AOAC) application and an associated remote application server. The system further includes circuitry configured to establish a communication link between the host system and the remote application server. The circuitry is configured periodically transmit keep-alive messages to the remote application server after the host system transitions to and remains in a low-power state. The keep-alive messages are configured to maintain connectivity and presence of the AOAC application with the remote application server while the host system is in the low-power state.
Abstract:
Technologies for managing network flow lookups of a network device include a network controller and a target device, each communicatively coupled to the network device. The network device includes a cache for a processor of the network device and a main memory. The network device additionally includes a multi-level hash table having a first-level hash table stored in the cache of the network device and a second-level hash table stored in the main memory of the network device. The network device is configured to determine whether to store a network flow hash corresponding to a network flow indicating the target device in the first-level or second-level hash table based on a priority of the network flow provided to the network device by the network controller.
Abstract:
A computing apparatus for providing a node within a distributed network function, including: a hardware platform; a network interface to communicatively couple to at least one other peer node of the distributed network function; a distributor function including logic to operate on the hardware platform, including a hashing module configured to receive an incoming network packet via the network interface and perform on the incoming network packet a first-level hash of a two-level hash, the first level hash being a lightweight hash with respect to a second-level hash, the first level hash to deterministically direct a packet to one of the nodes of the distributed network function as a directed packet; and a denial of service (DoS) mitigation engine to receive notification of a DoS attack, identify a DoS packet via the first-level hash, and prevent the DoS packet from reaching the second-level hash.
Abstract:
Apparatus and method to facilitate networked compute node cluster routing are disclosed herein. In some embodiments, a compute node for cluster compute may include one or more input ports to receive data packets from first selected ones of a cluster of compute nodes; one or more output ports to route data packets to second selected ones of the cluster of computer nodes; and one or more processors, wherein the one or more processors includes logic to determine a particular output port, of the one or more output ports, to which a data packet received at the one or more input ports is to be routed, and wherein the logic is to exclude output ports associated with links indicated in fault status information as having a fault status to be the particular output port to which the data packet is to be routed.
Abstract:
Technologies for distributed table lookup via a distributed router includes an ingress computing node, an intermediate computing node, and an egress computing node. Each computing node of the distributed router includes a forwarding table to store a different set of network routing entries obtained from a routing table of the distributed router. The ingress computing node generates a hash key based on the destination address included in a received network packet. The hash key identifies the intermediate computing node of the distributed router that stores the forwarding table that includes a network routing entry corresponding to the destination address. The ingress computing node forwards the received network packet to the intermediate computing node for routing. The intermediate computing node receives the forwarded network packet, determines a destination address of the network packet, and determines the egress computing node for transmission of the network packet from the distributed router.
Abstract:
The present disclosure provides techniques for cache management. A data block may be received from an IO interface. After receiving the data block, the occupancy level of a cache memory may be determined. The data block may be directed to a main memory if the occupancy level exceeds a threshold. The data block may be directed to a cache memory if the occupancy level is below a threshold.
Abstract:
Embodiments of methods, systems, and storage medium associated with are disclosed herein. In one instance, the method may include: first determining whether the computing device is connected to a network, based on a result of the first determining, monitoring data traffic between the computing device and the network, wherein the data traffic is associated with at least one application residing on the computing device, based on the monitoring, second determining whether the at least one application has been updated, and initiating a transition of the computing device to a sleep mode upon a result of the second determining that indicates that the at least one application has been updated. Other embodiments may be described and/or claimed.
Abstract:
Examples described herein include a device interface; a first set of one or more processing units; and a second set of one or more processing units. In some examples, the first set of one or more processing units are to perform heavy flow detection for packets of a flow and the second set of one or more processing units are to perform processing of packets of a heavy flow. In some examples, the first set of one or more processing units and second set of one or more processing units are different. In some examples, the first set of one or more processing units is to allocate pointers to packets associated with the heavy flow to a first set of one or more queues of a load balancer and the load balancer is to allocate the packets associated with the heavy flow to one or more processing units of the second set of one or more processing units based, at least in part on a packet receive rate of the packets associated with the heavy flow.
Abstract:
Key-value (KV) caching accelerates inference in large language models (LLMs) by allowing the attention operation to scale linearly rather than quadratically with the total sequence length. Due to large context lengths in modern LLMs, KV cache size can exceed the model size, which can negatively impact throughput. To address this issue, KVCrush, which stands for KEY-VALUE CACHE SIZE REDUCTION USING SIMILARITY IN HEAD-BEHAVIOR, is implemented. KVCrush involves using binary vectors to represent tokens, where the vector indicates which attention heads attend to the token and which attention heads disregard the token. The binary vectors are used in a hardware-efficient, low-overhead process to produce representatives for unimportant tokens to be pruned, without having to implement k-means clustering techniques.
Abstract:
A computing apparatus for providing a node within a distributed network function, including: a hardware platform; a network interface to communicatively couple to at least one other peer node of the distributed network function; a distributor function including logic to operate on the hardware platform, including a hashing module configured to receive an incoming network packet via the network interface and perform on the incoming network packet a first-level hash of a two-level hash, the first level hash being a lightweight hash with respect to a second-level hash, the first level hash to deterministically direct a packet to one of the nodes of the distributed network function as a directed packet; and a denial of service (DoS) mitigation engine to receive notification of a DoS attack, identify a DoS packet via the first-level hash, and prevent the DoS packet from reaching the second-level hash.