Admission control for latency-critical remote procedure calls in datacenters

    公开(公告)号:US12081442B2

    公开(公告)日:2024-09-03

    申请号:US17579989

    申请日:2022-01-20

    Applicant: Google LLC

    CPC classification number: H04L47/2433 H04L43/0852 H04L47/629 H04L67/133

    Abstract: A distributed sender driven Admission Control System (ACS) is described herein, leveraging Weighted-Fair Quality of Service (QoS) queues, found in standard NICs and switches, to guarantee RPC level latency service level objectives (SLOs) by a judicious selection of QoS weights and traffic-mix across QoS queues. ACS installs cluster-wide RPC latency SLOs by mapping LS RPCs to higher weight QoS queues, and coping with overloads by adaptively apportioning LS RPCs amongst QoS queues based on measured completion times for each queue. When the network demand spikes unexpectedly to predetermined threshold percentage of provisioned capacity, ACS achieves a latency SLO that is significantly lower than the state-of-art congestion control at the 99.9th-p and admits significantly more RPCs meeting SLO target when RPC sizes are not aligned with priorities.

    Fault Tolerant Design For Clock-Synchronization Systems

    公开(公告)号:US20210320736A1

    公开(公告)日:2021-10-14

    申请号:US17091158

    申请日:2020-11-06

    Applicant: Google LLC

    Abstract: A system is provided for synchronizing clocks. The system includes a plurality of devices in a network, each device having a local clock. The system is configured to synchronize the local clocks according to a primary spanning tree, where the primary spanning tree has a plurality of nodes connected through a plurality of primary links, each node of the plurality of nodes representing a respective device of the plurality of devices. The system is also configured to compute a backup spanning tree before a failure is detected in the primary spanning tree, wherein the backup spanning tree includes one or more backup links that are different from the primary links. As such, upon detection of a failure in the primary spanning tree, the system reconfigures the plurality of devices such that clock synchronization is performed according to the backup spanning tree.

    Weighted load balancing using scaled parallel hashing

    公开(公告)号:US11075986B2

    公开(公告)日:2021-07-27

    申请号:US15396512

    申请日:2016-12-31

    Applicant: Google LLC

    Abstract: A method for weighted data traffic routing can include receiving a data packet at data switch, where the data switch includes a plurality of egress ports. The method can also include, for each of the egress ports, generating an independent hash value based on one or more fields of the data packet and generating a weighted hash value by scaling the hash value using a scaling factor. The scaling factor can be based on at least two traffic routing weights of a plurality of respective traffic routing weights associated with the plurality of egress ports. The method can further include selecting an egress port of the plurality of egress ports based on the weighted hash value for each of the egress ports and transmitting the data packet using the selected egress port.

    SWITCH PROXY CONTROLLER FOR SWITCH VIRTUALIZATION

    公开(公告)号:US20190173807A1

    公开(公告)日:2019-06-06

    申请号:US16253645

    申请日:2019-01-22

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for switch virtualization by a switch proxy controller. In an aspect, a method includes receiving, by a switch proxy controller, a first request from a first switch fabric, where the first request indicates a first identifier that identifies the first request from other requests from the first switch fabric, generating a second request that indicates a second identifier that identifies the second request from other requests sent from the switch proxy controller to a switch, providing the second request to the switch, receiving, by the switch proxy controller, a first reply that indicates the second identifier indicated in the second request, generating, based on the second identifier indicated in the first reply, a second reply that indicates the first identifier, and selecting the first switch fabric to receive the second reply based on the second identifier.

    Fault tolerant disaggregated memory

    公开(公告)号:US12174701B2

    公开(公告)日:2024-12-24

    申请号:US18075526

    申请日:2022-12-06

    Applicant: Google LLC

    Abstract: Aspects of the disclosure are directed to a low-latency, low-overhead fault tolerant remote memory framework, which packs similar-size in-memory objects into individual page-aligned spans and applies erasure coding on these spans. The framework fully utilizes efficient one-sided remote memory accesses (RMAs) to swap spans in and out using minimal network input/outputs (I/Os), with compaction techniques that reduce remote memory fragmentation. The framework can achieve lower tail latency and higher application performance compared to other fault tolerance solutions, at the cost of potentially more memory usage.

Patent Agency Ranking