Abstract:
A method for supporting recovery from failure of a path in a network of nodes interconnected by links comprises: (a) selecting an intermediate node between an ingress point and an egress point of the network, wherein the intermediate node minimizes the sum of (i) a capacity constraint between the ingress point and the intermediate node and (ii) a capacity constraint between the intermediate node and the egress point; wherein the selection identifies a first link-disjoint path set between the ingress point and the intermediate node, and a second link-disjoint path set between the intermediate node and the egress point, each link-disjoint path set comprising a backup path and at least one primary path; (b) implementing, during a first routing phase, a first routing method for routing a fraction of a service level between the ingress point and the intermediate node along each of the one or more primary paths of the first link-disjoint path set; and (c) implementing, during a second routing phase, a second routing method for routing a fraction of the service level between the intermediate node and the egress point along each of the one or more primary paths of the second link-disjoint path set.
Abstract:
A method of networking a plurality of servers together within a data center is disclosed. The method includes the step of addressing a data packet for delivery to a destination server by providing the destination server address as a flat address. The method further includes the steps of obtaining routing information required to route the packet to the destination server. This routing information may be obtained from a directory service servicing the plurality of servers. Once the routing information is obtained, the data packet may be routed to the destination server according to the flat address of the destination server and routing information obtained from the directory service.
Abstract:
The subject disclosure is directed towards a multi-tiered cache having cache tiers with different access properties. Objects are written to a selected a tier of the cache based upon object-related properties and/or cache-related properties. In one aspect, objects are stored in an active log among a plurality of logs. The active log is sealed upon reaching a target size, with a new active log opened. Garbage collecting is performed on a sealed log, such as the sealed log with the most garbage therein.
Abstract:
Described is using flash memory (or other secondary storage), RAM-based data structures and mechanisms to access key-value pairs stored in the flash memory using only a low RAM space footprint. A mapping (e.g. hash) function maps key-value pairs to a slot in a RAM-based index. The slot includes a pointer that points to a bucket of records on flash memory that each had keys that mapped to the slot. The bucket of records is arranged as a linear-chained linked list, e.g., with pointers from the most-recently written record to the earliest written record. Also described are compacting non-contiguous records of a bucket onto a single flash page, and garbage collection. Still further described is load balancing to reduce variation in bucket sizes, using a bloom filter per slot to avoid unnecessary searching, and splitting a slot into sub-slots.
Abstract:
A system for commoditizing data center networking is disclosed. The system includes an interconnection topology for a data center having a plurality of servers and a plurality of nodes of a network in the data center through which data packets may be routed. The system uses a routing scheme where the routing is oblivious to the traffic pattern between nodes in the network, and wherein the interconnection topology contains a plurality of paths between one or more servers. The multipath routing may be Valiant load balancing. It disaggregates the function of load balancing into a group of regular servers, with the result that load balancing server hardware can be distributed amongst racks in the data center leading to greater agility and less fragmentation. The architecture creates a huge, flexible switching domain, supporting any server/any service, full mesh agility, and unregimented server capacity at low cost.
Abstract:
An ISP-friendly rate allocation system and method that reduces network traffic across ISP boundaries in a peer-to-peer (P2P) network, Embodiments of the system and method continuously solve a global optimization problem and dictate accordingly how much bandwidth is allocated on each connection. Embodiments of the system and method minimize load on a server in communication with the P2P network, minimize ISP-unfriendly traffic while keeping the minimum server load unaffected, and maximize peer prefetching. Two different techniques are used to compute rate allocation, including a utility function optimization technique and a minimum cost flow formulation technique. The utility function optimization technique constructs a utility function and optimizes that utility function. The minimum cost flow formulation technique generates a minimum cost flow formulation using a bipartite graph have a vertices set and an edges set. A distributed minimum cost flow formulation is solved using Lagrangian multipliers.
Abstract:
The subject disclosure is directed towards a data deduplication technology in which a hash index service's index maintains a hash index in a secondary storage device such as a hard drive, along with a compact index table and look-ahead cache in RAM that operate to reduce the I/O to access the secondary storage device during deduplication operations. Also described is a session cache for maintaining data during a deduplication session, and encoding of a read-only compact index table for efficiency.
Abstract:
Described is using flash memory, RAM-based data structures and mechanisms to provide a flash store for caching data items (e.g., key-value pairs) in flash pages. A RAM-based index maps data items to flash pages, and a RAM-based write buffer maintains data items to be written to the flash store, e.g., when a full page can be written. A recycle mechanism makes used pages in the flash store available by destaging a data item to a hard disk or reinserting it into the write buffer, based on its access pattern. The flash store may be used in a data deduplication system, in which the data items comprise chunk-identifier, metadata pairs, in which each chunk-identifier corresponds to a hash of a chunk of data that indicates. The RAM and flash are accessed with the chunk-identifier (e.g., as a key) to determine whether a chunk is a new chunk or a duplicate.
Abstract:
A scheme for routing packets of traffic to their destination after ensuring that they pass through one or more pre-determined intermediate nodes, thereby permitting all permissible traffic patterns to be handled without knowledge of the traffic matrix, subject to edge-link capacity constraints. In one embodiment, a request for a path with a service demand for routing data between the ingress point and the egress point is received. A set of two or more intermediate nodes between the ingress point and the egress point is selected. Based on a bandwidth of the network, respective fractions of the data to send from the ingress point to each node of the set of intermediate nodes are determined. The data is routed in the determined respective fractions from the ingress point to each node of the set of intermediate nodes, and routed from each node of the set of intermediate nodes to the egress point.
Abstract:
Improved p-cycle restoration techniques using a signaling protocol are disclosed. For example, a technique for use in at least one node of a data communication network for recovering from a failure, wherein the data communication network includes multiple nodes and multiple links for connecting the multiple nodes, comprises the following steps/operations. Notification of the failure is obtained at the at least one node. A determination is made whether the failure is a single link failure or one of a node failure and a multiple link failure. A pre-configured protection cycle (p-cycle) plan is implemented when the failure is a single link failure but not when the failure is one of a node failure and a multiple link failure, such that two independent paths in the network are not connected when implementing the pre-configured protection cycle plan. Implementation of the pre-configured protection cycle plan may further comprise the node sending at least one message to another node in the data communication network and/or receiving at least one message from another node in the data communication network.