摘要:
Techniques are described for sharing content among peers. Locality domains are treated as first order network units. Content is located at the level of a locality domain using a hierarchical DHT in which nodes correspond to locality domains. A peer searches for a given piece of content in a proximity guided manner and terminates at the earliest locality domain (in the hierarchy) which has the content. Locality domains are organized into hierarchical clusters based on their proximity.
摘要:
Described is using flash memory, RAM-based data structures and mechanisms to provide a flash store for caching data items (e.g., key-value pairs) in flash pages. A RAM-based index maps data items to flash pages, and a RAM-based write buffer maintains data items to be written to the flash store, e.g., when a full page can be written. A recycle mechanism makes used pages in the flash store available by destaging a data item to a hard disk or reinserting it into the write buffer, based on its access pattern. The flash store may be used in a data deduplication system, in which the data items comprise chunk-identifier, metadata pairs, in which each chunk-identifier corresponds to a hash of a chunk of data that indicates. The RAM and flash are accessed with the chunk-identifier (e.g., as a key) to determine whether a chunk is a new chunk or a duplicate.
摘要:
The subject disclosure is directed towards a data deduplication technology in which a hash index service's index maintains a hash index in a secondary storage device such as a hard drive, along with a compact index table and look-ahead cache in RAM that operate to reduce the I/O to access the secondary storage device during deduplication operations. Also described is a session cache for maintaining data during a deduplication session, and encoding of a read-only compact index table for efficiency.
摘要:
The subject disclosure is directed towards a data deduplication technology in which a hash index service's index maintains a hash index in a secondary storage device such as a hard drive, along with a compact index table and look-ahead cache in RAM that operate to reduce the I/O to access the secondary storage device during deduplication operations. Also described is a session cache for maintaining data during a deduplication session, and encoding of a read-only compact index table for efficiency.
摘要翻译:主题公开涉及一种数据重复数据删除技术,其中散列索引服务的索引在诸如硬盘驱动器的辅助存储设备中维护散列索引,以及RAM中的紧凑索引表和预先高速缓存,其操作以减少 I / O在重复数据消除操作期间访问辅助存储设备。 还描述了用于在重复数据删除会话期间维护数据的会话高速缓存,以及用于效率的只读压缩索引表的编码。
摘要:
The subject disclosure is directed towards partitioning a file into chunks that satisfy a chunk size restriction, such as maximum and minimum chunk sizes, using a sliding window. For file positions within the chunk size restriction, a signature representative of a window fingerprint is compared with a target pattern, with a chunk boundary candidate identified if matched. Other signatures and patterns are then checked to determine a highest ranking signature (corresponding to a lowest numbered Rule) to associate with that chunk boundary candidate, or set an actual boundary if the highest ranked signature is matched. If the maximum chunk size is reached without matching the highest ranked signature, the chunking mechanism regresses to set the boundary based on the candidate with the next highest ranked signature (if no candidates, the boundary is set at the maximum). Also described is setting chunk boundaries based upon pattern detection (e.g., runs of zeros).
摘要:
The subject disclosure is directed towards a data deduplication technology in which a hash index service's index maintains a hash index in a secondary storage device such as a hard drive, along with a compact index table and look-ahead cache in RAM that operate to reduce the I/O to access the secondary storage device during deduplication operations. Also described is a session cache for maintaining data during a deduplication session, and encoding of a read-only compact index table for efficiency.
摘要翻译:主题公开涉及一种数据重复数据删除技术,其中散列索引服务的索引在诸如硬盘驱动器的辅助存储设备中维护散列索引,以及RAM中的紧凑索引表和预先高速缓存,其操作以减少 I / O在重复数据消除操作期间访问辅助存储设备。 还描述了用于在重复数据删除会话期间维护数据的会话高速缓存,以及用于效率的只读压缩索引表的编码。
摘要:
In various embodiments, methods and systems are disclosed for a hybrid rate plus window based congestion protocol that controls the rate of packet transmission into the network and provides low queuing delay, practically zero packet loss, fair allocation of network resources amongst multiple flows, and full link utilization. In one embodiment, a congestion window may be used to control the maximum number of outstanding bits, a transmission rate may be used to control the rate of packets entering the network (packet pacing), a queuing delay based rate update may be used to control queuing delay within tolerated bounds and minimize packet loss, and aggressive ramp-up/graceful back-off may be used to fully utilize the link capacity and additive-increase, multiplicative-decrease (AIMD) rate control may be used to provide fairness amongst multiple flows.
摘要:
Difficulties associated with choosing advantageous network routes between server and clients are mitigated by a routing system that is devised to use many routing path sets, where respective sets comprise a number of routing paths covering all of the clients, including through other clients. A server may then apportion a data stream among all of the routing path sets. The server may also detect the performance of the computer network while sending the data stream between clients, and may adjust the apportionment of the routing path sets including the route. The clients may also be configured to operate as servers of other data streams, such as in a videoconferencing session, for example, and may be configured to send detected route performance information along with the portions of the various data streams.
摘要:
Various object de-duplication techniques may be applied to object systems (such as to files in a file store) to identify similar or identical objects or portions thereof, so that duplicate objects or object portions may be associated with one copy, and the duplicate copies may be removed. However, an object de-duplication technique that is suitable for de-duplicating one type of object may be inefficient for de-duplicating another type of object; e.g., a de-duplication method that significantly condenses sets of small objects may achieve very little condensation among sets of large objects, and vice versa. A multimodal approach to object de-duplication may be devised that analyzes an object to be stored and chooses a de-duplication technique that is likely to be effective for storing the object. The object index may be configured to support several de-duplication schemes for indexing and storing many types of objects in a space-economizing manner.
摘要:
The subject disclosure is directed towards a data deduplication technology in which a hash index service's index and/or indexing operations are adaptable to balance deduplication performance savings, throughput and resource consumption. The indexing service may employ hierarchical chunking using different levels of granularity corresponding to chunk size, a sampled compact index table that contains compact signatures for less than all of the hash index's (or subspace's) hash values, and/or selective subspace indexing based on similarity of a subspace's data to another subspace's data and/or to incoming data chunks.