Adaptive index for data deduplication

    公开(公告)号:US09639543B2

    公开(公告)日:2017-05-02

    申请号:US12979681

    申请日:2010-12-28

    CPC classification number: G06F17/30097 G06F17/3007 G06F17/30159

    Abstract: The subject disclosure is directed towards a data deduplication technology in which a hash index service's index and/or indexing operations are adaptable to balance deduplication performance savings, throughput and resource consumption. The indexing service may employ hierarchical chunking using different levels of granularity corresponding to chunk size, a sampled compact index table that contains compact signatures for less than all of the hash index's (or subspace's) hash values, and/or selective subspace indexing based on similarity of a subspace's data to another subspace's data and/or to incoming data chunks.

    Adaptive Index for Data Deduplication
    3.
    发明申请
    Adaptive Index for Data Deduplication 有权
    适用于重复数据删除的索引

    公开(公告)号:US20120166448A1

    公开(公告)日:2012-06-28

    申请号:US12979681

    申请日:2010-12-28

    CPC classification number: G06F17/30097 G06F17/3007 G06F17/30159

    Abstract: The subject disclosure is directed towards a data deduplication technology in which a hash index service's index and/or indexing operations are adaptable to balance deduplication performance savings, throughput and resource consumption. The indexing service may employ hierarchical chunking using different levels of granularity corresponding to chunk size, a sampled compact index table that contains compact signatures for less than all of the hash index's (or subspace's) hash values, and/or selective subspace indexing based on similarity of a subspace's data to another subspace's data and/or to incoming data chunks.

    Abstract translation: 主题公开涉及重复数据删除技术,其中散列索引服务的索引和/或索引操作适于平衡重复数据删除性能节省,吞吐量和资源消耗。 索引服务可以使用与块大小相对应的不同级别的粒度的分级分块,包含小于所有哈希索引(或子空间)散列值的紧凑签名的采样压缩索引表和/或基于相似性的选择性子空间索引 子空间的数据到另一个子空间的数据和/或输入的数据块。

    PROXIMITY GUIDED DATA DISCOVERY
    4.
    发明申请
    PROXIMITY GUIDED DATA DISCOVERY 有权
    临时指导数据发现

    公开(公告)号:US20100332579A1

    公开(公告)日:2010-12-30

    申请号:US12490811

    申请日:2009-06-24

    CPC classification number: G06F17/30864

    Abstract: Techniques are described for sharing content among peers. Locality domains are treated as first order network units. Content is located at the level of a locality domain using a hierarchical DHT in which nodes correspond to locality domains. A peer searches for a given piece of content in a proximity guided manner and terminates at the earliest locality domain (in the hierarchy) which has the content. Locality domains are organized into hierarchical clusters based on their proximity.

    Abstract translation: 描述了在对等体之间共享内容的技术。 地点域被视为一级网络单位。 内容位于使用层级DHT的位置级别的级别,其中节点对应于位置域。 对等体以邻近指导的方式搜索给定的内容,并且在具有内容的最早的位置域(在层次结构中)终止。 基于它们的邻近度,地域被组织成分级集群。

    RATE-CONTROLLABLE PEER-TO-PEER DATA STREAM ROUTING
    5.
    发明申请
    RATE-CONTROLLABLE PEER-TO-PEER DATA STREAM ROUTING 有权
    速率可控对等数据流路由

    公开(公告)号:US20100146108A1

    公开(公告)日:2010-06-10

    申请号:US12612395

    申请日:2009-11-04

    CPC classification number: H04L67/104 H04L65/80 H04L67/1085

    Abstract: Difficulties associated with choosing advantageous network routes between server and clients are mitigated by a routing system that is devised to use many routing path sets, where respective sets comprise a number of routing paths covering all of the clients, including through other clients. A server may then apportion a data stream among all of the routing path sets. The server may also detect the performance of the computer network while sending the data stream between clients, and may adjust the apportionment of the routing path sets including the route. The clients may also be configured to operate as servers of other data streams, such as in a videoconferencing session, for example, and may be configured to send detected route performance information along with the portions of the various data streams.

    Abstract translation: 通过设计为使用许多路由路径集的路由系统来减轻与服务器和客户端之间选择有利的网络路由相关联的困难,其中相应的集合包括覆盖所有客户端的多个路由路径,包括通过其他客户端。 然后,服务器可以在所有路由路径集之间分配数据流。 服务器还可以在客户端之间发送数据流时检测计算机网络的性能,并且可以调整包括路由的路由路径集合的分配。 客户端还可以被配置为例如在视频会议会话中作为其他数据流的服务器操作,并且可以被配置为发送检测到的路由性能信息以及各种数据流的部分。

    Content Aware Chunking for Achieving an Improved Chunk Size Distribution
    10.
    发明申请
    Content Aware Chunking for Achieving an Improved Chunk Size Distribution 有权
    用于实现改进的块大小分布的内容意识分块

    公开(公告)号:US20130054544A1

    公开(公告)日:2013-02-28

    申请号:US13222198

    申请日:2011-08-31

    Abstract: The subject disclosure is directed towards partitioning a file into chunks that satisfy a chunk size restriction, such as maximum and minimum chunk sizes, using a sliding window. For file positions within the chunk size restriction, a signature representative of a window fingerprint is compared with a target pattern, with a chunk boundary candidate identified if matched. Other signatures and patterns are then checked to determine a highest ranking signature (corresponding to a lowest numbered Rule) to associate with that chunk boundary candidate, or set an actual boundary if the highest ranked signature is matched. If the maximum chunk size is reached without matching the highest ranked signature, the chunking mechanism regresses to set the boundary based on the candidate with the next highest ranked signature (if no candidates, the boundary is set at the maximum). Also described is setting chunk boundaries based upon pattern detection (e.g., runs of zeros).

    Abstract translation: 本发明涉及使用滑动窗口将文件分成满足块大小限制的块,例如最大和最小块大小。 对于块大小限制内的文件位置,将窗口指纹的签名代表与目标模式进行比较,如果匹配则识别出块边界候选。 然后检查其他签名和模式以确定与该块块边界候选者相关联的最高排名签名(对应于最小编号的规则),或者如果最高排名签名匹配则设置实际边界。 如果没有匹配最高排名的签名达到最大块大小,则分块机制基于具有下一个最高排名的签名的候选者(如果没有候选,边界被设置为最大)而退化以设置边界。 还描述了基于模式检测(例如,零的运行)设置块边界。

Patent Agency Ranking