Adaptive Index for Data Deduplication
    12.
    发明申请
    Adaptive Index for Data Deduplication 有权
    适用于重复数据删除的索引

    公开(公告)号:US20120166448A1

    公开(公告)日:2012-06-28

    申请号:US12979681

    申请日:2010-12-28

    CPC classification number: G06F17/30097 G06F17/3007 G06F17/30159

    Abstract: The subject disclosure is directed towards a data deduplication technology in which a hash index service's index and/or indexing operations are adaptable to balance deduplication performance savings, throughput and resource consumption. The indexing service may employ hierarchical chunking using different levels of granularity corresponding to chunk size, a sampled compact index table that contains compact signatures for less than all of the hash index's (or subspace's) hash values, and/or selective subspace indexing based on similarity of a subspace's data to another subspace's data and/or to incoming data chunks.

    Abstract translation: 主题公开涉及重复数据删除技术,其中散列索引服务的索引和/或索引操作适于平衡重复数据删除性能节省,吞吐量和资源消耗。 索引服务可以使用与块大小相对应的不同级别的粒度的分级分块,包含小于所有哈希索引(或子空间)散列值的紧凑签名的采样压缩索引表和/或基于相似性的选择性子空间索引 子空间的数据到另一个子空间的数据和/或输入的数据块。

    PROXIMITY GUIDED DATA DISCOVERY
    13.
    发明申请
    PROXIMITY GUIDED DATA DISCOVERY 有权
    临时指导数据发现

    公开(公告)号:US20100332579A1

    公开(公告)日:2010-12-30

    申请号:US12490811

    申请日:2009-06-24

    CPC classification number: G06F17/30864

    Abstract: Techniques are described for sharing content among peers. Locality domains are treated as first order network units. Content is located at the level of a locality domain using a hierarchical DHT in which nodes correspond to locality domains. A peer searches for a given piece of content in a proximity guided manner and terminates at the earliest locality domain (in the hierarchy) which has the content. Locality domains are organized into hierarchical clusters based on their proximity.

    Abstract translation: 描述了在对等体之间共享内容的技术。 地点域被视为一级网络单位。 内容位于使用层级DHT的位置级别的级别,其中节点对应于位置域。 对等体以邻近指导的方式搜索给定的内容,并且在具有内容的最早的位置域(在层次结构中)终止。 基于它们的邻近度,地域被组织成分级集群。

    RATE-CONTROLLABLE PEER-TO-PEER DATA STREAM ROUTING
    14.
    发明申请
    RATE-CONTROLLABLE PEER-TO-PEER DATA STREAM ROUTING 有权
    速率可控对等数据流路由

    公开(公告)号:US20100146108A1

    公开(公告)日:2010-06-10

    申请号:US12612395

    申请日:2009-11-04

    CPC classification number: H04L67/104 H04L65/80 H04L67/1085

    Abstract: Difficulties associated with choosing advantageous network routes between server and clients are mitigated by a routing system that is devised to use many routing path sets, where respective sets comprise a number of routing paths covering all of the clients, including through other clients. A server may then apportion a data stream among all of the routing path sets. The server may also detect the performance of the computer network while sending the data stream between clients, and may adjust the apportionment of the routing path sets including the route. The clients may also be configured to operate as servers of other data streams, such as in a videoconferencing session, for example, and may be configured to send detected route performance information along with the portions of the various data streams.

    Abstract translation: 通过设计为使用许多路由路径集的路由系统来减轻与服务器和客户端之间选择有利的网络路由相关联的困难,其中相应的集合包括覆盖所有客户端的多个路由路径,包括通过其他客户端。 然后,服务器可以在所有路由路径集之间分配数据流。 服务器还可以在客户端之间发送数据流时检测计算机网络的性能,并且可以调整包括路由的路由路径集合的分配。 客户端还可以被配置为例如在视频会议会话中作为其他数据流的服务器操作,并且可以被配置为发送检测到的路由性能信息以及各种数据流的部分。

    Low RAM space, high-throughput persistent key-value store using secondary memory

    公开(公告)号:US10558705B2

    公开(公告)日:2020-02-11

    申请号:US12908153

    申请日:2010-10-20

    Abstract: Described is using flash memory (or other secondary storage), RAM-based data structures and mechanisms to access key-value pairs stored in the flash memory using only a low RAM space footprint. A mapping (e.g. hash) function maps key-value pairs to a slot in a RAM-based index. The slot includes a pointer that points to a bucket of records on flash memory that each had keys that mapped to the slot. The bucket of records is arranged as a linear-chained linked list, e.g., with pointers from the most-recently written record to the earliest written record. Also described are compacting non-contiguous records of a bucket onto a single flash page, and garbage collection. Still further described is load balancing to reduce variation in bucket sizes, using a bloom filter per slot to avoid unnecessary searching, and splitting a slot into sub-slots.

    Content aware chunking for achieving an improved chunk size distribution
    17.
    发明授权
    Content aware chunking for achieving an improved chunk size distribution 有权
    内容感知分块实现改进的块大小分布

    公开(公告)号:US08918375B2

    公开(公告)日:2014-12-23

    申请号:US13222198

    申请日:2011-08-31

    Abstract: The subject disclosure is directed towards partitioning a file into chunks that satisfy a chunk size restriction, such as maximum and minimum chunk sizes, using a sliding window. For file positions within the chunk size restriction, a signature representative of a window fingerprint is compared with a target pattern, with a chunk boundary candidate identified if matched. Other signatures and patterns are then checked to determine a highest ranking signature (corresponding to a lowest numbered Rule) to associate with that chunk boundary candidate, or set an actual boundary if the highest ranked signature is matched. If the maximum chunk size is reached without matching the highest ranked signature, the chunking mechanism regresses to set the boundary based on the candidate with the next highest ranked signature (if no candidates, the boundary is set at the maximum). Also described is setting chunk boundaries based upon pattern detection (e.g., runs of zeros).

    Abstract translation: 本发明涉及使用滑动窗口将文件分成满足块大小限制的块,例如最大和最小块大小。 对于块大小限制内的文件位置,将窗口指纹的签名代表与目标模式进行比较,如果匹配则识别出块边界候选。 然后检查其他签名和模式以确定与该块块边界候选者相关联的最高排名签名(对应于最小编号的规则),或者如果最高排名签名匹配则设置实际边界。 如果没有匹配最高排名的签名达到最大块大小,则分块机制基于具有下一个最高排名的签名的候选者(如果没有候选,边界被设置为最大)而退化以设置边界。 还描述了基于模式检测(例如,零的运行)设置块边界。

    FLASH MEMORY CACHE INCLUDING FOR USE WITH PERSISTENT KEY-VALUE STORE
    18.
    发明申请
    FLASH MEMORY CACHE INCLUDING FOR USE WITH PERSISTENT KEY-VALUE STORE 有权
    闪存存储器缓存,包括使用唯一的键值存储

    公开(公告)号:US20130282965A1

    公开(公告)日:2013-10-24

    申请号:US13919738

    申请日:2013-06-17

    Abstract: Described is using flash memory, RAM-based data structures and mechanisms to provide a flash store for caching data items (e.g., key-value pairs) in flash pages. A RAM-based index maps data items to flash pages, and a RAM-based write buffer maintains data items to be written to the flash store, e.g., when a full page can be written. A recycle mechanism makes used pages in the flash store available by destaging a data item to a hard disk or reinserting it into the write buffer, based on its access pattern. The flash store may be used in a data deduplication system, in which the data items comprise chunk-identifier, metadata pairs, in which each chunk-identifier corresponds to a hash of a chunk of data that indicates. The RAM and flash are accessed with the chunk-identifier (e.g., as a key) to determine whether a chunk is a new chunk or a duplicate.

    Abstract translation: 描述的是使用闪存,基于RAM的数据结构和机制来提供用于在闪存页中缓存数据项(例如键值对)的闪存。 基于RAM的索引将数据项映射到闪存页面,并且基于RAM的写入缓冲器保持要写入闪存存储器的数据项目,例如当可以写入全页时。 回收机制使得通过将数据项降级到硬盘或将其重新插入到写入缓冲器中,基于其访问模式,可用于闪存存储器中的使用页面。 闪存存储器可以用在数据重复数据删除系统中,其中数据项包括块标识符,元数据对,其中每个块标识符对应于指示的数据块的散列。 使用块标识符(例如,作为密钥)来访问RAM和闪存,以确定块是新的块还是重复的。

    Optimized transport protocol for delay-sensitive data
    19.
    发明授权
    Optimized transport protocol for delay-sensitive data 有权
    延迟敏感数据的优化传输协议

    公开(公告)号:US08228800B2

    公开(公告)日:2012-07-24

    申请号:US12364520

    申请日:2009-02-03

    Abstract: Transmission delays are minimized when packets are transmitted from a source computer over a network to a destination computer. The source computer measures the network's available bandwidth, forms a sequence of output packets from a sequence of data packets, and transmits the output packets over the network to the destination computer, where the transmission rate is ramped up to the measured bandwidth. In conjunction with the transmission, the source computer monitors a transmission delay indicator which it computes using acknowledgement packets it receives from the destination computer. Whenever the indicator specifies that the transmission delay is increasing, the source computer reduces the transmission rate until the indicator specifies that the delay is unchanged. The source computer dynamically decides whether each output packet will be a forward error correction packet or a single data packet, where the decision is based on minimizing the expected transmission delays.

    Abstract translation: 当数据包通过网络从源计算机传输到目标计算机时,传输延迟最小化。 源计算机测量网络的可用带宽,形成来自一系列数据分组的输出分组序列,并通过网络将输出分组发送到目标计算机,其中传输速率升高到测量带宽。 结合传输,源计算机监视传输延迟指示符,其使用从目的地计算机接收的确认分组来计算它。 每当指示符指示传输延迟增加时,源计算机降低传输速率,直到指示符指定延迟不变。 源计算机动态地确定每个输出分组是否将是前向纠错分组或单个数据分组,其中决定基于最小化期望的传输延迟。

    Using Index Partitioning and Reconciliation for Data Deduplication
    20.
    发明申请
    Using Index Partitioning and Reconciliation for Data Deduplication 有权
    使用索引分区和调整进行重复数据删除

    公开(公告)号:US20120166401A1

    公开(公告)日:2012-06-28

    申请号:US12979748

    申请日:2010-12-28

    Abstract: The subject disclosure is directed towards a data deduplication technology in which a hash index service's index is partitioned into subspace indexes, with less than the entire hash index service's index cached to save memory. The subspace index is accessed to determine whether a data chunk already exists or needs to be indexed and stored. The index may be divided into subspaces based on criteria associated with the data to index, such as file type, data type, time of last usage, and so on. Also described is subspace reconciliation, in which duplicate entries in subspaces are detected so as to remove entries and chunks from the deduplication system. Subspace reconciliation may be performed at off-peak time, when more system resources are available, and may be interrupted if resources are needed. Subspaces to reconcile may be based on similarity, including via similarity of signatures that each compactly represents the subspace's hashes.

    Abstract translation: 本发明涉及一种数据重复数据删除技术,其中散列索引服务的索引被分割成子空间索引,其中小于整个散列索引服务的索引来缓存存储器。 访问子空间索引以确定数据块是否已经存在或需要进行索引和存储。 索引可以根据与索引的数据相关联的标准被划分为子空间,例如文件类型,数据类型,最后使用时间等等。 还描述了子空间协调,其中检测子空间中的重复条目,以便从重复数据删除系统中删除条目和块。 当更多的系统资源可用时,子空间协调可以在非高峰时间执行,并且如果需要资源,则可能被中断。 调和的子空间可以基于相似性,包括通过每个紧密地表示子空间的散列的签名的相似性。

Patent Agency Ranking