Method and apparatus for storing data with reduced redundancy using data clusters
    1.
    发明授权
    Method and apparatus for storing data with reduced redundancy using data clusters 有权
    使用数据集群存储具有减少的冗余的数据的方法和装置

    公开(公告)号:US08255434B2

    公开(公告)日:2012-08-28

    申请号:US12876396

    申请日:2010-09-07

    IPC分类号: G06F7/00 G06F17/30

    摘要: Method and apparatus for storing data in a reduced redundancy form. Binary Large Objects (BLOBs) are partitioned into subblocks according to a partitioning method, and the subblocks are stored in subblock clusters. Each BLOB is represented as a list of spans of subblocks which identifies a contiguous sequence of subblocks within a cluster. Storage redundancy can be reduced because the spans of two different BLOBs can refer to the same subblocks. An index may be used to map subblock hashes to subblock cluster numbers.

    摘要翻译: 以减少的冗余形式存储数据的方法和装置。 二进制大对象(BLOB)根据划分方法划分为子块,子块存储在子块集群中。 每个BLOB被表示为识别集群内的子块的连续序列的子块的跨度的列表。 可以减少存储冗余,因为两个不同BLOB的跨度可以指代相同的子块。 可以使用索引来将子块哈希映射到子块簇号。

    Method and apparatus for detecting the presence of subblocks in a reduced redundancy storing system
    2.
    发明授权
    Method and apparatus for detecting the presence of subblocks in a reduced redundancy storing system 有权
    用于在减少冗余存储系统中检测子块的存在的方法和装置

    公开(公告)号:US08650368B2

    公开(公告)日:2014-02-11

    申请号:US13486408

    申请日:2012-06-01

    IPC分类号: G06F12/00

    CPC分类号: G06F17/30097 G06F17/30153

    摘要: This application concerns determining whether a particular subblock of data is present in a reduced-redundancy storage system. One embodiment achieves this by hashing subblocks in the storage system into a bitfilter that contains ‘1’ bit for each position to which at least one subblock hashes. This bitfilter provides a fast way to determine whether a subblock is in the storage system. In another embodiment, index entries for new subblocks may be buffered in a subblock index write buffer to convert a large number of random access read and write operations into a single sequential read and a single sequential write operation. The combination of the bitfilter and the write buffer yields a reduced-redundancy storage system that uses significantly less high speed random access memory than other systems that store the entire subblock index in memory.

    摘要翻译: 该应用涉及确定数据的特定子块是否存在于减少冗余存储系统中。 一个实施例通过将存储系统中的子块散列成对于至少一个子块散列到的每个位置包含'1'比特的位滤波器来实现这一点。 该位过滤器提供了确定子块是否在存储系统中的快速方法。 在另一个实施例中,用于新子块的索引条目可以缓冲在子块索引写入缓冲器中,以将大量的随机访问读和写操作转换为单个顺序读取和单个顺序写入操作。 位过滤器和写入缓冲器的组合产生了一种减少冗余的存储系统,该存储系统比在存储器中存储整个子块索引的其他系统使用显着更少的高速随机存取存储器。

    Method for partitioning a block of data into subblocks and for storing
and communcating such subblocks
    3.
    发明授权
    Method for partitioning a block of data into subblocks and for storing and communcating such subblocks 失效
    用于将数据块划分成子块并用于存储和通信这些子块的方法

    公开(公告)号:US5990810A

    公开(公告)日:1999-11-23

    申请号:US894091

    申请日:1997-08-15

    摘要: This invention provides a method and apparatus for detecting common spans within one or more data blocks by partitioning the blocks (FIG. 4) into subblocks and searching the group of subblocks (FIG. 12) (or their corresponding hashes (FIG. 13)) for duplicates. Blocks can be partitioned into subblocks using a variety of methods, including methods that place subblock boundaries at fixed positions (FIG. 3), methods that place subblock boundaries at data-dependent positions (FIG. 3), and methods that yield multiple overlapping subblocks (FIG. 6). By comparing the hashes of subblocks, common spans of one or more blocks can be identified without ever having to compare the blocks or subblocks themselves (FIG. 13). This leads to several applications including an incremental backup system that backs up changes rather than changed files (FIG. 25), a utility that determines the similarities and differences between two files (FIG. 13), a file system that stores each unique subblock at most once (FIG. 26), and a communications system that eliminates the need to transmit subblocks already possessed by the receiver (FIG. 19).

    摘要翻译: PCT No.PCT / AU96 / 00081 Sec。 371日期:1997年8月15日 102(e)日期1997年8月15日PCT提交1996年2月15日PCT公布。 公开号WO96 / 25801 日期:1996年8月22日本发明提供了一种通过将块(图4)划分为子块并搜索子块(图12)(或其对应的散列( 图13))。 可以使用各种方法将块分割成子块,包括将子块边界放置在固定位置的方法(图3),将子块边界放置在依赖于数据的位置的方法(图3),以及产生多个重叠子块的方法 (图6)。 通过比较子块的散列,可以识别一个或多个块的常规跨度,而无需比较块或子块本身(图13)。 这导致多个应用程序,包括备份更改而不是更改的文件的增量备份系统(图25),该实用程序确定两个文件(图13)之间的相似性和差异,存储每个唯一子块的文件系统 最多一次(图26),以及消除了发送接收机已经拥有的子块的需要的通信系统(图19)。

    METHOD FOR IDENTIFYING POTENTIAL DEFECTS IN A BLOCK OF TEXT USING SOCIALLY CONTRIBUTED PATTERN/MESSAGE RULES
    4.
    发明申请
    METHOD FOR IDENTIFYING POTENTIAL DEFECTS IN A BLOCK OF TEXT USING SOCIALLY CONTRIBUTED PATTERN/MESSAGE RULES 审中-公开
    使用社会贡献模式/消息规则识别文本块中的潜在缺陷的方法

    公开(公告)号:US20140047315A1

    公开(公告)日:2014-02-13

    申请号:US14112158

    申请日:2012-04-18

    IPC分类号: G06F17/24

    摘要: This invention provides a method and apparatus for identifying potential errors in a block of text using rules contributed by a plurality of users. Each rule consists of a pattern (which matches parts of a block of text) and a message (which provides helpful information). A group of rules is applied to a block of text to generate a report that binds messages with sites in the text where the corresponding rule patterns matched. Users can create, organise, edit, publish, rate, and combine rules and groups of rules. User ratings are used to generate better reports. The invention has many potential embodiments, with a web interface being an exemplary embodiment.

    摘要翻译: 本发明提供一种用于使用由多个用户贡献的规则来识别文本块中的潜在错误的方法和装置。 每个规则由一个模式(与一个文本块的部分匹配)和一个消息(提供有用的信息)组成。 一组规则被应用于一个文本块,以生成一个报告,该报告与相应规则模式匹配的文本中的站点绑定消息。 用户可以创建,组织,编辑,发布,评估和组合规则和规则组。 用户评分用于生成更好的报告。 本发明具有许多潜在的实施例,其中web界面是示例性实施例。

    Method for mapping a file specification to a sequence of actions
    5.
    发明授权
    Method for mapping a file specification to a sequence of actions 失效
    将文件规范映射到一系列操作的方法

    公开(公告)号:US5822746A

    公开(公告)日:1998-10-13

    申请号:US525280

    申请日:1995-07-05

    IPC分类号: G06F17/30

    摘要: A method for mapping a file specification to a sequence of zero or more actions. A file specification consists of any finite string of bits that provides information about the identity of a file in a computer's file system. An action can be any process that effects a change upon the computer system containing the file. The mapping method involves sequentially applying a list of pattern/action rules, with each rule's action being executed if the rule's pattern matches the file specification. Upon completion, the series of actions that has been executed is the sequence of actions corresponding to the file specification.

    摘要翻译: 一种将文件规范映射到零个或多个动作序列的方法。 文件规范由任何有限的位组成,提供有关计算机文件系统中文件的身份的信息。 一个动作可以是对包含该文件的计算机系统产生变化的任何进程。 映射方法包括顺序应用模式/动作规则列表,如果规则的模式与文件规范匹配,则每个规则的操作都将被执行。 完成后,已执行的一系列动作是与文件规范相对应的动作序列。

    Method and apparatus for indexing in a reduced-redundancy storage system
    6.
    发明授权
    Method and apparatus for indexing in a reduced-redundancy storage system 有权
    用于在冗余冗余存储系统中进行索引的方法和装置

    公开(公告)号:US08356021B2

    公开(公告)日:2013-01-15

    申请号:US11372603

    申请日:2006-03-10

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30067

    摘要: Method and apparatus for indexing subblocks in a reduced-redundancy storage system. Each subblock is hashed to an K-bit key and an entry for the subblock added to an index data structure comprising of a tree of hash tables. In a further aspect, by replacing the top of the tree with an array, the data structure can achieve O(1) access time for random keys while still providing relatively smooth growth.

    摘要翻译: 用于在减少冗余存储系统中索引子块的方法和装置。 每个子块被散列到K位密钥和子块的条目,该条目被添加到由哈希表的树构成的索引数据结构中。 在另一方面,通过用阵列替换树的顶部,数据结构可以实现随机密钥的O(1)访问时间,同时仍然提供相对平稳的增长。

    Method and apparatus for storing data with reduced redundancy using data clusters
    7.
    发明授权
    Method and apparatus for storing data with reduced redundancy using data clusters 有权
    使用数据集群存储具有减少的冗余的数据的方法和装置

    公开(公告)号:US07814129B2

    公开(公告)日:2010-10-12

    申请号:US11373420

    申请日:2006-03-10

    IPC分类号: G06F7/00 G06F17/30

    摘要: Method and apparatus for storing data in a reduced redundancy form. Binary Large Objects (BLOBs) are partitioned into subblocks according to a partitioning method, and the subblocks are stored in subblock clusters. Each BLOB is represented as a list of spans of subblocks which identifies a contiguous sequence of subblocks within a cluster. Storage redundancy can be reduced because the spans of two different BLOBs can refer to the same subblocks. An index may be used to map subblock hashes to subblock cluster numbers.

    摘要翻译: 以减少的冗余形式存储数据的方法和装置。 二进制大对象(BLOB)根据划分方法划分为子块,子块存储在子块集群中。 每个BLOB被表示为识别集群内的子块的连续序列的子块的跨度的列表。 可以减少存储冗余,因为两个不同BLOB的跨度可以指代相同的子块。 可以使用索引来将子块哈希映射到子块簇号。

    Method and apparatus for detecting the presence of subblocks in a reduced-redundancy storage system
    8.
    发明授权
    Method and apparatus for detecting the presence of subblocks in a reduced-redundancy storage system 有权
    用于检测减少冗余存储系统中子块的存在的方法和装置

    公开(公告)号:US08214607B2

    公开(公告)日:2012-07-03

    申请号:US13177799

    申请日:2011-07-07

    CPC分类号: G06F17/30097 G06F17/30153

    摘要: Method and apparatus for rapidly determining whether a particular subblock of data is present in a reduced-redundancy storage system. An aspect of the invention achieves this by hashing each subblock in the storage system into a bitfilter that contains a ‘1’ bit for each position to which at least one subblock hashes. This bitfilter provides an extremely fast way to determine whether a subblock is in the storage system. In a further aspect of the invention, index entries for new subblocks may be buffered in a subblock index write buffer so as to convert a large number of random access read and write operations into a single sequential read and a single sequential write operation. The combination of the bitfilter and the write buffer yields a reduced-redundancy storage system that uses significantly less high speed random access memory than is used by systems that store the entire subblock index in memory.

    摘要翻译: 用于快速确定特定子数据块是否存在于减少冗余存储系统中的方法和装置。 本发明的一个方面通过将存储系统中的每个子块散列成一个位滤波器来实现这一点,该位滤波器对于至少一个子块散列到的每个位置包含'1'比特。 该位过滤器提供了一种非常快速的方法来确定子块是否在存储系统中。 在本发明的另一方面,用于新的子块的索引条目可以缓冲在子块索引写入缓冲器中,以便将大量的随机访问读取和写入操作转换为单个顺序读取和单个顺序写入操作。 位过滤器和写入缓冲器的组合产生了一种减少冗余的存储系统,其使用比存储整个子块索引在存储器中的系统使用的显着更少的高速随机存取存储器。

    Method and apparatus for detecting the presence of subblocks in a reduced-redundancy storage system
    9.
    发明授权
    Method and apparatus for detecting the presence of subblocks in a reduced-redundancy storage system 有权
    用于检测减少冗余存储系统中子块的存在的方法和装置

    公开(公告)号:US08051252B2

    公开(公告)日:2011-11-01

    申请号:US11373569

    申请日:2006-03-10

    CPC分类号: G06F17/30097 G06F17/30153

    摘要: Method and apparatus for rapidly determining whether a particular subblock of data is present in a reduced-redundancy storage system. An aspect of the invention achieves this by hashing each subblock in the storage system into a bitfilter that contains a ‘1’ bit for each position to which at least one subblock hashes. This bitfilter provides an extremely fast way to determine whether a subblock is in the storage system. In a further aspect of the invention, index entries for new subblocks may be buffered in a subblock index write buffer so as to convert a large number of random access read and write operations into a single sequential read and a single sequential write operation. The combination of the bitfilter and the write buffer yields a reduced-redundancy storage system that uses significantly less high speed random access memory than is used by systems that store the entire subblock index in memory.

    摘要翻译: 用于快速确定特定子数据块是否存在于减少冗余存储系统中的方法和装置。 本发明的一个方面通过将存储系统中的每个子块散列成一个位滤波器来实现这一点,该位滤波器对于至少一个子块散列到的每个位置包含'1'比特。 该位过滤器提供了一种非常快速的方法来确定子块是否在存储系统中。 在本发明的另一方面,用于新的子块的索引条目可以缓冲在子块索引写入缓冲器中,以便将大量的随机访问读取和写入操作转换为单个顺序读取和单个顺序写入操作。 位过滤器和写入缓冲器的组合产生了一种减少冗余的存储系统,其使用比存储整个子块索引在存储器中的系统使用的显着更少的高速随机存取存储器。

    Method for matching elements of two groups
    10.
    发明授权
    Method for matching elements of two groups 失效
    两组元素匹配的方法

    公开(公告)号:US5737594A

    公开(公告)日:1998-04-07

    申请号:US525281

    申请日:1995-07-05

    IPC分类号: G06F17/30

    摘要: A method for matching-elements of two groups of data objects whose elements do not necessarily exactly match. The method consists of examining successively more abstract projections of the two groups until exact matches occur within elements of the same group, or between elements of the different groups or until there are no longer any more abstract projections to apply. Both random access end sequential embodiments are described.

    摘要翻译: 一种用于匹配元素不一定完全匹配的两组数据对象的元素的方法。 该方法包括连续检查两组的抽象投影,直到精确匹配发生在同一组的元素内,或不同组的元素之间,或者直到不再需要更多抽象的投影来应用。 描述了随机访问端顺序实施例。