Search index format optimizations
    1.
    发明授权
    Search index format optimizations 有权
    搜索索引格式优化

    公开(公告)号:US08914380B2

    公开(公告)日:2014-12-16

    申请号:US13424137

    申请日:2012-03-19

    IPC分类号: G06F17/30

    摘要: A search index structure which extends a typical composite index by incorporating an index which is optimized for fast retrieval from storage and which eliminates data which is specific to phrase searching. Other data is represented in a manner which allows it to be calculated rather than stored. Associating variable length entries with logical categories allows their length to be inferred from the category rather than stored. Using delta values between document IDs rather than the ID itself generates a compact, dense symbol set which is efficiently compressed by Huffman encoding or a similar compression method. Using an upper threshold to remove large, and thus rare, delta values from the symbol set prior to encoding further improves the encoding performance.

    摘要翻译: 一种搜索索引结构,其通过结合针对存储快速检索而优化的索引并且消除了特定于短语搜索的数据来扩展典型的复合索引。 其他数据以允许计算而不是存储的方式表示。 将可变长度条目与逻辑类别相关联可以使其长度从类别推断而不是存储。 在文档ID之间使用增量值而不是ID本身产生一个紧凑的,密集的符号集合,它被霍夫曼编码或类似的压缩方法高效地压缩。 使用上限阈值从编码之前的符号集中去除较大且因此罕见的增量值进一步提高了编码性能。

    TECHNIQUES FOR FACILITATING COPY CREATION
    2.
    发明申请
    TECHNIQUES FOR FACILITATING COPY CREATION 审中-公开
    促进复制创新的技术

    公开(公告)号:US20100191707A1

    公开(公告)日:2010-07-29

    申请号:US12358263

    申请日:2009-01-23

    IPC分类号: G06F17/30

    摘要: Various techniques are disclosed for creating a snapshot of application data. A snapshot is taken by pausing parts of the application over time. Modifications are paused to a first part of data and the first part is copied into a snapshot. After the first part has finished copying, modifications are paused to remaining data, and the remaining data is copied. The application is unpaused. A snapshot can be taken by unpausing parts of the application over time. Modifications to data in an application are paused. A first part of data is copied, and after the first part has finished copying, modifications to the first part are unpaused. The final part of data is copied, and after the final part has finished copying, modifications to the final part are unpaused. Techniques for creating a snapshot of data residing in multiple locations are described.

    摘要翻译: 公开了用于创建应用数据的快照的各种技术。 通过暂停应用程序的某些部分,可以拍摄快照。 修改暂停到数据的第一部分,第一部分被复制到快照中。 第一部分完成复制后,修改将暂停到剩余的数据,剩下的数据被复制。 应用程序是未启动的。 随着时间的推移,可以取消应用程序部分的快照。 对应用程序中数据的修改已暂停。 数据的第一部分被复制,在第一部分完成复制之后,对第一部分的修改被取消了。 数据的最后部分被复制,最后部分复制完成后,对最后一部分的修改将被取消。 描述用于创建驻留在多个位置的数据的快照的技术。

    Search index format optimizations
    3.
    发明授权
    Search index format optimizations 有权
    搜索索引格式优化

    公开(公告)号:US08166041B2

    公开(公告)日:2012-04-24

    申请号:US12139213

    申请日:2008-06-13

    IPC分类号: G06F7/00 G06F17/30

    摘要: A search index structure which extends a typical composite index by incorporating an index which is optimized for fast retrieval from storage and which eliminates data which is specific to phrase searching. Other data is represented in a manner which allows it to be calculated rather than stored. Associating variable length entries with logical categories allows their length to be inferred from the category rather than stored. Using delta values between document IDs rather than the ID itself generates a compact, dense symbol set which is efficiently compressed by Huffman encoding or a similar compression method. Using an upper threshold to remove large, and thus rare, delta values from the symbol set prior to encoding further improves the encoding performance.

    摘要翻译: 一种搜索索引结构,其通过结合针对存储快速检索而优化的索引并且消除了特定于短语搜索的数据来扩展典型的复合索引。 其他数据以允许计算而不是存储的方式表示。 将可变长度条目与逻辑类别相关联可以使其长度从类别推断而不是存储。 在文档ID之间使用增量值而不是ID本身产生一个紧凑的,密集的符号集合,它被霍夫曼编码或类似的压缩方法高效地压缩。 使用上限阈值从编码之前的符号集中去除较大且因此罕见的增量值进一步提高了编码性能。

    SEARCH INDEX FORMAT OPTIMIZATIONS
    4.
    发明申请
    SEARCH INDEX FORMAT OPTIMIZATIONS 有权
    搜索索引格式优化

    公开(公告)号:US20090313238A1

    公开(公告)日:2009-12-17

    申请号:US12139213

    申请日:2008-06-13

    IPC分类号: G06F7/06 G06F17/30

    摘要: A search index structure which extends a typical composite index by incorporating an index which is optimized for fast retrieval from storage and which eliminates data which is specific to phrase searching. Other data is represented in a manner which allows it to be calculated rather than stored. Associating variable length entries with logical categories allows their length to be inferred from the category rather than stored. Using delta values between document IDs rather than the ID itself generates a compact, dense symbol set which is efficiently compressed by Huffman encoding or a similar compression method. Using an upper threshold to remove large, and thus rare, delta values from the symbol set prior to encoding further improves the encoding performance.

    摘要翻译: 一种搜索索引结构,其通过结合针对存储快速检索而优化的索引并且消除了特定于短语搜索的数据来扩展典型的复合索引。 其他数据以允许计算而不是存储的方式表示。 将可变长度条目与逻辑类别相关联可以使其长度从类别推断而不是存储。 在文档ID之间使用增量值而不是ID本身产生一个紧凑的,密集的符号集合,它被霍夫曼编码或类似的压缩方法高效地压缩。 使用上限阈值从编码之前的符号集中去除较大且因此罕见的增量值进一步提高了编码性能。