TRANSACTION LOG LAYOUT FOR EFFICIENT RECLAMATION AND RECOVERY

    公开(公告)号:US20170097771A1

    公开(公告)日:2017-04-06

    申请号:US14872793

    申请日:2015-10-01

    Applicant: NetApp, Inc.

    Abstract: A layout of a transaction log enables efficient logging of metadata into entries of the log, as well as efficient reclamation and recovery of the log entries by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The transaction log is illustratively a two stage, append-only logging structure, wherein the first level is non-volatile random access memory (NVRAM) embodied as a NV log and the second stage is disk, e.g., solid state drive (SSD). The layout of the logging structure facilitates steady-state logging of metadata managed by the volume layer and crash recovery. Steady-state logging of metadata into the log entries occurs while the storage I/O stack of a node actively processes I/O requests, while crash recovery of the log entries occurs after an unexpected shutdown of the node.

    SNAPSHOT AND/OR CLONE COPY-ON-WRITE
    2.
    发明申请
    SNAPSHOT AND/OR CLONE COPY-ON-WRITE 审中-公开
    SNAPSHOT和/或克隆复制写入

    公开(公告)号:US20170032005A1

    公开(公告)日:2017-02-02

    申请号:US14814804

    申请日:2015-07-31

    Applicant: NetApp, Inc.

    CPC classification number: G06F16/128

    Abstract: A technique improves efficiency of a copy-on-write (COW) operation used to create a snapshot and/or clone by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The snapshot/clone may be represented as an independent volume, and embodied as a respective read-only copy (snapshot) or read-write copy (clone) of a parent volume. Volume metadata managed by the volume layer is organized as one or more multi-level dense tree metadata structures, wherein each level of the dense tree includes volume metadata entries for storing the metadata. The volume metadata entries may be organized as metadata pages having associated metadata page keys. Each metadata page is rendered distinct or “unique” from other metadata pages in an extent store layer of the storage I/O stack through the use of a multi-component uniqifier contained in a header of each metadata page. To improve the efficiency of the COW operation, the technique allows the use of reference count operations on the metadata page keys of the “unique” metadata pages so as to allow sharing of the metadata pages individually between the parent volume and the snapshot/clone.

    Abstract translation: 一种技术提高了用于通过在集群的一个或多个节点上执行的存储输入/输出(I / O)堆栈的卷层创建快照和/或克隆的写时复制(COW)操作的效率。 快照/克隆可以表示为独立卷,并体现为父卷的相应只读副本(快照)或读写副本(克隆)。 由卷层管理的卷元数据被组织为一个或多个多级密集树元数据结构,其中密集树的每个级别包括用于存储元数据的卷元数据条目。 卷元数据条目可以被组织为具有相关联的元数据页面密钥的元数据页面。 每个元数据页面通过使用包含在每个元数据页面的报头中的多组件单元格,在存储I / O堆栈的盘区存储层中与其他元数据页面呈现不同或“唯一”。 为了提高COW操作的效率,该技术允许在“唯一”元数据页面的元数据页面键上使用引用计数操作,以便允许在父卷和快照/克隆之间单独共享元数据页面。

    LOW-OVERHEAD RESTARTABLE MERGE OPERATION WITH EFFICIENT CRASH RECOVERY
    3.
    发明申请
    LOW-OVERHEAD RESTARTABLE MERGE OPERATION WITH EFFICIENT CRASH RECOVERY 审中-公开
    具有高效冲击恢复功能的低过载重启功能

    公开(公告)号:US20160070714A1

    公开(公告)日:2016-03-10

    申请号:US14483012

    申请日:2014-09-10

    Applicant: NetApp, Inc.

    CPC classification number: G06F16/1748 G06F11/1471 G06F16/2246

    Abstract: A low-overhead merge technique enables restart of a merge operation with minimal logging of state information relating to progress of the merge operation by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The technique enables restart of the merge operation by ensuring that metadata, i.e., metadata pages, generated during the merge operation is not subject to de-duplication by providing a unique value in each metadata page that distinguishes the page, i.e., renders the page distinct or “unique”, from other metadata pages in an extent store. In addition, the technique ensures that a reference count on each metadata page is a value denoting a lack of de-duplication. To that end, the extent store layer is configured to not increment the reference count for a metadata page if, during the merge operation, the page is identical (and thus subject to deduplication) to an existing metadata page in the extent store.

    Abstract translation: 低开销合并技术使得可以通过对在集群的一个或多个节点上执行的存储输入/输出(I / O)堆栈的卷层进行合并操作的进展的状态信息的最小记录来重新启动合并操作。 。 该技术通过确保在合并操作期间生成的元数据页面不受重复数据删除的影响,从而通过在每个元数据页面中提供唯一的值来区分页面,即,使页面不同 或“唯一”,从范围存储中的其他元数据页面。 此外,该技术确保每个元数据页面上的引用计数是表示缺少重复数据删除的值。 为此,如果在合并操作期间页面与扩展存储区中的现有元数据页面相同(因此遭受重复数据删除),则扩展区存储层被配置为不递增元数据页面的引用计数。

    Technique for recovery of trapped storage space in an extent store

    公开(公告)号:US09830103B2

    公开(公告)日:2017-11-28

    申请号:US14988435

    申请日:2016-01-05

    Applicant: NetApp, Inc.

    CPC classification number: G06F3/0644 G06F3/0608 G06F3/067

    Abstract: A technique enables recovery of storage space trapped in an extent store from overlapping write requests associated with metadata describing volume logical storage addresses for data in the extent store. The metadata is organized as metadata entries in a multi-level dense tree metadata structure. When a level of the dense tree is full, the metadata entries of the level are merged with a next lower level of the dense tree in accordance with a dense tree merge operation. The technique may be invoked during the merge operation to process the metadata entries associated with the overlapping write requests involved in the merge operation. Processing of the overlapping write requests during the merge operation may partially overwrite extents which, in turn, may result in logical storage space being trapped in the extent store. The technique may perform read-modify-write (RMW) operations on the partially overwritten extents to recapture that trapped space.

    DEFERRED REFERENCE COUNT UPDATE TECHNIQUE FOR LOW OVERHEAD VOLUME METADATA
    6.
    发明申请
    DEFERRED REFERENCE COUNT UPDATE TECHNIQUE FOR LOW OVERHEAD VOLUME METADATA 审中-公开
    用于低超大容量元数据的延迟参考计数更新技术

    公开(公告)号:US20160077744A1

    公开(公告)日:2016-03-17

    申请号:US14484061

    申请日:2014-09-11

    Applicant: NETAPP, INC.

    Abstract: A deferred refcount update technique efficiently frees storage space for metadata (associated with data) to be deleted during a merge operation managed by a volume layer of a node. The metadata is illustratively volume metadata embodied as mappings from logical block addresses (LBAs) of a logical unit (LUN) to extent keys maintained by an extent store layer of the node. One or more requests to delete (or overwrite) an LBA range within a LUN may be captured as page keys associated with metadata pages during the merge operation and the storage space associated with those metadata pages may be freed in an out-of-band fashion. The page keys of the metadata pages may be persistently recorded in a reference count (refcount) log to thereby allow the merge operation to complete without resolving deletion of the keys. A batch of page keys may be organized as one or more delete requests and, once the merge completes, the keys may be inserted into the refcount log. Subsequently, a deferred reference count update process may be spawned (instantiated) to walk through the page keys stored in the refcount log and delete each key, e.g., from the extent store layer, independently and out-of-band from the merge operation.

    Abstract translation: 延迟重新计费更新技术有效地释放了在由节点的卷层管理的合并操作期间要删除的元数据(与数据相关联)的存储空间。 元数据示例性地是体现为从逻辑单元(LUN)的逻辑块地址(LBA)到由节点的扩展区存储层维护的扩展密钥的映射的卷元数据。 删除(或覆盖)LUN中的LBA范围的一个或多个请求可以被捕获为在合并操作期间与元数据页相关联的页面键,并且与那些元数据页相关联的存储空间可以以带外方式释放 。 元数据页面的页面键可以被持久地记录在引用计数(引用计数)日志中,从而允许合并操作完成而不解决键的删除。 一批页面键可以被组织为一个或多个删除请求,并且一旦合并完成,则可以将密钥插入到引用计数日志中。 随后,可以产生(实例化)延迟引用计数更新处理以遍历存储在引用计数日志中的页面密钥,并且例如从扩展存储层中删除每个密钥,从合并操作中独立地进行带外删除。

    SCHEDULING TECHNIQUE TO SMOOTH METADATA PROCESSING LATENCY

    公开(公告)号:US20170212891A1

    公开(公告)日:2017-07-27

    申请号:US15005884

    申请日:2016-01-25

    Applicant: NetApp, Inc.

    CPC classification number: G06F9/50 G06F3/06 G06F3/0659 G06F3/067

    Abstract: A technique schedules processing of metadata managed by a volume layer of a storage input/output (I/O) stack executing on a node of cluster in a manner that reduces bursty activity associated with metadata processing and maintains smooth, i.e., bounded, processing latency on the node. Operations on the metadata managed by the volume layer manifest as modifications to metadata entries of data structures, i.e., dense trees, at offset ranges of the regions. The operations are dense tree merge operations that are processed by threads of execution, i.e., a uni-processor services, on one or more central processing units of the node. The scheduling technique distributes the bursty activity of the dense tree merge operations by (i) controlling concurrency of the merge operations, (ii) distributing initiation of the merge operations (i.e., staggering the merge operations), and (iii) pacing execution of merge messages to limit the continuous runtime of the merge operations.

    HIGH PERFORMANCE AND MEMORY EFFICIENT METADATA CACHING

    公开(公告)号:US20170192892A1

    公开(公告)日:2017-07-06

    申请号:US14989392

    申请日:2016-01-06

    Applicant: NetApp, Inc.

    Abstract: A technique provides memory efficient caching of metadata managed by a volume layer of a storage input/output stack executing on one or more nodes of a cluster. Efficient caching of the metadata in a memory of a node may be realized through the use of a caching data structure, i.e., a page cache, configured to store a key-value pair, wherein the key is an extent key and the value is a metadata page containing the index entries. The page cache illustratively includes two data structures configured to maintain the properties of Least Recently Used (LRU) and Least Frequently Used (LFU) for the cache. The first data structure is a hash table that stores a dense tree metadata page (value) indexed by the extent key. The second data structure is a recycle queue that controls the metadata page stored in the hash table based on spatial and temporal locality of the page.

    TRANSACTION LOG LAYOUT FOR EFFICIENT RECLAMATION AND RECOVERY

    公开(公告)号:US20170097873A1

    公开(公告)日:2017-04-06

    申请号:US14876572

    申请日:2015-10-06

    Applicant: NetApp, Inc.

    Abstract: A layout of a transaction log enables efficient logging of metadata into entries of the log, as well as efficient reclamation and recovery of the log entries by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The transaction log is illustratively a two stage, append-only logging structure, wherein the first level is non-volatile random access memory (NVRAM) embodied as a NVlog and the second stage is disk, e.g., solid state drive (SSD). During crash recovery, the log entries are examined for consistency and scanned to identify those entries that have completed and those that are active, which require replay. The log entries are walked from oldest to newest (using sequence numbers) searching for the highest sequence number. Partially complete log entries (e.g., log entries in-progress when a crash occurs) may be discarded for failing a checksum (e.g., a CRC error). Old value/new value logs may be used to implement roll-forward or roll-back semantics to replay the log entries and fix any on-disk data structures, first from NVRAM and then from on-disk logs.

    High performance and memory efficient metadata caching

    公开(公告)号:US10108547B2

    公开(公告)日:2018-10-23

    申请号:US14989392

    申请日:2016-01-06

    Applicant: NetApp, Inc.

    Abstract: A technique provides memory efficient caching of metadata managed by a volume layer of a storage input/output stack executing on one or more nodes of a cluster. Efficient caching of the metadata in a memory of a node may be realized through the use of a caching data structure, i.e., a page cache, configured to store a key-value pair, wherein the key is an extent key and the value is a metadata page containing the index entries. The page cache illustratively includes two data structures configured to maintain the properties of Least Recently Used (LRU) and Least Frequently Used (LFU) for the cache. The first data structure is a hash table that stores a dense tree metadata page (value) indexed by the extent key. The second data structure is a recycle queue that controls the metadata page stored in the hash table based on spatial and temporal locality of the page.

Patent Agency Ranking