Persistent key-value store and journaling system

    公开(公告)号:US11940911B2

    公开(公告)日:2024-03-26

    申请号:US17553930

    申请日:2021-12-17

    Applicant: NetApp Inc.

    Abstract: Techniques are provided for implementing a persistent key-value store for caching client data, journaling, and/or crash recovery. The persistent key-value store may be hosted as a primary cache that provides read and write access to key-value record pairs stored within the persistent key-value store. The key-value record pairs are stored within multiple chains in the persistent key-value store. Journaling is provided for the persistent key-value store such that incoming key-value record pairs are stored within active chains, and data within frozen chains is written in a distributed manner across distributed storage of a distributed cluster of nodes. If there is a failure within the distributed cluster of nodes, then the persistent key-value store may be reconstructed and used for crash recovery.

    Journal replay optimization
    2.
    发明授权

    公开(公告)号:US11861198B2

    公开(公告)日:2024-01-02

    申请号:US17728441

    申请日:2022-04-25

    Applicant: NetApp Inc.

    CPC classification number: G06F3/064 G06F3/067 G06F3/0619 G06F3/0656 G06F3/0659

    Abstract: Techniques are provided for journal replay optimization. A distributed storage architecture can implement a journal within memory for logging write operations into log records. Latency of executing the write operations is improved because the write operations can be responded back to clients as complete once logged within the journal without having to store the data to higher latency disk storage. If there is a failure, then a replay process is performed to replay the write operations logged within the journal in order to bring a file system up-to-date. The time to complete the replay of the write operations is significantly reduced by caching metadata (e.g., indirect blocks, checksums, buftree identifiers, file block numbers, and consistency point counts) directly into log records. Replay can quickly access this metadata for replaying the write operations because the metadata does not need to be retrieved from the higher latency disk storage into memory.

    BYTE-ADDRESSABLE JOURNAL HOSTED USING BLOCK STORAGE DEVICE

    公开(公告)号:US20230315695A1

    公开(公告)日:2023-10-05

    申请号:US17710638

    申请日:2022-03-31

    Applicant: NetApp Inc.

    CPC classification number: G06F16/1815 G06F16/1824 G06F16/172 G06F16/178

    Abstract: Techniques are provided for implementing a journal using a block storage device for a plurality of clients. A journal may be hosted as a primary cache for a node, where I/O operations of a plurality of clients are logged within the journal. The node may be part of a distributed cluster of nodes hosted within a container orchestration platform. The journal may be stored in a storage device comprising a block storage device and a cache. Adaptive caching may be implemented to store some journal data of the journal in the cache. For example, a first set of journal data may be stored in the block storage device without storing the first set of journal data in the cache. A second set of journal data may be stored in the block storage device and the cache.

    NETWORK STORAGE FAILOVER SYSTEMS AND ASSOCIATED METHODS

    公开(公告)号:US20210334179A1

    公开(公告)日:2021-10-28

    申请号:US16855837

    申请日:2020-04-22

    Applicant: NETAPP, INC.

    Abstract: Failover methods and systems for a networked storage environment are provided. A filtering data structure and a metadata data structure are generated before starting a replay of a log stored in a non-volatile memory of a second storage node, during a failover operation initiated in response to a failure at a first storage node. The second storage node operates as a partner node of the first storage node to mirror at the log one or more write requests received by the first storage node prior to the failure, and data associated with the one or more write requests. The filtering data structure identifies each log entry and the metadata structure stores a metadata attribute of each log entry. The filtering data structure and the metadata structure are used for providing access to a logical storage object during the log replay from the second storage node.

    RECOVERY CONSUMER FRAMEWORK
    6.
    发明申请
    RECOVERY CONSUMER FRAMEWORK 有权
    恢复消费者框架

    公开(公告)号:US20150355985A1

    公开(公告)日:2015-12-10

    申请号:US14298344

    申请日:2014-06-06

    Applicant: NetApp, Inc.

    CPC classification number: G06F11/2094 G06F11/00 G06F11/1666 G06F2201/84

    Abstract: A recovery consumer framework provides for execution of recovery actions by one or more recovery consumers to enable efficient recovery of information (e.g., data and metadata) in a storage system after a failure event (e.g., a power failure). The recovery consumer framework permits concurrent execution of recovery actions so as to reduce recovery time (i.e., duration) for the storage system. The recovery consumer framework may coordinate (e.g., notify) the recovery consumers to serialize execution of the recovery actions by those recovery consumers having a dependency while allowing concurrent execution between recovery consumers having no dependency relationship. Each recovery consumer may register with the framework to associate a dependency on one or more of the other recovery consumers. The dependency association may be represented as a directed graph where each vertex of the graph represents a recovery consumer and each directed edge of the graph represents a dependency. The framework may traverse (i.e., walk) the framework graph and for each vertex encountered, notify the associated recovery consumer to initiate its respective recovery actions.

    Abstract translation: 恢复消费者框架提供了一个或多个恢复消费者执行恢复动作以使得能够在故障事件(例如电源故障)之后有效地恢复存储系统中的信息(例如,数据和元数据)。 恢复消费者框架允许并发执行恢复动作,以减少存储系统的恢复时间(即,持续时间)。 恢复消费者框架可以协调(例如,通知)恢复消费者以使具有依赖性的那些恢复消费者对恢复动作的执行进行序列化,同时允许没有依赖关系的恢复消费者之间的并发执行。 每个恢复消费者可以向框架注册以将依赖关系与一个或多个其他恢复消费者相关联。 依赖关联可以表示为有向图,其中图的每个顶点表示恢复消费者,并且图的每个有向边代表依赖性。 框架可以遍历(即,走)框架图,并且对于遇到的每个顶点,通知相关联的恢复消费者以启动其各自的恢复动作。

    PERSISTENT KEY-VALUE STORE AND JOURNALING SYSTEM

    公开(公告)号:US20240232080A1

    公开(公告)日:2024-07-11

    申请号:US18615014

    申请日:2024-03-25

    Applicant: NetApp, Inc.

    Abstract: Techniques are provided for implementing a persistent key-value store for caching client data, journaling, and/or crash recovery. The persistent key-value store may be hosted as a primary cache that provides read and write access to key-value record pairs stored within the persistent key-value store. The key-value record pairs are stored within multiple chains in the persistent key-value store. Journaling is provided for the persistent key-value store such that incoming key-value record pairs are stored within active chains, and data within frozen chains is written in a distributed manner across distributed storage of a distributed cluster of nodes. If there is a failure within the distributed cluster of nodes, then the persistent key-value store may be reconstructed and used for crash recovery.

    JOURNAL REPLAY OPTIMIZATION
    8.
    发明公开

    公开(公告)号:US20240143210A1

    公开(公告)日:2024-05-02

    申请号:US18399555

    申请日:2023-12-28

    Applicant: NetApp Inc.

    CPC classification number: G06F3/064 G06F3/0619 G06F3/0656 G06F3/0659 G06F3/067

    Abstract: Techniques are provided for journal replay optimization. A distributed storage architecture can implement a journal within memory for logging write operations into log records. Latency of executing the write operations is improved because the write operations can be responded back to clients as complete once logged within the journal without having to store the data to higher latency disk storage. If there is a failure, then a replay process is performed to replay the write operations logged within the journal in order to bring a file system up-to-date. The time to complete the replay of the write operations is significantly reduced by caching metadata (e.g., indirect blocks, checksums, buftree identifiers, file block numbers, and consistency point counts) directly into log records. Replay can quickly access this metadata for replaying the write operations because the metadata does not need to be retrieved from the higher latency disk storage into memory.

    JOURNAL REPLAY OPTIMIZATION
    9.
    发明公开

    公开(公告)号:US20230342053A1

    公开(公告)日:2023-10-26

    申请号:US17728441

    申请日:2022-04-25

    Applicant: NetApp Inc.

    CPC classification number: G06F3/064 G06F3/0659 G06F3/0656 G06F3/0619 G06F3/067

    Abstract: Techniques are provided for journal replay optimization. A distributed storage architecture can implement a journal within memory for logging write operations into log records. Latency of executing the write operations is improved because the write operations can be responded back to clients as complete once logged within the journal without having to store the data to higher latency disk storage. If there is a failure, then a replay process is performed to replay the write operations logged within the journal in order to bring a file system up-to-date. The time to complete the replay of the write operations is significantly reduced by caching metadata (e.g., indirect blocks, checksums, buftree identifiers, file block numbers, and consistency point counts) directly into log records. Replay can quickly access this metadata for replaying the write operations because the metadata does not need to be retrieved from the higher latency disk storage into memory.

    Network storage failover systems and associated methods

    公开(公告)号:US11216350B2

    公开(公告)日:2022-01-04

    申请号:US16855853

    申请日:2020-04-22

    Applicant: NETAPP, INC.

    Abstract: Failover methods and systems for a networked storage environment are provided. A metadata data structure is generated, before starting a replay of entries at a log stored in a non-volatile memory of a second storage node, during a failover operation initiated in response to a failure at a first storage node. The second storage node operates as a partner node of the first storage node, and the metadata structure stores a metadata attribute of each log entry. Furthermore, the metadata attribute of each log entry is persistently stored. The persistently stored metadata attribute is used to respond to a read request received during the replay by the second storage node, while a write request metadata attribute of a write request is used for executing the write request received by the second storage node during the replay.

Patent Agency Ranking