Clustered RAID data organization
    21.
    发明授权
    Clustered RAID data organization 有权
    集群RAID数据组织

    公开(公告)号:US08832363B1

    公开(公告)日:2014-09-09

    申请号:US14162047

    申请日:2014-01-23

    Applicant: NetApp, Inc.

    Abstract: In one embodiment, a clustered storage system is configured to reduce parity overhead of Redundant Array of Independent Disks (RAID) groups, as well as to facilitate distribution and servicing of the storage containers among storage systems (nodes) of the cluster. The storage containers may be stored on one or more storage arrays of storage devices, such as solid state drives (SSDs), connected to the nodes of the cluster. The RAID groups may be formed from slices (i.e., portions) of storage spaces of the SSDs instead of the entire storage spaces of the SSDs. That is, each RAID group may be formed “horizontally” across a set of SSDs as slices (i.e., one slice of storage space from each SSD in the set). Accordingly, a plurality of RAID groups may co-exist (i.e., be stacked) on the same set of SSDs.

    Abstract translation: 在一个实施例中,集群存储系统被配置为减少独立磁盘冗余阵列(RAID)组的奇偶校验开销,并且便于在集群的存储系统(节点)之间的存储容器的分发和服务。 存储容器可以存储在连接到集群的节点的存储设备(例如固态驱动器(SSD))的一个或多个存储阵列上。 RAID组可以由SSD的存储空间的片(即,部分)而不是SSD的整个存储空间形成。 也就是说,每个RAID组可以作为切片(即,集合中的每个SSD的一个存储空间片)横跨一组SSD“水平地”形成。 因此,多个RAID组可以共存(即堆叠)在同一组SSD上。

    Flash optimized, log-structured layer of a file system

    公开(公告)号:US10042853B2

    公开(公告)日:2018-08-07

    申请号:US15239125

    申请日:2016-08-17

    Applicant: NetApp, Inc.

    Abstract: A flash-optimized, log-structured layer of a file system of a storage input/output (I/O) stack executes on one or more nodes of a cluster. The log-structured layer of the file system provides sequential storage of data and metadata (i.e., a log-structured layout) on solid state drives (SSDs) of storage arrays in the cluster to reduce write amplification, while leveraging variable compression and variable length data features of the storage I/O stack. The data may be organized as an arbitrary number of variable-length extents of one or more host-visible logical units (LUNs) served by the nodes. The metadata may include mappings from host-visible logical block address ranges (i.e., offset ranges) of a LUN to extent keys, as well as mappings of the extent keys to SSD storage locations of the extents. The storage location of an extent on SSD is effectively “virtualized” by its mapped extent key (i.e., extent store layer mappings) such that relocation of the extent on SSD does require update to volume layer metadata (i.e., the extent key sufficiently identifies the extent).

    TECHNIQUE FOR REDUCING METADATA STORED IN A MEMORY OF A NODE

    公开(公告)号:US20180173703A1

    公开(公告)日:2018-06-21

    申请号:US15895593

    申请日:2018-02-13

    Applicant: NetApp, Inc.

    Abstract: A technique reduces an amount of metadata stored in a memory of a node in a cluster. An extent store layer of a storage input/output (I/O) stack executing on the node stores key-value pairs in a plurality of data structures, e.g., cuckoo hash tables, resident in the memory. The cuckoo hash table embodies metadata that describes an extent and, as such, may be organized to associate a location on disk with a value that identifies the location on disk. The value may be embodied as a locator that includes a reference count used to support deduplication functionality of the extent store layer with respect to the extent. The reference count is divided into two portions: a delta count portion stored in memory for each slot of the hash table and an overflow count portion stored on disk in a header of each extent. One bit of the delta count portion is reserved as an overflow bit that indicates whether the in-memory reference count has overflowed. Another bit of the delta count portion is reserved as a sign bit that indicates whether the value of the remaining delta count portion, which stores the “delta” of the reference count, is positive or negative. Overflow updates to the overflow count portion on disk are postponed until all of the bits of the delta count portion are consumed as negative/positive transitions.

    NVRAM LOSS HANDLING
    24.
    发明申请
    NVRAM LOSS HANDLING 审中-公开

    公开(公告)号:US20170300388A1

    公开(公告)日:2017-10-19

    申请号:US15130280

    申请日:2016-04-15

    Applicant: NetApp, Inc.

    CPC classification number: G06F11/1464

    Abstract: A technique restores a file system of a storage input/output (I/O) stack to a deterministic point-in-time state in the event of failure (loss) of non-volatile random access memory (NVRAM) of a node. The technique enables restoration of the file system to a safepoint stored on storage devices, such solid state drives (SSD), of the node with minimum data and metadata loss. The safepoint is a point-in-time during execution of I/O requests (e.g., write operations) at which data and related metadata of the write operations prior to the point-in-time are safely persisted on SSD such that the metadata relating to an image of the file system on SSD (on-disk) is consistent and complete. Upon reboot after NVRAM loss, the technique identifies (i) the most recent safepoint, as well as (ii) the inflight writes that were persistently stored on disk after the most recent safepoint. The data and metadata of those inflight writes are then deleted to place the on-disk file system to its state at the most recent safepoint.

    Granular sync/semi-sync architecture
    25.
    发明授权
    Granular sync/semi-sync architecture 有权
    粒度同步/半同步架构

    公开(公告)号:US09571575B2

    公开(公告)日:2017-02-14

    申请号:US14473621

    申请日:2014-08-29

    Applicant: NetApp, Inc.

    Abstract: Data consistency and availability can be provided at the granularity of logical storage objects in storage solutions that use storage virtualization in clustered storage environments. To ensure consistency of data across different storage elements, synchronization is performed across the different storage elements. Changes to data are synchronized across storage elements in different clusters by propagating the changes from a primary logical storage object to a secondary logical storage object. To satisfy the strictest RPOs while maintaining performance, change requests are intercepted prior to being sent to a filesystem that hosts the primary logical storage object and propagated to a different managing storage element associated with the secondary logical storage object.

    Abstract translation: 可以在集群存储环境中使用存储虚拟化的存储解决方案中的逻辑存储对象的粒度提供数据一致性和可用性。 为了确保不同存储元件之间的数据的一致性,跨不同存储元件执行同步。 通过将更改从主逻辑存储对象传播到辅助逻辑存储对象,对数据的更改在不同群集中的存储元素之间进行同步。 为了在维护性能的同时满足最严格的RPO,在发送到托管主逻辑存储对象的文件系统之前,更改请求将被拦截,并传播到与辅助逻辑存储对象关联的其他管理存储元素。

    FLASH OPTIMIZED, LOG-STRUCTURED LAYER OF A FILE SYSTEM
    26.
    发明申请
    FLASH OPTIMIZED, LOG-STRUCTURED LAYER OF A FILE SYSTEM 审中-公开
    闪存优化,文件系统的日志结构层

    公开(公告)号:US20160357776A1

    公开(公告)日:2016-12-08

    申请号:US15239125

    申请日:2016-08-17

    Applicant: NetApp, Inc.

    Abstract: A flash-optimized, log-structured layer of a file system of a storage input/output (I/O) stack executes on one or more nodes of a cluster. The log-structured layer of the file system provides sequential storage of data and metadata (i.e., a log-structured layout) on solid state drives (SSDs) of storage arrays in the cluster to reduce write amplification, while leveraging variable compression and variable length data features of the storage I/O stack. The data may be organized as an arbitrary number of variable-length extents of one or more host-visible logical units (LUNs) served by the nodes. The metadata may include mappings from host-visible logical block address ranges (i.e., offset ranges) of a LUN to extent keys, as well as mappings of the extent keys to SSD storage locations of the extents. The storage location of an extent on SSD is effectively “virtualized” by its mapped extent key (i.e., extent store layer mappings) such that relocation of the extent on SSD does require update to volume layer metadata (i.e., the extent key sufficiently identifies the extent).

    Abstract translation: 存储输入/输出(I / O)堆栈的文件系统的闪存优化的日志结构化层在集群的一个或多个节点上执行。 文件系统的日志结构化层在集群中的存储阵列的固态驱动器(SSD)上提供数据和元数据(即,日志结构化布局)的顺序存储,以减少写入放大,同时利用可变压缩和可变长度 存储I / O堆栈的数据特征。 数据可以被组织为由节点服务的一个或多个主机可见逻辑单元(LUN)的可变长度范围的任意数量。 元数据可以包括从LUN到扩展密钥的主机可见逻辑块地址范围(即,偏移范围)的映射,以及扩展密钥到扩展区的SSD存储位置的映射。 SSD上的盘区的存储位置被其映射的盘区密钥(即,盘区存储层映射)有效地“虚拟化”,使得SSD上盘区的重新定位需要更新到卷层元数据(即,扩展密钥足够地识别 程度)。

    File system driven raid rebuild technique
    27.
    发明授权
    File system driven raid rebuild technique 有权
    文件系统驱动的RAID重建技术

    公开(公告)号:US09454434B2

    公开(公告)日:2016-09-27

    申请号:US14158448

    申请日:2014-01-17

    Applicant: NetApp, Inc.

    Abstract: In one embodiment, one or more storage arrays of solid state drives (SSDs) that include a plurality of segments are organized as one or more redundant array of independent disks (RAID) groups, where the RAID groups provides data redundancy for the segments. A node executing a layered file system of a storage input/output (I/O) stack performs segment cleaning to clean the segments. It further initiates rebuild of a RAID configuration of the SSDs on a segment-by-segment basis in response to the segment cleaning. In such a configuration, each segment includes one or more RAID stripes that provide a level of data redundancy as well as RAID organization for the segment.

    Abstract translation: 在一个实施例中,包括多个段的固态驱动器(SSD)的一个或多个存储阵列被组织为独立磁盘(RAID)组的一个或多个冗余阵列,其中RAID组为段提供数据冗余。 执行存储输入/输出(I / O)堆栈的分层文件系统的节点执行段清理来清理段。 它还响应于段清理,逐段启动重新构建SSD的RAID配置。 在这样的配置中,每个段包括一个或多个RAID条带,其提供数据冗余级别以及该段的RAID组织。

    PERTURB KEY TECHNIQUE
    28.
    发明申请
    PERTURB KEY TECHNIQUE 审中-公开
    PERTURB关键技术

    公开(公告)号:US20160248583A1

    公开(公告)日:2016-08-25

    申请号:US15052332

    申请日:2016-02-24

    Applicant: NetApp, Inc.

    CPC classification number: G06F21/78

    Abstract: A technique perturbs an extent key to compute a candidate extent key in the event of a collision with metadata (i.e., two extents having different data that yield identical hash values) stored in a memory of a node in a cluster. The perturbing technique may be used to compute a candidate extent key that is not previously stored in an extent store instance. The candidate extent key may be computed from a hash value of an extent using a perturbing algorithm, i.e., a hash collision computation, which illustratively adds a perturb value to the hash value. The perturb value is illustratively sufficient to ensure that the candidate extent key resolves to a same hash bucket and node (extent store instance) as the original extent key. In essence, the technique ensures that the original extent key is perturbed in a deterministic manner to generate the candidate extent key, so that the original extent and candidate extent key “decode” to the same hash bucket and extent store instance.

    Abstract translation: 在与集群中的节点的存储器中存储的元数据(即,具有产生相同的散列值的不同数据的两个扩展数据块)的冲突的情况下,技术干扰了用于计算候选扩展密钥的扩展密钥。 扰动技术可以用于计算先前不存储在范围存储实例中的候选扩展密钥。 候选范围密钥可以使用扰动算法(即,散列碰撞计算)从扩展的散列值计算,该散列碰撞计算说明性地将扰动值添加到散列值。 扰动值示例性地足以确保候选扩展密钥解析为与原始扩展密钥相同的哈希桶和节点(范围存储实例)。 实质上,该技术确保原始扩展密钥以确定性的方式被扰动以产生候选扩展密钥,使得原始扩展和候选扩展密钥“解码”到相同的哈希桶和扩展存储实例。

    File system driven raid rebuild technique
    29.
    发明授权
    File system driven raid rebuild technique 有权
    文件系统驱动的RAID重建技术

    公开(公告)号:US09389958B2

    公开(公告)日:2016-07-12

    申请号:US14161184

    申请日:2014-01-22

    Applicant: NetApp, Inc.

    Abstract: In one embodiment, a file system driven RAID rebuild technique is provided. A layered file system may organize storage of data as segments spanning one or more sets of storage devices, such as solid state drives (SSDs), of a storage array, wherein each set of SSDs may form a RAID group configured to provide data redundancy for a segment. The file system may then drive (i.e., initiate) rebuild of a RAID configuration of the SSDs on a segment-by-segment basis in response to cleaning of the segment (i.e., segment cleaning). Each segment may include one or more RAID stripes that provide a level of data redundancy (e.g., single parity RAID 5 or double parity RAID 6) as well as RAID organization (i.e., distribution of data and parity) for the segment. Notably, the level of data redundancy and RAID organization may differ among the segments of the array.

    Abstract translation: 在一个实施例中,提供了文件系统驱动的RAID重建技术。 分层文件系统可以将数据的存储组织为跨越存储阵列的一组或多组存储设备(例如固态驱动器(SSD))的段,其中每组SSD可以形成RAID组,其被配置为提供数据冗余 一段 然后文件系统可以响应于段的清除(即,段清除)逐个段地驱动(即,启动)重建SSD的RAID配置。 每个段可以包括一个或多个提供数据冗余级别(例如,单个奇偶校验RAID 5或双奇偶校验RAID 6)的RAID条带以及用于该段的RAID组织(即,数据和奇偶校验的分配)。 值得注意的是,数据冗余和RAID组织的级别可能在阵列的各个部分之间不同。

    OFFSET RANGE OPERATION STRIPING TO IMPROVE CONCURRENCY OF EXECUTION AND REDUCE CONTENTION AMONG RESOURCES
    30.
    发明申请
    OFFSET RANGE OPERATION STRIPING TO IMPROVE CONCURRENCY OF EXECUTION AND REDUCE CONTENTION AMONG RESOURCES 审中-公开
    偏离范围的操作条件提高了资源的执行和减少同步的和解

    公开(公告)号:US20160070644A1

    公开(公告)日:2016-03-10

    申请号:US14482957

    申请日:2014-09-10

    Applicant: NetApp, Inc.

    CPC classification number: G06F3/0688 G06F3/0611 G06F3/0644

    Abstract: An offset range striping technique increases concurrency of operation execution directed to metadata managed by a volume layer of a storage input/output (I/O) stack, while reducing contention among resources of one or more nodes of a cluster. A logical unit (LUN) may be apportioned into multiple volumes, each of which may be partitioned into multiple regions, wherein each region is represented by a dense tree. The technique increases concurrency of operation execution (e.g., modifications to the metadata at the offset ranges), while reducing contention among the resources (e.g., CPUs and NVLogs) by distributing the offset range operations among the regions and mapping the regions to services and NVLogs. Such increased concurrency and reduction of contention may be achieved by implementation of the technique to (i) apportion each region into disjoint chunks (i.e., stripes) of contiguous offset ranges; (ii) organize a plurality of regions into one or more zones and populate a first zone before allocating a second zone; and (iii) stagger the mapping of services to starting regions of the volumes.

    Abstract translation: 偏移范围条带化技术增加了针对由存储输入/输出(I / O)堆栈的卷层管理的元数据的操作执行的并发性,同时减少了集群的一个或多个节点的资源之间的争用。 逻辑单元(LUN)可以被分配成多个卷,每个卷可被划分成多个区域,其中每个区域由密集的树表示。 该技术增加了操作执行的并发性(例如,在偏移范围内对元数据的修改),同时通过在区域之间分配偏移范围操作来减少资源(例如,CPU和NVLogs)之间的争用,并将该区域映射到服务和NVLogs 。 这种增加的并发性和降低竞争力可以通过实现该技术来实现,以(i)将每个区域分配成相邻偏移范围的不相交的块(即条带); (ii)在分配第二区域之前将多个区域组织成一个或多个区域并填充第一区域; 和(iii)将服务的映射错开到卷的起始区域。

Patent Agency Ranking