Transferring and caching a cloud file in a distributed filesystem

    公开(公告)号:US09852149B1

    公开(公告)日:2017-12-26

    申请号:US13769213

    申请日:2013-02-15

    申请人: Panzura, Inc.

    IPC分类号: G06F15/16 G06F17/30

    摘要: The disclosed embodiments disclose techniques for transferring and caching a cloud file in a cloud controller. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers cache and ensure data consistency for the stored data. During operation, a cloud controller receives a client request for a data block of a target file that is stored in the distributed filesystem but not currently cached in the cloud controller. The cloud controller initiates a request to a cloud storage system for a cloud file containing the requested data block. While receiving the cloud file from the cloud storage system, the cloud controller uses a set of block metadata in the portion of the cloud file that has already been received to determine the portions of the cloud file that should be downloaded to and cached in the cloud controller.

    Facilitating the recovery of a virtual machine using a distributed filesystem

    公开(公告)号:US09613064B1

    公开(公告)日:2017-04-04

    申请号:US13782729

    申请日:2013-03-01

    申请人: Panzura, Inc.

    IPC分类号: G06F15/16 G06F17/30

    摘要: The disclosed embodiments disclose techniques that facilitate the recovery of a virtual machine using a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers ensure data consistency for the stored data, and each cloud controller caches portions of the distributed filesystem in a local storage pool. During operation, a host server executes program instructions for an application in a virtual machine (VM); data associated with this application and/or this virtual machine is stored in the distributed filesystem. Upon detecting a subsequent failure, the system can recover and resume the execution of the virtual machine and application using the previous application and virtual machine data that was stored in the distributed filesystem.

    PERFORMING ANTI-VIRUS CHECKS FOR A DISTRIBUTED FILESYSTEM
    3.
    发明申请
    PERFORMING ANTI-VIRUS CHECKS FOR A DISTRIBUTED FILESYSTEM 有权
    为分布式文件系统执行防病毒检查

    公开(公告)号:US20140007239A1

    公开(公告)日:2014-01-02

    申请号:US14019212

    申请日:2013-09-05

    申请人: PANZURA, INC.

    IPC分类号: G06F21/56

    摘要: The disclosed embodiments disclose techniques that facilitate the process of performing anti-virus checks for a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers ensure data consistency for the stored data, and each cloud controller caches portions of the distributed filesystem. During operation, a cloud controller receives a write request from a client system that seeks to store a target file in the distributed system. A scan is then performed for this target file. For instance, the scan may be an anti-virus scan that ensures that viruses are not spread to the distributed filesystem or the clients of the distributed filesystem.

    摘要翻译: 所公开的实施例公开了促进对分布式文件系统执行反病毒检查的过程的技术。 两个或多个云控制器共同管理存储在一个或多个云存储系统中的分布式文件系统数据; 云控制器确保存储数据的数据一致性,并且每个云控制器缓存部分分布式文件系统。 在操作期间,云控制器从寻求在分布式系统中存储目标文件的客户端系统接收写入请求。 然后对该目标文件执行扫描。 例如,扫描可能是防病毒扫描,可确保病毒不会传播到分布式文件系统或分布式文件系统的客户端。

    Accessing cached data from a peer cloud controller in a distributed filesystem
    5.
    发明授权
    Accessing cached data from a peer cloud controller in a distributed filesystem 有权
    从分布式文件系统中的对等云控制器访问缓存的数据

    公开(公告)号:US08805968B2

    公开(公告)日:2014-08-12

    申请号:US13725767

    申请日:2012-12-21

    申请人: Panzura, Inc.

    IPC分类号: G06F15/16

    CPC分类号: G06F17/30194 G06F17/30132

    摘要: The disclosed embodiments provide a system that archives data for a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers cache and ensure data consistency for the stored data. During operation, a cloud controller receives a request from a client for a data block of a file stored in the distributed filesystem. Upon determining that the requested data block is not currently cached in the cloud controller, the cloud controller sends a peer cache request for the requested data block to a peer cloud controller in the distributed filesystem.

    摘要翻译: 所公开的实施例提供了归档用于分布式文件系统的数据的系统。 两个或多个云控制器共同管理存储在一个或多个云存储系统中的分布式文件系统数据; 云控制器缓存并确保存储数据的数据一致性。 在操作期间,云控制器从客户端接收对分布式文件系统中存储的文件的数据块的请求。 在确定所请求的数据块当前未被缓存在云控制器中时,云控制器向分布式文件系统中的对等云控制器发送所请求数据块的对等缓存请求。

    PROVIDING DISASTER RECOVERY FOR A DISTRIBUTED FILESYSTEM
    6.
    发明申请
    PROVIDING DISASTER RECOVERY FOR A DISTRIBUTED FILESYSTEM 有权
    为分布式文件系统提供灾难恢复

    公开(公告)号:US20130111262A1

    公开(公告)日:2013-05-02

    申请号:US13725759

    申请日:2012-12-21

    申请人: Panzura, Inc.

    IPC分类号: G06F11/20

    摘要: The disclosed embodiments provide a system that distributes data for a distributed filesystem across multiple cloud storage systems. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers cache and ensure data consistency for the stored data. Whenever each cloud controller receives new data from a client, it outputs an incremental metadata snapshot for the new data that is propagated to the other cloud controllers and an incremental data snapshot containing the new data that is sent to a cloud storage system. During operation, a backup cloud controller associated with the distributed filesystem is also configured to receive each (incremental) metadata snapshot, such that, upon determining the failure of a cloud controller, the backup cloud controller can immediately begin receiving data requests from clients associated with the failed cloud controller.

    摘要翻译: 所公开的实施例提供了一种分布在多个云存储系统上的分布式文件系统的数据的系统。 两个或多个云控制器共同管理存储在一个或多个云存储系统中的分布式文件系统数据; 云控制器缓存并确保存储数据的数据一致性。 每当云控制器从客户端收到新数据时,它会为传播到其他云控制器的新数据和包含发送到云存储系统的新数据的增量数据快照输出增量元数据快照。 在操作期间,与分布式文件系统相关联的备份云控制器也被配置为接收每个(增量)元数据快照,使得在确定云控制器的故障时,备份云控制器可以立即开始接收来自与 失败的云控制器。

    ARCHIVING DATA FOR A DISTRIBUTED FILESYSTEM
    7.
    发明申请
    ARCHIVING DATA FOR A DISTRIBUTED FILESYSTEM 有权
    为分布式文件系统提供数据

    公开(公告)号:US20130110779A1

    公开(公告)日:2013-05-02

    申请号:US13725751

    申请日:2012-12-21

    申请人: Panzura, Inc.

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30215 G06F17/30221

    摘要: The disclosed embodiments provide a system that archives data for a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers cache and ensure data consistency for the stored data. During operation, a cloud controller determines that a cloud file in a previously stored data snapshot is no longer being actively referenced in the distributed filesystem. The cloud controller transfers this cloud file from the (first) cloud storage system to an archival cloud storage system, thereby reducing storage costs while preserving the data in the cloud file in case it is ever needed again.

    摘要翻译: 所公开的实施例提供了归档用于分布式文件系统的数据的系统。 两个或多个云控制器共同管理存储在一个或多个云存储系统中的分布式文件系统数据; 云控制器缓存并确保存储数据的数据一致性。 在操作期间,云控制器确定先前存储的数据快照中的云文件不再在分布式文件系统中被主动引用。 云控制器将云文件从(第一个)云存储系统传输到存档云存储系统,从而降低存储成本,同时保留云文件中的数据,以防再次需要。

    Managing metadata and data storage for a cloud controller in a distributed filesystem

    公开(公告)号:US09792298B1

    公开(公告)日:2017-10-17

    申请号:US13769211

    申请日:2013-02-15

    申请人: Panzura, Inc.

    IPC分类号: G06F15/16 G06F17/30

    摘要: The disclosed embodiments disclose techniques for managing metadata and data storage for a cloud controller in a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems. More specifically, the cloud controllers cache and ensure data consistency for the data stored in the cloud storage systems, with each cloud controller maintaining (e.g., storing) in a local storage device: (1) one or more metadata regions containing a metadata hierarchy that reflects the current state of the distributed filesystem; and (2) cached data for the distributed filesystem. During operation, the cloud controller receives an incremental metadata snapshot that references new data written to the distributed filesystem. The cloud controller stores updated metadata from this incremental metadata snapshot in one of the metadata regions on the local storage device.

    AVOIDING CLIENT TIMEOUTS IN A DISTRIBUTED FILESYSTEM
    10.
    发明申请
    AVOIDING CLIENT TIMEOUTS IN A DISTRIBUTED FILESYSTEM 有权
    避免分布式文件系统中的客户端时间

    公开(公告)号:US20130339407A1

    公开(公告)日:2013-12-19

    申请号:US13971621

    申请日:2013-08-20

    申请人: Panzura, Inc.

    IPC分类号: G06F17/30

    摘要: The disclosed embodiments disclose techniques that facilitate of avoiding client timeouts in a distributed filesystem. Multiple cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers ensure data consistency for the stored data, and each cloud controller caches portions of the distributed filesystem in a local storage pool. During operation, a cloud controller receives from a client system a request for a data block in a target file that is stored in the distributed filesystem. Although the cloud controller is already caching the requested data block, the cloud controller delays transmission of the cached data block; this additional delay gives the cloud controller more time to access uncached data blocks for the target file from a cloud storage system, thereby ensuring that subsequent requests of such data blocks do not exceed a timeout interval on the client system.

    摘要翻译: 公开的实施例公开了有助于避免分布式文件系统中的客户端超时的技术。 多个云控制器共同管理存储在一个或多个云存储系统中的分布式文件系统数据; 云控制器确保存储数据的数据一致性,并且每个云控制器将分布式文件系统的部分缓存在本地存储池中。 在操作期间,云控制器从客户端系统接收存储在分布式文件系统中的目标文件中的数据块的请求。 虽然云控制器已经缓存了所请求的数据块,但云控制器延迟了缓存的数据块的传输; 这种额外的延迟给云控制器更多的时间从云存储系统访问目标文件的未缓存的数据块,从而确保这些数据块的后续请求不超过客户端系统上的超时间隔。