摘要:
A set of operators on group-structured data enable creation of efficient execution plans. These operators are of two (2) distinct types. They operate similarly. For each row that matches an input row type, an hkey is obtained. The hkey uniquely identifies a table row within a table group. The hkey is transformed into a modified hkey associated with an output row type. Starting with a row of interest associated with the modified hkey, a table group is probed to identify one or more additional rows. As the additional rows are identified, they are written into an output stream.
摘要:
Content platform management is enhanced by logically partitioning a physical cluster that comprises a redundant array of independent nodes. Using an interface, an administrator defines one or more “tenants” within the archive cluster, wherein a tenant has a set of attributes including, for example, namespaces, administrative accounts, data access accounts, and a permission mask. A namespace is a logical partition of the cluster that serves as a collection of objects typically associated with at least one defined application. Each namespace has a private file system such that access to one namespace (and its associated objects) does not enable a user to access objects in another namespace. A namespace has capabilities (e.g., read, write, delete, purge, and the like) that a namespace administrator can choose to enable or disable for a given data account. Using the interface, an administrator for the tenant creates and manages namespaces such that the cluster then is logically partitioned into a set of namespaces, wherein one or more namespaces are associated with a given tenant. This approach enables a user to segregate cluster data into logical partitions. Using the administrative interface, a namespace associated with a given tenant is selectively configured without affecting a configuration of at least one other namespace in the set of namespaces. This architecture enables support for many top level tenants, with multiple namespaces per tenant, and wherein configuration is effected at the level of a namespace.
摘要:
An archival storage cluster of symmetric nodes includes a metadata management system that organizes metadata objects. Each metadata object may have a unique name, and metadata objects are organized into regions. A region is selected by hashing one or more object attributes and extracting a given number of bits of the resulting hash value. The number of bits may be controlled by a configuration parameter. Each region is stored redundantly. A region comprises a set of region copies. In particular, there is one authoritative copy of the region, and zero or more backup copies. The number of backup copies may be controlled by a configuration parameter. Region copies are distributed across the nodes of the cluster to balance the number of authoritative region copies per node, and the number of total region copies per node. Backup region copies are maintained synchronized to their associated authoritative region copy.
摘要:
An archive cluster application runs across a redundant array of independent nodes. Each node runs an archive cluster application instance comprising a set of software processes: a request manager, a storage manager, a metadata manager, and a policy manager. The request manager manages requests for data, the storage manager manages data read/write functions, and the metadata manager facilitates metadata transactions and recovery. The policy manager implements policies, which are operations that determine the behavior of an “archive object” within the cluster. The archive cluster application provides object-based storage. It associates metadata and policies with the raw archived data, which together comprise an archive object. Object policies govern the object's behavior in the archive. The archive manages itself independently of client applications, acting automatically to ensure that object policies are valid.
摘要:
An archive cluster application runs in a distributed manner across a redundant array of independent nodes. Each node preferably runs a complete archive cluster application instance. A given nodes provides a data repository, which stores up to a large amount (e.g., a terabyte) of data, while also acting as a portal that enables access to archive files. Each symmetric node has a set of software processes, e.g., a request manager, a storage manager, a metadata manager, and a policy manager. The request manager manages requests to the node for data (i.e., file data), the storage manager manages data read/write functions from a disk associated with the node, and the metadata manager facilitates metadata transactions and recovery across the distributed database. The policy manager implements one or more policies, which are operations that determine the behavior of an “archive object” within the cluster. The archive cluster application provides object-based storage. Preferably, the application permanently associates metadata and policies with the raw archived data, which together comprise an archive object. Object policies govern the object's behavior in the archive. As a result, the archive manages itself independently of client applications, acting automatically to ensure that all object policies are valid.
摘要:
An archival storage cluster of preferably symmetric nodes includes a metadata management system that organizes and provides access to given metadata, preferably in the form of metadata objects. Each metadata object may have a unique name, and metadata objects are organized into regions. Preferably, a region is selected by hashing one or more object attributes (e.g., the object's name) and extracting a given number of bits of the resulting hash value. The number of bits may be controlled by a configuration parameter. Each region is stored redundantly. A region comprises a set of region copies. In particular, there is one authoritative copy of the region, and zero or more backup copies. The number of backup copies may be controlled by a configuration parameter. Region copies are distributed across the nodes of the cluster so as to balance the number of authoritative region copies per node, as well as the number of total region copies per node. Backup region copies are maintained synchronized to their associated authoritative region copy.
摘要:
An archive cluster application runs in a distributed manner across a redundant array of independent nodes. Each node preferably runs a complete archive cluster application instance. A given nodes provides a data repository, which stores up to a large amount (e.g., a terabyte) of data, while also acting as a portal that enables access to archive files. Each symmetric node has a set of software processes, e.g., a request manager, a storage manager, a metadata manager, and a policy manager. The request manager manages requests to the node for data (i.e., file data), the storage manager manages data read/write functions from a disk associated with the node, and the metadata manager facilitates metadata transactions and recovery across the distributed database. The policy manager implements one or more policies, which are operations that determine the behavior of an “archive object” within the cluster. The archive cluster application provides object-based storage. Preferably, the application permanently associates metadata and policies with the raw archived data, which together comprise an archive object. Object policies govern the object's behavior in the archive. As a result, the archive manages itself independently of client applications, acting automatically to ensure that all object policies are valid.
摘要:
An archive cluster application runs in a distributed manner across a redundant array of independent nodes. Each node preferably runs a complete archive cluster application instance. A given nodes provides a data repository, which stores up to a large amount (e.g., a terabyte) of data, while also acting as a portal that enables access to archive files. Each symmetric node has a set of software processes, e.g., a request manager, a storage manager, a metadata manager, and a policy manager. The request manager manages requests to the node for data (i.e., file data), the storage manager manages data read/write functions from a disk associated with the node, and the metadata manager facilitates metadata transactions and recovery across the distributed database. The policy manager implements one or more policies, which are operations that determine the behavior of an “archive object” within the cluster. The archive cluster application provides object-based storage. Preferably, the application permanently associates metadata and policies with the raw archived data, which together comprise an archive object. Object policies govern the object's behavior in the archive. As a result, the archive manages itself independently of client applications, acting automatically to ensure that all object policies are valid.
摘要:
An archive cluster application runs across a redundant array of independent nodes. Each node runs an archive cluster application instance comprising a set of software processes: a request manager, a storage manager, a metadata manager, and a policy manager. The request manager manages requests for data, the storage manager manages data read/write functions, and the metadata manager facilitates metadata transactions and recovery. The policy manager implements policies, which are operations that determine the behavior of an “archive object” within the cluster. The archive cluster application provides object-based storage. It associates metadata and policies with the raw archived data, which together comprise an archive object. Object policies govern the object's behavior in the archive. The archive manages itself independently of client applications, acting automatically to ensure that object policies are valid.
摘要:
An archival storage cluster of preferably symmetric nodes includes a metadata management system that organizes and provides access to given metadata, preferably in the form of metadata objects. Each metadata object may have a unique name, and metadata objects are organized into regions. Preferably, a region is selected by hashing one or more object attributes (e.g., the object's name) and extracting a given number of bits of the resulting hash value. The number of bits may be controlled by a configuration parameter. Each region is stored redundantly. A region comprises a set of region copies. In particular, there is one authoritative copy of the region, and zero or more backup copies. The number of backup copies may be controlled by a configuration parameter. Region copies are distributed across the nodes of the cluster so as to balance the number of authoritative region copies per node, as well as the number of total region copies per node. Backup region copies are maintained synchronized to their associated authoritative region copy.