摘要:
Content platform management is enhanced by logically partitioning a physical cluster that comprises a redundant array of independent nodes. Using an interface, an administrator defines one or more “tenants” within the archive cluster, wherein a tenant has a set of attributes including, for example, namespaces, administrative accounts, data access accounts, and a permission mask. A namespace is a logical partition of the cluster that serves as a collection of objects typically associated with at least one defined application. Each namespace has a private file system such that access to one namespace (and its associated objects) does not enable a user to access objects in another namespace. A namespace has capabilities (e.g., read, write, delete, purge, and the like) that a namespace administrator can choose to enable or disable for a given data account. Using the interface, an administrator for the tenant creates and manages namespaces such that the cluster then is logically partitioned into a set of namespaces, wherein one or more namespaces are associated with a given tenant. This approach enables a user to segregate cluster data into logical partitions. Using the administrative interface, a namespace associated with a given tenant is selectively configured without affecting a configuration of at least one other namespace in the set of namespaces. This architecture enables support for many top level tenants, with multiple namespaces per tenant, and wherein configuration is effected at the level of a namespace.
摘要:
A set of operators on group-structured data enable creation of efficient execution plans. These operators are of two (2) distinct types. They operate similarly. For each row that matches an input row type, an hkey is obtained. The hkey uniquely identifies a table row within a table group. The hkey is transformed into a modified hkey associated with an output row type. Starting with a row of interest associated with the modified hkey, a table group is probed to identify one or more additional rows. As the additional rows are identified, they are written into an output stream.
摘要:
Archive management is enhanced by logically partitioning a physical cluster. Using an interface, an administrator defines “tenants” within the cluster. A namespace is a logical partition of the cluster for a collection of objects. Each namespace has a private file system. This approach enables a user to segregate cluster data into logical partitions. Using the interface, a namespace for a tenant is configured without affecting a configuration of another namespace. One configuration option is “versioning,” by which an administrator can elect to enable multiple versions of a same data object to be stored in association with a namespace. Once versioning is enabled for a namespace, the administrator can set a configuration parameter identifying a time period for maintaining a version. Preferably, versioning is dasabled for a data object under retention.
摘要:
An archival storage cluster of symmetric nodes includes a metadata management system that organizes metadata objects. Each metadata object may have a unique name, and metadata objects are organized into regions. A region is selected by hashing one or more object attributes and extracting a given number of bits of the resulting hash value. The number of bits may be controlled by a configuration parameter. Each region is stored redundantly. A region comprises a set of region copies. In particular, there is one authoritative copy of the region, and zero or more backup copies. The number of backup copies may be controlled by a configuration parameter. Region copies are distributed across the nodes of the cluster to balance the number of authoritative region copies per node, and the number of total region copies per node. Backup region copies are maintained synchronized to their associated authoritative region copy.
摘要:
Archive cluster management is enhanced by logically partitioning a physical cluster that comprises a redundant array of independent nodes. Using a web-based interface, an administrator defines one or more “tenants” within the archive cluster, wherein a tenant has a set of attributes: namespaces, administrative accounts, data access accounts, and a permission mask. A namespace is a logical partition of the cluster that serves as a collection of objects typically associated with at least one defined application. Each namespace has a private file system with respect to other namespaces. This approach enables a user to segregate cluster data into logical partitions. Using the administrative interface, a namespace associated with a given tenant is selectively configured without affecting a configuration of at least one other namespace in the set of namespaces. One configuration option is “versioning,” by which an administrator can elect to enable multiple versions of a same data object to be stored in association with a given namespace. Each version of the data object has associated therewith a time of storage attribute that uniquely identifies the version in the archive. Once versioning is enabled for a namespace, the administrator can set a configuration parameter identifying a time period for maintaining a version in the archive cluster, as well as a parameter for a time period for maintaining a version of the data object on a replica associated with the archive cluster. A current version of the data object is freely accessible in the archive, and a prior version may be browsed via an API. Preferably, versioning is disabled for a data object under retention.
摘要:
An archival storage cluster of preferably symmetric nodes includes a metadata management system that organizes and provides access to given metadata, preferably in the form of metadata objects. Each metadata object may have a unique name, and metadata objects are organized into regions. Preferably, a region is selected by hashing one or more object attributes (e.g., the object's name) and extracting a given number of bits of the resulting hash value. The number of bits may be controlled by a configuration parameter. Each region is stored redundantly. A region comprises a set of region copies. In particular, there is one authoritative copy of the region, and zero or more backup copies. The number of backup copies may be controlled by a configuration parameter. Region copies are distributed across the nodes of the cluster so as to balance the number of authoritative region copies per node, as well as the number of total region copies per node. Backup region copies are maintained synchronized to their associated authoritative region copy.