Abstract:
A system and method for data replication is described. A destination storage system receives a message from a source storage system as part of a replication process. The message includes an identity of a first file, information about where the first file is stored in the source storage system, a name of a first data being used by the first file and stored at a first location of the source storage system, and a fingerprint of the first data. The destination storage system determines that a mapping database is unavailable or inaccurate, and accesses a fingerprint database using the fingerprint of the first data received with the message to determine whether data stored in the destination storage system has a fingerprint identical to the fingerprint of the first data.
Abstract:
Methods, systems, and computer executable instructions for performing distributed data analytics are provided. In one exemplary embodiment, a method of performing a distributed data analytics job includes collecting application-specific information in a processing node assigned to perform a task to identify data necessary to perform the task. The method also includes requesting a chunk of the necessary data from a storage server based on location information indicating one or more locations of the data chunk and prioritizing the request relative to other data requests associated with the job. The method also includes receiving the data chunk from the storage server in response to the request and storing the data chunk in a memory cache of the processing node which uses a same file system as the storage server.
Abstract:
Techniques to authenticate user requests involving multiple applications are described. An apparatus may comprise a logic circuit, and a user interface component operative on the logic circuit to present to a user content from a primary application, handle user commands directed to the primary application, and verify the user to a secondary application using an identifier value that is generated by the primary application for authenticating the user. In one embodiment, the user interface component submits the identifier value to the secondary application in a request for certain content. After determining whether the identifier value is valid, the secondary application provides the requested content or deny the user's request. Other embodiments are described and claimed.
Abstract:
A system and method integrates namespace management and storage management in a storage system environment. According to the invention, an integrated management framework provides an underlying infrastructure that supports various heterogenous storage access protocols within a single, logical namespace service. The logical namespace service is based on extensions to underlying storage management processes that cooperate to create the integrated management framework. Notably, these extensions are embodied as novel library functionality.
Abstract:
Technology is disclosed for subpartitioning a namespace region. In various embodiments, the technology creates at least two subpartitions from a partitioned namespace, wherein the partitioned namespace corresponds to at least two different name nodes of the large scale data storage service; and stores data corresponding to each subpartition as a separate file, e.g., so that it can be easily mounted by an operating system executed by a different computing device.
Abstract:
A system and method for performing a backup operation is described. A source system determines a set of files to be backed up at a backup system. Based on one or more attributes of each file of the set of files, the source system determines an order in which to perform the backup operation for the set of files. The order specifies an individual file of the set of files to be backed up before another file of the set of files. The source system communicates with the backup system to perform the backup operation of the set of files in the determined order.
Abstract:
The storage device receives a write request from a disk controller to write data to a storage array. The storage device determines that one or more blocks are marked for deletion. In response to receiving the write request and determining that blocks are marked for deletion, the storage device issues a write command on a first media access channel for a first location of the storage array, and issues an erase command on a second media access channel for a different storage location of the storage array. Thus, the commands are issued concurrently on different channels.
Abstract:
Examples are disclosed for identifying duplicated media content in a plurality of media files. In some examples, according to a media file format, media content sequences may be located and duplicated media content sequences identified. For these examples, at least a portion of the identified duplicated media content sequences may then be deleted or not stored at a storage system. Other examples are described and claimed.
Abstract:
A method, non-transitory computer readable medium, and device that prefetchs includes identifying a candidate data block from one of one or more immediate successor data blocks. The identified candidate data block has a historical access probability value from an initial accessed data block which is higher than a historical access probability value for each of the other immediate successor data blocks and is above a prefetch threshold value. The identifying is repeated until a next identified candidate data block has the historical access probability value below the prefetch threshold value. In the repeating, the identifying next immediate successor data blocks is from the previously identified candidate data block and the historical access probability value for each of the next immediate successor data blocks is determined from the originally accessed data block. The identified candidate data block with the historical access probability value above the prefetch threshold value is fetched.
Abstract:
Systems and methods are disclosed for rapidly provisioning of virtual storage objects, whereby such rapid provisioning does not require clearing of physical storage resources when initialized for use in a virtual storage object. Accordingly, a virtual storage object of embodiments of the invention is provisioned without the time-intensive process of clearing (e.g., writing zeroes to) data blocks of the physical storage resources.