Abstract:
Techniques are provided for small file aggregation in a parallel computing system. An exemplary method for storing a plurality of files generated by a plurality of processes in a parallel computing system comprises aggregating the plurality of files into a single aggregated file; and generating metadata for the single aggregated file. The metadata comprises an offset and a length of each of the plurality of files in the single aggregated file. The metadata can be used to unpack one or more of the files from the single aggregated file.
Abstract:
A data processing system includes host data processors, a data storage system including data storage shared among the host data processors, and a data switch coupling the host data processors to the data storage system. The data storage system has host adapter ports coupled to the data switch. The data switch is programmed for distributing block I/O requests from the host data processors over the operable host adapter ports for load balancing of the block I/O requests among the operable host adapter ports. The shared data storage can be a file system striped across RAID sets of disk drives for load balancing upon disk director ports of the data storage system. The data processing system can be expanded by adding more data storage systems, switches for the additional data storage systems, and switches for routing block I/O requests from the host processors to the data storage systems.
Abstract:
An application is included in a virtual machine sent to a cloud computing server. The cloud computing server has a remote access layer that fetches data blocks of the private dataset of the application from private data storage as the data blocks are requested by the application, so that the application in the public cloud begins execution without waiting for the entire application dataset to be transferred to the public cloud, and the data blocks are transferred from the private dataset to the public cloud only when the data blocks are accessed by the application. The application's private data is kept in the public cloud only when it is currently being used. If there are security concerns, the application's private data is transferred over the public network in an encrypted form and stored in the public cloud in an encrypted form.
Abstract:
Three contiguous segments of video data are kept in video cache memory for streaming video data to a host application from a video file in data storage. For example, three buffers are allocated in the cache memory for each video stream, and at any given time during sequential access, a particular one of the three buffers is a middle buffer from which pre-fetched data is streamed to the host application. For forward or backward streaming, the buffers also include a backward buffer as well as a forward buffer on opposite sides of the middle buffer. In order to simplify the assembling of the buffers, a shift or rotation of the roles of the buffers and an asynchronous pre-fetch for continuance of a stream or for a switched direction of a stream is triggered by the cache state of the offset requested by the video application.
Abstract:
In a data processing system, a first processor pre-allocates data blocks for use in a file system at a later time when a second processor needs data blocks for extending the file system. The second processor selectively maps the logical addresses of the pre-allocated blocks so that when the pre-allocated blocks are used in the file system, the layout of the file system on disk is improved to avoid block scatter and enhance I/O performance. The selected mapping can be done at a program layer between a conventional file system manager and a conventional logical volume layer so that there is no need to modify the data block mapping mechanism of the file system manager or the logical volume layer. The data blocks can be pre-allocated adaptively in accordance with the allocation history of the file system.
Abstract:
An access control agent is advantageously deployed at a host device to prevent malicious use of a storage system by unauthorized hosts and users. In one embodiment the access control agent is disposed in a processing path between the application and the storage device. An application is mounted as an image file by a loop device to provide a virtual file system. The virtual file system is populated with access control information for each block of the file. Application I/O requests are mapped to physical blocks of the storage by the loop device, and the access control information is used to filter the access requests to preclude unauthorized requests from being forwarded to the storage client (and consequently the storage devices). With such an arrangement, access rights can be determined at I/O accesses, file and block granularity for each user.
Abstract:
Read-only and read-write snapshot copies of a production file in a Unix-based file system are organized as a version set of file inodes and shared file blocks. Version pointers and branch pointers link the inodes. Initially the production file can have all its blocks preallocated or it can be a sparse file having only an inode and its last data block. A protocol is provided for creating read-only and read-write snapshots, deleting snapshots, restoring the production file with a specified snapshot, refreshing a specified snapshot, and naming the snapshots. Block pointers are marked with a flag indicating whether or not the pointed-to block is owned by the parent inode. A non-owner marking is inherited by all of the block's descendants. The block ownership controls the copying of indirect blocks when writing to the production file, and also controls deallocation and passing of blocks when deleting a read-only snapshot.
Abstract:
To permit multiple unsynchronized processors to update the file-modification time attribute of a file during concurrent asynchronous writes to the file, a primary processor having a clock manages access to metadata of the file. A number of secondary processors service client request for access to the file. Each secondary processor has a timer. When the primary processor grants a range lock upon the file to a secondary, it returns its clock time (m). Upon receipt, the secondary starts a local timer (t). When the secondary modifies the file data, it determines a file-modification time that is a function of the clock time and the timer interval, such as a sum (m+t). When the secondary receives an updated file-modification time (mp) from the primary, if mp>m+t, then the secondary updates the clock time (m) to (mp) and resets its local timer.
Abstract:
In a data processing system, a first processor pre-allocates data blocks for use in a file system at a later time when a second processor needs data blocks for extending the file system. The second processor selectively maps the logical addresses of the pre-allocated blocks so that when the pre-allocated blocks are used in the file system, the layout of the file system on disk is improved to avoid block scatter and enhance I/O performance. The selected mapping can be done at a program layer between a conventional file system manager and a conventional logical volume layer so that there is no need to modify the data block mapping mechanism of the file system manager or the logical volume layer. The data blocks can be pre-allocated adaptively in accordance with the allocation history of the file system.
Abstract:
A system for producing multiple concurrent real-time video streams from stored MPEG video clips includes a video server and at least one MPEG decoder array. The decoder array has multiple decoder pairs, each pair having a video switch for switching from one decoder in the pair to the other at a specified time. Switching may occur from a specified Out-point frame to a specified In-point frame, and the specified frames can be any frame type at any location in the group of pictures (GOP) structure. In a preferred construction, the video server has a controller server linked to a series of data mover computers, each controlling one or more respective decoder arrays. The data mover computers use a control protocol to control the decoder arrays, and each decoder uses a data protocol to request data from a respective data mover computer.