Abstract:
A network storage server implements a method to perform fast crawling of a hierarchical storage structure. The hierarchical storage structure contains data entities stored by a network storage server. The hierarchical storage structure can be recursively divided into a plurality of sections. A plurality of parallel-processing threads can be used to process the plurality of sections. Each thread selects and processes one of the plurality of sections at a time to generate a sorted list of metadata corresponding to the section of the hierarchical storage structure. The sorted lists generated by the plurality of threads are merged to a baseline list. The baseline list contains sorted metadata for entities managed by the hierarchical storage structure. The baseline list can then be outputted as a representation of the state of data stored by the network storage server.
Abstract:
A system and method accelerates update of a metadata search database using PCPI differencing. After first populating the search database, a search agent generates a PCPI and utilizes a PCPI differencing technique to quickly identify changes between inode files of first and second PCPIs. The differences are noted as modified metadata and are written to a log file, which is later read by the search agent to update the search database.
Abstract:
One embodiment of the present invention provides a distributed file system that is able to use direct connections between clients and disks to perform file system operations. Upon receiving a request at a client to access a file, the client performs a lookup in a local cache to determine what physical disk blocks are associated with the request. If the lookup cannot be satisfied from the local cache, the client forwards the request to a server. In response to the forwarded request, the client receives a block map for the file from the server. This block map includes location information specifying physical disk blocks containing the file. The client uses this block map to determine which physical disk blocks are involved in the request and then accesses the file directly from the disk without going through the server.