-
1.
公开(公告)号:US20200081868A1
公开(公告)日:2020-03-12
申请号:US16247014
申请日:2019-01-14
Applicant: NetApp, Inc.
IPC: G06F16/174 , G06F16/14
Abstract: Methods, non-transitory machine readable media, and computing devices that compare a hash value to a predefined value for sliding windows in parallel for segments partitioned from an input data stream. A bit array is parsed according to minimum and maximum chunk sizes to identify chunk boundaries for the input data stream. The bit array is populated based on a result of the comparison and portions of the bit array are parsed in parallel. Unique chunks of the input data stream defined by the chunk boundaries are stored in a storage device. Accordingly, this technology utilizes parallel processing in two stages. In a first stage, rolling window based hashing is performed concurrently to identify potential chunk boundaries. In a second stage, actual chunk boundaries are selected based on minimum and maximum chunk size constraints. This technology advantageously facilitates significant deduplication ratio improvement as well as improved parallel chunking performance.