Hybridized storage optimization for genomic workloads

    公开(公告)号:US12210904B2

    公开(公告)日:2025-01-28

    申请号:US16023091

    申请日:2018-06-29

    Abstract: A method for more efficiently storing genomic includes designating multiple different data storage techniques for storing genomic data generated by a genomic pipeline. The method further identifies a file, made up of multiple blocks, generated by the genomic pipeline. The method determines which data storage technique is most optimal to store each block of the file. In doing so, the method may consider the type of the file, the stage of the genomic pipeline that generated the file, the access frequency for blocks of the file, the most accessed blocks of the file, and the like. The method stores each block using the data storage technique determined to be most optimal after completion of a designated stage of the genomic pipeline, such that blocks of the file are stored using several different data storage techniques. A corresponding system and computer program product are also disclosed.

    Increased parallelization efficiency in tiering environments

    公开(公告)号:US11194727B2

    公开(公告)日:2021-12-07

    申请号:US16732638

    申请日:2020-01-02

    Abstract: A computer-implemented method, according to one embodiment, includes: identifying block addresses which are associated with a given object, and combining the block addresses to a first set in response to determining that at least one token is currently issued on one or more of the identified block addresses. A first portion of the block addresses is transitioned to a second set, where the first portion includes ones of the block addresses determined as having a token currently issued thereon. Moreover, a second portion of the block addresses is divided into equal chunks, where the second portion includes the block addresses remaining in the first set. The chunks in the first set are allocated across two or more parallelization units. Furthermore, the block addresses in the second set are divided into equal chunks, and the chunks in the second set are allocated to at least one dedicated parallelization unit.

    Application restore time from cloud gateway optimization using storlets

    公开(公告)号:US10983826B2

    公开(公告)日:2021-04-20

    申请号:US16529201

    申请日:2019-08-01

    Abstract: A method, computer system, and a computer program product for designing and executing at least one storlet is provided. The present invention may include receiving a plurality of restore operations based on a plurality of data. The present invention may also include identifying a plurality of blocks corresponding to the received plurality of restore operations from the plurality of data. The present invention may then include identifying a plurality of grain packs corresponding with the identified plurality of blocks. The present invention may further include generating a plurality of grain pack index identifications corresponding with the identified plurality of grain packs. The present invention may also include generating at least one storlet based on the generated plurality of grain pack index identifications. The present invention may then include returning a plurality of consolidated objects by executing the generated storlet.

    Using merged snapshots to increase operational efficiency for network caching based disaster recovery

    公开(公告)号:US10936240B2

    公开(公告)日:2021-03-02

    申请号:US16209804

    申请日:2018-12-04

    Abstract: A computer-implemented method, according to one embodiment, includes: selecting two previously captured snapshots and calculating a checksum for each file in each of the two snapshots. The checksums are used to determine whether the two snapshots are sufficiently similar to each other. In response to determining that the two snapshots are sufficiently similar to each other, important ones of the files in each of the two snapshots are identified. The identified important files which are located in a lower performance tier of a multi-tier data storage system are transitioned to a higher performance tier of the multi-tier data storage system. Moreover, a merged snapshot is created by merging the two snapshots, and the merged snapshot is provided for additional operations. Other systems, methods, and computer program products are described in additional embodiments.

Patent Agency Ranking