MOLECULAR DATA STORAGE SYSTEMS AND METHODS

    公开(公告)号:US20230040158A1

    公开(公告)日:2023-02-09

    申请号:US17780404

    申请日:2019-11-27

    Abstract: A molecular data storage system is presented for encoding data-block(s). The system includes one or more populations of molecular sequences, each population encoding a respective one of the data-blocks. Each molecular sequence comprises a data encoding section comprising a sequence of similar predetermined length N of short k-mers, whereby in each population the data encoding sections of all molecular sequences have the similar predetermined length N. The short k-mers serve as data encoding building blocks of the data encoding sections, whereby valid short k-mers serving as data encoding building blocks form a subset of a building-block-set consisting of a number Z of different preselected short k-mers each presenting a unique combination of a number k of bases of a preselected set of bases, characterized in that all the Z types of short k-mers in said building-block-set have a similar predetermined size k≥2 (plurality) of bases. The data encoding sections collectively encode a sequence of encoded alphabet letters S=(π1, π2, . . . , πn . . . , πN−1, πN). Each valid encoded alphabet letter πn at location n of the sequence S of alphabet letters is characterized by occurrence of a predetermined plurality of different types of short k-mers of the building-block-set in a corresponding location n along the data encoding sections of the plurality of molecular sequences of said population.

    MOLECULAR DATA STORAGE SYSTEMS AND METHODS

    公开(公告)号:US20210141568A1

    公开(公告)日:2021-05-13

    申请号:US17101824

    申请日:2020-11-23

    Abstract: A data storage system and method are provided, as well as systems and methods for fabrication, and writing and reading of data therein. The data storage system includes at least one population of molecular sequences including chains of basic molecular building-blocks, and defining at least one respective data-block encoding data in the data storage system. The data of the data-block is encoded in a sequence S=(π1, π2, . . . , πk . . . , πK-1, πK) of encoded letters {πk} associated with an alphabet Σ≡{σm}|m=1 to M, which are encoded according to the types of basic molecular building-blocks appearing at k respective location along storage segments of the molecular sequences of the population. The molecular sequences include a number Z of different types of basic molecular building-blocks {En}|n=1 to Z, while the alphabet Σ has a size M strictly greater than the number Z of types of building-blocks. Each alphabet letter σm is associated with a vector {Pmn}|n=1 to Z indicative of occurrences of basic molecular building-block En of type n in the alphabet letter σm. Accordingly each encoded letter πk at location k in the storage segments of molecular sequences of the data-block/population, is mapped to a corresponding alphabet letter σm by determining a match between the occurrence of basic molecular building-blocks of different types at that locations k of the molecular sequences of the population, with the vector {Pmn}|n=1 to Z associated with the alphabet letter σm. In some implementations the component Pmn of the vector {Pmn}m|n=1 to Z associated with alphabet letter σm is indicative of a probability that a basic molecular building-block En of type n, 1≤n≤Z, appears at the location k of the storage segment of a molecular strand of the at least one population in case the letter πk encoded at that location k corresponds to the alphabet letter σm.

Patent Agency Ranking