摘要:
The present invention relates to a computationally efficient method of finding patterns in any data that can be expressed in the form of arrays of binary features or arrays of categorical features. This includes data represented by continuous-valued attributes that can be transformed to a categorical representation, such as the discovery of patterns of genetic variability that may be causally related to diseases or traits, as well as the discovery of patterns of protein biomarkers that may be used for medical diagnostics, prognostics, and therapeutics. The invention further relates to a program storage device having instructions for controlling a computer system to perform the methods, and to a program storage device containing data structures used in the practice of the methods.
摘要:
The present invention relates to methods for representing multidimensional data. The methods of the present invention are well suited but not limited to the representation of multidimensional data in such a way as to enable the comparison and differentiation of data sets. For example, the invention may be applied to the representation of flow cytometric data. The invention further relates to a program storage device having instructions for controlling a computer system to perform the methods, and to a program storage device containing data structures used in the practice of the methods.
摘要:
The present invention relates to methods for representing multidimensional data. The methods of the present invention are well suited but not limited to the representation of multidimensional data in such a way as to enable the comparison and differentiation of data sets. For example, the invention may be applied to the representation of flow cytometric data. The invention further relates to a program storage device having instructions for controlling a computer system to perform the methods, and to a program storage device containing data structures used in the practice of the methods.
摘要:
A method of discovering one or more patterns in two sequences of symbols S1 and S2 includes the formation, for each sequence, of a master offset table that groups for each symbol the position in the sequence occupied by each occurrence of that symbol. The difference in position between each occurrence of a symbol in one of the sequences and each occurrence of that same symbol in the other sequence is determined and a Pattern Map is formed. For each given value of a difference in position the Pattern Map lists the position in the first sequence of each symbol therein that appears in the second sequence at that difference in position. The collection of the symbols tabulated for each value of difference in position thereby defines a parent pattern in the first sequence that is repeated in the second sequence.A computer readable medium having instructions for controlling a computer system to perform the method and a computer readable medium containing a data structure used in the practice of the method are also disclosed.
摘要:
The present invention relates to methods for representing multidimensional data. The methods of the present invention are well suited but not limited to the representation of multidimensional data in such a way as to enable the comparison and differentiation of data sets. For example, the invention may be applied to the representation of flow cytometric data. The invention further relates to a program storage device having instructions for controlling a computer system to perform the methods, and to a program storage device containing data structures used in the practice of the methods.