Abstract:
The present invention utilizes a set of D descriptors for each of N items. A value K′ representing a number of descriptors, and a value K for a number of items that should support an hypothesis are generated, preferably via user input. Collections involving K or more items for which there is an association involving a selection of values across at least K′ of the D descriptors are identified, and preferably reported to the user.
Abstract:
The present invention provides a method for predicting the risk of a patient for developing adverse drug reactions, particularly drug-induced prolonged QT interval or TdP. The invention also provides a method of identifying a subject afflicted with, or at risk of, developing TdP. In some aspects, the methods comprise analyzing at least one genetic marker, wherein the presence of the at least one genetic marker indicates that the subject is afflicted with, or at risk of, developing TdP.
Abstract:
In a dictionary formation aspect of the invention, a computer-based method of processing a plurality of sequences in a database comprises the following steps. First, the method includes evaluating each of the plurality of sequences including characters which form each sequence. Then, at least one pattern of characters is generated representing at least a subset of the sequences in the database. The pattern has a statistical significance associated therewith, the statistical significance of the pattern being determined by a value representing a minimum number of sequences that the pattern supports in the database.
Abstract:
The method of the present invention aligns a set of N sequences, where N is large. The alignment brings out the best commonality of the N sequences. The method is performed in two stages. A first stage involving discovering motifs, and a second stage involve motif pruning and sequence alignment. The present invention also provides an additional constraint, K, as a user defined control parameter. The additional parameter constrains the alignment of the N sequences to have at least K of the N sequences agree on a character, whenever possible, in the alignment. The alignment number, K, provides a natural constraint for dealing with a large number of sequences in that a commonality across most, if not all sequences is required to be detected.
Abstract:
The present invention provides a method for predicting the risk of a patient for developing adverse drug reactions, particularly Serious Skin Rash (SSR), including such severe adverse reactions such as Stevens-Johnson Syndrome (SJS) and Toxic Epidermal Necrolysis (TEN). The invention also provides a method of identifying a subject afflicted with or at risk of developing SSR. In some aspects, the methods comprise analyzing at least one genetic marker, wherein the presence of the at least one genetic marker indicates that the subject is afflicted with or at risk of developing SSR. Genetic markers useful in accordance with the methods of the invention are disclosed.
Abstract:
The present invention provides a method for predicting the risk of a patient for developing adverse drug reactions, particularly Drug-Induced Liver Injury (DILI) or hepatotoxicity. The invention also provides a method of identifying a subject afflicted with, or at risk of, developing DILI. In some aspects, the methods comprise analyzing at least one genetic marker, wherein the presence of the at least one genetic marker indicates that the subject is afflicted with, or at risk of, developing DILI.
Abstract:
In a sequence homology detection aspect of the invention, a computer-based method of detecting homologies between a plurality of sequences in a database and a query sequence comprises the following steps. First, the method includes accessing patterns associated with the database, each pattern representing at least a portion of one or more sequences in the database. Next, the query sequence is compared to the patterns to detect whether one or more portions of the query sequence are homologous to portions of the sequences of the database represented by the patterns. Then, a score is generated for each sequence detected to be homologous to the query sequence, wherein the sequence score is based on individual scores generated in accordance with each homologous portion of the sequence detected, and the sequence score represents a degree of homology between the query sequence and the detected sequence.
Abstract:
An algorithm which detects tandem repeats (TR) is provided. In an illustrative embodiment, a set of repeating units contained in an input sequence is identified, wherein each given repeating unit satisfies at least the following conditions: (a) a first measure of similarity between adjacent repeating units in the set is greater than a first user defined threshold, and (b) the given repeating unit includes at least one unit having a second measure of similarity with any other unit in the set that is a greater than a second user defined threshold. The method then provides for reporting positions in the input sequence that are covered by the set of repeating units.