摘要:
A reference storage process populates a data structure so that the data structure contains all of the molecular structures and/or rigid substructures in the database classified according to attributes of tuples. In a preferred embodiment, the tuples are derived from sites (e.g. atomic sites) of the molecular structures and the attributes can be derived from geometric (and other) information related to the tuples. The attributes are used to define indices in the data structure that are associated with invariant vector information (e.g. information about rotatable bond(s) in skewed local coordinate frames created from tuples). These representations are invariant with respect to the rotation and translation of molecular structures and/or the rotation of substructures about attached rotatable bond(s). Accordingly, the invariant vector information is classified in the data structure with the respective tuple attributes in locations determined by the index derived from the respective tuple. A matching process creates one or more tuples, skewed local reference frames, and indices (called test frame tuple indices) for the structure (substructures) of a test molecule using the same technique that was used to populate the data structure. The test frame tuple index accesses the invariant vector information and tallies the frequency of matching in order to determine the identity of molecules/substructures in the database that are structurally similar to the test molecule. This identification can be achieved even in the presence of conformationally flexible molecules in the database.
摘要:
The invention provides for isolated nucleic acid sequences of newly discovered micro RNAs that have been identified to exist in normal Human B cells and/or in tumor-related Human B cells, using an integrated bioinformatics method and pipeline described herein.
摘要:
Disclosed herein is a systems biology approach to prediction of phenotypically relevant genes such as oncogenes and perturbation targets. Interactions from a comprehensive cellular network such as the B Cell Interactome (BCI) can be used to identify those that become affected, or dysregulated, by a phenotype (e.g, disease, tumor and cancer) or perturbation (e.g., drug treatment) based on correlation changes between expression profiles of gene pairs in the interactions upon removal or addition of samples showing the phenotype or perturbation. Genes can be ranked based on the affected interactions involving the genes to predict phenotypically relevant genes and/or perturbation targets.
摘要:
A method of processing semiotic data includes receiving semiotic data including at least one data set P, selecting a function h, and for at least one of each data set P to be collected, computing h(P), destroying data set P, and storing h(P) in a database, wherein data set P cannot be extracted from h(P). The method further includes selecting a private key/public key (K, k) once for all cases, one of destroying the private key K and sending the private key K to a trusted party, and choosing function h as the public encryption function corresponding to k.
摘要:
The method and apparatus of the present invention provide for automatic recognition of fingerprint images. In an acquisition mode, subsets of the feature points for a given fingerprint image are generated in a deterministic fashion. One or more of the subsets of feature points for the given fingerprint image is selected. For each selected subset, a key is generated that characterizes the fingerprint in the vicinity of the selected subset. A multi-map entry corresponding to the selected subset of feature points is stored and labeled with the corresponding key. In the recognition mode, a query fingerprint image is supplied to the system. The processing of the acquisition mode is repeated in order to generate a plurality of keys associated with a plurality of subsets of feature points of the query fingerprint image. For each key generated in the recognition mode, all entries in the multi-map that are associated with this key are retrieved. For each item retrieved, a hypothesized match between the query fingerprint image and the reference fingerprint image is constructed. Hypothesized matches are accumulated in a vote table. This list of hypotheses and scores stored in the vote table are preferably used to determine whether a match to the query fingerprint image is stored by the system.
摘要:
This method non sequentially compares a reference sequence of tokens to an original sequence of tokens to determine subsequences of tokens which exactly or similarly match. The method has a novel approach for creating a large number of indexes by partitioning strings of tokens into substrings, appending non contiguous substrings together to form tuples, and creating indexes from the tuples. Indexes are created in this manner for both the original and reference strings. Techniques are also provided to approximately or exactly locate the substrings which where used to create the tuples and indexes from the original sequence of tokens. Original and reference indexes are compared and matches are tracked. Higher numbers of matches result in higher scores (votes) in a table and indicate a stronger similarity between the sequences on the the original and reference strings. Using this method, the degree of similarity can also be determined. The Method is useful when comparing a reference sequence of tokens to a large database of original strings of tokens. It has applications in the biological sciences (human genome mapping or analyzing proteins) and in image, speech, and music recognition.
摘要:
Generally, the present invention applies a transformation to convert a probability distribution of gene expression signals in control samples to a uniform distribution. The uniform distribution allows better comparisons between expression levels for genes. The transformation is derived from gene expression signals of control data, and is applied to gene expression signals of phenotype data. The phenotype data can be represented in a matrix format. A number of gene expression patterns may be determined (in the form of submatrices) that will characterize the phenotype. The uniform distribution helps in this regard, as it allows better comparisons of patterns. The gene expression patterns can then be used to classify samples as belonging to the phenotype set. Preferably, a discriminant function is used to compare a sample with the gene expression patterns that characterize the phenotype. The discriminant function can determine a score that can be used to determine whether the sample belongs to the phenotype.
摘要:
The systems and methods provide a dynamic process for obtaining and managing informed consent documentation. In general, the dynamic informed consent process (DICP) makes use of an intermediary organization, e.g., a trusted intermediary, which: (a) provides ICFs which have been dynamically generated for a specified trial or medical procedure and based on particular state or federal requirements, if any; and (b) archives copies of signed ICFs. In certain preferred embodiments, there may also be a procedure to provide training materials, such as audio or video presentations, to be viewed by prospective participants. In certain preferred embodiments, the process also includes contacting subjects who have signed ICFs in the event that there is a change of circumstance which the subject may deem material to whether s/he would continue to consent, or whether the participant needs to provide a different type of consent to participate in particular event or trial.
摘要:
A method (as well as system and signal-bearing medium) of processing biometric data, includes receiving biometric data including a data set P, selecting a secure hash function h, and for each data set P to be collected, computing h(P), destroying the data set P, and storing h(P) in a database, wherein data set P cannot be extracted from h(P).
摘要:
Generally, the present invention applies a transformation to convert a probability distribution of gene expression signals in control samples to a uniform distribution. The uniform distribution allows better comparisons between expression levels for genes. The transformation is derived from gene expression signals of control data, and is applied to gene expression signals of phenotype data. The phenotype data can be represented in a matrix format. A number of gene expression patterns may be determined (in the form of submatrices) that will characterize the phenotype. The uniform distribution helps in this regard, as it allows better comparisons of patterns. The gene expression patterns can then be used to classify samples as belonging to the phenotype set. Preferably, a discriminant function is used to compare a sample with the gene expression patterns that characterize the phenotype. The discriminant function can determine a score that can be used to determine whether the sample belongs to the phenotype.