-
公开(公告)号:US20170293714A1
公开(公告)日:2017-10-12
申请号:US15485767
申请日:2017-04-12
摘要: A method and system for determining interaction sites between biosequences is described herein. A dataset of contact data for a plurality of biomolecule pairs is obtained to account their frequency of occurrence. Statistical weights are determined for each frequency of occurrence. Each vector of a statistical residual vector space (SRV) is decomposed through principal component decomposition. The vectors of the SRV are re-projected back to a new SRV with a new set of coordinates. A feature vector is generated and inputted into a predictor for outputting a likelihood of an interaction site.
-
公开(公告)号:US20210304842A1
公开(公告)日:2021-09-30
申请号:US17345699
申请日:2021-06-11
摘要: A method and system for determining interaction sites between biosequences is described herein. A dataset of contact data for a plurality of biomolecule pairs is obtained to account their frequency of occurrence. Statistical weights are obtained for each frequency of occurrence. A statistical vector space (SRV) is decomposed through principal component decomposition. The r-vectors of the SRV are re-projected back to a new SRV with a new set of SR coordinates. A feature vector is generated and inputted into a predictor for outputting a likelihood of an interaction site. A method and system for determining significant attribute-value associations (AVAs) from relational datasets is also described. A frequency of occurrence of attribute value pairs and statistical weights may be obtained for each frequency of occurrence. Principal component decomposition and re-projection of AVA vectors may also be performed. The disentangle SR of AVAs could be used to identify AVA related to subgroups/classes.
-
公开(公告)号:US20200301949A1
公开(公告)日:2020-09-24
申请号:US16823627
申请日:2020-03-19
申请人: Andrew Ka-Ching WONG , Peiyuan ZHOU
发明人: Andrew Ka-Ching WONG , Peiyuan ZHOU
摘要: A system and method for processing relational datasets are provided, the method may include: retrieving a relational dataset containing a plurality of entities and a plurality of attribute values; constructing an entity address table, based on the relational dataset, wherein the entity address table contains the plurality of attribute values, and each of the plurality of attribute values is associated with one or more entity addresses in the relational dataset; generating a frequency table, based on the entity address table, wherein the frequency table contains one or more cardinality values; generating a SR vector space table comprising a plurality of SR values for the plurality of a pair of attribute values; generating PCs and their corresponding RSRVs through disentangling SRV into a plurality of disentangled spaces (DS); selecting from the plurality of DS, a subset of DS; and generating one or more patterns based on the plurality of DS.
-
-