Indexing of large scale patient set

    公开(公告)号:US10242085B2

    公开(公告)日:2019-03-26

    申请号:US15059392

    申请日:2016-03-03

    发明人: Fei Wang Jun Wang

    IPC分类号: G06F17/30 G16H10/60 G16H50/70

    摘要: Systems and methods for indexing data include formulating an objective function to index a dataset, a portion of the dataset including supervision information. A data property component of the objective function is determined, which utilizes a property of the dataset to group data of the dataset. A supervised component of the objective function is determined, which utilizes the supervision information to group data of the dataset. The objective function is optimized using a processor based upon the data property component and the supervised component to partition a node into a plurality of child nodes.

    Indexing and searching heterogenous data entities

    公开(公告)号:US10216778B2

    公开(公告)日:2019-02-26

    申请号:US15088526

    申请日:2016-04-01

    发明人: Fei Wang Jun Wang

    IPC分类号: G06F7/02 G06F17/30

    摘要: A method of performing a search of heterogeneous data based on an input query includes: generating an index including at least two hash tables, where each hash table corresponds to a different data domain of the heterogeneous data and includes hash code sets, where at least one of the hash code sets is mapped to a hash code set of another one of the tables. The method further includes performing a hash on the input query to generate a hash code, by referring to the index, determining a first hash code set that the generated hash code belongs to, and determining a second hash code set that the determined first hash code set is mapped to, and providing at least one result based on the determined second hash code set.

    Mapping relationships using electronic communications data

    公开(公告)号:US10127300B2

    公开(公告)日:2018-11-13

    申请号:US14138799

    申请日:2013-12-23

    摘要: A pairwise relationship data set with multiple attributes (such as, who, what, when, where, how) and with the what attribute (also called the topic attribute) having a word dimension and a people dimension. The data in the topic dimension of the what attribute relates to topics (including other people) relating to the specific, human, personal relationship between the first person and the second person of the pairwise pair. The what attribute data is derived by processing basis data, which includes correspondence data (that is, the substance of correspondence that the first and second persons participate in, including instant messaging and e-mail exchanges. Pairwise relationship data is displayed to a user in real time during a chat session.

    Visualizing conflicts in online messages

    公开(公告)号:US09779161B2

    公开(公告)日:2017-10-03

    申请号:US14994456

    申请日:2016-01-13

    IPC分类号: G06F17/30 G06Q10/00 H04L12/58

    摘要: Visualizing social media conflict is provided. Active users in a set of human users authoring a number of textual messages regarding a particular topic more than a threshold number of textual messages are selected. Keywords are selected that occur more than a threshold number of times within the textual messages regarding the particular topic. A sentiment score is computed for each of the keywords occurring more than the threshold number of times within the textual messages using a keyword co-occurrence graph. A sentiment of each of the active users is determined based on the computed sentiment score of each of the selected keywords that are authored by a particular active user. Two distinct groups from the active users are selected based on at least one of a relationship between the two distinct groups and a determined degree of conflict between the two distinct groups with regard to the particular topic.

    Evidence Boosting in Rational Drug Design and Indication Expansion by Leveraging Disease Association

    公开(公告)号:US20170124469A1

    公开(公告)日:2017-05-04

    申请号:US14929995

    申请日:2015-11-02

    IPC分类号: G06N5/04 G06F19/12

    CPC分类号: G16B5/00 G16C20/50

    摘要: An embodiment of the invention receives input including a list of drugs, drug characteristics of each drug, and known drug-disease associations including a disease and a drug having a threshold efficacy for treating the disease. For each drug in the list of drugs, a processor predicts whether the drug meets a threshold efficacy for treating a first disease based on the drug characteristics and the drug-disease associations. For each drug in the list of drugs, the processor predicts whether the drug meets a threshold efficacy for treating a second disease based on the drug characteristics and the predicting of whether the drug meets the threshold efficacy for treating the first disease. Output is generated output based on the predictions, the output including an identified drug-disease association, an identified disease-disease association, an identified chemical fingerprint for the first disease, and an identified chemical fingerprint for the second disease.