MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING

    公开(公告)号:US20230420085A1

    公开(公告)日:2023-12-28

    申请号:US17850763

    申请日:2022-06-27

    摘要: This disclosure describes a machine learning system that includes a contrastive learning based two-tower model for retrieval of relevant chemical reaction procedures given a query chemical reaction. The two-tower model uses attention-based transformers and neural networks to convert tokenized representations of chemical reactions and chemical reaction procedures to embeddings in a shared embedding space. Each tower can include a transformer network, a pooling layer, a normalization layer, and a neural network. The model is trained with labeled data pairs that include a chemical reaction and the text of a chemical reaction procedure for that chemical reaction. New queries can locate chemical reaction procedures for performing a given chemical reaction as well as procedures for similar chemical reactions. The architecture and training of the model make it possible to perform semantic matching based on chemical structures. The model is highly accurate providing an average recall at K=5 of 95.9%.

    GENERATING REAL-TIME INFERRED NETWORK GRAPHS

    公开(公告)号:US20230394722A1

    公开(公告)日:2023-12-07

    申请号:US17833221

    申请日:2022-06-06

    IPC分类号: G06T11/20 G06F16/901

    摘要: The present disclosure relates to systems, methods, and computer-readable media for utilizing an interactive graphing system to achieve improved dataset exploration utilizing an intelligent workflow and an interactive user interface. More specifically, the interactive graphing system facilitates generating updated network graphs that include inferred user influences based on implicit user action. Indeed, the interactive graphing system can automatically generate and present a user with an updated network graph that includes added, removed, or subsetted elements and relationships that are otherwise hidden from a user. Additionally, the interactive graphing system facilitates network graph exploration and processing of customized combined network graphs that join otherwise separate network graphs.

    METHODS AND SYSTEMS FOR AUTOMATICALLY PREDICTING CLINICAL STUDY OUTCOMES

    公开(公告)号:US20220344008A1

    公开(公告)日:2022-10-27

    申请号:US17377320

    申请日:2021-07-15

    摘要: The methods and systems may improve the development of protocol documents used for clinical trials. The methods and systems may automatically estimate the likelihood of success or failure of executing a protocol document for a clinical study using a machine learning model that leverages several hundred thousand of past protocol documents and the outcomes of the clinical studies. The methods and systems may highlight sections of the protocol document that may increase a likelihood of an unsuccessful execution of the protocol document and may provide one or more recommendations to improve the highlighted sections of the protocol document.

    SYSTEMS AND METHODS FOR PROCEDURE OPTIMIZATION

    公开(公告)号:US20220237422A1

    公开(公告)日:2022-07-28

    申请号:US17160188

    申请日:2021-01-27

    摘要: Procedural optimization is facilitated by receiving user input for creating or modifying a body of text comprising a procedure, detecting one or more procedural steps associated with the procedure using a procedural step detection module, automatically searching within a corpus of references for one or more related procedural steps using a related procedural step extraction module, automatically identifying one or more outcomes within the corpus of references associated with the one or more related procedural steps using an outcome extraction module, automatically determining whether the one or more outcomes comprise detrimental results using an outcome analysis module, and, in response to determining a set of detrimental outcomes from the one or more outcomes that comprise detrimental results, presenting a detriment indicator within the user interface in association with the one or more procedural steps.

    MOLECULE EMBEDDING USING GRAPH NEURAL NETWORKS AND MULTI-TASK TRAINING

    公开(公告)号:US20220180201A1

    公开(公告)日:2022-06-09

    申请号:US17208110

    申请日:2021-03-22

    IPC分类号: G06N3/08 G06N3/04

    摘要: An embedding model maps a graph representation of a molecule to an embedding space. The embedding model may include one or more graph neural network layers that use a message passing framework and one or more attention layers. The one or more attention layers may determine an edge weight for each message received by a receiving node from one or more sending nodes. The edge weight may be based on features of the receiving node and features of the one or more sending nodes. The one or more graph neural network layers may determine embedded features for the graph based on the messages and the edge weights. The embedding model may determine molecule features for the molecule based on the embedded features. The molecule features may map to an embedding space. The embedding model may be trained using multi-task training to generate a more generic embedding space.

    DETERMINING CONCEPT RELATIONSHIPS IN DOCUMENT COLLECTIONS UTILIZING A SPARSE GRAPH RECOVERY MACHINE-LEARNING MODEL

    公开(公告)号:US20230394239A1

    公开(公告)日:2023-12-07

    申请号:US17833142

    申请日:2022-06-06

    IPC分类号: G06F40/295

    CPC分类号: G06F40/295

    摘要: The present disclosure relates to systems, methods, and computer-readable media for utilizing a concept graphing system to determine and provide relationships between concepts within document collections or corpora. For example, the concept graphing system can generate and utilize machine-learning models, such as a sparse graph recovery machine-learning model, to identify less-obvious correlations between concepts, including positive and negative concept connections, as well as provide these connections within a visual concept graph. Additionally, the concept graphing system can provide a visual concept graph that determines and displays concept correlations based on the input of a single concept, multiple concepts, or no concepts.

    MACHINE-LEARNING OF DOCUMENT PORTION LAYOUT

    公开(公告)号:US20230074788A1

    公开(公告)日:2023-03-09

    申请号:US17469751

    申请日:2021-09-08

    摘要: Machine learning to predict a layout type that each of a plurality of portions of a document appears in. This is done even though the computer-readable representation of the document does not contain information at the granularity of the prediction to be made that identifies which layout type that each of the plurality of document portions belongs in. For each of a plurality of the portions, the machine-learning system predicts the layout type that the respective portion appears in, and indexes the document using the predictions so as to result in a computer-readable index. The index represents a predicted layout type associated with each of the plurality of portions of the document. Thus, the index can be used to search based on position of a searched term within the document.

    CONTENT-BASED MULTIMEDIA RETRIEVAL WITH ATTENTION-ENABLED LOCAL FOCUS

    公开(公告)号:US20220382800A1

    公开(公告)日:2022-12-01

    申请号:US17332673

    申请日:2021-05-27

    摘要: Examples of the present disclosure describe systems and methods for content-based multimedia retrieval with attention-enabled local focus. In aspects, a search query comprising multimedia content may be received by a search system. A first semantic embedding representation of the multimedia content may be generated. The first semantic embedding representation may be compared to a stored set of candidate semantic embedding representations of other multimedia content. Based on the comparison, one or more candidate representations that are visually similar to the first semantic embedding representation may be selected from the stored set of candidate semantic embedding representations. The candidate representations may be ranked, and top ‘N’ candidate representations (or corresponding multimedia items) may be retrieved and provided as search results for the search query.

    Facilitating Interaction with Plural BOTs Using a Master BOT Framework

    公开(公告)号:US20200259891A1

    公开(公告)日:2020-08-13

    申请号:US16269571

    申请日:2019-02-07

    发明人: Robin ABRAHAM

    IPC分类号: H04L29/08 H04L12/58

    摘要: A computer-implemented technique is described herein which uses a master BOT framework to facilitate a user's interaction with plural BOTs. The BOT framework includes a BOT registry that stores information regarding a plurality of BOTs that may be activated to handle different tasks (and associated intents). The BOT framework also includes various components that facilitate the transition from one BOT to another in the course of a multi-BOT transaction. According to one technical feature, the technique automatically invokes a new BOT without requiring the user to explicitly identify it. This provision simplifies the user's activation of a new BOT. According to another feature, the technique automatically forwards current state information to the new BOT. This provision expedites the user's transaction because it reduces the need for the user to repeat information that has already been supplied in one more prior turns of the transaction.

    EXERCISING ARTIFICIAL INTELLIGENCE BY REFINING MODEL OUTPUT

    公开(公告)号:US20190354632A1

    公开(公告)日:2019-11-21

    申请号:US15985415

    申请日:2018-05-21

    IPC分类号: G06F17/30 G06F15/18

    摘要: The improved exercise of artificial intelligence. Raw output data is obtained by applying an input data set to an artificial intelligence (AI). Such raw output data is sometimes difficult to interpret. The principles defined herein provide a systematic way to refine the output for a wide variety of AI models. An AI model collection characterization structure is utilized for purpose of refining AI model output so as to be more useful. The characterization structure represents, for each of multiple and perhaps numerous AI models, a refinement of output data that resulted from application of an AI model to input data. Upon obtaining output data from the AI model, the appropriate refinement may then be applied. The refined data may then be semantically indexed to provide a semantic index. The characterization structure may also provide tailored information to allow for intuitive querying against the semantic index.