System and method for performing a search in a vector space based search engine

    公开(公告)号:US20230138014A1

    公开(公告)日:2023-05-04

    申请号:US17918127

    申请日:2021-04-10

    摘要: The invention provides a relevance feedback system and computer-implemented method for performing a search in a vector space comprising a first number of target vectors. The method comprises forming a first search query, determining a second number of first search hit vectors among the first number of target vectors based on the first search query vector using a first distance function, determining a third number of flagged vectors, determining a vector subspace spanned by the flagged vectors and/or a second distance function by utilizing the flagged vectors, and determining a plurality of second hit vectors among the target vectors based on the first search query vector and the vector subspace and/or the second distance function.

    Method of searching patent documents

    公开(公告)号:US20220004545A1

    公开(公告)日:2022-01-06

    申请号:US17284797

    申请日:2019-10-13

    摘要: A method of searching patent documents comprising reading a plurality of patent documents each comprising a specification and a converted into specification graphs and claim graphs. The graphs contain nodes each having a first natural language unit extracted from the specification or claim as a node value, and edges between the nodes determined based on at least one second natural language unit extracted from the specification or claim. A machine learning model is trained using an algorithm capable of travelling through the graphs according to the edges and utilizing said node values for forming a trained machine learning model. The method comprises reading a fresh graph and utilizing the trained machine learning model for determining a subset of patent documents.

    System for searching natural language documents

    公开(公告)号:US20210350125A1

    公开(公告)日:2021-11-11

    申请号:US17284796

    申请日:2019-10-13

    IPC分类号: G06K9/00 G06F40/284 G06N3/04

    摘要: The invention provides a natural language search system and method. The system comprises a digital data storage means for storing a plurality of blocks of natural language and data graphs corresponding to said blocks. First data processing means are adapted to convert said blocks to said graphs, which are stored in said storage means. The graphs contain a plurality of nodes each containing as node value a natural language unit extracted from said blocks. There are also provided second data processing means for executing a machine learning algorithm capable of travelling said graphs and reading the node values for forming a trained machine learning model based on nodal structures of the graphs and node values of the graphs and third data processing means adapted to read a fresh graph and to utilize said model for determining a subset of said blocks of natural language based on the fresh graph.

    System and method for analyzing similarity of natural language data

    公开(公告)号:US20220207240A1

    公开(公告)日:2022-06-30

    申请号:US17611204

    申请日:2020-04-11

    摘要: The invention provides a system and method for analyzing similarity of natural language data. The system comprises a neural network subsystem adapted for reading graph format input data comprising a plurality of nodes having node values, and a similarity estimation subsystem utilizing the neural network subsystem and being trained for estimating similarity of a first and a second graphs, the similarity estimation subsystem being capable of producing at least one similarity value. In addition, there is provided a similarity explainability subsystem adapted to calculate importance values for a plurality of nodes or subgraphs of the second graph, which are used to create a reduced second graph indicate sub-blocks of the second block of natural language.

    Method of training a natural language search system, search system and corresponding use

    公开(公告)号:US20210397790A1

    公开(公告)日:2021-12-23

    申请号:US17284799

    申请日:2019-10-13

    发明人: Sakari Arvela

    摘要: The invention provides a method and system for training a machine learning-based patent search or novelty evaluation system. The method comprises providing a plurality of patent documents each having a computer-identifiable claim block and specification block, the specification block including at least part of the description of the patent document. The method also comprises providing a machine learning model and training the machine learning model using a training data set comprising data from said patent documents for forming a trained machine learning model. According to the invention, the training comprises using pairs of claim blocks and specification blocks originating from the same patent document as training cases of said training data set.

    System and method for generating blocks of natural language

    公开(公告)号:US10902210B2

    公开(公告)日:2021-01-26

    申请号:US16234637

    申请日:2018-12-28

    IPC分类号: G06F40/30 G06F40/56

    摘要: The invention relates to a system and method for generating a block of natural language, the system comprising a digital data store capable of storing a data graph according to a data schema, input sub-system for entering natural language data units to the data graph, and a data processor for generating a block of natural language based on the data graph. Further, the data schema allows storage of recursively nested natural language data units and relation data units associated with the natural language data units into the data graph, the relation data units being configured to define relations between natural language data units in the data graph. The data processor is adapted to generate said block of natural language utilizing a plurality of natural language data units and relations between the natural language data units as defined by the relation data units associated therewith.