摘要:
Embodiments include a method for reconciling a target data node with a data graph encoding a plurality of interconnected data nodes. The method comprises filtering an initial candidate set of data nodes from among the plurality of interconnected data nodes by performing a partial comparison process of a member of the initial candidate set with the target data node. The partial comparison process comprises using a first set of hash functions to compare a first set of features extracted from each of the member and the target data node, and if the outcome of the partial comparison process satisfies one or more removal criteria, removing: the member from the initial candidate set; and any other members from the initial candidate set assessed as having a semantic similarity with the member above a semantic similarity threshold. The partial comparison process further comprises repeating the performing, and removing on condition of the removal criterion being satisfied, until each remaining member of the initial candidate set has had the partial comparison process with the target data node completed. The method further comprises performing full comparison processing between the target data node and each remaining member of the initial candidate set following the filtering, the full comparison processing comprising using a second set of hash functions to compare a second set of features extracted from both the remaining member and the target data node. Wherein the second set of hash functions contains more hash functions than the first set of hash functions.
摘要:
Embodiments include a method, apparatus, program, and system for distributing data items among a plurality of data storage units, the data items being an aggregation of data from a plurality of data sources. The method comprises generating a semantic description of each of the plurality of data sources; calculating, for each pair of data sources from among the plurality of data sources, a degree of similarity between the semantic descriptions of the pair of data sources; and allocating data items to data storage units in dependence upon the calculated degree of similarity between the data source of a data item being allocated and the or each data source of data items already allocated to the data storage units.
摘要:
Embodiments of the invention provide a computer-implemented healthcare system arranged to produce personal advices for a user, the system comprising: a goal analyser arranged to receive a healthcare goal, to receive personal risk factors for a user and to provide atomic goals from a combination of the personal risk factors and the healthcare goal; an engine finder arranged to input atomic goals to one or more selected advice generation engines, which provide advices relating to the atomic goals; and an advice filter, which filters the advices from the selected reasoning engines against checking parameters to provide checked advice for output.
摘要:
A method comprises carrying out a graph entity matching process between first and second graphs in which a first image representing the first graph, and a second image representing an arrangement of graph entities of the second graph, are obtained, a measure of similarity between the first and second images is computed, and a determination is made as to whether a first predetermined condition has been met. When the first predetermined condition has not been met, the graph entity matching process is repeated to compute a similarity measure for a different graph entity arrangement in respect of the second graph. The graph entity arrangement of the second graph which, on the basis of the computed similarity measure(s), provides the closest similarity between the first and second image data is identified as the closest match to the first graph.
摘要:
A system to align codes between two coding standards, comprising: an expert mapping module, a syntactical mapping module, and a case-based mapping module; a module adjustment unit; and an alignment unit; wherein the expert mapping module is configured to collect established mappings between pairs of codes of the two coding standards from the internet and/or from machine-readable publications; the syntactical mapping module is configured to access the two coding standards including descriptions for each code, and to find the similarity of pairs of codes of the two coding standards using the descriptions to provide syntactical mappings; the case-based mapping module is configured to access existing cases that are annotated with both coding standards and to find case-based mappings between pairs of codes of the two coding standards; the module adjustment unit is configured to aggregate the mappings from the modules; and the alignment unit is configured to accept input of codes from one of the coding standards and to use the aggregated mappings from the module adjustment unit to extract one or more suitable mappings from each input code to a code of the other coding standard.
摘要:
Embodiments include a querying method for a database of graph data encoded as triples, the triples each comprising values of three triple elements and being stored on a plurality of storage servers, the method comprising: a dividing step comprising dividing a query into a plurality of result criteria, the result criteria comprising a plurality of triple patterns which some or all query results must match, each triple pattern is composed of three triple pattern elements each corresponding to a different one of the three triple elements; each triple pattern element being either: a single value triple pattern element specifying a single value of the corresponding triple element which triples must have to match the triple pattern; or a variable value triple pattern element specifying an ID of a variable, the ID being attributed to values of the corresponding triple elements of triples matching the triple pattern. The method further comprises a sub-query forming step comprising, forming one or more sub-queries each comprising two or more triple patterns having the same single value triple pattern element or specifying the same ID of a variable as a variable value triple pattern element; a sub-query issuing step comprising issuing each formed sub-query to each of the plurality of storage servers; and a query result preparing step comprising receiving triples satisfying at least one formed sub-query as sub-query results from the plurality of storage servers and using the sub-query results to prepare query results as a response to the query.
摘要:
Embodiments of the present invention include a method for assessing the similarity of two concepts from different ontologies, the method comprising: for each concept being assessed: obtaining a core list comprising a label of the concept and each concept that is a parent or ancestor of the concept; and performing a generalisation process. The generalisation process includes: querying a document corpus with each of the labels from a list, and obtaining a query result identifying the one or more most relevant documents; compiling a textual description of the concept by obtaining at least a portion of each of one or more of the most relevant documents; and identifying a set of most significant terms in the textual description, wherein, the list in the performing of the generalisation process is the core list, The method further comprises: performing a comparison of the sets of most significant terms of the two concepts to obtain an indication of the similarity between the two concepts.
摘要:
A computer system (2) operable to quantify the impact of a physical activity on a body, the computer system comprising: a data model (4, 6) in which one or more physical activities are each decomposed into a plurality of fundamental actions which make up the physical activity and parts of the body and their relationships with one another are defined; an impact value generator (8) having access to the data model (4, 6) and an information source (10); wherein, for a specified physical activity from among the one or more physical activities, the impact value generator (8) is configured to generate an individual impact value for the impact of each fundamental action of the specified physical activity defined in the physical activity data model (4) on each body part defined in the data model (6); wherein each individual impact value is generated based on a correlation between the fundamental action and the body part in the information source (10).