SYNONYM DETERMINATION SYSTEM AND SYNONYM DETERMINATION METHOD

    公开(公告)号:US20240242026A1

    公开(公告)日:2024-07-18

    申请号:US18289903

    申请日:2022-04-26

    Applicant: HITACHI, LTD.

    CPC classification number: G06F40/247

    Abstract: Synonyms are efficiently extracted from document data with high accuracy. A synonym determination system acquires correct/incorrect information that is information indicating whether or not two constituent words of a part of a plurality of synonym candidates that are a combination of two words selected from a plurality of words extracted from document data are synonyms, generates a synonym extraction rule that is information for determining whether or not the two constituent words of the synonym candidates are synonyms on the basis of a feature of the synonym candidates acquired from the document data and the correct/incorrect information, and extracts the synonym candidates of which the two constituent words are synonyms by applying the synonym extraction rule to the synonym candidates for which the correct/incorrect information has not been acquired. The correct/incorrect information is acquired, for example, by being received from a user via a user interface.

    DATA EXTRACTION METHOD AND DATA EXTRACTION DEVICE

    公开(公告)号:US20210103699A1

    公开(公告)日:2021-04-08

    申请号:US17064683

    申请日:2020-10-07

    Applicant: HITACHI, LTD.

    Abstract: A data extraction device includes: a label input part that receives, from a user, an input of the type of each component of at least one set of sentences and a designation of a topic portion in the component; a model creation part that creates a pre-trained model that has learned the type of each component and a feature of the topic portion in the component; a sentence-feature presuming part that inputs a specified set of sentences inputted by a user into the pre-trained model and a topic portion in each component; a word-vector calculation part that determines a relationship among each word in the specified set of sentences, the type of each presumed component, and the presumed topic portion to calculate a feature amount of each word. A relationship of each of the words based on the calculated feature amount is then extracted.

    DATA CATALOG AUTOMATIC GENERATION SYSTEM AND DATA CATALOG AUTOMATIC GENERATION METHOD

    公开(公告)号:US20190310982A1

    公开(公告)日:2019-10-10

    申请号:US16379501

    申请日:2019-04-09

    Applicant: HITACHI, LTD.

    Abstract: A technology is disclosed that makes it possible even for an analyst, who has poor knowledge relating to field data, to select and use analysis data in analysis. A data catalog automatic generation system that generates a catalog tag to be used to select analysis data from collected field data is configured such that, based on a set classification rule input, a relationship between an objective variable as an analysis perspective relating to field data and an explanatory variable or a causal relationship between a plurality of the explanatory variables is extracted, and based on a result of the extraction, a catalog tag of the objective variable and a catalog tag of the explanatory function are specified and attached.

    RESEARCH VIEWPOINT PRESENTATION SYSTEM AND RESEARCH VIEWPOINT PRESENTATION METHOD

    公开(公告)号:US20240104303A1

    公开(公告)日:2024-03-28

    申请号:US18275086

    申请日:2021-08-17

    Applicant: HITACHI, LTD.

    CPC classification number: G06F40/284 G06V30/413

    Abstract: A research viewpoint presentation system calculates, for a document group, a level of potential relevance between two words, using a co-occurrence rate determined based on a meaning of a word or a context in which the word appears, taking into consideration a potential relationship between the two words included in the document group; calculates, for the document group, a level of existing relevance between two words, based on a frequency of actual appearance of the two words; selects a pair of two words for pairs of two words extracted from the document group, based on an index determined by comparing the potential relevance level with the existing relevance level; extracts recommended research viewpoint information concerning the selected pair of two words from the document group; and outputs the extracted recommended research viewpoint information.

Patent Agency Ranking