TEXT PREDICTION METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM

    公开(公告)号:EP3907648A3

    公开(公告)日:2022-02-16

    申请号:EP21193300.7

    申请日:2021-08-26

    摘要: The present disclosure provides a text prediction method, a text prediction apparatus, an electronic device and a storage medium, and relates to the field of data processing technologies, especially deep learning, natural language processing and intelligent search technologies. The method includes: obtaining (S110, S210, S310) at least two sentences by segmenting a text to be predicted; obtaining (S120, S220) at least one sentence set by grouping the at least two sentences based on a number of Central Processing Unit (CPU) cores in a target device, in which the target device is a device configured to perform a prediction operation; assigning (S130, S240, S340) each sentence set to a corresponding CPU core of the target device, and predicting each sentence set sentence by sentence through the corresponding CPU core to obtain a prediction result of each sentence set; and determining (S140, S250, S350) a prediction result of the text to be predicted based on the prediction result of each sentence set.

    KEYWORD EXTRACTION METHOD, APPARATUS AND MEDIUM

    公开(公告)号:EP3835993A3

    公开(公告)日:2021-08-04

    申请号:EP20166998.3

    申请日:2020-03-31

    摘要: The present invention discloses a keyword extraction method, a keyword extraction apparatus and a medium, belonging to the field of data processing. The method comprises operations of: receiving an original document (S10); extracting candidate words from the original document, the extracted candidate words forming a first word set (S11); acquiring a first association degree between each first word in the first word set and the original document (S12), and determining a second word set according to the first association degree , the second word set being a subset of the first word set (S13); for each second word in the second word set, inquiring, in a word association topology, at least one node word satisfying a condition of association with the second word, the at least one node word forming a third word set, the word association topology indicating an association relation among multiple node words in a predetermined field (S14); and determining a union set of the second word set and the third word set (S15), acquiring a second association degree between each candidate keyword in the union set and the original document (S16), and selecting, according to the second association degree, at least one candidate keyword from the union set, to form a keyword set of the original document (S17). In accordance with the present invention, the calculation complexity can be reduced, and the calculation speed can be improved; the problem of preferentially selecting high-frequency words in the existing methods is solved; and, the expression of keywords is effectively enriched.

    TEXT KEY INFORMATION EXTRACTING METHOD, APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT

    公开(公告)号:EP3896595A1

    公开(公告)日:2021-10-20

    申请号:EP21163876.2

    申请日:2021-03-22

    摘要: The present disclosure provides a method for extracting key information of text, and apparatus, electronic device, storage medium, and computer program product thereof, which relates to the technical field of artificial intelligence. A specific implementation solution is as follows: segmenting an original text according to a preset segmenting unit, and generating a unit sequence corresponding to the original text; according to the unit sequence and a pre-trained information extraction model, extracting ID information of at least one target segment based on the original text by using a segment-copying principle; generating the key information based on the ID information of the at least one target segment. According to the technical solution of the present disclosure, it is possible to copy a segment including consecutive words as a target segment, effectively reduce copying times, reduce the accumulated errors, and thereby effectively improve the speed and accuracy in extracting the key information during extraction of the key information.

    DOCUMENT PAGE IDENTIFIERS FROM SELECTED PAGE REGION CONTENT

    公开(公告)号:EP4064112A1

    公开(公告)日:2022-09-28

    申请号:EP22170924.9

    申请日:2015-01-27

    申请人: Bluebeam, Inc.

    IPC分类号: G06F40/258

    摘要: A computer-implemented method of automatically indexing an electronic document comprising a plurality of pages, the method comprising receiving, via a graphical user interface, a first selection of a page region within a first page of the electronic document and a second selection of a structure for content positioned in the page region of the first page, the page region represented by a set of boundary locations relative to the first page that encompasses the content, extracting a first text string from the content positioned in the page region, assigning the first text string to a page location index of the first page, formatting the first text string assigned to the page location index of the first page in accordance with the structure of the second selection, such that an arrangement of the first text string in the page location index is defined by the structure, generating subsequent page regions in subsequent pages of the electronic document by applying the set of boundary locations to each of the subsequent pages, extracting subsequent text strings from content positioned in the subsequent page regions in the subsequent pages; and assigning the subsequent text strings extracted from the subsequent page regions to corresponding page location indices of the subsequent pages.