-
-
公开(公告)号:EP3907648A3
公开(公告)日:2022-02-16
申请号:EP21193300.7
申请日:2021-08-26
发明人: YANG, Chen , YANG, Tianxing , PENG, Bin , ZHANG, Yilin , SONG, Xunchao
IPC分类号: G06F40/258 , G06F8/41 , G06F40/274
摘要: The present disclosure provides a text prediction method, a text prediction apparatus, an electronic device and a storage medium, and relates to the field of data processing technologies, especially deep learning, natural language processing and intelligent search technologies. The method includes: obtaining (S110, S210, S310) at least two sentences by segmenting a text to be predicted; obtaining (S120, S220) at least one sentence set by grouping the at least two sentences based on a number of Central Processing Unit (CPU) cores in a target device, in which the target device is a device configured to perform a prediction operation; assigning (S130, S240, S340) each sentence set to a corresponding CPU core of the target device, and predicting each sentence set sentence by sentence through the corresponding CPU core to obtain a prediction result of each sentence set; and determining (S140, S250, S350) a prediction result of the text to be predicted based on the prediction result of each sentence set.
-
公开(公告)号:EP4404083A1
公开(公告)日:2024-07-24
申请号:EP24152425.5
申请日:2024-01-17
IPC分类号: G06F16/93 , G06F40/205 , G06F40/258 , G06F40/279 , G06F40/284 , G06F40/295 , G06F40/30 , G06V30/412 , G06V30/416
CPC分类号: G06F40/205 , G06F40/258 , G06F40/284 , G06F40/295 , G06V10/82 , G06V30/416 , G06F40/279 , G06F40/30 , G06V30/412 , G06F16/93
摘要: Systems and methods for automated indexing and extraction of information in digital documents are disclosed. A method may comprise selecting a page number of a digital document to identify a page containing targeted information; inputting an image of the page into a visual machine learning network (visual ML), wherein the visual ML is trained to recognize text associated with the targeted information in an image; identifying by the visual ML, a section of the image that contains the targeted information; inputting the page number, the digital document, and coordinates of the section into an extraction module; and extracting the targeted information by the extraction module from the section.
-
公开(公告)号:EP3835993A3
公开(公告)日:2021-08-04
申请号:EP20166998.3
申请日:2020-03-31
发明人: GUO, Qun , LU, Xiao , MENG, Erli , WANG, Bin , SHI, Liang , JI, Hongxu , QI, Baoyuan
IPC分类号: G06F40/258 , G06F40/284 , G06K9/62 , G06F40/247
摘要: The present invention discloses a keyword extraction method, a keyword extraction apparatus and a medium, belonging to the field of data processing. The method comprises operations of: receiving an original document (S10); extracting candidate words from the original document, the extracted candidate words forming a first word set (S11); acquiring a first association degree between each first word in the first word set and the original document (S12), and determining a second word set according to the first association degree , the second word set being a subset of the first word set (S13); for each second word in the second word set, inquiring, in a word association topology, at least one node word satisfying a condition of association with the second word, the at least one node word forming a third word set, the word association topology indicating an association relation among multiple node words in a predetermined field (S14); and determining a union set of the second word set and the third word set (S15), acquiring a second association degree between each candidate keyword in the union set and the original document (S16), and selecting, according to the second association degree, at least one candidate keyword from the union set, to form a keyword set of the original document (S17). In accordance with the present invention, the calculation complexity can be reduced, and the calculation speed can be improved; the problem of preferentially selecting high-frequency words in the existing methods is solved; and, the expression of keywords is effectively enriched.
-
公开(公告)号:EP4338089A1
公开(公告)日:2024-03-20
申请号:EP21729390.1
申请日:2021-05-12
发明人: ORBACH, Eyal , FAIZAKOF, Avraham , MAZZA, Arnon , HAIKIN, Lev
IPC分类号: G06F40/258
-
公开(公告)号:EP4014229A1
公开(公告)日:2022-06-22
申请号:EP20876032.2
申请日:2020-05-28
IPC分类号: G10L15/08 , G10L25/48 , G06F40/258 , G06F40/247 , G06F40/30
-
公开(公告)号:EP3977332A1
公开(公告)日:2022-04-06
申请号:EP20750451.5
申请日:2020-06-08
发明人: XIONG, Li , HU, Chuan , OVERWIJK, Arnold , AHMED, Junaid , CAMPOS, Daniel Fernando , XIONG, Chenyan
IPC分类号: G06F40/258
-
8.
公开(公告)号:EP3896595A1
公开(公告)日:2021-10-20
申请号:EP21163876.2
申请日:2021-03-22
发明人: WANG, Xin , SUN, Mingming , LI, Ping
IPC分类号: G06F40/284 , G06F40/289 , G06F40/258 , G06N3/02 , G06F16/34 , G06F40/30
摘要: The present disclosure provides a method for extracting key information of text, and apparatus, electronic device, storage medium, and computer program product thereof, which relates to the technical field of artificial intelligence. A specific implementation solution is as follows: segmenting an original text according to a preset segmenting unit, and generating a unit sequence corresponding to the original text; according to the unit sequence and a pre-trained information extraction model, extracting ID information of at least one target segment based on the original text by using a segment-copying principle; generating the key information based on the ID information of the at least one target segment. According to the technical solution of the present disclosure, it is possible to copy a segment including consecutive words as a target segment, effectively reduce copying times, reduce the accumulated errors, and thereby effectively improve the speed and accuracy in extracting the key information during extraction of the key information.
-
公开(公告)号:EP4309071A1
公开(公告)日:2024-01-24
申请号:EP22716129.6
申请日:2022-03-04
IPC分类号: G06F40/171 , G06F3/04883 , G06F40/117 , G06F40/258
-
公开(公告)号:EP4064112A1
公开(公告)日:2022-09-28
申请号:EP22170924.9
申请日:2015-01-27
申请人: Bluebeam, Inc.
发明人: HARTMANN, Brian , NOYES, Peter
IPC分类号: G06F40/258
摘要: A computer-implemented method of automatically indexing an electronic document comprising a plurality of pages, the method comprising receiving, via a graphical user interface, a first selection of a page region within a first page of the electronic document and a second selection of a structure for content positioned in the page region of the first page, the page region represented by a set of boundary locations relative to the first page that encompasses the content, extracting a first text string from the content positioned in the page region, assigning the first text string to a page location index of the first page, formatting the first text string assigned to the page location index of the first page in accordance with the structure of the second selection, such that an arrangement of the first text string in the page location index is defined by the structure, generating subsequent page regions in subsequent pages of the electronic document by applying the set of boundary locations to each of the subsequent pages, extracting subsequent text strings from content positioned in the subsequent page regions in the subsequent pages; and assigning the subsequent text strings extracted from the subsequent page regions to corresponding page location indices of the subsequent pages.
-
-
-
-
-
-
-
-
-