Invention Application
- Patent Title: Method and Apparatus for Constructing Document Heading Tree, Electronic Device and Storage Medium
-
Application No.: US17023721Application Date: 2020-09-17
-
Publication No.: US20210303772A1Publication Date: 2021-09-30
- Inventor: Zhen Zhang , Yipeng Zhang , Minghao Liu , Jiangliang Guo
- Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.
- Applicant Address: CN Beijing
- Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
- Current Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
- Current Assignee Address: CN Beijing
- Priority: CN202010247461.4 20200331
- Main IPC: G06F40/14
- IPC: G06F40/14 ; G06K9/00 ; G06N3/04 ; G06F40/211

Abstract:
A method and apparatus for constructing a document heading tree, an electronic device and a storage medium are provided. The method includes: performing a rule matching between a text feature of each of paragraphs in a document to be processed and a paragraph feature in a predefined rule, according to the predefined rule; determining a paragraph level of each of the paragraphs in the document to be processed according to a result of the rule matching, in a case where the rule matching is successful; determining a paragraph level of each of the paragraphs in the document to be processed using a machine learning model, in a case where the rule matching is failed; and constructing a document heading tree of the document to be processed based on the paragraph level of each of the paragraphs.
Information query