IDENTIFYING SEQUENCE HEADINGS IN A DOCUMENT
    1.
    发明申请

    公开(公告)号:US20200311571A1

    公开(公告)日:2020-10-01

    申请号:US16370724

    申请日:2019-03-29

    Inventor: Darrell Bellert

    Abstract: A method for processing an electronic document (ED) to infer a sequence of section headings in the ED. The method includes generating, by a computer processor, based on regular expression matching of a predetermined section heading pattern and a plurality of characters in the ED, a list of candidate headings in the ED; generating, by the computer processor and based on the list of candidate headings, a list of chain fragments for inferring a portion of the sequence of section headings; and generating, by the computer processor and based on predetermined criteria, the sequence of section headings by merging at least two chain fragments in the list of chain fragments.

    Identifying sequence headings in a document

    公开(公告)号:US11468346B2

    公开(公告)日:2022-10-11

    申请号:US16370724

    申请日:2019-03-29

    Inventor: Darrell Bellert

    Abstract: A method for processing an electronic document (ED) to infer a sequence of section headings in the ED. The method includes generating, by a computer processor, based on regular expression matching of a predetermined section heading pattern and a plurality of characters in the ED, a list of candidate headings in the ED; generating, by the computer processor and based on the list of candidate headings, a list of chain fragments for inferring a portion of the sequence of section headings; and generating, by the computer processor and based on predetermined criteria, the sequence of section headings by merging at least two chain fragments in the list of chain fragments.

    IDENTIFYING SECTION HEADINGS IN A DOCUMENT
    3.
    发明申请

    公开(公告)号:US20200320170A1

    公开(公告)日:2020-10-08

    申请号:US16675456

    申请日:2019-11-06

    Inventor: Darrell Bellert

    Abstract: A method, non-transitory computer readable medium, and system for inferring certain texts as stylized section headings in an electronic document (ED). Stylized section headings are section headings that have unique styling distinct from the body of text below each stylized heading. In particular, the stylized section headings are identified based on styling information in the ED. Identifying stylized section headings includes grouping candidate headings based on identification of dominant styling, locating high level fragments, and repeatedly locating nested fragments from within higher level fragments. The ED may or may not include explicitly identified headings in the document.

    Identifying section headings in a document

    公开(公告)号:US11494555B2

    公开(公告)日:2022-11-08

    申请号:US16675456

    申请日:2019-11-06

    Inventor: Darrell Bellert

    Abstract: A method, non-transitory computer readable medium, and system for inferring certain texts as stylized section headings in an electronic document (ED). Stylized section headings are section headings that have unique styling distinct from the body of text below each stylized heading. In particular, the stylized section headings are identified based on styling information in the ED. Identifying stylized section headings includes grouping candidate headings based on identification of dominant styling, locating high level fragments, and repeatedly locating nested fragments from within higher level fragments. The ED may or may not include explicitly identified headings in the document.

Patent Agency Ranking