Identification of changes between document versions

    公开(公告)号:US11630869B2

    公开(公告)日:2023-04-18

    申请号:US16806438

    申请日:2020-03-02

    摘要: One embodiment provides a method, including: obtaining at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; identifying, within each of the at least two documents, portions corresponding to groups of text containing a conceptual unit; assigning at least a subset of the identified portions to a category type corresponding to a topic of a given portion, wherein the assigning comprises (i) generating a semantic tag for the identified portions in the subset and (ii) tagging the identified portions in the subset with the semantic tag; and determining changes between the at least two documents, wherein the determining comprises (iii) aligning given portions across the at least two documents based upon a relationship between the given portions across the at least two documents, (iv) identifying semantic differences between the aligned portions, and (v) identifying any remaining unaligned portions.

    IDENTIFICATION OF CHANGES BETWEEN DOCUMENT VERSIONS

    公开(公告)号:US20210271718A1

    公开(公告)日:2021-09-02

    申请号:US16806438

    申请日:2020-03-02

    摘要: One embodiment provides a method, including: obtaining at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; identifying, within each of the at least two documents, portions corresponding to groups of text containing a conceptual unit; assigning at least a subset of the identified portions to a category type corresponding to a topic of a given portion, wherein the assigning comprises (i) generating a semantic tag for the identified portions in the subset and (ii) tagging the identified portions in the subset with the semantic tag; and determining changes between the at least two documents, wherein the determining comprises (iii) aligning given portions across the at least two documents based upon a relationship between the given portions across the at least two documents, (iv) identifying semantic differences between the aligned portions, and (v) identifying any remaining unaligned portions.

    Computer-implemented cognitive system for assessing subjective question-answers

    公开(公告)号:US09646250B1

    公开(公告)日:2017-05-09

    申请号:US14943427

    申请日:2015-11-17

    IPC分类号: G06N5/04 G06F17/30

    CPC分类号: G06N5/04 G06F17/3043 G09B7/02

    摘要: A cognitive system that automatically assesses subjective answers may be provided. A cognitive engine executing on one or more processors may determine whether a statement parsed from a subjective answer by natural language processing technique is accurate or inaccurate, for each of the plurality of statements based on matching the statement with information associated with a domain of a question from a plurality of data sources, according to an accuracy threshold. An overall assessment of the answer may be automatically determined based on a number of statements determined to be accurate, a number of statements determined to be inaccurate, a number of duplicate statements in the answer relative to a total number of statements in the answer. A visual graphics representing accurate and inaccurate statements may be presented or displayed, allowing a user to interact with the visual graphics to modify the assessment.

    COMPUTER-IMPLEMENTED COGNITIVE SYSTEM FOR ASSESSING SUBJECTIVE QUESTION-ANSWERS

    公开(公告)号:US20170140277A1

    公开(公告)日:2017-05-18

    申请号:US14943427

    申请日:2015-11-17

    IPC分类号: G06N5/04 G06F17/30

    CPC分类号: G06N5/04 G06F17/3043 G09B7/02

    摘要: A cognitive system that automatically assesses subjective answers may be provided. A cognitive engine executing on one or more processors may determine whether a statement parsed from a subjective answer by natural language processing technique is accurate or inaccurate, for each of the plurality of statements based on matching the statement with information associated with a domain of a question from a plurality of data sources, according to an accuracy threshold. An overall assessment of the answer may be automatically determined based on a number of statements determined to be accurate, a number of statements determined to be inaccurate, a number of duplicate statements in the answer relative to a total number of statements in the answer. A visual graphics representing accurate and inaccurate statements may be presented or displayed, allowing a user to interact with the visual graphics to modify the assessment.

    Contrasting document-embedded structured data and generating summaries thereof

    公开(公告)号:US11500840B2

    公开(公告)日:2022-11-15

    申请号:US16804399

    申请日:2020-02-28

    摘要: Methods, systems, and computer program products for contrasting document-embedded structured data and generating summaries thereof are provided herein. A computer-implemented method includes extracting two or more tables from two or more input documents, wherein each of the two or more input documents comprises structured data and unstructured data; normalizing the two or more extracted tables using one or more alignment techniques; determining at least one of (i) one or more differences and (ii) one or more similarities across the two or more extracted tables by performing a comparison of the two or more normalized tables; deriving one or more insights from the comparison by applying at least one analytical model to the at least one of the one or more determined differences and one or more determined similarities; and outputting at least a portion of the one or more insights to at least one user.

    Contrasting Document-Embedded Structured Data and Generating Summaries Thereof

    公开(公告)号:US20210271654A1

    公开(公告)日:2021-09-02

    申请号:US16804399

    申请日:2020-02-28

    摘要: Methods, systems, and computer program products for contrasting document-embedded structured data and generating summaries thereof are provided herein. A computer-implemented method includes extracting two or more tables from two or more input documents, wherein each of the two or more input documents comprises structured data and unstructured data; normalizing the two or more extracted tables using one or more alignment techniques; determining at least one of (i) one or more differences and (ii) one or more similarities across the two or more extracted tables by performing a comparison of the two or more normalized tables; deriving one or more insights from the comparison by applying at least one analytical model to the at least one of the one or more determined differences and one or more determined similarities; and outputting at least a portion of the one or more insights to at least one user.