Automated identification of concept labels for a text fragment

    公开(公告)号:US11354513B2

    公开(公告)日:2022-06-07

    申请号:US16784000

    申请日:2020-02-06

    Applicant: Adobe Inc.

    Abstract: A technique for intelligently identifying concept labels for a text fragment where the identified concept labels are representative of and semantically relevant to the information contained by the text fragment is provided. The technique includes determining, using a knowledge base storing information for a reference set of concept labels, a first subset of concept labels that are relevant to the information contained by the text fragment. The technique includes ordering the first subset of concept labels according to their relevance scores and performing dependency analysis on the ordered list of concept labels. Based on the dependency analysis, the technique includes identifying concept labels for a text fragment that are more independent (e.g., more distinct and non-overlapping) of each other, representative of and semantically relevant to the information represented by the text fragment.

    AUTOMATED IDENTIFICATION OF CONCEPT LABELS FOR A SET OF DOCUMENTS

    公开(公告)号:US20210248323A1

    公开(公告)日:2021-08-12

    申请号:US16784145

    申请日:2020-02-06

    Applicant: Adobe Inc.

    Abstract: Techniques are described for intelligently identifying concept labels for a set of multiple documents where the identified concept labels are representative of and semantically relevant to the information contained by the set of documents. The technique includes extracting semantic units (e.g., paragraphs) from the set of documents and determining concept labels applicable to the semantic units based on relevance scores computed for the concept labels. The technique includes determining an initial set of concept labels for the set of documents based on the applicable concept labels. The technique further includes obtaining a reference hierarchy associated with the reference set of concept labels and determining a final set of concept labels for the set of documents using a reference hierarchy, the initial set of concept labels, and the relevance scores. The technique includes outputting information identifying the final set of concept labels for the set of documents.

    CONSTRUCTING CONTENT BASED ON MULTI-SENTENCE COMPRESSION OF SOURCE CONTENT

    公开(公告)号:US20190197184A1

    公开(公告)日:2019-06-27

    申请号:US15854320

    申请日:2017-12-26

    Applicant: ADOBE INC.

    CPC classification number: G06F16/334 G06F16/338 G06F17/2705 G06F17/277

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed to facilitating corpus-based content generation, in particular, using graph-based multi-sentence compression to generate a final content output. In one embodiment, pre-existing source content is identified and retrieved from a corpus. The source content is then parsed into sentence tokens, mapped and weighted. The sentence tokens are further parsed into word tokens and weighted. The mapped word tokens are then compressed into candidate sentences to be used in a final content. The final content is assembled using ranked candidate sentences, such that the final content is organized to reduce information redundancy and optimize content cohesion.

    Systems for Generating Indications of Relationships between Electronic Documents

    公开(公告)号:US20230162518A1

    公开(公告)日:2023-05-25

    申请号:US17534744

    申请日:2021-11-24

    Applicant: Adobe Inc.

    CPC classification number: G06V30/413 G06V30/274 G06V30/414 G06V30/418

    Abstract: In implementations of systems for generating indications of relationships between electronic documents, a processing device implements a relationship system to segment text of electronic documents included in a document corpus into segments. The relationship system determines a subset of the electronic documents that includes electronic document pairs having a number of similar segments that is greater than a threshold number. The similar segments are identified using locality sensitive hashing. The electronic document pairs are classified as related documents or unrelated documents using a machine learning model that receives a pair of electronic documents as an input and generates an indication of a classification for the pair of electronic documents as an output. Indications of relationships between particular electronic documents included in the subset are generated based at least partially on the electronic document pairs that are classified as related documents.

    Automated identification of concept labels for a set of documents

    公开(公告)号:US11416684B2

    公开(公告)日:2022-08-16

    申请号:US16784145

    申请日:2020-02-06

    Applicant: Adobe Inc.

    Abstract: Techniques are described for intelligently identifying concept labels for a set of multiple documents where the identified concept labels are representative of and semantically relevant to the information contained by the set of documents. The technique includes extracting semantic units (e.g., paragraphs) from the set of documents and determining concept labels applicable to the semantic units based on relevance scores computed for the concept labels. The technique includes determining an initial set of concept labels for the set of documents based on the applicable concept labels. The technique further includes obtaining a reference hierarchy associated with the reference set of concept labels and determining a final set of concept labels for the set of documents using a reference hierarchy, the initial set of concept labels, and the relevance scores. The technique includes outputting information identifying the final set of concept labels for the set of documents.

    Classifying and ranking changes between document versions

    公开(公告)号:US10713432B2

    公开(公告)日:2020-07-14

    申请号:US15476640

    申请日:2017-03-31

    Applicant: Adobe Inc.

    Abstract: This disclosure generally covers systems and methods that identify and differentiate types of changes made from one version of a document to another version of the document. In particular, the disclosed systems and methods identify changes between different document versions as factual changes or paraphrasing changes or (in some embodiments) as changes of a more specific revision category. Moreover, in some embodiments, the disclosed systems and methods also generate a comparison of the first and second versions that identifies changes as factual changes or paraphrasing changes or (in some embodiments) as changes of a more specific revision category. The disclosed systems and methods, in some embodiments, further rank sentences that include changes made between different document versions or group similar (or the same) type of changes within a comparison of document versions.

    EXPLOITING DOMAIN-SPECIFIC LANGUAGE CHARACTERISTICS FOR LANGUAGE MODEL PRETRAINING

    公开(公告)号:US20240303496A1

    公开(公告)日:2024-09-12

    申请号:US18181044

    申请日:2023-03-09

    Applicant: ADOBE INC.

    CPC classification number: G06N3/0895 G06F40/279

    Abstract: A method, apparatus, non-transitory computer readable medium, and system of training a domain-specific language model are described. One or more aspects of the method, apparatus, non-transitory computer readable medium, and system include obtaining domain-specific training data including a plurality of domain-specific documents having a document structure corresponding to a domain, and obtaining domain-agnostic training data including a plurality of documents outside of the domain. The domain-specific training data and the domain-agnostic training data are used to train a language model to perform a domain-specific task based on the domain-specific training data and to perform a domain agnostic task based on the domain-agnostic training data.

    AUTOMATED IDENTIFICATION OF CONCEPT LABELS FOR A TEXT FRAGMENT

    公开(公告)号:US20210248322A1

    公开(公告)日:2021-08-12

    申请号:US16784000

    申请日:2020-02-06

    Applicant: Adobe Inc.

    Abstract: A technique for intelligently identifying concept labels for a text fragment where the identified concept labels are representative of and semantically relevant to the information contained by the text fragment is provided. The technique includes determining, using a knowledge base storing information for a reference set of concept labels, a first subset of concept labels that are relevant to the information contained by the text fragment. The technique includes ordering the first subset of concept labels according to their relevance scores and performing dependency analysis on the ordered list of concept labels. Based on the dependency analysis, the technique includes identifying concept labels for a text fragment that are more independent (e.g., more distinct and non-overlapping) of each other, representative of and semantically relevant to the information represented by the text fragment.

    Anomaly detection for time series data having arbitrary seasonality

    公开(公告)号:US11023577B2

    公开(公告)日:2021-06-01

    申请号:US15228570

    申请日:2016-08-04

    Applicant: ADOBE INC.

    Abstract: In various implementations, a method includes receiving a set of time series data that corresponds to a metric. A seasonal pattern is extracted from the set of time series data and the extracted seasonal pattern is filtered from the set of time series data. A predictive model is generated from the filtered set of data. The extracted seasonal pattern is filtered from another set of time series data where the second set of time series data corresponds to the metric. The filtered second set of time series data is compared to the predictive model. An alert is generated to a user for a value within the filtered second set of time series data which falls outside of the predictive model.

Patent Agency Ranking