Performing global image editing using editing operations determined from natural language requests

    公开(公告)号:US11570318B2

    公开(公告)日:2023-01-31

    申请号:US17374103

    申请日:2021-07-13

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize a neural network having a long short-term memory encoder-decoder architecture to progressively modify a digital image in accordance with a natural language request. For example, in one or more embodiments, the disclosed systems utilize a language-to-operation decoding cell of a language-to-operation neural network to sequentially determine one or more image-modification operations to perform to modify a digital image in accordance with a natural language request. In some cases, the decoding cell determines an image-modification operation to perform partly based on the previously used image-modification operations. The disclosed systems further utilize the decoding cell to determine one or more operation parameters for each selected image-modification operation. The disclosed systems utilize the image-modification operation(s) and operation parameter(s) to modify the digital image (e.g., by generating one or more modified digital images) via the decoding cell.

    Utilizing logical-form dialogue generation for multi-turn construction of paired natural language queries and query-language representations

    公开(公告)号:US11561969B2

    公开(公告)日:2023-01-24

    申请号:US16834850

    申请日:2020-03-30

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating pairs of natural language queries and corresponding query-language representations. For example, the disclosed systems can generate a contextual representation of a prior-generated dialogue sequence to compare with logical-form rules. In some implementations, the logical-form rules comprise trigger conditions and corresponding logical-form actions for constructing a logical-form representation of a subsequent dialogue sequence. Based on the comparison to logical-form rules indicating satisfaction of one or more trigger conditions, the disclosed systems can perform logical-form actions to generate a logical-form representation of a subsequent dialogue sequence. In turn, the disclosed systems can apply a natural-language-to-query-language (NL2QL) template to the logical-form representation to generate a natural language query and a corresponding query-language representation for the subsequent dialogue sequence.

    SEMANTIC REASONING FOR TABULAR QUESTION ANSWERING

    公开(公告)号:US20220374426A1

    公开(公告)日:2022-11-24

    申请号:US17317052

    申请日:2021-05-11

    Applicant: ADOBE INC.

    Abstract: Systems and methods for natural language processing are described. One or more embodiments of the present disclosure receive a query related to information in a table, compute an operation selector by combining the query with an operation embedding representing a plurality of table operations, compute a column selector by combining the query with a weighted operation embedding, compute a row selector based on the operation selector and the column selector, compute a probability value for a cell in the table based on the row selector and the column selector, where the probability value represents a probability that the cell provides an answer to the query, and transmit contents of the cell based on the probability value.

    Utilizing a gated self-attention memory network model for predicting a candidate answer match to a query

    公开(公告)号:US11113479B2

    公开(公告)日:2021-09-07

    申请号:US16569513

    申请日:2019-09-12

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that can determine an answer to a query based on matching probabilities for combinations of respective candidate answers. For example, the disclosed systems can utilize a gated-self attention mechanism (GSAM) to interpret inputs that include contextual information, a query, and candidate answers. The disclosed systems can also utilize a memory network in tandem with the GSAM to form a gated self-attention memory network (GSAMN) to refine outputs or predictions over multiple reasoning hops. Further, the disclosed systems can utilize transfer learning of the GSAM/GSAMN from an initial training dataset to a target training dataset.

    UTILIZING A GATED SELF-ATTENTION MEMORY NETWORK MODEL FOR PREDICTING A CANDIDATE ANSWER MATCH TO A QUERY

    公开(公告)号:US20210081503A1

    公开(公告)日:2021-03-18

    申请号:US16569513

    申请日:2019-09-12

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that can determine an answer to a query based on matching probabilities for combinations of respective candidate answers. For example, the disclosed systems can utilize a gated-self attention mechanism (GSAM) to interpret inputs that include contextual information, a query, and candidate answers. The disclosed systems can also utilize a memory network in tandem with the GSAM to form a gated self-attention memory network (GSAMN) to refine outputs or predictions over multiple reasoning hops. Further, the disclosed systems can utilize transfer learning of the GSAM/GSAMN from an initial training dataset to a target training dataset.

    GENERATING CONTEXTUAL TAGS FOR DIGITAL CONTENT

    公开(公告)号:US20210034657A1

    公开(公告)日:2021-02-04

    申请号:US16525366

    申请日:2019-07-29

    Applicant: Adobe Inc.

    Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for determining multi-term contextual tags for digital content and propagating the multi-term contextual tags to additional digital content. For instance, the disclosed systems can utilize search query supervision to determine and associate multi-term contextual tags (e.g., tags that represent a specific concept based on the order of the terms in the tag) with digital content. Furthermore, the disclosed systems can propagate the multi-term contextual tags determined for the digital content to additional digital content based on similarities between the digital content and additional digital content (e.g., utilizing clustering techniques). Additionally, the disclosed systems can provide digital content as search results based on the associated multi-term contextual tags.

    Bi-directional recurrent encoders with multi-hop attention for speech emotion recognition

    公开(公告)号:US12236975B2

    公开(公告)日:2025-02-25

    申请号:US17526810

    申请日:2021-11-15

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for determining speech emotion. In particular, a speech emotion recognition system generates an audio feature vector and a textual feature vector for a sequence of words. Further, the speech emotion recognition system utilizes a neural attention mechanism that intelligently blends together the audio feature vector and the textual feature vector to generate attention output. Using the attention output, which includes consideration of both audio and text modalities for speech corresponding to the sequence of words, the speech emotion recognition system can apply attention methods to one of the feature vectors to generate a hidden feature vector. Based on the hidden feature vector, the speech emotion recognition system can generate a speech emotion probability distribution of emotions among a group of candidate emotions, and then select one of the candidate emotions as corresponding to the sequence of words.

    Methods and Systems for Determining Characteristics of A Dialog Between A Computer and A User

    公开(公告)号:US20230197081A1

    公开(公告)日:2023-06-22

    申请号:US18107620

    申请日:2023-02-09

    Applicant: Adobe Inc.

    CPC classification number: G10L15/22 G10L15/02 G10L15/183

    Abstract: A computer-implemented method is disclosed for determining one or more characteristics of a dialog between a computer system and user. The method may comprise receiving a system utterance comprising one or more tokens defining one or more words generated by the computer system; receiving a user utterance comprising one or more tokens defining one or more words uttered by a user in response to the system utterance, the system utterance and the user utterance forming a dialog context; receiving one or more utterance candidates comprising one or more tokens; for each utterance candidate, generating an input sequence combining the one or more tokens of each of the system utterance, the user utterance, and the utterance candidate; and for each utterance candidate, evaluating the generated input sequence with a model to determine a probability that the utterance candidate is relevant to the dialog context.

Patent Agency Ranking