-
公开(公告)号:US11170158B2
公开(公告)日:2021-11-09
申请号:US15915775
申请日:2018-03-08
Applicant: Adobe Inc.
Inventor: Arman Cohan , Walter W. Chang , Trung Huu Bui , Franck Dernoncourt , Doo Soon Kim
IPC: G06F40/14 , G06F40/146 , G06N3/04 , G06F16/93 , G06F16/34 , G06F40/30 , G06F40/56 , G06F40/274 , G06F40/289
Abstract: Techniques are disclosed for abstractive summarization process for summarizing documents, including long documents. A document is encoded using an encoder-decoder architecture with attentive decoding. In particular, an encoder for modeling documents generates both word-level and section-level representations of a document. A discourse-aware decoder then captures the information flow from all discourse sections of a document. In order to extend the robustness of the generated summarization, a neural attention mechanism considers both word-level as well as section-level representations of a document. The neural attention mechanism may utilize a set of weights that are applied to the word-level representations and section-level representations.
-
公开(公告)号:US20200327884A1
公开(公告)日:2020-10-15
申请号:US16383312
申请日:2019-04-12
Applicant: ADOBE INC.
Inventor: Trung Huu Bui , Subhadeep Dey , Franck Dernoncourt
Abstract: Methods and systems are provided for generating a customized speech recognition neural network system comprised of an adapted automatic speech recognition neural network and an adapted language model neural network. The automatic speech recognition neural network is first trained in a generic domain and then adapted to a target domain. The language model neural network is first trained in a generic domain and then adapted to a target domain. Such a customized speech recognition neural network system can be used to understand input vocal commands.
-
公开(公告)号:US20200160842A1
公开(公告)日:2020-05-21
申请号:US16198302
申请日:2018-11-21
Applicant: Adobe Inc.
Inventor: Tzu-Hsiang Lin , Trung Huu Bui , Doo Soon Kim
Abstract: Dialog system training techniques using a simulated user system are described. In one example, a simulated user system supports multiple agents. The dialog system, for instance, may be configured for use with an application (e.g., digital image editing application). The simulated user system may therefore simulate user actions involving both the application and the dialog system which may be used to train the dialog system. Additionally, the simulated user system is not limited to simulation of user interactions by a single input mode (e.g., natural language inputs), but also supports multimodal inputs. Further, the simulated user system may also support use of multiple goals within a single dialog session
-
公开(公告)号:US10586528B2
公开(公告)日:2020-03-10
申请号:US15423429
申请日:2017-02-02
Applicant: Adobe Inc.
Abstract: Domain-specific speech recognizer generation with crowd sourcing is described. The domain-specific speech recognizers are generated for voice user interfaces (VUIs) configured to replace or supplement application interfaces. In accordance with the described techniques, the speech recognizers are generated for a respective such application interface and are domain-specific because they are each generated based on language data that corresponds to the respective application interface. This domain-specific language data is used to build a domain-specific language model. The domain-specific language data is also used to collect acoustic data for building an acoustic model. In particular, the domain-specific language data is used to generate user interfaces that prompt crowd-sourcing participants to say selected words represented by the language data for recording. The recordings of these selected words are then used to build the acoustic model. The domain-specific speech recognizers are generated by combining a respective domain-specific language model and crowd-sourced acoustic model.
-
公开(公告)号:US20190156822A1
公开(公告)日:2019-05-23
申请号:US15820874
申请日:2017-11-22
Applicant: Adobe Inc.
Inventor: Ramesh Radhakrishna Manuvinakurike , Trung Huu Bui , Walter W. Chang
CPC classification number: G10L15/22 , G06F3/167 , G10L15/04 , G10L15/063 , G10L15/265 , G10L25/57 , G10L25/78 , G10L25/87 , G10L2015/223
Abstract: A technique for multiple turn conversational task assistance includes receiving data representing a conversation between a user and an agent. The conversation includes a digitally recorded video portion and a digitally recorded audio portion, where the audio portion corresponds to the video portion. Next, the audio portion is segmented into a plurality of audio chunks. For each of the audio chunks, a transcript of the respective audio chunk is received. Each of the audio chunks is grouped into one or more dialog acts, where each dialog act includes at least one of the respective audio chunks, the validated transcript corresponds to the respective audio chunks, and a portion of the video portion corresponds to the respective audio chunk. Each of the dialog acts is stored in a data corpus.
-
公开(公告)号:US11960843B2
公开(公告)日:2024-04-16
申请号:US16401548
申请日:2019-05-02
Applicant: Adobe Inc.
Inventor: Zhe Lin , Trung Huu Bui , Scott Cohen , Mingyang Ling , Chenyun Wu
IPC: G06N20/00 , G06F40/30 , G06V10/25 , G06V10/764 , G06V10/82 , G06F18/21 , G06F40/205
CPC classification number: G06F40/30 , G06N20/00 , G06V10/25 , G06V10/764 , G06V10/82 , G06F18/217 , G06F40/205
Abstract: Techniques and systems are provided for training a machine learning model using different datasets to perform one or more tasks. The machine learning model can include a first sub-module configured to perform a first task and a second sub-module configured to perform a second task. The first sub-module can be selected for training using a first training dataset based on a format of the first training dataset. The first sub-module can then be trained using the first training dataset to perform the first task. The second sub-module can be selected for training using a second training dataset based on a format of the second training dataset. The second sub-module can then be trained using the second training dataset to perform the second task.
-
公开(公告)号:US20240020337A1
公开(公告)日:2024-01-18
申请号:US17811963
申请日:2022-07-12
Applicant: ADOBE INC.
Inventor: Adyasha Maharana , Quan Hung Tran , Seunghyun Yoon , Franck Dernoncourt , Trung Huu Bui , Walter W. Chang
IPC: G06F16/738 , G10L13/08 , G06F40/284 , G06F16/783
CPC classification number: G06F16/739 , G10L13/08 , G06F40/284 , G06F16/7844
Abstract: Systems and methods for intent discovery and video summarization are described. Embodiments of the present disclosure receive a video and a transcript of the video, encode the video to obtain a sequence of video encodings, encode the transcript to obtain a sequence of text encodings, apply a visual gate to the sequence of text encodings based on the sequence of video encodings to obtain gated text encodings, and generate an intent label for the transcript based on the gated text encodings.
-
公开(公告)号:US20230419164A1
公开(公告)日:2023-12-28
申请号:US17846428
申请日:2022-06-22
Applicant: Adobe Inc.
Inventor: Khalil Mrini , Franck Dernoncourt , Seunghyun Yoon , Trung Huu Bui , Walter W. Chang , Emilia Farcas , Ndapandula T. Nakashole
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: Multitask machine-learning model training and training data augmentation techniques are described. In one example, training is performed for multiple tasks simultaneously as part of training a multitask machine-learning model using question pairs. Examples of the multiple tasks include question summarization and recognizing question entailment. Further, a loss function is described that incorporates a parameter sharing loss that is configured to adjust an amount that parameters are shared between corresponding layers trained for the first and second tasks, respectively. In an implementation, training data augmentation techniques are also employed by synthesizing question pairs, automatically and without user intervention, to improve accuracy in model training.
-
29.
公开(公告)号:US20230418868A1
公开(公告)日:2023-12-28
申请号:US17808599
申请日:2022-06-24
Applicant: ADOBE INC.
Inventor: Yeon Seonwoo , Seunghyun Yoon , Trung Huu Bui , Franck Dernoncourt , Roger K. Brooks , Mihir Naware
IPC: G06F16/901 , G06F16/903 , G06F16/9038 , G06F16/93
CPC classification number: G06F16/9024 , G06F16/90335 , G06F16/9038 , G06F16/93
Abstract: Systems and methods for text processing are described. Embodiments of the present disclosure receive a query comprising a natural language expression; extract a plurality of mentions from the query; generate a relation vector between a pair of the plurality of mentions using a relation encoder network, wherein the relation encoder network is trained using a contrastive learning process where mention pairs from a same document are labeled as positive samples and mention pairs from different documents are labeled as negative samples; combine the plurality of mentions with the relation vector to obtain a virtual knowledge graph of the query; identify a document corresponding to the query by comparing the virtual knowledge graph of the query to a virtual knowledge graph of the document; and transmit a response to the query, wherein the response includes a reference to the document.
-
30.
公开(公告)号:US11769111B2
公开(公告)日:2023-09-26
申请号:US16904881
申请日:2020-06-18
Applicant: ADOBE INC.
Inventor: Trung Huu Bui , Hung Hai Bui , Shawn Alan Gaither , Walter Wei-Tuh Chang , Michael Frank Kraley , Pranjal Daga
IPC: G06F17/00 , G06Q10/10 , G06Q10/06 , G06F40/10 , G06V30/148 , G06V30/413 , G06F40/103
CPC classification number: G06Q10/10 , G06F40/10 , G06F40/103 , G06Q10/06 , G06V30/153 , G06V30/413
Abstract: The present invention is directed towards providing automated workflows for the identification of a reading order from text segments extracted from a document. Ordering the text segments is based on trained natural language models. In some embodiments, the workflows are enabled to perform a method for identifying a sequence associated with a portable document. The methods includes iteratively generating a probabilistic language model, receiving the portable document, and selectively extracting features (such as but not limited to text segments) from the document. The method may generate pairs of features (or feature pair from the extracted features). The method may further generate a score for each of the pairs based on the probabilistic language model and determine an order to features based on the scores. The method may provide the extracted features in the determined order.
-
-
-
-
-
-
-
-
-