Patent search ap:("ADOBE INC.") AND inv:"Hailin Jin" Page 1

1.

发明授权
Topical vector-quantized variational autoencoders for extractive summarization of video transcripts 有权

公开(公告)号：US12147771B2

公开(公告)日：2024-11-19

申请号：US17361878

申请日：2021-06-29

Applicant: ADOBE INC.

Inventor： Sangwoo Cho , Franck Dernoncourt , Timothy Jeewun Ganter , Trung Huu Bui , Nedim Lipka , Varun Manjunatha , Walter Chang , Hailin Jin , Jonathan Brandt

IPC: G06F40/35 , G06F40/279

Abstract: System and methods for a text summarization system are described. In one example, a text summarization system receives an input utterance and determines whether the utterance should be included in a summary of the text. The text summarization system includes an embedding network, a convolution network, an encoding component, and a summary component. The embedding network generates a semantic embedding of an utterance. The convolution network generates a plurality of feature vectors based on the semantic embedding. The encoding component identifies a plurality of latent codes respectively corresponding to the plurality of feature vectors. The summary component identifies a prominent code among the latent codes and to select the utterance as a summary utterance based on the prominent code.

2.

发明授权
Localization of narrations in image data 有权

公开(公告)号：US12118787B2

公开(公告)日：2024-10-15

申请号：US17499193

申请日：2021-10-12

Applicant: ADOBE INC.

Inventor： Hailin Jin , Bryan Russell , Reuben Xin Hong Tan

IPC: G06K9/00 , G06F18/214 , G06F18/22 , G06N3/04 , G06V20/40 , G10L15/02 , G10L15/16 , G10L15/19 , G10L15/26

CPC classification number: G06V20/41 , G06F18/214 , G06F18/22 , G06N3/04 , G06V20/46 , G10L15/02 , G10L15/16 , G10L15/19 , G10L15/26

Abstract: Methods, system, and computer storage media are provided for multi-modal localization. Input data comprising two modalities, such as image data and corresponding text or audio data, may be received. A phrase may be extracted from the text or audio data, and a neural network system may be utilized to spatially and temporally localize the phrase within the image data. The neural network system may include a plurality of cross-modal attention layers that each compare features across the first and second modalities without comparing features of the same modality. Using the cross-modal attention layers, a region or subset of pixels within one or more frames of the image data may be identified as corresponding to the phrase, and a localization indicator may be presented for display with the image data. Embodiments may also include unsupervised training of the neural network system.

3.

发明公开
MULTIMODAL UNSUPERVISED VIDEO TEMPORAL SEGMENTATION FOR SUMMARIZATION 审中-公开

公开(公告)号：US20230386208A1

公开(公告)日：2023-11-30

申请号：US17804656

申请日：2022-05-31

Applicant: ADOBE INC.

Inventor： Hailin Jin , Jielin Qiu , Zhaowen Wang , Trung Huu Bui , Franck Dernoncourt

IPC: G06V20/40 , G06F16/683 , G06V10/774 , G06F16/34

CPC classification number: G06V20/47 , G06V20/49 , G06F16/685 , G06V10/774 , G06F16/345

Abstract: Systems and methods for video segmentation and summarization are described. Embodiments of the present disclosure receive a video and a transcript of the video; generate visual features representing frames of the video using an image encoder; generate language features representing the transcript using a text encoder, wherein the image encoder and the text encoder are trained based on a correlation between training visual features and training language features; and segment the video into a plurality of video segments based on the visual features and the language features.

4.

发明授权
Training text recognition systems 有权

公开(公告)号：US11810374B2

公开(公告)日：2023-11-07

申请号：US17240097

申请日：2021-04-26

Applicant: Adobe Inc.

Inventor： Zhaowen Wang , Hailin Jin , Yang Liu

IPC: G06V20/62 , G06V30/148 , G06F18/214 , G06V10/764

CPC classification number: G06V20/62 , G06F18/214 , G06V10/764 , G06V20/63 , G06V30/153 , G06V2201/01

Abstract: In implementations of recognizing text in images, text recognition systems are trained using noisy images that have nuisance factors applied, and corresponding clean images (e.g., without nuisance factors). Clean images serve as supervision at both feature and pixel levels, so that text recognition systems are trained to be feature invariant (e.g., by requiring features extracted from a noisy image to match features extracted from a clean image), and feature complete (e.g., by requiring that features extracted from a noisy image be sufficient to generate a clean image). Accordingly, text recognition systems generalize to text not included in training images, and are robust to nuisance factors. Furthermore, since clean images are provided as supervision at feature and pixel levels, training requires fewer training images than text recognition systems that are not trained with a supervisory clean image, thus saving time and resources.

5.

发明授权
Controlled style-content image generation based on disentangling content and style 有权

公开(公告)号：US11776180B2

公开(公告)日：2023-10-03

申请号：US16802440

申请日：2020-02-26

Applicant: ADOBE INC.

Inventor： Ning Xu , Bayram Safa Cicek , Hailin Jin , Zhaowen Wang

IPC: G06N20/20 , G06T11/60 , G06N3/088 , G06T11/00 , G06F18/214 , G06N3/045 , G06V10/764 , G06V10/774 , G06V10/82 , G06V10/44

CPC classification number: G06T11/60 , G06F18/214 , G06N3/045 , G06N3/088 , G06T11/00 , G06V10/454 , G06V10/764 , G06V10/774 , G06V10/82 , G06T2210/36

Abstract: Embodiments of the present disclosure are directed towards improved models trained using unsupervised domain adaptation. In particular, a style-content adaptation system provides improved translation during unsupervised domain adaptation by controlling the alignment of conditional distributions of a model during training such that content (e.g., a class) from a target domain is correctly mapped to content (e.g., the same class) in a source domain. The style-content adaptation system improves unsupervised domain adaptation using independent control over content (e.g., related to a class) as well as style (e.g., related to a domain) to control alignment when translating between the source and target domain. This independent control over content and style can also allow for images to be generated using the style-content adaptation system that contain desired content and/or style.

6.

发明授权
Digital content interaction prediction and training that addresses imbalanced classes 有权

公开(公告)号：US11676060B2

公开(公告)日：2023-06-13

申请号：US15002206

申请日：2016-01-20

Applicant: Adobe Inc.

Inventor： Anirban Roychowdhury , Hung H. Bui , Trung H. Bui , Hailin Jin

IPC: G06N20/00 , G06Q30/02

CPC classification number: G06N20/00 , G06Q30/02

Abstract: Digital content interaction prediction and training techniques that address imbalanced classes are described. In one or more implementations, a digital medium environment is described to predict user interaction with digital content that addresses an imbalance of numbers included in first and second classes in training data used to train a model using machine learning. The training data is received that describes the first class and the second class. A model is trained using machine learning. The training includes sampling the training data to include at least one subset of the training data from the first class and at least one subset of the training data from the second class. Iterative selections are made of a batch from the sampled training data. The iteratively selected batches are iteratively processed by a classifier implemented using machine learning to train the model.

7.

发明授权
Training neural networks to perform tag-based font recognition utilizing font classification 有权

公开(公告)号：US11636147B2

公开(公告)日：2023-04-25

申请号：US17584962

申请日：2022-01-26

Applicant: Adobe Inc.

Inventor： Zhaowen Wang , Tianlang Chen , Ning Xu , Hailin Jin

IPC: G06F16/906 , G06F16/55 , G06N3/084 , G06F16/903 , G06F40/109 , G06V30/244 , G06F18/28 , G06F18/21 , G06F18/214 , G06F18/2415 , G06V30/19 , G06V30/226 , G06V10/82 , G06V10/44

Abstract: The present disclosure relates to a tag-based font recognition system that utilizes a multi-learning framework to develop and improve tag-based font recognition using deep learning neural networks. In particular, the tag-based font recognition system jointly trains a font tag recognition neural network with an implicit font classification attention model to generate font tag probability vectors that are enhanced by implicit font classification information. Indeed, the font recognition system weights the hidden layers of the font tag recognition neural network with implicit font information to improve the accuracy and predictability of the font tag recognition neural network, which results in improved retrieval of fonts in response to a font tag query. Accordingly, using the enhanced tag probability vectors, the tag-based font recognition system can accurately identify and recommend one or more fonts in response to a font tag query.

8.

发明授权
Generating responses to queries about videos utilizing a multi-modal neural network with attention 有权

公开(公告)号：US11615308B2

公开(公告)日：2023-03-28

申请号：US17563901

申请日：2021-12-28

Applicant: Adobe Inc.

Inventor： Wentian Zhao , Seokhwan Kim , Ning Xu , Hailin Jin

IPC: G06K9/00 , G06N3/02 , G06F17/16 , G06N3/08 , G06V20/40 , G06V30/18 , G06V30/19 , G06V10/82 , G06V20/62 , G06V30/10

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating a response to a question received from a user during display or playback of a video segment by utilizing a query-response-neural network. The disclosed systems can extract a query vector from a question corresponding to the video segment using the query-response-neural network. The disclosed systems further generate context vectors representing both visual cues and transcript cues corresponding to the video segment using context encoders or other layers from the query-response-neural network. By utilizing additional layers from the query-response-neural network, the disclosed systems generate (i) a query-context vector based on the query vector and the context vectors, and (ii) candidate-response vectors representing candidate responses to the question from a domain-knowledge base or other source. To respond to a user's question, the disclosed systems further select a response from the candidate responses based on a comparison of the query-context vector and the candidate-response vectors.

9.

发明申请
TOPICAL VECTOR-QUANTIZED VARIATIONAL AUTOENCODERS FOR EXTRACTIVE SUMMARIZATION OF VIDEO TRANSCRIPTS 有权

公开(公告)号：US20220414338A1

公开(公告)日：2022-12-29

申请号：US17361878

申请日：2021-06-29

Applicant: ADOBE INC.

Inventor： SANGWOO CHO , Franck Dernoncourt , Timothy Jeewun Ganter , Trung Huu Bui , Nedim Lipka , Varun Manjunatha , Walter Chang , Hailin Jin , Jonathan Brandt

IPC: G06F40/35 , G06F40/279

Abstract: System and methods for a text summarization system are described. In one example, a text summarization system receives an input utterance and determines whether the utterance should be included in a summary of the text. The text summarization system includes an embedding network, a convolution network, an encoding component, and a summary component. The embedding network generates a semantic embedding of an utterance. The convolution network generates a plurality of feature vectors based on the semantic embedding. The encoding component identifies a plurality of latent codes respectively corresponding to the plurality of feature vectors. The summary component identifies a prominent code among the latent codes and to select the utterance as a summary utterance based on the prominent code.

10.

发明申请
DETERMINING FINE-GRAIN VISUAL STYLE SIMILARITIES FOR DIGITAL IMAGES BY EXTRACTING STYLE EMBEDDINGS DISENTANGLED FROM IMAGE CONTENT 有权

公开(公告)号：US20220092108A1

公开(公告)日：2022-03-24

申请号：US17025041

申请日：2020-09-18

Applicant: Adobe Inc.

Inventor： John Collomosse , Zhe Lin , Saeid Motiian , Hailin Jin , Baldo Faieta , Alex Filipkowski

IPC: G06F16/583 , G06F16/535 , G06F16/532 , G06N3/08

Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and flexibly identifying digital images with similar style to a query digital image using fine-grain style determination via weakly supervised style extraction neural networks. For example, the disclosed systems can extract a style embedding from a query digital image using a style extraction neural network such as a novel two-branch autoencoder architecture or a weakly supervised discriminative neural network. The disclosed systems can generate a combined style embedding by combining complementary style embeddings from different style extraction neural networks. Moreover, the disclosed systems can search a repository of digital images to identify digital images with similar style to the query digital image. The disclosed systems can also learn parameters for one or more style extraction neural network through weakly supervised training without a specifically labeled style ontology for sample digital images.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification