-
公开(公告)号:US10565518B2
公开(公告)日:2020-02-18
申请号:US14748059
申请日:2015-06-23
Applicant: Adobe Inc.
Inventor: Hailin Jin , Chen Fang , Jianchao Yang , Zhe Lin
Abstract: The present disclosure is directed to collaborative feature learning using social media data. For example, a machine learning system may identify social media data that includes user behavioral data, which indicates user interactions with content item. Using the identified social user behavioral data, the machine learning system may determine latent representations from the content items. In some embodiments, the machine learning system may train a machine-learning model based on the latent representations. Further, the machine learning system may extract features of the content item from the trained machine-learning model.
-
公开(公告)号:US10515295B2
公开(公告)日:2019-12-24
申请号:US15796213
申请日:2017-10-27
Applicant: Adobe Inc.
Inventor: Yang Liu , Zhaowen Wang , Hailin Jin
Abstract: The present disclosure relates to a font recognition system that employs a multi-task learning framework to jointly improve font classification and remove negative side effects caused by intra-class variances of glyph content. For example, in one or more embodiments, the font recognition system can jointly train a font recognition neural network using a font classification loss model and triplet loss model to generate a deep learning neural network that provides improved font classifications. In addition, the font recognition system can employ the trained font recognition neural network to efficiently recognize fonts within input images as well as provide other suggested fonts.
-
公开(公告)号:US20190130231A1
公开(公告)日:2019-05-02
申请号:US15796213
申请日:2017-10-27
Applicant: Adobe Inc.
Inventor: Yang Liu , Zhaowen Wang , Hailin Jin
Abstract: The present disclosure relates to a font recognition system that employs a multi-task learning framework to jointly improve font classification and remove negative side effects caused by intra-class variances of glyph content. For example, in one or more embodiments, the font recognition system can jointly train a font recognition neural network using a font classification loss model and triplet loss model to generate a deep learning neural network that provides improved font classifications. In addition, the font recognition system can employ the trained font recognition neural network to efficiently recognize fonts within input images as well as provide other suggested fonts.
-
公开(公告)号:US10268928B2
公开(公告)日:2019-04-23
申请号:US15616776
申请日:2017-06-07
Applicant: Adobe Inc.
Inventor: Hailin Jin , John Philip Collomosse
Abstract: A combined structure and style network is described. Initially, a large set of training images, having a variety of different styles, is obtained. Each of these training images is associated with one of multiple different predetermined style categories indicating the image's style and one of multiple different predetermined semantic categories indicating objects depicted in the image. Groups of these images are formed, such that each group includes an anchor image having one of the styles, a positive-style example image having the same style as the anchor image, and a negative-style example image having a different style. Based on those groups, an image style network is generated to identify images having desired styling by recognizing visual characteristics of the different styles. The image style network is further combined, according to a unifying training technique, with an image structure network configured to recognize desired objects in images irrespective of image style.
-
公开(公告)号:US20230386054A1
公开(公告)日:2023-11-30
申请号:US17804376
申请日:2022-05-27
Applicant: Adobe Inc. , University of Surrey
Inventor: John Collomosse , Alexander Black , Van Tu Bui , Hailin Jin , Viswanathan Swaminathan
CPC classification number: G06T7/337 , G06T3/0093 , G06N3/04 , G06T2207/20221 , G06T2207/20084
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize deep learning to identify regions of an image that have been editorially modified. For example, the image comparison system includes a deep image comparator model that compares a pair of images and localizes regions that have been editorially manipulated relative to an original or trusted image. More specifically, the deep image comparator model generates and surfaces visual indications of the location of such editorial changes on the modified image. The deep image comparator model is robust and ignores discrepancies due to benign image transformations that commonly occur during electronic image distribution. The image comparison system optionally includes an image retrieval model utilizes a visual search embedding that is robust to minor manipulations or benign modifications of images. The image retrieval model utilizes a visual search embedding for an image to robustly identify near duplicate images.
-
公开(公告)号:US11823322B2
公开(公告)日:2023-11-21
申请号:US17807337
申请日:2022-06-16
Applicant: Adobe Inc.
Inventor: Tong He , John Collomosse , Hailin Jin
CPC classification number: G06T15/08 , G06T7/74 , G06V10/454 , G06V10/82 , G06V20/647 , G06T2200/08 , G06T2207/20084
Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for utilizing an encoder-decoder architecture to learn a volumetric 3D representation of an object using digital images of the object from multiple viewpoints to render novel views of the object. For instance, the disclosed systems can utilize patch-based image feature extraction to extract lifted feature representations from images corresponding to different viewpoints of an object. Furthermore, the disclosed systems can model view-dependent transformed feature representations using learned transformation kernels. In addition, the disclosed systems can recurrently and concurrently aggregate the transformed feature representations to generate a 3D voxel representation of the object. Furthermore, the disclosed systems can sample frustum features using the 3D voxel representation and transformation kernels. Then, the disclosed systems can utilize a patch-based neural rendering approach to render images from frustum feature patches to display a view of the object from various viewpoints.
-
公开(公告)号:US20230075087A1
公开(公告)日:2023-03-09
申请号:US17466636
申请日:2021-09-03
Applicant: ADOBE INC.
Inventor: Simon Jenni , Hailin Jin
Abstract: The disclosed invention includes systems and methods for training and employing equivariant models for generating representations (e.g., vector representations) of temporally-varying content, such as but not limited to video content. The trained models are equivariant to temporal transformations applied to the input content (e.g., video content). The trained models are additionally invariant to non-temporal transformations (e.g., spatial and/or color-space transformations) applied to the input content. Such representations are employed in various machine learning tasks, such as but not limited to video retrieval (e.g., video search engine applications), identification of actions depicted in video, and temporally ordering clips of the video.
-
公开(公告)号:US20220414314A1
公开(公告)日:2022-12-29
申请号:US17362031
申请日:2021-06-29
Applicant: Adobe Inc.
Inventor: Zhifei Zhang , Zhaowen Wang , Hailin Jin , Matthew Fisher
IPC: G06F40/109 , G06T11/20 , G06N3/04
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and flexibly generating scalable and semantically editable font representations utilizing a machine learning approach. For example, the disclosed systems generate a font representation code from a glyph utilizing a particular neural network architecture. For example, the disclosed systems utilize a glyph appearance propagation model and perform an iterative process to generate a font representation code from an initial glyph. Additionally, using a glyph appearance propagation model, the disclosed systems automatically propagate the appearance of the initial glyph from the font representation code to generate additional glyphs corresponding to respective glyph labels. In some embodiments, the disclosed systems propagate edits or other changes in appearance of a glyph to other glyphs within a glyph set (e.g., to match the appearance of the edited glyph).
-
39.
公开(公告)号:US20220122357A1
公开(公告)日:2022-04-21
申请号:US17563901
申请日:2021-12-28
Applicant: Adobe Inc.
Inventor: Wentian Zhao , Seokhwan Kim , Ning Xu , Hailin Jin
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating a response to a question received from a user during display or playback of a video segment by utilizing a query-response-neural network. The disclosed systems can extract a query vector from a question corresponding to the video segment using the query-response-neural network. The disclosed systems further generate context vectors representing both visual cues and transcript cues corresponding to the video segment using context encoders or other layers from the query-response-neural network. By utilizing additional layers from the query-response-neural network, the disclosed systems generate (i) a query-context vector based on the query vector and the context vectors, and (ii) candidate-response vectors representing candidate responses to the question from a domain-knowledge base or other source. To respond to a user's question, the disclosed systems further select a response from the candidate responses based on a comparison of the query-context vector and the candidate-response vectors.
-
公开(公告)号:US20210409836A1
公开(公告)日:2021-12-30
申请号:US17470441
申请日:2021-09-09
Applicant: Adobe Inc.
Inventor: Bryan Russell , Ruppesh Nalwaya , Markus Woodson , Joon-Young Lee , Hailin Jin
IPC: H04N21/81 , H04N21/845 , G06N3/08 , G06K9/00
Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for automatic tagging of videos. In particular, in one or more embodiments, the disclosed systems generate a set of tagged feature vectors (e.g., tagged feature vectors based on action-rich digital videos) to utilize to generate tags for an input digital video. For instance, the disclosed systems can extract a set of frames for the input digital video and generate feature vectors from the set of frames. In some embodiments, the disclosed systems generate aggregated feature vectors from the feature vectors. Furthermore, the disclosed systems can utilize the feature vectors (or aggregated feature vectors) to identify similar tagged feature vectors from the set of tagged feature vectors. Additionally, the disclosed systems can generate a set of tags for the input digital videos by aggregating one or more tags corresponding to identified similar tagged feature vectors.
-
-
-
-
-
-
-
-
-