-
公开(公告)号:US11790045B2
公开(公告)日:2023-10-17
申请号:US17240246
申请日:2021-04-26
Applicant: ADOBE INC.
Inventor: Shipali Shetty , Zhe Lin , Alexander Smith
CPC classification number: G06F18/22 , G06T7/0002 , G06V10/75 , G06T2207/20132 , G06T2210/12
Abstract: Systems and methods for image tagging are described. In some embodiments, images with problematic tags are identified after applying an auto-tagger. The images with problematic tags are then sent to an object detection network. In some cases, the object detection network is trained using a training set selected to improve detection of objects associated with the problematic tags. The output of the object detection network can be merged with the output of the auto-tagger to provide a combined image tagging output. In some cases, the output of the object detection network also includes a bounding box, which can be used to crop the image around a relevant object so that the auto-tagger can be reapplied to a portion of the image.
-
公开(公告)号:US20230316591A1
公开(公告)日:2023-10-05
申请号:US17709895
申请日:2022-03-31
Applicant: Adobe Inc.
Inventor: Zhixin Shu , Zhe Lin , Yuchen Liu , Yijun Li , Richard Zhang
IPC: G06T11/00 , G06V10/40 , G06V10/774
CPC classification number: G06T11/00 , G06V10/40 , G06V10/7747
Abstract: Techniques for identity preserved controllable facial image manipulation are described that support generation of a manipulated digital image based on a facial image and a render image. For instance, a facial image depicting a facial representation of an individual is received as input. A feature space including an identity parameter and at least one other visual parameter is extracted from the facial image. An editing module edits one or more of the visual parameters and preserves the identity parameter. A renderer generates a render image depicting a morphable model reconstruction of the facial image based on the edit. The render image and facial image are encoded, and a generator of a neural network is implemented to generate a manipulated digital image based on the encoded facial image and the encoded render image.
-
公开(公告)号:US11775578B2
公开(公告)日:2023-10-03
申请号:US17398317
申请日:2021-08-10
Applicant: Adobe Inc.
Inventor: Pranav Vineet Aggarwal , Zhe Lin , Baldo Antonio Faieta , Saeid Motiian
IPC: G06K9/62 , G06K9/72 , G06F16/535 , G06N20/00 , G06V30/262 , G06F18/40 , G06F18/214 , G06V30/19 , G06V10/82 , G06V10/94 , G06F3/0482
CPC classification number: G06F16/535 , G06F18/2148 , G06F18/40 , G06N20/00 , G06V10/82 , G06V10/945 , G06V30/1916 , G06V30/19147 , G06V30/19173 , G06V30/274 , G06F3/0482
Abstract: Text-to-visual machine learning embedding techniques are described that overcome the challenges of conventional techniques in a variety of ways. These techniques include use of query-based training data which may expand availability and types of training data usable to train a model. Generation of negative digital image samples is also described that may increase accuracy in training the model using machine learning. A loss function is also described that also supports increased accuracy and computational efficiency by losses separately, e.g., between positive or negative sample embeddings a text embedding.
-
公开(公告)号:US11741157B2
公开(公告)日:2023-08-29
申请号:US17544689
申请日:2021-12-07
Applicant: Adobe Inc.
Inventor: Ajinkya Kale , Baldo Faieta , Benjamin Leviant , Fengbin Chen , Francois Guerin , Kate Sousa , Trung Bui , Venkat Barakam , Zhe Lin
IPC: G06F16/40 , G06F16/58 , G06F16/48 , G06F16/2457 , G06F16/43 , G06V20/00 , G06F18/23213
CPC classification number: G06F16/5866 , G06F16/24578 , G06F16/43 , G06F16/48 , G06F18/23213 , G06V20/35 , G06V2201/10
Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for determining multi-term contextual tags for digital content and propagating the multi-term contextual tags to additional digital content. For instance, the disclosed systems can utilize search query supervision to determine and associate multi-term contextual tags (e.g., tags that represent a specific concept based on the order of the terms in the tag) with digital content. Furthermore, the disclosed systems can propagate the multi-term contextual tags determined for the digital content to additional digital content based on similarities between the digital content and additional digital content (e.g., utilizing clustering techniques). Additionally, the disclosed systems can provide digital content as search results based on the associated multi-term contextual tags.
-
335.
公开(公告)号:US11734339B2
公开(公告)日:2023-08-22
申请号:US17075450
申请日:2020-10-20
Applicant: Adobe Inc.
Inventor: Ajinkya Kale , Zhe Lin , Pranav Aggarwal
IPC: G06F16/535 , G06F16/538 , G06F16/242 , G06F40/279 , G06N3/08 , G06N3/04 , G06F18/21
CPC classification number: G06F16/535 , G06F16/243 , G06F16/538 , G06F18/21 , G06F40/279 , G06N3/04 , G06N3/08
Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.
-
公开(公告)号:US11710042B2
公开(公告)日:2023-07-25
申请号:US16782793
申请日:2020-02-05
Applicant: Adobe Inc.
Inventor: Shikun Liu , Zhe Lin , Yilin Wang , Jianming Zhang , Federico Perazzi
Abstract: The present disclosure relates to shaping the architecture of a neural network. For example, the disclosed systems can provide a neural network shaping mechanism for at least one sampling layer of a neural network. The neural network shaping mechanism can include a learnable scaling factor between a sampling rate of the at least one sampling layer and an additional sampling function. The disclosed systems can learn the scaling factor based on a dataset while jointly learning the network weights of the neural network. Based on the learned scaling factor, the disclosed systems can shape the architecture of the neural network by modifying the sampling rate of the at least one sampling layer.
-
公开(公告)号:US11709885B2
公开(公告)日:2023-07-25
申请号:US17025041
申请日:2020-09-18
Applicant: Adobe Inc.
Inventor: John Collomosse , Zhe Lin , Saeid Motiian , Hailin Jin , Baldo Faieta , Alex Filipkowski
IPC: G06T7/00 , G06F16/583 , G06F16/532 , G06N3/08 , G06F16/535 , G06V10/82 , G06V20/30
CPC classification number: G06F16/5854 , G06F16/532 , G06F16/535 , G06F16/5838 , G06N3/08 , G06V10/82 , G06V20/30
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and flexibly identifying digital images with similar style to a query digital image using fine-grain style determination via weakly supervised style extraction neural networks. For example, the disclosed systems can extract a style embedding from a query digital image using a style extraction neural network such as a novel two-branch autoencoder architecture or a weakly supervised discriminative neural network. The disclosed systems can generate a combined style embedding by combining complementary style embeddings from different style extraction neural networks. Moreover, the disclosed systems can search a repository of digital images to identify digital images with similar style to the query digital image. The disclosed systems can also learn parameters for one or more style extraction neural network through weakly supervised training without a specifically labeled style ontology for sample digital images.
-
公开(公告)号:US20230206525A1
公开(公告)日:2023-06-29
申请号:US18117155
申请日:2023-03-03
Applicant: Adobe Inc.
Inventor: Midhun Harikumar , Pranav Aggarwal , Baldo Faieta , Ajinkya Kale , Zhe Lin
CPC classification number: G06T11/60 , G06T7/11 , G06T7/162 , G06T2207/20084 , G06T2207/20081
Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, using a model, a learned image representation of a target image. The operations further include generating, using a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image based on the convolving of the learned image representation of the target image with the text embedding.
-
公开(公告)号:US11681919B2
公开(公告)日:2023-06-20
申请号:US17331161
申请日:2021-05-26
Applicant: Adobe Inc.
Inventor: Khoi Pham , Scott Cohen , Zhe Lin , Zhihong Ding , Walter Wei Tuh Chang
IPC: G06V10/00 , G06N3/08 , G06F18/2113 , G06F18/214 , G06F18/21 , G06V10/764 , G06V10/771 , G06V10/774 , G06V10/82
CPC classification number: G06N3/08 , G06F18/2113 , G06F18/2155 , G06F18/2163 , G06V10/764 , G06V10/765 , G06V10/771 , G06V10/7753 , G06V10/82
Abstract: The present disclosure relates to an object selection system that automatically detects and selects objects in a digital image utilizing a large-scale object detector. For instance, in response to receiving a request to automatically select a query object with an unknown object class in a digital image, the object selection system can utilize a large-scale object detector to detect potential objects in the image, filter out one or more potential objects, and label the remaining potential objects in the image to detect the query object. In some implementations, the large-scale object detector utilizes a region proposal model, a concept mask model, and an auto tagging model to automatically detect objects in the digital image.
-
公开(公告)号:US20230185844A1
公开(公告)日:2023-06-15
申请号:US18104848
申请日:2023-02-02
Applicant: Adobe Inc.
Inventor: Pranav Vineet Aggarwal , Zhe Lin , Baldo Antonio Faieta , Saeid Antonio Motiian
IPC: G06F16/583 , G06N20/00 , G06N3/08 , G06F16/538 , G06F18/21 , G06F18/22 , G06V30/19 , G06V30/262 , G06V10/82 , G06V10/94
CPC classification number: G06F16/583 , G06N20/00 , G06N3/08 , G06F16/538 , G06F18/21 , G06F18/22 , G06V30/19147 , G06V30/1916 , G06V30/19173 , G06V30/274 , G06V10/82 , G06V10/945 , G10L15/22
Abstract: Visually guided machine-learning language model and embedding techniques are described that overcome the challenges of conventional techniques in a variety of ways. In one example, a model is trained to support a visually guided machine-learning embedding space that supports visual intuition as to “what” is represented by text. The visually guided language embedding space supported by the model, once trained, may then be used to support visual intuition as part of a variety of functionality. In one such example, the visually guided language embedding space as implemented by the model may be leveraged as part of a multi-modal differential search to support search of digital images and other digital content with real-time focus adaptation which overcomes the challenges of conventional techniques.
-
-
-
-
-
-
-
-
-