-
公开(公告)号:US20230252774A1
公开(公告)日:2023-08-10
申请号:US17650437
申请日:2022-02-09
Applicant: ADOBE INC.
Inventor: Jason Wen Yong Kuen , Dat Ba Huynh , Zhe Lin , Jiuxiang Gu
IPC: G06V10/778 , G06V10/82 , G06V10/86 , G06V10/22 , G06V10/77 , G06V10/764 , G06V10/776 , G06T7/11 , G06T7/70
CPC classification number: G06V10/7792 , G06T7/11 , G06T7/70 , G06V10/22 , G06V10/82 , G06V10/86 , G06V10/764 , G06V10/776 , G06V10/7715 , G06T2207/20021 , G06T2207/20081 , G06T2207/20084
Abstract: Systems and methods for image processing are described. Embodiments of the present disclosure receive a training image and a caption for the training image, wherein the caption includes text describing an object in the training image; generate a pseudo mask for the object using a teacher network based on the text describing the object; generate a mask for the object using a student network; and update parameters of the student network based on the mask and the pseudo mask.
-
公开(公告)号:US20230237088A1
公开(公告)日:2023-07-27
申请号:US18191651
申请日:2023-03-28
Applicant: Adobe Inc.
Inventor: Scott Cohen , Zhe Lin , Mingyang Ling
IPC: G06F16/535 , G06V10/20 , G06F18/24 , G06F18/2113 , G06V10/764 , G06V10/82 , G06V20/70 , G06V20/10
CPC classification number: G06F16/535 , G06V10/255 , G06F18/24 , G06F18/2113 , G06V10/764 , G06V10/82 , G06V20/70 , G06V20/10
Abstract: The present disclosure relates to an object selection system that accurately detects and optionally automatically selects user-requested objects (e.g., query objects) in digital images. For example, the object selection system builds and utilizes an object selection pipeline to determine which object detection neural network to utilize to detect a query object based on analyzing the object class of a query object. In particular, the object selection system can identify both known object classes as well as objects corresponding to unknown object classes.
-
公开(公告)号:US11676282B2
公开(公告)日:2023-06-13
申请号:US17479646
申请日:2021-09-20
Applicant: ADOBE INC.
Inventor: Jianming Zhang , Zhe Lin
CPC classification number: G06T7/11 , G06N3/045 , G06T2207/20081 , G06T2207/20084
Abstract: Enhanced methods and systems for the semantic segmentation of images are described. A refined segmentation mask for a specified object visually depicted in a source image is generated based on a coarse and/or raw segmentation mask. The refined segmentation mask is generated via a refinement process applied to the coarse segmentation mask. The refinement process correct at least a portion of both type I and type II errors, as well as refine boundaries of the specified object, associated with the coarse segmentation mask. Thus, the refined segmentation mask provides a more accurate segmentation of the object than the coarse segmentation mask. A segmentation refinement model is employed to generate the refined segmentation mask based on the coarse segmentation mask. That is, the segmentation model is employed to refine the coarse segmentation mask to generate more accurate segmentations of the object. The refinement process is an iterative refinement process carried out via a trained neural network.
-
公开(公告)号:US11663762B2
公开(公告)日:2023-05-30
申请号:US17083899
申请日:2020-10-29
Applicant: Adobe Inc.
Inventor: Jianming Zhang , Zhe Lin , Radomir Mech , Xiaohui Shen
CPC classification number: G06T11/60 , G06T3/0012 , G06T7/12 , G06T2207/20132 , G06T2210/22
Abstract: Embodiments of the present invention are directed to facilitating region of interest preservation. In accordance with some embodiments of the present invention, a region of interest preservation score using adaptive margins is determined. The region of interest preservation score indicates an extent to which at least one region of interest is preserved in a candidate image crop associated with an image. A region of interest positioning score is determined that indicates an extent to which a position of the at least one region of interest is preserved in the candidate image crop associated with the image. The region of interest preservation score and/or the preserving score are used to select a set of one or more candidate image crops as image crop suggestions.
-
公开(公告)号:US20230122623A1
公开(公告)日:2023-04-20
申请号:US17503671
申请日:2021-10-18
Applicant: Adobe Inc.
Inventor: He Zhang , Jeya Maria Jose Valanarasu , Jianming Zhang , Jose Ignacio Echevarria Vallespi , Kalyan Sunkavalli , Yilin Wang , Yinglan Ma , Zhe Lin , Zijun Wei
IPC: G06T11/60 , G06F3/0484 , G06K9/46 , G06K9/62 , G06N3/08
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and flexibly generating harmonized digital images utilizing an object-to-object harmonization neural network. For example, the disclosed systems implement, and learn parameters for, an object-to-object harmonization neural network to combine a style code from a reference object with features extracted from a target object. Indeed, the disclosed systems extract a style code from a reference object utilizing a style encoder neural network. In addition, the disclosed systems generate a harmonized target object by applying the style code of the reference object to a target object utilizing an object-to-object harmonization neural network.
-
公开(公告)号:US11605019B2
公开(公告)日:2023-03-14
申请号:US16426298
申请日:2019-05-30
Applicant: Adobe Inc.
Inventor: Pranav Vineet Aggarwal , Zhe Lin , Baldo Antonio Faieta , Saeid Motiian
Abstract: Visually guided machine-learning language model and embedding techniques are described that overcome the challenges of conventional techniques in a variety of ways. In one example, a model is trained to support a visually guided machine-learning embedding space that supports visual intuition as to “what” is represented by text. The visually guided language embedding space supported by the model, once trained, may then be used to support visual intuition as part of a variety of functionality. In one such example, the visually guided language embedding space as implemented by the model may be leveraged as part of a multi-modal differential search to support search of digital images and other digital content with real-time focus adaptation which overcomes the challenges of conventional techniques.
-
167.
公开(公告)号:US11574392B2
公开(公告)日:2023-02-07
申请号:US16803332
申请日:2020-02-27
Applicant: Adobe Inc.
Inventor: Zhe Lin , Vipul Dalal , Vera Lychagina , Shabnam Ghadar , Saeid Motiian , Rohith mohan Dodle , Prethebha Chandrasegaran , Mina Doroudi , Midhun Harikumar , Kannan Iyer , Jayant Kumar , Gaurav Kukal , Daniel Miranda , Charles R McKinney , Archit Kalra
Abstract: The present disclosure relates to an image merging system that automatically and seamlessly detects and merges missing people for a set of digital images into a composite group photo. For instance, the image merging system utilizes a number of models and operations to automatically analyze multiple digital images to identify a missing person from a base image, segment the missing person from the second image, and generate a composite group photo by merging the segmented image of the missing person into the base image. In this manner, the image merging system automatically creates merged group photos that appear natural and realistic.
-
168.
公开(公告)号:US20220309762A1
公开(公告)日:2022-09-29
申请号:US17805289
申请日:2022-06-03
Applicant: Adobe Inc.
Inventor: Handong Zhao , Zhe Lin , Sheng Li , Mingyang Ling , Jiuxiang Gu
IPC: G06V10/26 , G06N3/08 , G06K9/62 , G06N3/04 , G06V10/426
Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for generating semantic scene graphs for digital images using an external knowledgebase for feature refinement. For example, the disclosed system can determine object proposals and subgraph proposals for a digital image to indicate candidate relationships between objects in the digital image. The disclosed system can then extract relationships from an external knowledgebase for refining features of the object proposals and the subgraph proposals. Additionally, the disclosed system can generate a semantic scene graph for the digital image based on the refined features of the object/subgraph proposals. Furthermore, the disclosed system can update/train a semantic scene graph generation network based on the generated semantic scene graph. The disclosed system can also reconstruct the image using object labels based on the refined features to further update/train the semantic scene graph generation network.
-
公开(公告)号:US11449079B2
公开(公告)日:2022-09-20
申请号:US16262448
申请日:2019-01-30
Applicant: Adobe Inc.
Inventor: Zhe Lin , Xin Ye , Joon-Young Lee , Jianming Zhang
Abstract: Systems and techniques are described that provide for generalizable approach policy learning and implementation for robotic object approaching. Described techniques provide fast and accurate approaching of a specified object, or type of object, in many different environments. The described techniques enable a robot to receive an identification of an object or type of object from a user, and then navigate to the desired object, without further control from the user. Moreover, the approach of the robot to the desired object is performed efficiently, e.g., with a minimum number of movements. Further, the approach techniques may be used even when the robot is placed in a new environment, such as when the same type of object must be approached in multiple settings.
-
公开(公告)号:US20220295149A1
公开(公告)日:2022-09-15
申请号:US17200691
申请日:2021-03-12
Applicant: Adobe Inc.
Inventor: Handong Zhao , Zhankui He , Zhe Lin , Zhaowen Wang , Ajinkya Gorakhnath Kale
IPC: H04N21/466 , H04N21/4722 , H04N21/45 , G06N3/08
Abstract: A multimodal recommendation identification system analyzes data describing a sequence of past content item interactions to generate a recommendation for a content item for a user. An indication of the recommended content item is provided to a website hosting system or recommendation system so that the recommended content item is displayed or otherwise presented to the user. The multimodal recommendation identification system identifies a content item to recommend to the user by generating an encoding that encodes identifiers of the sequence of content items the user has interacted with and generating encodings that encode multimodal information for content items in the sequence of content items the user has interacted with. An aggregated information encoding for a user based on these encodings and a system analyzes the content item sequence encoding and interaction between the content item sequence encoding and the multiple modality encodings to generate the aggregated information encoding.
-
-
-
-
-
-
-
-
-