-
公开(公告)号:US11914641B2
公开(公告)日:2024-02-27
申请号:US17186625
申请日:2021-02-26
Applicant: ADOBE INC.
Inventor: Pranav Aggarwal , Ajinkya Kale , Baldo Faieta , Saeid Motiian , Venkata naveen kumar yadav Marri
IPC: G06F16/583 , G06F40/279 , G06N3/08 , G06F16/51 , G06F16/538 , G06F16/532 , G06V10/56
CPC classification number: G06F16/5838 , G06F16/51 , G06F16/532 , G06F16/538 , G06F40/279 , G06N3/08 , G06V10/56
Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a color embedding network trained using machine learning techniques to generate embedded color representations for color terms included in a text search query. For example, techniques described herein are used to represent color text in a same space as color embeddings (e.g., an embedding space created by determining a histogram of LAB based colors in a three-dimensional (3D) space). Further, techniques are described for indexing color palettes for all the searchable images in the search space. Accordingly, color terms in a text query are directly converted into a color palette and an image search system can return one or more search images with corresponding color palettes that are relevant to (e.g., within a threshold distance from) the color palette of the text query.
-
公开(公告)号:US20230419551A1
公开(公告)日:2023-12-28
申请号:US17808261
申请日:2022-06-22
Applicant: Adobe Inc.
Inventor: Midhun Harikumar , Pranav Aggarwal , Ajinkya Gorakhnath Kale
Abstract: Techniques for generating a novel image using tokenized image representations are disclosed. In some embodiments, a method of generating the novel image includes generating, via a first machine learning model, a first sequence of coded representations of a first image having one or more features; generating, via a second machine learning model, a second sequence of coded representations of a sketch image having one or more edge features associated with the one or more features; predicting, via a third machine learning model, one or more subsequent coded representations based on the first sequence of coded representations and the second sequence of coded representations; and based on the subsequent coded representations, generating, via the third machine learning model, a first portion of a reconstructed image having one or more image attributes of the first image, and a second portion of the reconstructed image associated with the one or more edge features.
-
公开(公告)号:US20220138439A1
公开(公告)日:2022-05-05
申请号:US17088847
申请日:2020-11-04
Applicant: Adobe Inc.
Inventor: Ritiz Tambi , Pranav Aggarwal , Ajinkya Kale
IPC: G06F40/58 , G06F40/117
Abstract: Introduced here is an approach to translating tags assigned to digital images. As an example, embeddings may be extracted from a tag to be translated and the digital image with which the tag is associated by a multimodal model. These embeddings can be compared to embeddings extracted from a set of target tags associated with a target language by the multimodal model. Such an approach allows similarity to be established along two dimensions, which ensures the obstacles associated with direct translation can be avoided.
-
4.
公开(公告)号:US20220121702A1
公开(公告)日:2022-04-21
申请号:US17075450
申请日:2020-10-20
Applicant: Adobe Inc.
Inventor: Ajinkya Kale , Zhe Lin , Pranav Aggarwal
IPC: G06F16/535 , G06K9/62 , G06F40/279 , G06F16/242 , G06F16/538 , G06N3/04 , G06N3/08
Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.
-
公开(公告)号:US12008698B2
公开(公告)日:2024-06-11
申请号:US18117155
申请日:2023-03-03
Applicant: Adobe Inc.
Inventor: Midhun Harikumar , Pranav Aggarwal , Baldo Faieta , Ajinkya Kale , Zhe Lin
CPC classification number: G06T11/60 , G06T7/11 , G06T7/162 , G06T2207/20081 , G06T2207/20084
Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, using a model, a learned image representation of a target image. The operations further include generating, using a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image based on the convolving of the learned image representation of the target image with the text embedding.
-
公开(公告)号:US11687714B2
公开(公告)日:2023-06-27
申请号:US16998730
申请日:2020-08-20
Applicant: Adobe Inc. , Pranav Aggarwal , Di Pu , Daniel ReMine , Ajinkya Kale
Inventor: Pranav Aggarwal , Di Pu , Daniel ReMine , Ajinkya Kale
IPC: G06F40/279
CPC classification number: G06F40/279
Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.
-
公开(公告)号:US20220156992A1
公开(公告)日:2022-05-19
申请号:US16952008
申请日:2020-11-18
Applicant: Adobe Inc.
Inventor: Midhun Harikumar , Pranav Aggarwal , Baldo Faieta , Ajinkya Kale , Zhe Lin
Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.
-
公开(公告)号:US20230315988A1
公开(公告)日:2023-10-05
申请号:US18315391
申请日:2023-05-10
Applicant: Adobe Inc.
Inventor: Pranav Aggarwal , Di Pu , Daniel ReMine , Ajinkya Kale
IPC: G06F40/279
CPC classification number: G06F40/279
Abstract: Disclosed are computer-implemented methods and systems for generating text descriptive of digital images, comprising using a machine learning model to pre-process an image to generate initial text descriptive of the image; adjusting one or more inferences of the machine learning model, the inferences biasing the machine learning model away from associating negative words with the image; using the machine learning model comprising the adjusted inferences to post-process the image to generate updated text descriptive of the image; and processing the generated updated text descriptive of the image outputted by the machine learning model to fine-tune the updated text descriptive of the image.
-
9.
公开(公告)号:US11734339B2
公开(公告)日:2023-08-22
申请号:US17075450
申请日:2020-10-20
Applicant: Adobe Inc.
Inventor: Ajinkya Kale , Zhe Lin , Pranav Aggarwal
IPC: G06F16/535 , G06F16/538 , G06F16/242 , G06F40/279 , G06N3/08 , G06N3/04 , G06F18/21
CPC classification number: G06F16/535 , G06F16/243 , G06F16/538 , G06F18/21 , G06F40/279 , G06N3/04 , G06N3/08
Abstract: The present disclosure relates to methods, systems, and non-transitory computer-readable media for retrieving digital images in response to queries. For example, in one or more embodiments, the disclosed systems receive a query comprising text and generates a cross-lingual-multimodal embedding for the text within a multimodal embedding space. The disclosed systems further identifies an image embedding for a digital image that corresponds to (e.g., is relevant to) the text from the query based on an embedding distance between the image embedding and the cross-lingual-multimodal embedding for the text within the multimodal embedding space. Accordingly, the disclosed systems retrieve the digital image associated with the image embedding for display on a client device, such as the client device that submitted the query.
-
公开(公告)号:US20230206525A1
公开(公告)日:2023-06-29
申请号:US18117155
申请日:2023-03-03
Applicant: Adobe Inc.
Inventor: Midhun Harikumar , Pranav Aggarwal , Baldo Faieta , Ajinkya Kale , Zhe Lin
CPC classification number: G06T11/60 , G06T7/11 , G06T7/162 , G06T2207/20084 , G06T2207/20081
Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, using a model, a learned image representation of a target image. The operations further include generating, using a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image based on the convolving of the learned image representation of the target image with the text embedding.
-
-
-
-
-
-
-
-
-