-
公开(公告)号:US20210073617A1
公开(公告)日:2021-03-11
申请号:US16567277
申请日:2019-09-11
Applicant: Amazon Technologies, Inc.
Inventor: Loris Bazzani , Maksim Lapin , Felix Hieber , Tobias Domhan
Abstract: Techniques are generally described for automatic scoring of alt-text for image data. In various examples, first image data and first text data describing the first image data may be received. A feature representation of the first image data may be determined using an encoder machine learning model. A hidden state representation may be determined using a decoder machine learning model based on the feature representation and a first word of the first text data. In some examples, a first score may be determined using the hidden state representation. The first score may include an indication of a descriptive capability of the first text data with respect to the first image data.
-
公开(公告)号:US11720942B1
公开(公告)日:2023-08-08
申请号:US16915361
申请日:2020-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Loris Bazzani , Yanbei Chen
IPC: G06Q30/06 , G06N3/02 , G06F16/535 , G06Q30/0601 , G06T7/00
CPC classification number: G06Q30/0613 , G06F16/535 , G06N3/02 , G06T7/00
Abstract: Techniques are generally described for interactive image retrieval using visual semantic matching. Image data and text data are encoded into a single shared visual semantic embedding space. A prediction model is trained using reference inputs, target outputs, and modification text describing changes to the reference inputs to obtain the target outputs. The prediction model can be used to perform image-to-text, text-to-image, and interactive retrieval.
-
公开(公告)号:US11829445B1
公开(公告)日:2023-11-28
申请号:US17361905
申请日:2021-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Loris Bazzani , Michael Donoser , Yuxin Hou , Eleonora Vig
IPC: G06F18/22 , G06F16/532 , G06F3/0482 , G06N3/08 , G06N3/04
CPC classification number: G06F18/22 , G06F16/532 , G06F3/0482 , G06N3/04 , G06N3/08
Abstract: Systems and techniques are generally described for attribute-based content selection and search. In some examples, a graphical user interface (GUI) may display an image of a first product comprising a plurality of visual attributes. In some further examples, the GUI may display at least a first control button with data identifying a first visual attribute of the plurality of visual attributes. In some cases, a first selection of the first control button may be received. In some examples, a first plurality of products may be determined based at least in part on the first selection of the first control button. The first plurality of products may be determined based on a visual similarity to the first product, and a visual dissimilarity to the first product with respect to the first visual attribute. In some examples, the first plurality of products may be displayed on the GUI.
-
公开(公告)号:US11809520B1
公开(公告)日:2023-11-07
申请号:US17216234
申请日:2021-03-29
Applicant: Amazon Technologies, Inc.
Inventor: Antonio D'Innocente , Nikhil Garg , Loris Bazzani
IPC: G06F18/214 , G06T7/73 , G06T11/00 , G06F16/9538 , G06F16/2457 , G06F16/56 , G06F16/54 , G06N3/08 , G06F16/9535 , G06V10/75 , G06F18/213 , G06F18/2113 , G06Q30/0601
CPC classification number: G06F18/214 , G06F16/24578 , G06F16/54 , G06F16/56 , G06F16/9535 , G06F16/9538 , G06F18/213 , G06F18/2113 , G06N3/08 , G06T7/74 , G06T11/00 , G06V10/751 , G06Q30/0603 , G06T2200/24 , G06T2207/20084
Abstract: Devices and techniques are generally described for determining localized visual similarity. In some examples, a selection of a first location of interest on a first image data depicting at least one article of clothing may be received. In some examples, a first machine learning model may generate a feature map representing the first image data. In some examples, a reduced feature map may be generated based at least in part on a mapping of the first location of interest to the feature map. In some examples, a second image depicting at least a second article of clothing may be determined based at least in part on the reduced feature map.
-
公开(公告)号:US11416910B1
公开(公告)日:2022-08-16
申请号:US17034294
申请日:2020-09-28
Applicant: Amazon Technologies, Inc.
Inventor: Loris Bazzani , Filip Saina , Amaia Salvador Aguilera , Angel Noe Martinez Gonzalez , Eleonora Vig , Erhan Gundogdu , Michael Donoser
IPC: G06F16/26 , G06F3/04842 , G06Q30/06 , G06F3/0483 , G06N3/04 , G06N3/08
Abstract: Systems and techniques are generally described for generating visually blended recommendation grids. In some examples, a selection of a first item and a second item displayed on a display may be received. In various examples, the first item may be displayed in a first element of a grid and the second item may be displayed in a second element of the grid. In some examples, a third element of the grid that is disposed between the first element and the second element along an axis of the grid may be determined. In various examples, a third item may be determined for display in the third element of the grid based at least in part on a blended representation of an embedding of the first item and an embedding of the second item. The third item may be displayed in the third element of the grid.
-
公开(公告)号:US11361212B2
公开(公告)日:2022-06-14
申请号:US16567277
申请日:2019-09-11
Applicant: Amazon Technologies, Inc.
Inventor: Loris Bazzani , Maksim Lapin , Felix Hieber , Tobias Domhan
Abstract: Techniques are generally described for automatic scoring of alt-text for image data. In various examples, first image data and first text data describing the first image data may be received. A feature representation of the first image data may be determined using an encoder machine learning model. A hidden state representation may be determined using a decoder machine learning model based on the feature representation and a first word of the first text data. In some examples, a first score may be determined using the hidden state representation. The first score may include an indication of a descriptive capability of the first text data with respect to the first image data.
-
-
-
-
-