-
公开(公告)号:US20240046332A1
公开(公告)日:2024-02-08
申请号:US18382230
申请日:2023-10-20
Applicant: Microsoft Technology Licensing, LLC
Inventor: Julia X. Gong , Jyotkumar Patel , Yale Song , Xuetao Yin , Xiujia Guo , Rajiv S. Binwade , Houdong Hu
IPC: G06Q30/0601 , G06Q30/0282 , G06Q30/0204 , G06N3/04 , G06N3/08 , G06Q50/00 , G06V10/32 , G06F18/22
CPC classification number: G06Q30/0631 , G06Q30/0639 , G06Q30/0282 , G06Q30/0205 , G06N3/04 , G06N3/08 , G06Q50/01 , G06V10/32 , G06F18/22
Abstract: The present disclosure provides method and apparatus for determining a food item from a photograph and a corresponding restaurant serving the food item. An image is received from a user, the image being associated with a consumable item. One or more ingredients of the consumable item in the image is identified along with a location of the user and using a neural network, determining one or more similar images from a database. A restaurant associated with each of the one or more similar images is determined along with a similarity score indicating a similarity between the restaurant and the identified content of the image. The one or more restaurants and/or associated similar food items are ranked based on the similarity score and a list of ranked restaurants is provided to the user.
-
2.
公开(公告)号:US11120070B2
公开(公告)日:2021-09-14
申请号:US15985623
申请日:2018-05-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li Huang , Meenaz Merchant , Houdong Hu , Arun Sacheti
IPC: G06F16/532 , G06F16/51 , G06F16/2457 , G06F16/248 , G06F16/56 , G06K9/62 , G06N3/04 , G06N3/08
Abstract: A visual search system detects one or more user-selected objects represented in an image. A first group of attributes for the user-selected objects is identified. A category for the user-selected objects is identified, and a second group of pre-defined attributes associated with the category is retrieved. The first and second groups of attributes are combined into an attributes set. The combined set of attributes are presented to the user. The user selects one or more attributes and a search is performed to identify images similar to the user-selected attributes. The images are ranked and a subset is presented to the user.
-
公开(公告)号:US11093560B2
公开(公告)日:2021-08-17
申请号:US16138587
申请日:2018-09-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kuang-Huei Lee , Gang Hua , Xi Chen , Houdong Hu , He Xiaodong
Abstract: The present concepts relate to matching data of two different modalities using two stages of attention. First data is encoded as a set of first vectors representing components of the first data, and second data is encoded as a set of second vectors representing components of the second data. In the first stage, the components of the first data are attended by comparing the first vectors and the second vectors to generate a set of attended vectors. In the second stage, the components of the second data are attended by comparing the second vectors and the attended vectors to generate a plurality of relevance scores. Then, the relevance scores are pooled to calculate a similarity score that indicates a degree of similarity between the first data and the second data.
-
公开(公告)号:US10891969B2
公开(公告)日:2021-01-12
申请号:US16165281
申请日:2018-10-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li Huang , Houdong Hu , Congyong Su
IPC: G10L21/12 , G10L15/16 , G10L15/18 , G10L21/18 , G10L15/14 , G10L15/02 , G10L25/63 , G10L21/10 , G10L15/06 , G10L15/26 , G10L15/22
Abstract: A technique is described herein for transforming audio content into images. The technique may include: receiving the audio content from a source; converting the audio content into a temporal stream of audio features; and converting the stream of audio features into one or more images using one or more machine-trained models. The technique generates the image(s) based on recognition of: semantic information that conveys one or more semantic topics associated with the audio content; and sentiment information that conveys one or more sentiments associated with the audio content. The technique then generates an output presentation that includes the image(s), which it provides to one or more display devices for display thereat. The output presentation serves as a summary of salient semantic and sentiment-related characteristics of the audio content.
-
公开(公告)号:US11182408B2
公开(公告)日:2021-11-23
申请号:US16417902
申请日:2019-05-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kun Wu , Yiran Shen , Houdong Hu , Soudamini Sreepada , Arun Sacheti , Mithun Das Gupta , Rushabh Rajesh Gandhi , Sudhir Kumar
IPC: G06F16/00 , G06F16/28 , G06F16/22 , G06F16/583 , G06F16/587 , G06F16/532
Abstract: A computer-implemented technique is described herein for using a machine-trained model to identify individual objects within images. The technique then creates a relational index for the identified objects. That is, each index entry in the relational index is associated with a given object, and includes a set of attributes pertaining to the given object. One such attribute identifies at least one latent semantic vector associated with the given object. Each attribute provides a way of linking the given object to one or more other objects in the relational index. In one application of this technique, a user may submit a query that specifies a query object. The technique consults the relational index to find one or more objects that are related to the query object. In some cases, the query object and each of the other objects have a complementary relationship.
-
公开(公告)号:US20190318405A1
公开(公告)日:2019-10-17
申请号:US15954152
申请日:2018-04-16
Applicant: Microsoft Technology Licensing , LLC
Inventor: Houdong Hu , Li Huang
IPC: G06Q30/06 , G06N99/00 , G06N3/08 , G06F17/30 , G06F3/0484 , G06K9/66 , G06K9/62 , G06F3/0482
Abstract: Methods, systems, and computer programs are presented for identifying the brand and model of products embedded within an image. One method includes operations for receiving, via a graphical user interface (GUI), a selection of an image, and for analyzing the image to determine a location within the image of one or more products. For each product in the image, determining a unique identification of the product is determined, the unique identification including a manufacturer of the product and a model identifier. The method further includes an operation for presenting information about the one or more products in the GUI with a selection option for selecting each of the one or more products. Additionally, the method includes operations for receiving a product selection for one of the one or more products, and presenting shopping options in the GUI for purchasing the selected product.
-
公开(公告)号:US20190258895A1
公开(公告)日:2019-08-22
申请号:US15900606
申请日:2018-02-20
Applicant: Microsoft Technology Licensing, LLC
Inventor: Arun Sacheti , Xi Chen , Houdong Hu , Li Huang , Jiapei Huang , Meenaz Merchant
Abstract: Non-limiting examples of the present disclosure relate to object detection processing of image content that categorically classifies specific objects within image content. Exemplary object detection processing may be utilized to enhance visual search processing including content retrieval and curation, among other technical advantages. An exemplary object detection model is implemented to categorically classify an object. In doing, so an exemplary object detection model may classify objects based on: analysis of specific objects within image content, positioning of the objects within the image content and intent associated with the image content, among other examples. The object detection model generates exemplary categorical classification(s) for specific data objects, which may be propagated to enhance processing efficiency and accuracy during visual search processing. Exemplary categorical classifications may comprise hierarchical classifications of a detected object that can be used to retrieve, curate and surface content that is most contextually relevant to a detected object.
-
公开(公告)号:US12299029B2
公开(公告)日:2025-05-13
申请号:US15888960
申请日:2018-02-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Houdong Hu , Li Huang , Arun K. Sacheti , Linjun Yang
IPC: G06F16/58 , G06F16/2457 , G06F16/51 , G06F16/56 , G06F16/583 , G06F16/587
Abstract: Systems and methods can be implemented to conduct a visual search as a service in a variety of applications. In various embodiments, a system is configured to provide searching capabilities of content provided by a first entity in response to a search request by a second entity. An image provided by the second entity can be used by the system as a query image to search the content of the first entity. In an embodiment, the first entity can be a commercial entity providing such a system with image related content regarding its products and services such that any number of individual consumers can search for particular products and services of the commercial entity via their communication enabled devices. In addition, such systems can be arranged for other embodiments to provide customized searches of a single source by many individual devices. Additional systems and methods are disclosed.
-
公开(公告)号:US11669558B2
公开(公告)日:2023-06-06
申请号:US16368798
申请日:2019-03-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Ye Wu , Houdong Hu , Surendra Ulabala , Vishal Thakkar , Arun Sacheti
IPC: G06N3/04 , G06N5/02 , G06N3/045 , G06F16/33 , G06F16/245 , G06F16/248 , G06V20/62 , G06F18/2413 , G06F17/16
CPC classification number: G06F16/3347 , G06F16/245 , G06F16/248 , G06F18/2413 , G06N3/04 , G06N3/045 , G06N5/02 , G06V20/62 , G06F17/16
Abstract: A computer-implemented technique generates a dense embedding vector that provides a distributed representation of input text. The technique includes: generating an input term-frequency (TF) vector of dimension g that includes frequency information relating to frequency of occurrence of terms in an instance of input text; using a TF-modifying component to modify the term-specific frequency information in the input TF vector by respective machine-trained weighting factors, to produce an intermediate vector of dimension g; using a projection component to project the intermediate vector of dimension g into an embedding vector of dimension k, where k is less than g. Both the TF-modifying component and the projection component use respective machine-trained neural networks. An application performs any of a retrieval-based function, a recognition-based function, a recommendation-based function, a classification-based function, etc. based on the embedding vector.
-
公开(公告)号:US11372914B2
公开(公告)日:2022-06-28
申请号:US15936117
申请日:2018-03-26
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yokesh Kumar , Kuang-Huei Lee , Houdong Hu , Li Huang , Arun Sacheti , Meenaz Merchant , Linjun Yang , Tianjun Xiao , Saurajit Mukherjee
IPC: G06F16/583 , G06F16/58 , G06F16/51 , G06F16/538 , G06N5/02 , G06F16/9535 , G06N20/00
Abstract: The description relates to diversified hybrid image annotation for annotating images. One implementation includes generating first image annotations for a query image using a retrieval-based image annotation technique. Second image annotations can be generated for the query image using a model-based image annotation technique. The first and second image annotations can be integrated to generate a diversified hybrid image annotation result for the query image.
-
-
-
-
-
-
-
-
-