-
11.
公开(公告)号:US11120070B2
公开(公告)日:2021-09-14
申请号:US15985623
申请日:2018-05-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li Huang , Meenaz Merchant , Houdong Hu , Arun Sacheti
IPC: G06F16/532 , G06F16/51 , G06F16/2457 , G06F16/248 , G06F16/56 , G06K9/62 , G06N3/04 , G06N3/08
Abstract: A visual search system detects one or more user-selected objects represented in an image. A first group of attributes for the user-selected objects is identified. A category for the user-selected objects is identified, and a second group of pre-defined attributes associated with the category is retrieved. The first and second groups of attributes are combined into an attributes set. The combined set of attributes are presented to the user. The user selects one or more attributes and a search is performed to identify images similar to the user-selected attributes. The images are ranked and a subset is presented to the user.
-
公开(公告)号:US10891969B2
公开(公告)日:2021-01-12
申请号:US16165281
申请日:2018-10-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li Huang , Houdong Hu , Congyong Su
IPC: G10L21/12 , G10L15/16 , G10L15/18 , G10L21/18 , G10L15/14 , G10L15/02 , G10L25/63 , G10L21/10 , G10L15/06 , G10L15/26 , G10L15/22
Abstract: A technique is described herein for transforming audio content into images. The technique may include: receiving the audio content from a source; converting the audio content into a temporal stream of audio features; and converting the stream of audio features into one or more images using one or more machine-trained models. The technique generates the image(s) based on recognition of: semantic information that conveys one or more semantic topics associated with the audio content; and sentiment information that conveys one or more sentiments associated with the audio content. The technique then generates an output presentation that includes the image(s), which it provides to one or more display devices for display thereat. The output presentation serves as a summary of salient semantic and sentiment-related characteristics of the audio content.
-
公开(公告)号:US12299029B2
公开(公告)日:2025-05-13
申请号:US15888960
申请日:2018-02-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Houdong Hu , Li Huang , Arun K. Sacheti , Linjun Yang
IPC: G06F16/58 , G06F16/2457 , G06F16/51 , G06F16/56 , G06F16/583 , G06F16/587
Abstract: Systems and methods can be implemented to conduct a visual search as a service in a variety of applications. In various embodiments, a system is configured to provide searching capabilities of content provided by a first entity in response to a search request by a second entity. An image provided by the second entity can be used by the system as a query image to search the content of the first entity. In an embodiment, the first entity can be a commercial entity providing such a system with image related content regarding its products and services such that any number of individual consumers can search for particular products and services of the commercial entity via their communication enabled devices. In addition, such systems can be arranged for other embodiments to provide customized searches of a single source by many individual devices. Additional systems and methods are disclosed.
-
公开(公告)号:US11372914B2
公开(公告)日:2022-06-28
申请号:US15936117
申请日:2018-03-26
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yokesh Kumar , Kuang-Huei Lee , Houdong Hu , Li Huang , Arun Sacheti , Meenaz Merchant , Linjun Yang , Tianjun Xiao , Saurajit Mukherjee
IPC: G06F16/583 , G06F16/58 , G06F16/51 , G06F16/538 , G06N5/02 , G06F16/9535 , G06N20/00
Abstract: The description relates to diversified hybrid image annotation for annotating images. One implementation includes generating first image annotations for a query image using a retrieval-based image annotation technique. Second image annotations can be generated for the query image using a model-based image annotation technique. The first and second image annotations can be integrated to generate a diversified hybrid image annotation result for the query image.
-
公开(公告)号:US11036724B2
公开(公告)日:2021-06-15
申请号:US16560942
申请日:2019-09-04
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li Huang , Houdong Hu , Meenaz Merchant , Arun Sacheti
IPC: G06F16/242 , G06F16/9038 , G06F3/0482 , G06F16/28 , G06F16/951 , G06F16/9535 , G06F3/0481
Abstract: A visual search engine is described herein. The visual search engine is configured to return information to a client computing device based upon a multimodal query received from the client computing device (wherein the multimodal query comprises an image and text). The visual search engine is further configured to interact with a user of the client computing device to disambiguate information retrieval intent of the user.
-
公开(公告)号:US20200019628A1
公开(公告)日:2020-01-16
申请号:US16036224
申请日:2018-07-16
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xi Chen , Houdong Hu , Li Huang , Jiapei Huang , Arun Sacheti , Linjun Yang , Rui Xia , Kuang-Huei Lee , Meenaz Merchant , Sean Chang Culatana
Abstract: Representative embodiments disclose mechanisms to perform visual intent classification or visual intent detection or both on an image. Visual intent classification utilizes a trained machine learning model that classifies subjects in the image according to a classification taxonomy. The visual intent classification can be used as a pre-triggering mechanism to initiate further action in order to substantially save processing time. Example further actions include user scenarios, query formulation, user experience enhancement, and so forth. Visual intent detection utilizes a trained machine learning model to identify subjects in an image, place a bounding box around the image, and classify the subject according to the taxonomy. The trained machine learning model utilizes multiple feature detectors, multi-layer predictions, multilabel classifiers, and bounding box regression.
-
公开(公告)号:US20190311070A1
公开(公告)日:2019-10-10
申请号:US15947564
申请日:2018-04-06
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li Huang , Houdong Hu , Meenaz Merchant
Abstract: A method for using a speech signal to augment a visual search includes processing the image data to determine an image search intent. Concurrently with processing the image data, the method processes the speech signal to determine at least one speech search intent. The method generates a search query by combining keywords and/or the image from the image search intent with keywords from the speech search intent. The method then performs a search based on the generated query and reports the results of the search. The method generates the image search intent by applying the image data to a knowledge base and generates the speech search intent by converting the speech to text and applying the text to a cognition service.
-
公开(公告)号:US20190236487A1
公开(公告)日:2019-08-01
申请号:US15883686
申请日:2018-01-30
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jiapei Huang , Houdong Hu , Li Huang , Xi Chen , Linjun Yang
IPC: G06N99/00
CPC classification number: G06N20/00 , G06F3/04842
Abstract: A technique for hyperparameter tuning can be performed via a hyperparameter tuning tool. In the technique, computer-readable values for each of one or more machine learning hyperparameters can be received. Multiple computer-readable hyperparameter value sets can be defined using different combinations of the values. In response to a request to start, an overall hyperparameter tuning operation can be performed via the tool, with the overall operation including a tuning job for each of the hyperparameter sets. A computer-readable comparison of the results of the parameter tuning operations can be generated for the hyperparameter sets, with the comparison indicating effectiveness of the hyperparameter sets, as compared to each other, in the tuning jobs.
-
公开(公告)号:US11947589B2
公开(公告)日:2024-04-02
申请号:US17710761
申请日:2022-03-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li Huang , Rui Xia , Zhiting Chen , Kun Wu , Meenaz Merchant , Kamal Ginotra , Arun K. Sacheti , Chu Wang , Andrew Lawrence Stewart , Hanmu Zuo , Saurajit Mukherjee
IPC: G06F16/50 , G06F16/532 , G06F16/535 , G06F16/56
CPC classification number: G06F16/535 , G06F16/532 , G06F16/56
Abstract: Systems and methods directed to returning personalized image-based search results are described. In examples, a query including an image may be received, and a personalized item embedding may be generated based on the image and user profile information associated with a user. Further, a plurality of candidate images may be obtained based on the personalized item embedding. The candidate images may then be ranked according to a predicted level of user engagement for a user, and then diversified to ensure visual diversity among the ranked images. A portion of the diversified images may then be returned in response to an image-based search.
-
-
-
-
-
-
-
-