-
公开(公告)号:US12299029B2
公开(公告)日:2025-05-13
申请号:US15888960
申请日:2018-02-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Houdong Hu , Li Huang , Arun K. Sacheti , Linjun Yang
IPC: G06F16/58 , G06F16/2457 , G06F16/51 , G06F16/56 , G06F16/583 , G06F16/587
Abstract: Systems and methods can be implemented to conduct a visual search as a service in a variety of applications. In various embodiments, a system is configured to provide searching capabilities of content provided by a first entity in response to a search request by a second entity. An image provided by the second entity can be used by the system as a query image to search the content of the first entity. In an embodiment, the first entity can be a commercial entity providing such a system with image related content regarding its products and services such that any number of individual consumers can search for particular products and services of the commercial entity via their communication enabled devices. In addition, such systems can be arranged for other embodiments to provide customized searches of a single source by many individual devices. Additional systems and methods are disclosed.
-
公开(公告)号:US11669558B2
公开(公告)日:2023-06-06
申请号:US16368798
申请日:2019-03-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Ye Wu , Houdong Hu , Surendra Ulabala , Vishal Thakkar , Arun Sacheti
IPC: G06N3/04 , G06N5/02 , G06N3/045 , G06F16/33 , G06F16/245 , G06F16/248 , G06V20/62 , G06F18/2413 , G06F17/16
CPC classification number: G06F16/3347 , G06F16/245 , G06F16/248 , G06F18/2413 , G06N3/04 , G06N3/045 , G06N5/02 , G06V20/62 , G06F17/16
Abstract: A computer-implemented technique generates a dense embedding vector that provides a distributed representation of input text. The technique includes: generating an input term-frequency (TF) vector of dimension g that includes frequency information relating to frequency of occurrence of terms in an instance of input text; using a TF-modifying component to modify the term-specific frequency information in the input TF vector by respective machine-trained weighting factors, to produce an intermediate vector of dimension g; using a projection component to project the intermediate vector of dimension g into an embedding vector of dimension k, where k is less than g. Both the TF-modifying component and the projection component use respective machine-trained neural networks. An application performs any of a retrieval-based function, a recognition-based function, a recommendation-based function, a classification-based function, etc. based on the embedding vector.
-
公开(公告)号:US20220083853A1
公开(公告)日:2022-03-17
申请号:US17021779
申请日:2020-09-15
Applicant: Microsoft Technology Licensing, LLC
Inventor: Parag Agrawal , Ankan Saha , Yafei Wang , Yan Wang , Eric Lawrence , Ashwin Narasimha Murthy , Aastha Nigam , Bohong Zhao , Albert Lingfeng Cui , David Sung , Aastha Jain , Abdulla Mohammad Al-Qawasmeh
Abstract: In an example embodiment, a single machine learned model that allows for ranking of entities across all of the different combinations of node types and edge types is provided. The solution calibrates the scores from Edge-FPR models to a single scale. Additionally, the solution may utilize a per-edge type multiplicative factor dictated by the true importance of an edge type, which is learned through a counterfactual experimentation process. The solution may additionally optimize on a single, common downstream metric, specifically downstream interactions that can be compared against each other across all combinations of node types and edge types.
-
公开(公告)号:US20210097339A1
公开(公告)日:2021-04-01
申请号:US16584619
申请日:2019-09-26
Applicant: Microsoft Technology Licensing, LLC
Inventor: Parag Agrawal , Yan Wang , Aastha Jain , Hema Raghavan
Abstract: The disclosed embodiments provide a system for performing inference. During operation, the system obtains a graph containing nodes representing members of an online system, edges between pairs of nodes, and edge scores representing confidences in a type of relationship between the pairs of nodes. Next, the system performs a set of iterations that propagate a label for the type of relationship from a first subset of edges to remaining edges in the graph, with each iteration updating a probability of the label for an edge between a pair of nodes based on a subset of edge scores for a second subset of edges connected to one or both nodes in the pair and probabilities of the label for the second subset of edges. The system then performs one or more tasks in the online system based on the probability of the label for the edge.
-
公开(公告)号:US11928875B2
公开(公告)日:2024-03-12
申请号:US16297388
申请日:2019-03-08
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Ye Wu , Arun Sacheti
IPC: G06F16/387 , G06F16/31 , G06F16/35 , G06F18/2411 , G06N20/10 , G06V10/70 , G06V10/75 , G06V10/80 , G06V20/62 , G06V30/144 , G06V30/148 , G06V30/18 , G06V30/19 , G06V30/413 , G06V30/414 , G06V30/10
CPC classification number: G06V30/18171 , G06F16/313 , G06F16/35 , G06F16/387 , G06F18/2411 , G06N20/10 , G06V10/70 , G06V10/75 , G06V10/806 , G06V20/62 , G06V30/144 , G06V30/153 , G06V30/158 , G06V30/1916 , G06V30/19173 , G06V30/413 , G06V30/414 , G06V30/10
Abstract: Described herein is a mechanism for visual recognition of items or visual search using Optical Character Recognition (OCR) of text in images. Recognized OCR blocks in an image comprise position information and recognized text. The embodiments utilize a location-aware feature vector created using the position and recognized information in each recognized block. The location-aware features of the feature vector utilize position information associated with the block to calculate a weight for the block. The recognized text is used to construct a tri-character gram frequency, inverse document frequency (TGF-IDP) metric using tri-character grams extracted from the recognized text. Features in location-aware feature vector for the block are computed by multiplying the weight and the corresponding TGF-IDF metric. The location-aware feature vector for the image is the sum of the location-aware feature vectors for the individual blocks.
-
公开(公告)号:US11074289B2
公开(公告)日:2021-07-27
申请号:US15885568
申请日:2018-01-31
Applicant: Microsoft Technology Licensing, LLC.
Inventor: Houdong Hu , Yan Wang , Linjun Yang , Li Huang , Xi Chen , Jiapei Huang , Ye Wu , Arun K. Sacheti , Meenaz Merchant
IPC: G06F16/53 , G06F16/532 , G06T7/00 , G06K9/62 , G06K9/46 , G06N3/08 , G06F16/51 , G06F16/56 , G06F16/583 , G06F16/2457
Abstract: Systems and methods can be implemented to conduct searches based on images used as queries in a variety of applications. In various embodiments, a set of visual words representing a query image are generated from features extracted from the query image and are compared with visual words of index images. A set of candidate images is generated from the index images resulting from matching one or more visual words in the comparison. A multi-level ranking is conducted to sort the candidate images of the set of candidate images, and results of the multi-level ranking are returned to a user device that provided the query image. Additional systems and methods are disclosed.
-
公开(公告)号:US10997468B2
公开(公告)日:2021-05-04
申请号:US16799528
申请日:2020-02-24
Applicant: Microsoft Technology Licensing, LLC
Inventor: Arun Sacheti , Fnu Yokesh Kumar , Saurajit Mukherjee , Nikesh Srivastava , Yan Wang , Kuang-Huei Lee , Surendra Ulabala
IPC: G06K9/62 , G06F16/532 , G06F16/583 , G06K9/00
Abstract: Non-limiting examples described herein relate to ensemble model processing for image recognition that improves precision and recall for image recognition processing as compared with existing solutions. An exemplary ensemble model is configured enhance image recognition processing through aggregate data modeling processing that evaluates image recognition prediction results obtained through processing that comprises: nearest neighbor visual search analysis, categorical image classification analysis and/or categorical instance retrieval analysis. An exemplary ensemble model is scalable, where new segments/categories can be bootstrapped to build deeper learning models and achieve high precision image recognition, while the cost of implementation (including from a bandwidth and resource standpoint) is lower than what is currently available across the industry today. Processing described herein, including implementation of an exemplary ensemble data model, may be exposed as a web service that is standalone or integrated within other applications/services to enhance processing efficiency and productivity applications/services such as productivity applications/services.
-
公开(公告)号:US10824899B2
公开(公告)日:2020-11-03
申请号:US16234148
申请日:2018-12-27
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Arun Sacheti , Vishal Chhabilbhai Thakkar , Surendra Srinivas Ulabala , Shloak Jain
Abstract: Representative embodiments disclose mechanisms to create a text stream from raw OCR outputs. The raw OCR output comprises a plurality of bounding boxes, each bounding box defining a region containing text which has been recognized by the OCR system. A weight matrix is calculated that comprises a weight for each pair of bounding boxes. The weight representing the probability that a pair of bounding boxes belongs to the same cluster. The bounding boxes are then clustered along the weights. The resulting clusters are first ordered using an ordering criteria. The bounding boxes within each cluster are then ordered according to a second ordering criteria. The ordered clusters and bounding boxes are then arranged into a text stream.
-
公开(公告)号:US20190243910A1
公开(公告)日:2019-08-08
申请号:US15888960
申请日:2018-02-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yan Wang , Houdong Hu , Li Huang , Arun K. Sacheti , Linjun Yang
IPC: G06F17/30
CPC classification number: G06F16/5838 , G06F16/24578 , G06F16/51 , G06F16/56 , G06F16/583 , G06F16/5866 , G06F16/587
Abstract: Systems and methods can be implemented to conduct a visual search as a service in a variety of applications. In various embodiments, a system is configured to provide searching capabilities of content provided by a first entity in response to a search request by a second entity. An image provided by the second entity can be used by the system as a query image to search the content of the first entity. In an embodiment, the first entity can be a commercial entity providing such a system with image related content regarding its products and services such that any number of individual consumers can search for particular products and services of the commercial entity via their communication enabled devices. In addition, such systems can be arranged for other embodiments to provide customized searches of a single source by many individual devices. Additional systems and methods are disclosed.
-
公开(公告)号:US20190236167A1
公开(公告)日:2019-08-01
申请号:US15885568
申请日:2018-01-31
Applicant: Microsoft Technology Licensing, LLC.
Inventor: Houdong Hu , Yan Wang , Linjun Yang , Li Huang , Xi Chen , Jiapei Huang , Ye Wu , Arun K. Sacheti , Meenaz Merchant
CPC classification number: G06F16/532 , G06F16/24578 , G06F16/51 , G06F16/56 , G06F16/5838 , G06K9/46 , G06K9/6215 , G06K9/627 , G06K2209/27 , G06N3/08 , G06T7/97 , G06T2207/20084 , G06T2207/30196
Abstract: Systems and methods can be implemented to conduct searches based on images used as queries in a variety of applications. In various embodiments, a set of visual words representing a query image are generated from features extracted from the query image and are compared with visual words of index images. A set of candidate images is generated from the index images resulting from matching one or more visual words in the comparison. A multi-level ranking is conducted to sort the candidate images of the set of candidate images, and results of the multi-level ranking are returned to a user device that provided the query image. Additional systems and methods are disclosed.
-
-
-
-
-
-
-
-
-