-
公开(公告)号:US20170371924A1
公开(公告)日:2017-12-28
申请号:US15192909
申请日:2016-06-24
Applicant: Microsoft Technology Licensing, LLC
Inventor: Bolin Ding , Silu Huang , Chi Wang , Kaushik Chakrabarti , Surajit Chaudhuri
IPC: G06F17/30
Abstract: A processing unit can determine a first subset of a data set including data records selected based on measure values thereof. The processing unit can determine an index mapping a predicate to data records associated with that predicate and approximation values of the records. The processing unit can process a query against the first subset to provide a first result and a first accuracy value, determine that the first accuracy value does not satisfy an accuracy criterion, and process the query against the index. In some examples, the processing unit can process the query against a second subset including data records satisfying a predetermined predicate. In some examples, the processing unit can receive data records and determine the first subset. Data records can include respective measure values. Data records with higher measure values can occur in the first subset more frequently than data records with lower measure values.
-
公开(公告)号:US20170322964A1
公开(公告)日:2017-11-09
申请号:US15661269
申请日:2017-07-27
Applicant: Microsoft Technology Licensing, LLC
Inventor: Zhongyuan Wang , Kanstantsyn Zoryn , Zhimin Chen , Kaushik Chakrabarti , James P. Finnigan , Vivek R. Narasayya , Surajit Chaudhuri , Kris Ganjam
IPC: G06F17/30
CPC classification number: G06F16/2282 , G06F16/211 , G06F16/221 , G06F16/2455 , G06F16/284 , G06F16/901 , G06F16/951 , G06F16/955
Abstract: The present invention extends to methods, systems, and computer program products for understanding tables for search. Aspects of the invention include identifying a subject tuple (e.g., a subject column) for a table, detecting a tuple header (e.g., a column header) using other tables, and detecting a tuple header (e.g., a column header) using a knowledge base. Implementations can be utilized in a structured data search system (SDSS) that indexes structured information, such as, tables in a relational database or html tables extracted from web pages. The SDSS allows users to search over the structured information (tables) using different mechanisms including keyword search and data finding data.
-
公开(公告)号:US20170132329A1
公开(公告)日:2017-05-11
申请号:US14932983
申请日:2015-11-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Mohamed Yakout , Kaushik Chakrabarti , Maria Pershina
IPC: G06F17/30
CPC classification number: G06F16/9024
Abstract: Techniques for using digital entity correlation to generate a composite knowledge graph from constituent graphs. In an aspect, digital attribute values associated with primary entities may be encoded into primitives, e.g., using a multi-resolution encoding scheme. A pairs graph may be constructed, based on seed pairs calculated from correlating encoded primitives, and further expanded to include subjects and objects of the seed pairs, as well as pairs connected to relationship entities. A similarity metric is computed for each candidate pair to determine whether a match exists. The similarity metric may be based on summing a weighted landing probability over all primitives associated directly or indirectly with each candidate pair. By incorporating primitive matches from not only the candidate pair but also from pairs surrounding the candidate pair, entity matching may be efficiently implemented on a holistic basis.
-
公开(公告)号:US10810181B2
公开(公告)日:2020-10-20
申请号:US15950176
申请日:2018-04-11
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kanstantsyn Zoryn , Zhimin Chen , Kaushik Chakrabarti , James P. Finnigan , Vivek R. Narasayya , Surajit Chaudhuri , Kris Ganjam
IPC: G06F16/22 , G06F16/951 , G06F16/958 , G06F16/955 , G06F16/2457
Abstract: The present invention extends to methods, systems, and computer program products for refining structured data indexes. Aspects of the invention include associating structured data, such as, for example, tables, with additional content. Additional content can include content outside the and tags of a web table. Indexes for structured data (e.g., table indexes) can be refined based on the additional content to improve the relevance of providing parts of the structured data (e.g., parts of the table) in search results.
-
公开(公告)号:US10311092B2
公开(公告)日:2019-06-04
申请号:US15195981
申请日:2016-06-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kris K. Ganjam , Kaushik Chakrabarti
Abstract: The techniques discussed herein leverage structure within data of a corpus to parse unstructured data to obtain structured data and/or to predict latent data that is related to the unstructured and/or structured data. In some examples, parsing and/or predicting can be conducted at varying levels of granularity. In some examples, parsing and/or predicting can be iteratively conducted to improve accuracy and/or to expose more hidden data.
-
公开(公告)号:US20160378765A1
公开(公告)日:2016-12-29
申请号:US14754318
申请日:2015-06-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Philip A. Bernstein , Kaushik Chakrabarti , Zhimin Chen , Yeye He , Chi Wang , Kris K. Ganjam
IPC: G06F17/30
CPC classification number: G06F16/245 , G06F16/9024
Abstract: Concept expansion using tables, such as web tables, can return entities belonging to a concept based on an input of the concept and at least one seed entity that belongs to the concept. A concept expansion frontend can receive the concept and seed entity and provide them to a concept expansion framework. The concept expansion framework can expand the coverage of entities for concepts, including tail concepts, using tables by leveraging rich content signals corresponding to concept names. Such content signals can include content matching the concept that appear in captions, early headings, page titles, surrounding text, anchor text, and queries for which the page has been clicked. The concept expansion framework can use the structured entities in tables to infer exclusive tables. Such inference differs from previous label propagation methods and involves modeling a table-entity relationship. The table-entity relationship reduces semantic drift without using a reference ontology.
Abstract translation: 使用诸如web表的表的概念扩展可以基于概念的输入和属于该概念的至少一个种子实体返回属于概念的实体。 概念扩展前端可以接收概念和种子实体,并将其提供给概念扩展框架。 概念扩展框架可以通过利用与概念名称相对应的丰富内容信号来扩展实体的涵盖范围,包括尾部概念,使用表格。 这样的内容信号可以包括匹配在字幕,早期标题,页面标题,周围文本,锚文本和页面被点击的查询中出现的概念的内容。 概念扩展框架可以使用表中的结构化实体来推断排他性表。 这种推论与以前的标签传播方法不同,并且涉及对表 - 实体关系进行建模。 表 - 实体关系减少语义漂移,而不使用引用本体。
-
公开(公告)号:US10963471B2
公开(公告)日:2021-03-30
申请号:US16223907
申请日:2018-12-18
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kaushik Chakrabarti , Surajit Chaudhuri , Senjuti Basu Roy
IPC: G06F16/24 , G06F16/2457 , G06F16/13 , G06F16/29 , G06F16/951 , G01C21/36
Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.
-
公开(公告)号:US10726018B2
公开(公告)日:2020-07-28
申请号:US14177081
申请日:2014-02-10
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: Kaushik Chakrabarti , Meihui Zhang
IPC: G06F16/00 , G06F16/2457 , G06F16/2455
Abstract: Techniques and constructs to facilitate semantic matching and automated annotation (SMA) of attributes can take entity names and a keyword describing an attribute associated with the named entities as input and leverage a corpus of data such as data from tables, which can include HTML web tables, to automatically populate values associated with the named entities for the attribute. The constructs enable accurate SMA of attributes, such as attributes that relate to the entity and include numeric values in a different unit than the query, in a different scale than the query, and/or reflecting a time different from that of the query. An entity augmentation application programming interface (API) may be used to accept queries that include numeric criteria, parameters, or arguments, including query attributes represented by numeric values, which may be in different units or scales, and attributes represented by numeric values that can vary by time.
-
-
-
-
-
-
-