-
公开(公告)号:US10853344B2
公开(公告)日:2020-12-01
申请号:US15661269
申请日:2017-07-27
Applicant: Microsoft Technology Licensing, LLC
Inventor: Zhongyuan Wang , Kanstantsyn Zoryn , Zhimin Chen , Kaushik Chakrabarti , James P. Finnigan , Vivek R. Narasayya , Surajit Chaudhuri , Kris Ganjam
IPC: G06F7/02 , G06F16/00 , G06F16/22 , G06F16/21 , G06F16/28 , G06F16/901 , G06F16/955 , G06F16/2455 , G06F16/951
Abstract: The present invention extends to methods, systems, and computer program products for understanding tables for search. Aspects of the invention include identifying a subject tuple (e.g., a subject column) for a table, detecting a tuple header (e.g., a column header) using other tables, and detecting a tuple header (e.g., a column header) using a knowledge base. Implementations can be utilized in a structured data search system (SDSS) that indexes structured information, such as, tables in a relational database or html tables extracted from web pages. The SDSS allows users to search over the structured information (tables) using different mechanisms including keyword search and data finding data.
-
公开(公告)号:US10896229B2
公开(公告)日:2021-01-19
申请号:US16188210
申请日:2018-11-12
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kanstantsyn Zoryn , Zhimin Chen , Kaushik Chakrabarti , James P. Finnigan , Vivek R. Narasayya , Surajit Chaudhuri , Kris Ganjam
IPC: G06F17/30 , G06F16/951 , G06F16/2458
Abstract: The present invention extends to methods, systems, and computer program products for computing features of structured data. Aspects of the invention include computing features of table components (e.g., of rows, columns, cells, etc.). Computed features can be used for ranking the table components. When aggregated, features for different components of a table can be used for ranking the table (e.g., a web table).
-
公开(公告)号:US10769140B2
公开(公告)日:2020-09-08
申请号:US14754318
申请日:2015-06-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Philip A. Bernstein , Kaushik Chakrabarti , Zhimin Chen , Yeye He , Chi Wang , Kris K. Ganjam
IPC: G06F16/245 , G06F16/901
Abstract: Concept expansion using tables, such as web tables, can return entities belonging to a concept based on an input of the concept and at least one seed entity that belongs to the concept. A concept expansion frontend can receive the concept and seed entity and provide them to a concept expansion framework. The concept expansion framework can expand the coverage of entities for concepts, including tail concepts, using tables by leveraging rich content signals corresponding to concept names. Such content signals can include content matching the concept that appear in captions, early headings, page titles, surrounding text, anchor text, and queries for which the page has been clicked. The concept expansion framework can use the structured entities in tables to infer exclusive tables. Such inference differs from previous label propagation methods and involves modeling a table-entity relationship. The table-entity relationship reduces semantic drift without using a reference ontology.
-
公开(公告)号:US20170371958A1
公开(公告)日:2017-12-28
申请号:US15195981
申请日:2016-06-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kris K. Ganjam , Kaushik Chakrabarti
CPC classification number: G06F17/30707 , G06F17/2715 , G06N7/005
Abstract: The techniques discussed herein leverage structure within data of a corpus to parse unstructured data to obtain structured data and/or to predict latent data that is related to the unstructured and/or structured data. In some examples, parsing and/or predicting can be conducted at varying levels of granularity. In some examples, parsing and/or predicting can be iteratively conducted to improve accuracy and/or to expose more hidden data.
-
公开(公告)号:US10915564B2
公开(公告)日:2021-02-09
申请号:US16396867
申请日:2019-04-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kris K. Ganjam , Kaushik Chakrabarti
IPC: G06F16/35 , G06F40/216 , G06N7/00
Abstract: The techniques discussed herein leverage structure within data of a corpus to parse unstructured data to obtain structured data and/or to predict latent data that is related to the unstructured and/or structured data. In some examples, parsing and/or predicting can be conducted at varying levels of granularity. In some examples, parsing and/or predicting can be iteratively conducted to improve accuracy and/or to expose more hidden data.
-
公开(公告)号:US10740328B2
公开(公告)日:2020-08-11
申请号:US15192909
申请日:2016-06-24
Applicant: Microsoft Technology Licensing, LLC
Inventor: Bolin Ding , Silu Huang , Chi Wang , Kaushik Chakrabarti , Surajit Chaudhuri
IPC: G06F16/2453 , G06F16/22 , G06F16/2455 , G06F16/2458
Abstract: A processing unit can determine a first subset of a data set including data records selected based on measure values thereof. The processing unit can determine an index mapping a predicate to data records associated with that predicate and approximation values of the records. The processing unit can process a query against the first subset to provide a first result and a first accuracy value, determine that the first accuracy value does not satisfy an accuracy criterion, and process the query against the index. In some examples, the processing unit can process the query against a second subset including data records satisfying a predetermined predicate. In some examples, the processing unit can receive data records and determine the first subset. Data records can include respective measure values. Data records with higher measure values can occur in the first subset more frequently than data records with lower measure values.
-
公开(公告)号:US20190251109A1
公开(公告)日:2019-08-15
申请号:US16396867
申请日:2019-04-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kris K. GANJAM , Kaushik Chakrabarti
CPC classification number: G06F16/353 , G06F17/2715 , G06N7/005
Abstract: The techniques discussed herein leverage structure within data of a corpus to parse unstructured data to obtain structured data and/or to predict latent data that is related to the unstructured and/or structured data. In some examples, parsing and/or predicting can be conducted at varying levels of granularity. In some examples, parsing and/or predicting can be iteratively conducted to improve accuracy and/or to expose more hidden data.
-
公开(公告)号:US10204142B2
公开(公告)日:2019-02-12
申请号:US14556232
申请日:2014-11-30
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kaushik Chakrabarti , Surajit Chaudhuri , Senjuti Basu Roy
Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.
-
公开(公告)号:US10127315B2
公开(公告)日:2018-11-13
申请号:US14325376
申请日:2014-07-08
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kanstantsyn Zoryn , Zhimin Chen , Kaushik Chakrabarti , James P. Finnigan , Vivek R. Narasayya , Surajit Chaudhuri , Kris Ganjam
IPC: G06F17/30
Abstract: The present invention extends to methods, systems, and computer program products for computing features of structured data. Aspects of the invention include computing features of table components (e.g., of rows, columns, cells, etc.). Computed features can be used for ranking the table components. When aggregated, features for different components of a table can be used for ranking the table (e.g., a web table).
-
公开(公告)号:US20180232410A1
公开(公告)日:2018-08-16
申请号:US15950176
申请日:2018-04-11
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kanstantsyn Zoryn , Zhimin Chen , Kaushik Chakrabarti , James P. Finnigan , Vivek R. Narasayya , Surajit Chaudhuri , Kris Ganjam
IPC: G06F17/30
CPC classification number: G06F16/2282 , G06F16/24573 , G06F16/951 , G06F16/9558 , G06F16/958
Abstract: The present invention extends to methods, systems, and computer program products for refining structured data indexes. Aspects of the invention include associating structured data, such as, for example, tables, with additional content. Additional content can include content outside the and tags of a web table. Indexes for structured data (e.g., table indexes) can be refined based on the additional content to improve the relevance of providing parts of the structured data (e.g., parts of the table) in search results.
-
-
-
-
-
-
-
-
-