Patent search ap:"Surajit Chaudhuri" Page 7

61.

发明授权
Identifying indexes on materialized views for database workload 有权
Title translation: 识别数据库工作负载的物化视图的索引

公开(公告)号：US06356891B1

公开(公告)日：2002-03-12

申请号：US09629412

申请日：2000-08-01

Applicant: Sanjay Agrawal , Surajit Chaudhuri , Vivek R. Narasayya

Inventor： Sanjay Agrawal , Surajit Chaudhuri , Vivek R. Narasayya

IPC: G06F1730

CPC classification number: G06F17/30336 , Y10S707/99932 , Y10S707/99933 , Y10S707/99935

Abstract: An index and materialized view selection wizard produces a fast and reasonable recommendation for a configuration of indexes, materialized views, and indexes on materialized views which are beneficial given a specified workload for a given database and database server. Candidate materialized views and indexes are obtained, and a joint enumeration of the combined materialized views and indexes is performed to obtain a recommended configuration. The configuration includes indexes, materialized views and indexes on materialized views. Candidate materialized views are obtained by first determining subsets of tables are referenced in queries in the workload and then finding interesting table subsets. Next, interesting subsets are considered on a per query basis to determine which are syntactically relevant for a query. Materialized views which are likely to be used for the workload are then generated along with a set of merged materialized views. Clustered indexes and non-clustered indexes on materialized views are then generated. The indexes, materialized views and indexes on materialized views are then enumerated together to form the recommended configuration.

Abstract translation: 索引和物化视图选择向导可以为物理视图的索引，物化视图和索引配置提供快速合理的建议，这对给定数据库和数据库服务器的指定工作负载是有益的。获取候选物化视图和索引，并执行组合实例化视图和索引的联合枚举，以获得推荐的配置。配置包括物化视图的索引，物化视图和索引。通过首先确定表中的子集在工作负载中的查询中引用并且然后找到有趣的表子集来获得候选物化视图。接下来，在每个查询的基础上考虑有趣的子集，以确定哪个在查询语法上相关。可能用于工作负载的物化视图随同一组合并物化视图一起生成。然后生成物化视图上的聚簇索引和非聚集索引。然后将物化视图的索引，物化视图和索引列在一起以形成推荐的配置。

62.

发明授权
Histogram construction using adaptive random sampling with cross-validation for database systems 有权
Title translation: 使用自适应随机抽样与数据库系统交叉验证的直方图构造

公开(公告)号：US06278989B1

公开(公告)日：2001-08-21

申请号：US09139835

申请日：1998-08-25

Applicant: Surajit Chaudhuri , Rajeev Motwani , Vivek Narasayya

Inventor： Surajit Chaudhuri , Rajeev Motwani , Vivek Narasayya

IPC: G06F1730

CPC classification number: G06F17/30463 , G06F17/30536 , Y10S707/99932 , Y10S707/99933 , Y10S707/99942

Abstract: Using adaptive random sampling with cross-validation helps determine when enough data of a database has been sampled to construct histograms on one or more columns of one or more tables of the database within a desired or predetermined degree of accuracy. An adaptive random sampling histogram construction tool constructs an approximate equi-height k-histogram using an initial sample of data values from the database and iteratively updates the histogram using an additional sample of data values from the database until the histogram is within the desired degree of accuracy. The accuracy of the histogram is cross-validated against the additional sample at each iteration, and the additional sample is used to update the histogram to help improve its accuracy. The accuracy of the histogram may be measured by an error in distribution of the additional sample over the histogram as compared to a threshold error using a suitable error metric. By attempting to sample only the number of data values necessary to construct the histogram within the desired degree of accuracy, the adaptive random sampling histogram construction tool attempts to avoid any cost increases in time and memory from sampling too many data values.

Abstract translation: 使用具有交叉验证的自适应随机抽样有助于确定在数据库的足够数据被采样以在期望的或预定的准确度内在数据库的一个或多个表的一个或多个列上构造直方图。自适应随机抽样直方图构造工具使用来自数据库的数据值的初始样本构建近似等高k直方图，并使用来自数据库的附加数据值样本迭代地更新直方图，直到直方图在所需的程度准确性。在每次迭代时，直方图的精度与附加样本进行交叉验证，并且附加样本用于更新直方图以帮助提高其准确性。与使用合适的误差度量的阈值误差相比，可以通过直方图上的附加样本的分布误差来测量直方图的精度。通过尝试仅在所需精度范围内仅采样构建直方图所需的数据值的数量，自适应随机抽样直方图构造工具尝试避免在采样太多数据值时的时间和内存中的任何成本增加。

63.

发明授权
Index tuner for given workload 有权
Title translation: 索引调谐器用于给定的工作负载

公开(公告)号：US06266658B1

公开(公告)日：2001-07-24

申请号：US09553070

申请日：2000-04-20

Applicant: Atul Adya , Sanjay Agrawal , Surajit Chaudhuri , Vivek R. Narasayya

Inventor： Atul Adya , Sanjay Agrawal , Surajit Chaudhuri , Vivek R. Narasayya

IPC: G06F1730

CPC classification number: G06F17/30312 , Y10S707/99931 , Y10S707/99932 , Y10S707/99933 , Y10S707/99935

Abstract: An index tuning wizard produces a fast and reasonable recommendation identifying database indexes to use given a specified workload. A query optimizer is used to determine the expected usefulness of potential indexes for the specified workload by taking cost of queries in the workload into account. A cost based pruning of indexes is then performed to provide an intermediate set of proposed indexes. Indexes having most benefit based on storage constraints are then selected. The optimizer is then used again, and further pruning is done on a benefits basis. An index is not recommended unless it has a significant impact on the workload.

Abstract translation: 索引调整向导会产生一个快速合理的建议，用于标识在指定工作负载下使用的数据库索引。查询优化器用于通过考虑工作负载中的查询成本来确定指定工作负载的潜在索引的预期有用性。然后执行索引的基于成本的修剪以提供提出的索引的中间集合。然后选择基于存储约束最有利的索引。然后，优化器再次被使用，并且进一步修剪是在有益的基础上进行的。不建议使用索引，除非它对工作负载有重大影响。

64.

发明授权
Database system index selection using cost evaluation of a workload for multiple candidate index configurations 失效

公开(公告)号：US5926813A

公开(公告)日：1999-07-20

申请号：US980829

申请日：1997-12-01

Applicant: Surajit Chaudhuri , Vivek Narasayya

Inventor： Surajit Chaudhuri , Vivek Narasayya

IPC: G06F17/30

CPC classification number: G06F17/30312 , Y10S707/99931 , Y10S707/99932 , Y10S707/99933 , Y10S707/99935 , Y10S707/99942 , Y10S707/99953

Abstract: An index selection tool helps reduce costs in time and memory in selecting an index configuration or set of indexes for use by a database server in accessing a database in accordance with a workload of queries. The index selection tool attempts to reduce the number of indexes to be considered, the number of index configurations to be enumerated, and the number of invocations of a query optimizer in selecting an index configuration for the workload.

65.

发明授权
Database system multi-column index selection for a workload 失效
Title translation: 一个工作负载的数据库系统多列索引选择

公开(公告)号：US5913206A

公开(公告)日：1999-06-15

申请号：US980831

申请日：1997-12-01

Applicant: Surajit Chaudhuri , Vivek Narasayya

Inventor： Surajit Chaudhuri , Vivek Narasayya

IPC: G06F17/30

CPC classification number: G06F17/30312 , Y10S707/99931 , Y10S707/99932 , Y10S707/99933 , Y10S707/99935 , Y10S707/99942 , Y10S707/99953

Abstract: An index selection tool helps reduce costs in time and memory in selecting an index configuration or set of indexes for use by a database server in accessing a database in accordance with a workload of queries. The index selection tool attempts to reduce the number of indexes to be considered, the number of index configurations to be enumerated, and the number of invocations of a query optimizer in selecting an index configuration for the workload.

Abstract translation: 索引选择工具有助于在选择索引配置或索引集时，在数据库服务器根据查询的工作量访问数据库时，减少时间和内存中的成本。索引选择工具尝试减少要考虑的索引数量，要枚举的索引配置数量以及查询优化器在为工作负载选择索引配置时调用的次数。

66.

发明授权
Integrated fuzzy joins in database management systems 有权
Title translation: 在数据库管理系统中集成模糊连接

公开(公告)号：US09317544B2

公开(公告)日：2016-04-19

申请号：US13253315

申请日：2011-10-05

Applicant: Kris Ganjam , Vivek Ravindranath Narasayya , Raghav Kaushik , Arvind Arasu , Surajit Chaudhuri

Inventor： Kris Ganjam , Vivek Ravindranath Narasayya , Raghav Kaushik , Arvind Arasu , Surajit Chaudhuri

IPC: G06F7/00 , G06F17/30

CPC classification number: G06F17/30303 , G06F17/30533

Abstract: A fuzzy joins system that is integrated in a database system generates fuzzy joins between records from two datasets. The fuzzy joins system includes a tokenizer to generate tokens for data records and a transformer to find transforms for the tokens. The fuzzy joins system invokes a signature generator, running within a runtime layer of the database system, to generate signatures for data records based on the tokens and their transforms. Subsequently, an equi-join operation joins the records from the two datasets with at least one equal signature. A similarity calculator, running within a runtime layer of the database system, computes a similarity measure using the token information of the joined records. If the similarity measure for any two records is above a threshold, the fuzzy joins system generates a fuzzy join between such two records.

Abstract translation: 集成在数据库系统中的模糊连接系统在两个数据集的记录之间生成模糊连接。模糊连接系统包括一个用于生成数据记录令牌的标记器和一个用于为令牌找到变换的变压器。模糊连接系统调用在数据库系统的运行时层内运行的签名生成器，以基于令牌及其变换生成用于数据记录的签名。随后，等连接操作将来自两个数据集的记录与至少一个相等的签名相连。在数据库系统的运行时层内运行的相似度计算器使用所连接的记录的令牌信息来计算相似性度量。如果任何两个记录的相似性度量高于阈值，则模糊连接系统在这两个记录之间生成模糊连接。

67.

发明授权
Entity augmentation service from latent relational data 有权
Title translation: 潜在关系数据的实体增强服务

公开(公告)号：US09171081B2

公开(公告)日：2015-10-27

申请号：US13413179

申请日：2012-03-06

Applicant: Kris K. Ganjam , Kaushik Chakrabarti , Mohamed A. Yakout , Surajit Chaudhuri

Inventor： Kris K. Ganjam , Kaushik Chakrabarti , Mohamed A. Yakout , Surajit Chaudhuri

IPC: G06F17/30 , G06F7/00

CPC classification number: G06F17/30864 , G06F17/30539 , G06F2216/03

Abstract: The subject disclosure is directed towards providing data for augmenting an entity-attribute-related task. Pre-processing is preformed on entity-attribute tables extracted from the web, e.g., to provide indexes that are accessible to find data that completes augmentation tasks. The indexes are based on both direct mappings and indirect mappings between tables. Example augmentation tasks include queries for augmented data based on an attribute name or examples, or finding synonyms for augmentation. An online query is efficiently processed by accessing the indexes to return augmented data related to the task.

Abstract translation: 主题公开旨在提供用于增强实体属性相关任务的数据。在从网络提取的实体属性表上执行预处理，例如，提供可访问以查找完成扩充任务的数据的索引。索引基于表之间的直接映射和间接映射。示例增强任务包括基于属性名称或示例的增强数据查询，或查找用于扩充的同义词。通过访问索引以返回与任务相关的扩充数据，可以有效地处理在线查询。

68.

发明授权
Finding data in connected corpuses using examples 有权
Title translation: 使用示例查找连接的语料库中的数据

公开(公告)号：US08983954B2

公开(公告)日：2015-03-17

申请号：US13443681

申请日：2012-04-10

Applicant: John C. Platt , Surajit Chaudhuri , Lev Novik , Henricus Johannes Maria Meijer , Efim Hudis , Kunal Mukerjee , Christopher Alan Hays

Inventor： John C. Platt , Surajit Chaudhuri , Lev Novik , Henricus Johannes Maria Meijer , Efim Hudis , Kunal Mukerjee , Christopher Alan Hays

IPC: G06F17/30

CPC classification number: G06F17/30758 , G06F17/30303 , G06F17/30395 , G06F17/3053 , G06F17/30539 , G06F17/30595 , G06F17/30722 , G06F17/30867

Abstract: In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.

Abstract translation: 在一个实施例中，数据集存储在目录中。通过在不同数据集中建立域之间的关系来丰富数据集。用户通过提供感兴趣的域的示例来搜索相关的数据集。系统识别与用户提供的示例对应的数据集。系统识别通过其他域直接链接或间接链接的数据集的连接子集。用户提供已知的关系示例来过滤连接的子集并识别与用户查询最相关的连接的子集。可以通过商业智能/分析进一步分析所选择的连接子集以创建枢轴表或处理数据。

69.

发明申请
TAGGING ENTITIES WITH DESCRIPTIVE PHRASES 有权
Title translation: 用描述性标签标签实体

公开(公告)号：US20130132381A1

公开(公告)日：2013-05-23

申请号：US13298349

申请日：2011-11-17

Applicant: Kaushik Chakrabarti , Surajit Chaudhuri , Tao Cheng

Inventor： Kaushik Chakrabarti , Surajit Chaudhuri , Tao Cheng

IPC: G06F17/30

CPC classification number: G06F17/30864 , G06F17/30277

Abstract: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.

Abstract translation: 可以基于第一多个文档的分析来确定与第一域相关联的多个描述短语，以确定描述短语与与第一域相关联的一个或多个名称标签的共同出现。可以获得与第一域相关联的实体。可以启动对第二多个文档的分析，以识别获得的实体的提及和多个描述短语中的一个或多个以及与提及和描述短语的共同出现中的每一个相关联的上下文，在第二多个文档的每一个中。可以基于对所识别的上下文的分析来确定获得的实体与描述短语之一之间的描述标签关联。

70.

发明授权
Learning string transformations from examples 有权
Title translation: 从示例中学习字符串变换

公开(公告)号：US08249336B2

公开(公告)日：2012-08-21

申请号：US12492311

申请日：2009-08-14

Applicant: Arvind Arasu , Surajit Chaudhuri , Shriraghav Kaushik

Inventor： Arvind Arasu , Surajit Chaudhuri , Shriraghav Kaushik

IPC: G06K9/00

CPC classification number: G06F17/2765

Abstract: Techniques are described to leverage a set of sample or example matched pairs of strings to learn string transformation rules, which may be used to match data records that are semantically equivalent. In one embodiment, matched pairs of input strings are accessed. For a set of matched pairs, a set of one or more string transformation rules are learned. A transformation rule may include two strings determined to be semantically equivalent. The transformation rules are used to determine whether a first and second string match each other.

Abstract translation: 描述技术来利用一组样本或示例匹配的字符串对来学习字符串转换规则，其可以用于匹配语义等同的数据记录。在一个实施例中，访问匹配的输入串对。对于一组匹配的对，学习一组或多个字符串转换规则。转换规则可以包括确定为在语义上相等的两个字符串。变换规则用于确定第一个和第二个字符串是否彼此匹配。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification