Patent search ap:"Surajit Chaudhuri" Page 10

91.

发明申请
Schema for physical database tuning 审中-公开
Title translation: 物理数据库调优模式

公开(公告)号：US20060085378A1

公开(公告)日：2006-04-20

申请号：US10966282

申请日：2004-10-15

Applicant: Alexander Raizman , Arunprasad Marathe , Djana Ophelia Milton , Dmitry Sonkin , Lubor Kollar , Maciej Sarnowicz , Manoj Syamala , Raja Duddupudi , Sanjay Agrawal , Surajit Chaudhuri , Vivek Narasayya

Inventor： Alexander Raizman , Arunprasad Marathe , Djana Ophelia Milton , Dmitry Sonkin , Lubor Kollar , Maciej Sarnowicz , Manoj Syamala , Raja Duddupudi , Sanjay Agrawal , Surajit Chaudhuri , Vivek Narasayya

IPC: G06F17/30

CPC classification number: G06F16/22

Abstract: Internal communications within components of an automated physical database design tool may be conducted in a data description language such as XML. Inputs to and outputs from the automated physical database design tool may also be presented in the data description language (e.g., XML). The communications, inputs and outputs may comply with a schema for the data description language. The schema may be written in a schema language such as XSD. Inputs presented in the data description language may comprise tuning options. Outputs may comprise a proposed physical design for a database and reports.

Abstract translation: 自动化物理数据库设计工具的组件内的内部通信可以以诸如XML的数据描述语言来进行。自动物理数据库设计工具的输入和输出也可以以数据描述语言（例如，XML）呈现。通信，输入和输出可能符合数据描述语言的模式。模式可以用XSD等模式语言编写。以数据描述语言呈现的输入可以包括调谐选项。输出可以包括数据库的提出的物理设计和报告。

92.

发明申请
Dynamic physical database design 有权

公开(公告)号：US20060036989A1

公开(公告)日：2006-02-16

申请号：US10914901

申请日：2004-08-10

Applicant: Surajit Chaudhuri , Arnd Konig , Vivek Narasayya

Inventor： Surajit Chaudhuri , Arnd Konig , Vivek Narasayya

IPC: G06F9/44

CPC classification number: G06F17/30312 , Y10S707/99933 , Y10S707/99945 , Y10S707/99948

Abstract: A monitoring component of a database server collects a subset of a query workload along with related statistics. A remote index tuning component uses the workload subset and related statistics to determine a physical design that minimizes the cost of executing queries in the workload subset while ensuring that queries omitted from the subset do not degrade in performance.

93.

发明申请
Detecting duplicate records in databases 有权
Title translation: 检测数据库中的重复记录

公开(公告)号：US20050262044A1

公开(公告)日：2005-11-24

申请号：US11182590

申请日：2005-07-14

Applicant: Surajit Chaudhuri , Venkatesh Ganti , Rohit Ananthakrishna

Inventor： Surajit Chaudhuri , Venkatesh Ganti , Rohit Ananthakrishna

IPC: G06F17/30 , G06F7/00

CPC classification number: G06F17/30303 , Y10S707/99931 , Y10S707/99942

Abstract: The invention concerns a detection of duplicate tuples in a database. Previous domain independent detection of duplicated tuples relied on standard similarity functions (e.g., edit distance, cosine metric) between multi-attribute tuples. However, such prior art approaches result in large numbers of false positives if they are used to identify domain-specific abbreviations and conventions. In accordance with the invention a process for duplicate detection is implemented based on interpreting records from multiple dimensional tables in a data warehouse, which are associated with hierarchies specified through key-foreign key relationships in a snowflake schema. The invention exploits the extra knowledge available from the table hierarchy to develop a high quality, scalable duplicate detection process.

Abstract translation: 本发明涉及对数据库中的重复元组的检测。复制元组的先前的域独立检测依赖于多属性元组之间的标准相似度函数（例如，编辑距离，余弦度量）。然而，如果这些现有技术的方法用于识别领域特定的缩写和惯例，则会产生大量的假阳性。根据本发明，基于解释数据仓库中来自多个维度表的记录来实现重复检测的过程，数据仓库与通过雪花模式中的关键 - 外键关系指定的层次相关联。本发明利用表层次结构中可用的额外知识来开发高质量，可扩展的重复检测过程。

94.

发明申请
Primitives for workload summarization 有权
Title translation: 用于工作负载摘要的基元

公开(公告)号：US20050223026A1

公开(公告)日：2005-10-06

申请号：US10815061

申请日：2004-03-31

Applicant: Surajit Chaudhuri , Vivek Narasayya , Prasanna Ganesan

Inventor： Surajit Chaudhuri , Vivek Narasayya , Prasanna Ganesan

IPC: G06F7/00 , G06F17/30

CPC classification number: G06F17/30489 , G06F17/30306 , Y10S707/99932 , Y10S707/99934 , Y10S707/99935

Abstract: A database object summarization tool is provided that selects a subset of database objects subject to filtering constraints such as a partial order or optimization of some attribute. A dominance primitive filters out tuples that are dominated according to a partial order constraint by another tuple. A representation primitive selects a representative subset of tuples such than an optimization criteria is met.

Abstract translation: 提供了一种数据库对象摘要工具，该工具选择受过滤约束（如某些属性的部分顺序或优化）的数据库对象的子集。优势原语过滤掉由另一个元组根据部分顺序约束所主导的元组。表示基元选择满足优化标准的元组的代表性子集。

95.

发明授权
Method and apparatus for exploiting statistics on query expressions for optimization 有权

公开(公告)号：US06947927B2

公开(公告)日：2005-09-20

申请号：US10191822

申请日：2002-07-09

Applicant: Surajit Chaudhuri , Nicolas Bruno

Inventor： Surajit Chaudhuri , Nicolas Bruno

IPC: G06F17/30

CPC classification number: G06F17/30463 , G06F17/30536 , Y10S707/99932 , Y10S707/99933 , Y10S707/99942 , Y10S707/99943 , Y10S707/99944 , Y10S707/99945

Abstract: A method for evaluating a user query on a relational database having records stored therein, a workload made up of a set of queries that have been executed on the database, and a query optimizer that generates a query execution plan for the user query. Each query plan includes a plurality of intermediate query plan components that verify a subset of records from the database meeting query criteria. The method accesses the query plan and a set of stored intermediate statistics for records verified by query components, such as histograms that summarize the cardinality of the records that verify the query component. The method forms a transformed query plan based on the selected intermediate statistics (possibly by rewriting the query plan) and estimates the cardinality of the transformed query plan to arrive at a more accurate cardinality estimate for the query. If additional intermediate statistics are necessary, a pool of intermediate statistics may be generated based on the queries in the workload by evaluating the benefit of a given statistic over the workload and adding intermediate statistics to the pool that provide relatively great benefit.

96.

发明申请
Database monitoring system 有权
Title translation: 数据库监控系统

公开(公告)号：US20050192921A1

公开(公告)日：2005-09-01

申请号：US10788077

申请日：2004-02-26

Applicant: Surajit Chaudhuri , Arnd Konig , Vivek Narasayya

Inventor： Surajit Chaudhuri , Arnd Konig , Vivek Narasayya

IPC: G06F7/00 , G06F17/30

CPC classification number: G06F17/30368 , Y10S707/955 , Y10S707/962 , Y10S707/99932

Abstract: A framework is provided within a database system for specifying database monitoring rules that will be evaluated as part of the execution code path of database events being monitored. The occurrence of a selected database event triggers a rule that evaluates some parameter of an object related to the event against a condition in the rule. If the condition is met, a specified action is taken that can alter the execution of the database event or database system performance. Lightweight aggregation tables are utilized to enable aggregation of object parameter values so that presently occurring events can be compared to a summary of the object parameter values from previously occurring database events. Signatures are assigned to queries based on the structure of the query plan so that information in the lightweight aggregation tables can be grouped according to query signature.

Abstract translation: 在数据库系统中提供一个框架，用于指定数据库监视规则，该规则将作为被监视的数据库事件的执行代码路径的一部分进行评估。所选数据库事件的发生触发一个规则，该规则根据规则中的条件来评估与事件相关的对象的某些参数。如果满足条件，则采取可以改变数据库事件或数据库系统性能执行的指定操作。轻量级聚合表用于启用对象参数值的聚合，以便将当前发生的事件与先前发生的数据库事件的对象参数值的摘要进行比较。根据查询计划的结构将签名分配给查询，以便轻量级聚合表中的信息可以根据查询签名进行分组。

97.

发明授权
Compressing database workloads 有权
Title translation: 压缩数据库工作负载

公开(公告)号：US06912547B2

公开(公告)日：2005-06-28

申请号：US10180667

申请日：2002-06-26

Applicant: Surajit Chaudhuri , Ashish Kumar Gupta , Vivek Narasayya , Sanjay Agrawal

Inventor： Surajit Chaudhuri , Ashish Kumar Gupta , Vivek Narasayya , Sanjay Agrawal

IPC: G06F17/30

CPC classification number: G06F17/30536 , G06F17/30306 , G06F17/30312 , Y10S706/917 , Y10S707/99932 , Y10S707/99942 , Y10S707/99945

Abstract: Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.

Abstract translation: 诸如索引选择，直方图调整，近似查询处理和统计选择等关系数据库应用程序已经认识到利用工作负载的重要性。通常，这些应用程序具有大的工作负载，即一组SQL DML语句作为输入。影响这些应用程序可扩展性的关键因素是工作负载的大小。本发明涉及工作负载压缩，这有助于提高这种应用的可扩展性。该示例性实施例广泛地适用于各种工作负载驱动的应用，同时允许结合应用特定的知识。该过程在两个工作负载驱动的应用程序的上下文中进行了详细描述：索引选择和近似查询处理。

98.

发明授权
Generalized keyword matching for keyword based searching over relational databases 有权
Title translation: 通过关键字搜索关系数据库的广义关键词匹配

公开(公告)号：US06792414B2

公开(公告)日：2004-09-14

申请号：US10036348

申请日：2001-10-19

Applicant: Surajit Chaudhuri , Sanjay Agrawal

Inventor： Surajit Chaudhuri , Sanjay Agrawal

IPC: G06F1730

CPC classification number: G06F17/3033 , G06F17/30436 , G06F17/30471 , G06F17/3053 , Y10S707/99932 , Y10S707/99933

Abstract: Searching by keywords and providing generalized matching capabilities on a relational database is enabled by performing preprocessing operations to construct inverted list lookup tables based on data record components at an interim level of granularity, such as column location. Prefix information is in the inverted list stored for each keyword, keyword sub-string, or stemmed version of the keyword. A keyword search is performed on the lookup tables rather than the database tables to determine database column locations of the keyword. The lookup tables is scanned to identify each prefix associated with the search term. Schema information about the database is used to link the column locations to form database subgraphs that span the keywords. Join tables are to generated based on the subgraphs consisting of columns containing the keywords. A query on the database is generated to join the tables and retrieve database rows that contain the keyword and the prefixes associated with the keyword. The retrieved rows are ranked in order of relevance before being output. By preprocessing a relational database to form lookup tables, and initially searching the lookup tables to obtain a targeted subset of the database upon which SQL queries can be performed to collect data records, keyword searching on relational database is made efficient.

Abstract translation: 通过关键字搜索和在关系数据库上提供广义匹配功能，可以通过执行预处理操作，以基于数据记录组件的临时级别（如列位置）构建反向列表查找表。前缀信息位于每个关键字，关键字子字符串或关键字的主题版本中存储的反向列表中。对查找表而不是数据库表执行关键字搜索，以确定关键字的数据库列位置。扫描查找表以识别与搜索项相关联的每个前缀。关于数据库的模式信息用于链接列位置以形成跨越关键字的数据库子图。根据由包含关键字的列组成的子图生成连接表。生成关于数据库的查询以连接表并检索包含与关键字关联的关键字和前缀的数据库行。检索到的行在输出之前按照相关性的顺序排列。通过预处理关系数据库以形成查找表，并且最初搜索查找表以获得数据库的目标子集，可以执行SQL查询来收集数据记录，关系数据库上的关键字搜索是有效的。

99.

发明授权
Self-tuning histogram and database modeling 有权
Title translation: 自调整直方图和数据库建模

公开(公告)号：US06460045B1

公开(公告)日：2002-10-01

申请号：US09268589

申请日：1999-03-15

Applicant: Ashraf Aboulnaga , Surajit Chaudhuri

Inventor： Ashraf Aboulnaga , Surajit Chaudhuri

IPC: G06F1700

CPC classification number: G06F17/30469 , G06F17/30536 , Y10S707/99931 , Y10S707/99932 , Y10S707/99933 , Y10S707/99934 , Y10S707/99943

Abstract: Building histograms by using feedback information about the execution of query workload rather than by examining the data helps reduce the cost of building and maintaining histograms. A method of maintaining self-tuning histograms updates histograms based on feedback about the execution of a user query. A histogram may be initialized using an assumption of uniform distribution of data or by combining existing histograms. A histogram tuner accesses and estimated result in response to a user query generated by using the histogram. The histogram tuner calculates an estimation error based on the result of the user query and the estimated result. The frequencies of histogram buckets are refined based on the estimation error. The bucket bounds of the histogram are restructured based on the refined frequencies. The method may be performed on-line after a user query or off-line by accessing a workload log. By updating a histogram without accessing the database, the cost of building and maintaining histograms is significantly reduced.

Abstract translation: 通过使用有关执行查询工作负载的反馈信息而不是检查数据来构建直方图有助于降低构建和维护直方图的成本。维持自调整直方图的方法基于关于用户查询的执行的反馈来更新直方图。可以使用数据均匀分布的假设或通过组合现有直方图来初始化直方图。直方图调谐器响应于通过使用直方图生成的用户查询来访问和估计结果。直方图调谐器基于用户查询的结果和估计结果来计算估计误差。基于估计误差来改进直方图桶的频率。直方图的边界根据精细的频率进行重组。该方法可以在用户查询之后在线执行，或者通过访问工作负载日志离线执行。通过更新直方图而不访问数据库，建立和维护直方图的成本显着降低。

100.

发明授权
What-if index analysis utility for database systems 有权
Title translation: 数据库系统的假设索引分析实用程序

公开(公告)号：US06223171B1

公开(公告)日：2001-04-24

申请号：US09139843

申请日：1998-08-25

Applicant: Surajit Chaudhuri , Vivek Narasayya

Inventor： Surajit Chaudhuri , Vivek Narasayya

IPC: G06F1730

CPC classification number: G06F11/3447 , G06F11/3457 , G06F2201/88 , Y10S707/954 , Y10S707/99931 , Y10S707/99932

Abstract: What-if index analysis utility provides the ability to analyze the performance of the existing configuration of a database system with respect to one or more workloads of queries and to propose a hypothetical configuration for the database system to analyze its potential impact on the performance of the database system. The utility may be used, for example, to perform an impact analysis of the set of indexes selected by an index selection tool, for example, with respect to a workload of queries and may also be used to explore what-if scenarios for the database system by analyzing the impact of hypothetical sets of indexes with respect to the execution of various workloads over projected sizes of a database. The utility may be used to perform summarizations of workloads, configurations, and the performance of workloads with respect to the existing configuration and hypothetical configurations. What-if index analysis utility may be used, for example, by a database administrator or a physical database design tool to help improve performance of a database system.

Abstract translation: 假设索引分析实用程序提供了分析数据库系统对一个或多个查询工作负载的现有配置的性能的能力，并提出数据库系统的假设配置，以分析其对性能的潜在影响数据库系统。例如，该实用程序可以用于对由索引选择工具选择的索引集合进行影响分析，例如关于查询的工作负载，并且还可以用于探索数据库的假设情况系统通过分析假设的索引集合对各种工作负载的执行与数据库的预计大小的影响。该实用程序可用于执行相对于现有配置和假设配置的工作负载，配置和工作负载性能的摘要。假设索引分析实用程序可以由数据库管理员或物理数据库设计工具使用，以帮助提高数据库系统的性能。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification