专利检索 ap:("International Business Machines Corporation") AND inv:"Lukasz Gaza" 第 1 页

1.

发明授权
Efficient processing of data extents 有权

公开(公告)号：US10776354B2

公开(公告)日：2020-09-15

申请号：US15833244

申请日：2017-12-06

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Michal Bodziony , Andreas Brodt , Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski

IPC分类号： G06F7/00 , G06F16/2453 , G06F16/242

摘要： The present disclosure relates to a computer-implemented method, computer program product, and computer system, for optimization of query processing a set of data extents on which a table is stored. Attribute value information may be maintained for each data extent. The attribute value information indicate as ranges the minimum and maximum values of an attribute of the entries stored in the respective extent. A first metric of a first data extent of the set may determine splitting the first data extent into sub-extents increases query processing efficiency. A second metric of a second data extent and a third data extent may determine merging the second data extent and the third data extent increases query processing efficiency.

2.

发明授权
Early diagnosis of hardware, software or configuration problems in data warehouse system utilizing grouping of queries based on query parameters 有权

公开(公告)号：US10423479B2

公开(公告)日：2019-09-24

申请号：US15617201

申请日：2017-06-08

申请人： International Business Machines Corporation

发明人： Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Bartlomiej T. Malecki , Konrad K. Skibski , Tomasz Stradomski

IPC分类号： G06F11/07 , G06F16/21 , G06F16/25 , G06F16/2455 , G06F16/2457 , G06F11/34

摘要： A method, system and computer program product for providing early diagnosis of hardware, software or configuration problems in a data warehouse system. A received query is parsed to determine the properties of the query. The query may then be joined to existing groups of queries if those groups have shared properties of the query. After executing the query according to an execution plan, results from the execution of the query is received, which may include problem(s) that occurred during execution of the query. For those problems that reach a pre-defined threshold of becoming a “group problem” in those groups joined by the query, the problem is reported to the end user concerning those groups where the problem exceeds the pre-defined threshold. In this manner, an early diagnosis of the problems in the data warehouse system that can cause delay and failure of the processing of queries is able to occur.

3.

发明授权
Joining two data tables on a join attribute 有权

公开(公告)号：US10380112B2

公开(公告)日：2019-08-13

申请号：US15663896

申请日：2017-07-31

申请人： International Business Machines Corporation

发明人： Michal Bodziony , Konrad K. Skibski , Tomasz Kazalski , Artur M. Gruszecki , Lukasz Gaza

IPC分类号： G06F16/00 , G06F16/2453

摘要： The present disclosure relates to a computer-implemented method for joining two data tables on a join attribute. The data tables have at least a first and a second attribute. The second attribute is the join attribute. The method includes providing a function for associating a computing node to a given record. The function may be used to determine the associated computing node. The records of the two data tables may be distributed to the respective determined computing nodes. The relationship between the values of the first and second attributes may be modelled using a predefined dataset. For each record of the two data tables the values of the first attribute may be re-determined using the corresponding values of the second attribute. The function may be used to re-determine the associated computing node.

4.

发明授权
Providing multidimensional attribute value information 有权

公开(公告)号：US10360240B2

公开(公告)日：2019-07-23

申请号：US15230509

申请日：2016-08-08

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Michal Bodziony , Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski

IPC分类号： G06F17/30 , G06F16/28

摘要： The invention relates to a method, computer program product and computer system for providing attribute value information for a data extent comprising a set of data entries. For each multidimensional reference point of a set of one or more multidimensional reference points the method comprises: calculating for each multidimensional data entry a reference-point-specific distance between the respective multidimensional data entry and the multidimensional reference point resulting in a set of reference-point-specific distances for the data extent, the respective reference-point-specific distance being calculated using a combination of the attribute values of the multidimensional data entry and a combination of the reference attribute values of the respective multidimensional reference point; determining a minimum reference-point-specific distance and a maximum reference-point-specific distance of the set of reference-point-specific distances; storing for the data extent as attribute value information for further use with query processing the minimum reference-point-specific distance and maximum reference-point-specific distance.

5.

发明授权
Selectivity estimation for query execution planning in a database 有权

公开(公告)号：US10162860B2

公开(公告)日：2018-12-25

申请号：US14517964

申请日：2014-10-20

申请人： International Business Machines Corporation

发明人： Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski , Tomasz Stradomski

IPC分类号： G06F17/30

摘要： A computer-implemented method of estimating selectivity of a query may include generating, for data stored in a database in a memory, a one-dimensional value distribution for each of a plurality of attributes of the data. A multidimensional histogram may be generated, wherein the multidimensional histogram includes the one-dimensional value distributions for the plurality of attributes of the data. The multidimensional histogram may be converted to a one-dimensional histogram by assigning each bucket of the multidimensional histogram to corresponding buckets of the one-dimensional histogram and ordering the corresponding buckets according to a space-filling curve. One or more bucket ranges of the one-dimensional histogram may be determined by mapping the query conditions on the one-dimensional histogram. The selectivity of the query may be estimated by estimating how many data values in the one or more bucket ranges will meet the query conditions.

6.

发明授权
Approximate string matching optimization for a database 有权

公开(公告)号：US10095808B2

公开(公告)日：2018-10-09

申请号：US15494874

申请日：2017-04-24

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Michal Bodziony , Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski , Tomasz Stradomski

IPC分类号： G06F17/30

摘要： Software for processing a database query that includes: (i) receiving a query of a database including a search value; (ii) determining a distance between the search value and at least one reference value; (iii) determining a maximum distance from the search value to be used in searching a plurality of datasets of the database, wherein the maximum distance from the search value defines a search range and is based, at least in part, on the determined distance between the search value and the at least one reference value; (iv) determining a subset of datasets from the plurality of datasets that includes datasets for which a data range with respect to each reference value overlaps with the search range; and (v) performing approximate string matching for the search value on the subset of datasets.

7.

发明授权
Attribute value information for a data extent 有权

公开(公告)号：US10713254B2

公开(公告)日：2020-07-14

申请号：US15697614

申请日：2017-09-07

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Michal Bodziony , Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski

IPC分类号： G06F17/30 , G06F16/2455 , G06F16/25

摘要： The invention relates to a method, computer program product and computer system for providing attribute value information for a data extent having a set of data entries. The method includes: determining a reference string value of a data-extent-specific reference point based on symbol frequencies at each sequence position of attribute string values in a subset of the set of data entries; calculating a distance between each of the attribute string values in the subset and the reference string value of the data-extent-specific reference point resulting in a set of distances; determining for each of the attribute string values an attribute-string-value-specific minimum distance for any reference string value of the data-extent-specific reference point resulting in a set of attribute-string-value-specific minimum distances for the set of data entries; storing for the data extent the minimum distance and the maximum distance of the set of attribute-string-value-specific minimum distances as attribute value information.

8.

发明授权
Method for processing a database query 有权

公开(公告)号：US10698912B2

公开(公告)日：2020-06-30

申请号：US15941377

申请日：2018-03-30

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski , Tomasz Stradomski

IPC分类号： G06F15/16 , G06F16/2458 , G06F16/245 , G06F16/951

摘要： The invention relates to a computer-implemented method for processing a query in a database, the query comprising a search value. The database comprises a plurality of datasets the datasets comprising entries, wherein distance statistics are assigned to the datasets. The distance statistics describe the minimum and maximum distance between the values of the entries of a dataset of the plurality of datasets and a reference value. The method comprises determining the distance between the search value and the reference value, said determination resulting in a search distance, determining a subset of datasets from the plurality of datasets for which the search distance is within the limits given by the minimum and maximum distances described by the respective distance statistics, and searching for the search value in the subset of datasets.

9.

发明授权
Optimization of a plurality of table processing operations in a massive parallel processing environment 有权

公开(公告)号：US10210206B2

公开(公告)日：2019-02-19

申请号：US14505715

申请日：2014-10-03

申请人： International Business Machines Corporation

发明人： Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski , Tomasz Stradomski

IPC分类号： G06F17/30

摘要： A computer-implemented method for partitioning data for a query operation of one table of the database system is provided. The computer-implemented method comprises estimating a value distribution of the attribute in the result table based on a first value distribution of the attribute in the first column of the first table. The computer-implemented method further comprises determining boundaries for partitioning ranges of the attribute, based on the estimated value distribution, wherein the partitioning ranges correspond to a same number of rows of the result table. The computer-implemented method further comprises partitioning the first table with processing nodes of the query operation, based on the determined boundaries of partitioning ranges.

10.

发明申请
METHOD FOR PROCESSING A DATABASE QUERY 审中-公开

公开(公告)号：US20180225338A1

公开(公告)日：2018-08-09

申请号：US15941377

申请日：2018-03-30

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Lukasz Gaza , Artur M. Gruszecki , Tomasz Kazalski , Konrad K. Skibski , Tomasz Stradomski

IPC分类号： G06F17/30

CPC分类号： G06F16/2462 , G06F16/245 , G06F16/951

摘要： The invention relates to a computer-implemented method for processing a query in a database, the query comprising a search value. The database comprises a plurality of datasets the datasets comprising entries, wherein distance statistics are assigned to the datasets. The distance statistics describe the minimum and maximum distance between the values of the entries of a dataset of the plurality of datasets and a reference value. The method comprises determining the distance between the search value and the reference value, said determination resulting in a search distance, determining a subset of datasets from the plurality of datasets for which the search distance is within the limits given by the minimum and maximum distances described by the respective distance statistics, and searching for the search value in the subset of datasets.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类