专利检索 ap:("Marius I. Danciu" OR "Fan Li" OR "Michael McRoberts" OR "Jing-Yun Shyr" OR "Damir Spisic" OR "Jing Xu") AND inv:"Jing Xu" 第 1 页

1.

发明授权
Generating a predictive model from multiple data sources 有权
标题翻译：从多个数据源生成预测模型

公开(公告)号：US08996452B2

公开(公告)日：2015-03-31

申请号：US13545817

申请日：2012-07-10

申请人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

发明人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

IPC分类号： G06F7/00 , G06F17/00 , G06Q10/06

CPC分类号： G06Q10/06

摘要： Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.

摘要翻译： 公开了用于从多个数据源生成集合模型的技术。在一个实施例中，使用全局验证样本，全局保持样本和从多个数据源生成的基本模型来生成集合模型。可以基于全局验证数据集为每个基本模型确定精度值。集合模型可以从基本模型的子集生成，其中基于确定的精度值选择子集。

2.

发明授权
Generating a predictive model from multiple data sources 有权

公开(公告)号：US08990149B2

公开(公告)日：2015-03-24

申请号：US13048536

申请日：2011-03-15

申请人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

发明人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

IPC分类号： G06F7/00 , G06F17/00 , G06Q10/06

CPC分类号： G06Q10/06

摘要： Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.

3.

发明申请
GENERATING A PREDICTIVE MODEL FROM MULTIPLE DATA SOURCES 有权
标题翻译：从多个数据源生成预测模型

公开(公告)号：US20120239613A1

公开(公告)日：2012-09-20

申请号：US13048536

申请日：2011-03-15

申请人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

发明人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

IPC分类号： G06F7/00 , G06F17/00 , G06F17/30

CPC分类号： G06Q10/06

摘要： Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.

摘要翻译： 公开了用于从多个数据源生成集合模型的技术。在一个实施例中，使用全局验证样本，全局保持样本和从多个数据源生成的基本模型来生成集合模型。可以基于全局验证数据集为每个基本模型确定精度值。集合模型可以从基本模型的子集生成，其中基于确定的精度值选择子集。

4.

发明申请
GENERATING A PREDICTIVE MODEL FROM MULTIPLE DATA SOURCES 审中-公开

公开(公告)号：US20120278275A1

公开(公告)日：2012-11-01

申请号：US13545817

申请日：2012-07-10

申请人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

发明人： Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu

IPC分类号： G06N5/02

CPC分类号： G06Q10/06

摘要： Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.

5.

发明授权
Computing and applying order statistics for data preparation 有权
标题翻译：计算和应用订单统计数据进行准备

公开(公告)号：US08868573B2

公开(公告)日：2014-10-21

申请号：US13444718

申请日：2012-04-11

申请人： Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu

发明人： Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu

IPC分类号： G06F7/00

CPC分类号： G06F17/30283 , G06F17/3007 , G06Q10/10 , G06Q30/06

摘要： Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.

摘要翻译： 提供了用于生成订单统计和错误界限的技术。对于多个分布式数据源中的每一个，为数据源中的每个字段创建有限数量的数据仓。处理多个分布式数据源中的每一个中的数据值，以便在单次数据值中为每个数据仓生成基本摘要。来自多个分布式数据源中的每一个的数据仓被排序。通过从多个排序的数据仓中累积计数，为数据集计算一个或多个近似顺序统计量。为所计算的一个或多个近似秩统计中的每一个提供下限和上限误差界限，其中下限误差界限和上限误差界限是定义包含订单统计量的真实值的间隔的值。

6.

发明申请
COMPUTING AND APPLYING ORDER STATISTICS FOR DATA PREPARATION 审中-公开
标题翻译：计算和应用订单统计数据准备

公开(公告)号：US20130218908A1

公开(公告)日：2013-08-22

申请号：US13399838

申请日：2012-02-17

申请人： Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu

发明人： Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu

IPC分类号： G06F17/30

CPC分类号： G06Q30/06 , G06F16/11 , G06F16/27 , G06Q10/10

摘要： Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.

摘要翻译： 提供了用于生成订单统计和错误界限的技术。对于多个分布式数据源中的每一个，为数据源中的每个字段创建有限数量的数据仓。处理多个分布式数据源中的每一个中的数据值，以便在单次数据值中为每个数据仓生成基本摘要。来自多个分布式数据源中的每一个的数据仓被排序。通过从多个排序的数据仓中累积计数，为数据集计算一个或多个近似顺序统计量。为所计算的一个或多个近似秩统计中的每一个提供下限和上限误差界限，其中下限误差界限和上限误差界限是定义包含订单统计量的真实值的间隔的值。

7.

发明申请
COMPUTING AND APPLYING ORDER STATISTICS FOR DATA PREPARATION 有权

公开(公告)号：US20130218909A1

公开(公告)日：2013-08-22

申请号：US13444718

申请日：2012-04-11

申请人： Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu

发明人： Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu

IPC分类号： G06F17/30

CPC分类号： G06F17/30283 , G06F17/3007 , G06Q10/10 , G06Q30/06

摘要： Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.

8.

发明授权
Interestingness of data 有权

公开(公告)号：US08880532B2

公开(公告)日：2014-11-04

申请号：US13172707

申请日：2011-06-29

申请人： Jing-Yun Shyr , Damir Spisic , Raymond Wright , Jing Xu , Xueying Zhang

发明人： Jing-Yun Shyr , Damir Spisic , Raymond Wright , Jing Xu , Xueying Zhang

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30321

摘要： Provided are techniques for analyzing fields. Statistical metrics for each field in a data set are received. A general interestingness index is generated for each field using one or more combination functions that aggregate standardized interestingness sub-indexes. One or more fields are identified as interesting for further analysis using the general interestingness index. One or more expert recommendations for field transformations are constructed for the identified one or more fields.

9.

发明申请
INTERESTINGNESS OF DATA 有权
标题翻译：资料的利益

公开(公告)号：US20130006998A1

公开(公告)日：2013-01-03

申请号：US13172707

申请日：2011-06-29

申请人： Jing-Yun Shyr , Damir Spisic , Raymond Wright , Jing Xu , Xueying Zhang

发明人： Jing-Yun Shyr , Damir Spisic , Raymond Wright , Jing Xu , Xueying Zhang

IPC分类号： G06F17/30

CPC分类号： G06F17/30321

摘要： Provided are techniques for analyzing fields. Statistical metrics for each field in a data set are received. A general interestingness index is generated for each field using one or more combination functions that aggregate standardized interestingness sub-indexes. One or more fields are identified as interesting for further analysis using the general interestingness index. One or more expert recommendations for field transformations are constructed for the identified one or more fields.

摘要翻译： 提供分析领域的技术。收到数据集中每个字段的统计量度。使用聚合标准化兴趣子索引的一个或多个组合函数为每个字段生成一般的趣味性索引。一个或多个字段被识别为有趣的进一步分析使用一般的趣味性指数。为识别的一个或多个字段构建用于场转换的一个或多个专家建议。

10.

发明授权
Interestingness of data 有权
标题翻译：数据有趣

公开(公告)号：US08843498B2

公开(公告)日：2014-09-23

申请号：US13614335

申请日：2012-09-13

申请人： Jing-Yun Shyr , Damir Spisic , Raymond Wright , Jing Xu , Xueying Zhang

发明人： Jing-Yun Shyr , Damir Spisic , Raymond Wright , Jing Xu , Xueying Zhang

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30321

摘要： Provided are techniques for analyzing fields. Statistical metrics for each field in a data set are received. A general interestingness index is generated for each field using one or more combination functions that aggregate standardized interestingness sub-indexes. One or more fields are identified as interesting for further analysis using the general interestingness index. One or more expert recommendations for field transformations are constructed for the identified one or more fields.

摘要翻译： 提供分析领域的技术。收到数据集中每个字段的统计量度。使用聚合标准化兴趣子索引的一个或多个组合函数为每个字段生成一般的趣味性索引。一个或多个字段被识别为有趣的进一步分析使用一般的趣味性指数。为识别的一个或多个字段构建用于场转换的一个或多个专家建议。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类