-
公开(公告)号:US08996452B2
公开(公告)日:2015-03-31
申请号:US13545817
申请日:2012-07-10
申请人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
发明人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
CPC分类号: G06Q10/06
摘要: Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.
摘要翻译: 公开了用于从多个数据源生成集合模型的技术。 在一个实施例中,使用全局验证样本,全局保持样本和从多个数据源生成的基本模型来生成集合模型。 可以基于全局验证数据集为每个基本模型确定精度值。 集合模型可以从基本模型的子集生成,其中基于确定的精度值选择子集。
-
公开(公告)号:US08990149B2
公开(公告)日:2015-03-24
申请号:US13048536
申请日:2011-03-15
申请人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
发明人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
CPC分类号: G06Q10/06
摘要: Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.
-
公开(公告)号:US20120239613A1
公开(公告)日:2012-09-20
申请号:US13048536
申请日:2011-03-15
申请人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
发明人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
CPC分类号: G06Q10/06
摘要: Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.
摘要翻译: 公开了用于从多个数据源生成集合模型的技术。 在一个实施例中,使用全局验证样本,全局保持样本和从多个数据源生成的基本模型来生成集合模型。 可以基于全局验证数据集为每个基本模型确定精度值。 集合模型可以从基本模型的子集生成,其中基于确定的精度值选择子集。
-
公开(公告)号:US20120278275A1
公开(公告)日:2012-11-01
申请号:US13545817
申请日:2012-07-10
申请人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
发明人: Marius I. Danciu , Fan Li , Michael McRoberts , Jing-Yun Shyr , Damir Spisic , Jing Xu
IPC分类号: G06N5/02
CPC分类号: G06Q10/06
摘要: Techniques are disclosed for generating an ensemble model from multiple data sources. In one embodiment, the ensemble model is generated using a global validation sample, a global holdout sample and base models generated from the multiple data sources. An accuracy value may be determined for each base model, on the basis of the global validation dataset. The ensemble model may be generated from a subset of the base models, where the subset is selected on the basis of the determined accuracy values.
-
公开(公告)号:US08868573B2
公开(公告)日:2014-10-21
申请号:US13444718
申请日:2012-04-11
申请人: Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu
发明人: Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu
IPC分类号: G06F7/00
CPC分类号: G06F17/30283 , G06F17/3007 , G06Q10/10 , G06Q30/06
摘要: Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.
摘要翻译: 提供了用于生成订单统计和错误界限的技术。 对于多个分布式数据源中的每一个,为数据源中的每个字段创建有限数量的数据仓。 处理多个分布式数据源中的每一个中的数据值,以便在单次数据值中为每个数据仓生成基本摘要。 来自多个分布式数据源中的每一个的数据仓被排序。 通过从多个排序的数据仓中累积计数,为数据集计算一个或多个近似顺序统计量。 为所计算的一个或多个近似秩统计中的每一个提供下限和上限误差界限,其中下限误差界限和上限误差界限是定义包含订单统计量的真实值的间隔的值。
-
公开(公告)号:US20130218908A1
公开(公告)日:2013-08-22
申请号:US13399838
申请日:2012-02-17
申请人: Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu
发明人: Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu
IPC分类号: G06F17/30
摘要: Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.
摘要翻译: 提供了用于生成订单统计和错误界限的技术。 对于多个分布式数据源中的每一个,为数据源中的每个字段创建有限数量的数据仓。 处理多个分布式数据源中的每一个中的数据值,以便在单次数据值中为每个数据仓生成基本摘要。 来自多个分布式数据源中的每一个的数据仓被排序。 通过从多个排序的数据仓中累积计数,为数据集计算一个或多个近似顺序统计量。 为所计算的一个或多个近似秩统计中的每一个提供下限和上限误差界限,其中下限误差界限和上限误差界限是定义包含订单统计量的真实值的间隔的值。
-
公开(公告)号:US20130218909A1
公开(公告)日:2013-08-22
申请号:US13444718
申请日:2012-04-11
申请人: Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu
发明人: Yea J. Chu , Sier Han , Fan Li , Jing-Yun Shyr , Damir Spisic , Graham J. Wills , Jing Xu
IPC分类号: G06F17/30
CPC分类号: G06F17/30283 , G06F17/3007 , G06Q10/10 , G06Q30/06
摘要: Provided are techniques for generating order statistics and error bounds. For each of multiple, distributed data sources, a finite number of data bins are created for each field in that data source. Data values in each of the multiple, distributed data sources are processed to generate basic summaries for each of the data bins in a single pass of the data values. The data bins from each of the multiple, distributed data sources are sorted. One or more approximate order statistics are computed for a data set by accumulating counts from a number of the sorted data bins. Lower and upper error bounds are provided for each of the computed one or more approximate order statistics, wherein the lower and upper error bounds are values delimiting an interval containing a true value of an order statistic.
-
公开(公告)号:US08831909B2
公开(公告)日:2014-09-09
申请号:US13240743
申请日:2011-09-22
申请人: Fan Li , Chunshui Zhao , Feng Zhao
发明人: Fan Li , Chunshui Zhao , Feng Zhao
CPC分类号: G01C22/006 , G01C21/165 , G06F19/00
摘要: Step detection and step length estimation techniques include detecting salient points in sensor data of one or more sensors. A step frequency is estimated based on a time interval between the detected salient points. A step length of the step may then be computed based on a nonlinear combination of the estimated step frequency and a function of the sensor data, and/or a step model. Alternatively, the step length of the step may be computed based on a combination of a nonlinear function of the estimated step frequency and a (linear or nonlinear) function of the sensor data, and/or a step model.
摘要翻译: 步骤检测和步长估计技术包括检测一个或多个传感器的传感器数据中的突出点。 基于检测到的突出点之间的时间间隔来估计步进频率。 然后可以基于估计的步进频率和传感器数据的函数的非线性组合和/或步骤模型来计算步长的步长。 或者,可以基于估计的步进频率的非线性函数和传感器数据的(线性或非线性)函数和/或步骤模型的组合来计算步长的步长。
-
公开(公告)号:US20110072021A1
公开(公告)日:2011-03-24
申请号:US12563357
申请日:2009-09-21
申请人: Yumao Lu , Lei Duan , Fan Li , Benoit Dumoulin , Xing Wei
发明人: Yumao Lu , Lei Duan , Fan Li , Benoit Dumoulin , Xing Wei
IPC分类号: G06F17/30
CPC分类号: G06F17/30864
摘要: In one embodiment, access a search query comprising one or more query words, at least one of the query words representing one or more query concepts; access a network document identified for a search query by a search engine, the network document comprising one or more document words, at least one of the document words representing one or more document concepts; semantic-text match the search query and the network document to determine one or more negative semantic-text matches; and construct one or more negative features based on the negative semantic-text matches.
摘要翻译: 在一个实施例中,访问包括一个或多个查询词的搜索查询,表示一个或多个查询概念的查询词中的至少一个; 访问由搜索引擎识别为搜索查询的网络文档,所述网络文档包括一个或多个文档字,所述文档字中的至少一个表示一个或多个文档概念; 语义文本匹配搜索查询和网络文档以确定一个或多个否定语义文本匹配; 并基于负面语义文本匹配构造一个或多个负面特征。
-
公开(公告)号:US20100146136A1
公开(公告)日:2010-06-10
申请号:US12328119
申请日:2008-12-04
申请人: Jian-guang Lou , Yusuo Hu , Qingwei Lin , Fan Li , Jiang Li
发明人: Jian-guang Lou , Yusuo Hu , Qingwei Lin , Fan Li , Jiang Li
IPC分类号: G06F15/16
CPC分类号: H04L67/104 , H04L65/4084 , H04L65/80 , H04L67/1078 , H04L67/325
摘要: Techniques for streaming media packets in a peer-to-peer network are disclosed.
摘要翻译: 公开了在对等网络中流媒体分组的技术。
-
-
-
-
-
-
-
-
-