-
公开(公告)号:US10691662B1
公开(公告)日:2020-06-23
申请号:US15358002
申请日:2016-11-21
发明人: Michael Harris , Jeff Wang , Bobby Prochnow
IPC分类号: G06F16/22 , G06F16/29 , G06F16/2458
摘要: A method and apparatus for a data analysis system for analyzing data object collections that include geo-temporal data is provided. One or more temporal granularities are specified for the purpose of generating a geo-temporal data index. The time granularities correspond to temporal ranges expected to correspond to temporal ranges specified in user queries against the data. One or more temporal index bucket groups are generated based on to the specified time granularities. Geo-temporal input data is indexed based on the generated temporal index bucket groups. The system allows a data analyst to specify geo-temporal queries that include both geospatial component and a temporal component. The system transforms geo-temporal queries into one or more second queries that retrieve data items based on the temporal index bucket groups.
-
公开(公告)号:US20190138508A1
公开(公告)日:2019-05-09
申请号:US16240507
申请日:2019-01-04
发明人: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
摘要: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US20190052648A1
公开(公告)日:2019-02-14
申请号:US14928512
申请日:2015-10-30
发明人: Geoff Stowe , Harkirat Singh , Stefan Bach , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
摘要: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
公开(公告)号:US20180196838A1
公开(公告)日:2018-07-12
申请号:US15914215
申请日:2018-03-07
发明人: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
IPC分类号: G06F17/30
CPC分类号: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
摘要: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US09965937B2
公开(公告)日:2018-05-08
申请号:US14473920
申请日:2014-08-29
发明人: David Cohen , Jason Ma , Bing Jie Fu , Ilya Nepomnyashchiy , Steven Berler , Alex Smaliy , Jack Grossman , James Thompson , Julia Boortz , Matthew Sprague , Parvathy Menon , Michael Kross , Michael Harris , Adam Borochoff
CPC分类号: G08B21/18 , G06F3/04842 , H04L63/0281 , H04L63/1433 , H04L63/145
摘要: Embodiments of the present disclosure relate to a data analysis system that may automatically generate memory-efficient clustered data structures, automatically analyze those clustered data structures, and provide results of the automated analysis in an optimized way to an analyst. The automated analysis of the clustered data structures (also referred to herein as data clusters) may include an automated application of various criteria or rules so as to generate a compact, human-readable analysis of the data clusters. The human-readable analyzes (also referred to herein as “summaries” or “conclusions”) of the data clusters may be organized into an interactive user interface so as to enable an analyst to quickly navigate among information associated with various data clusters and efficiently evaluate those data clusters in the context of, for example, a fraud investigation. Embodiments of the present disclosure also relate to automated scoring of the clustered data structures.
-
公开(公告)号:US09483506B2
公开(公告)日:2016-11-01
申请号:US14879916
申请日:2015-10-09
发明人: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
CPC分类号: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
摘要: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
摘要翻译: 一种维护数据流水线计算机系统和方法的历史。 在一个方面,历史保存数据流水线系统提供不变的和版本化的数据集。 因为数据集是不可变的和版本化的,所以系统可以在过去的某个时间点确定数据集中的数据,即使该数据不再在数据集的当前版本中。
-
公开(公告)号:US09229952B1
公开(公告)日:2016-01-05
申请号:US14533433
申请日:2014-11-05
发明人: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
IPC分类号: G06F17/30
CPC分类号: G06F17/30309 , G06F11/1451 , G06F17/30227 , G06F17/3023 , G06F17/30292 , G06F17/30371 , G06F17/3038 , G06F17/30563
摘要: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
摘要翻译: 一种维护数据流水线计算机系统和方法的历史。 在一个方面,历史保存数据流水线系统提供不变的和版本化的数据集。 因为数据集是不可变的和版本化的,所以系统可以在过去的某个时间点确定数据集中的数据,即使该数据不再在数据集的当前版本中。
-
公开(公告)号:US08712906B1
公开(公告)日:2014-04-29
申请号:US13968213
申请日:2013-08-15
IPC分类号: G06Q40/00
摘要: Techniques are disclosed for prioritizing a plurality of clusters. Prioritizing clusters may generally include identifying a scoring strategy for prioritizing the plurality of clusters. Each cluster is generated from a seed and stores a collection of data retrieved using the seed. For each cluster, elements of the collection of data stored by the cluster are evaluated according to the scoring strategy and a score is assigned to the cluster based on the evaluation. The clusters may be ranked according to the respective scores assigned to the plurality of clusters. The collection of data stored by each cluster may include financial data evaluated by the scoring strategy for a risk of fraud. The score assigned to each cluster may correspond to an amount at risk.
-
公开(公告)号:US10853338B2
公开(公告)日:2020-12-01
申请号:US16240507
申请日:2019-01-04
发明人: Jacob Meacham , Michael Harris , Gustav Brodman , Lynn Cuthriell , Hannah Korus , Brian Toth , Jonathan Hsiao , Mark Elliot , Brian Schimpf , Michael Garland , Evelyn Nguyen
摘要: A history preserving data pipeline computer system and method. In one aspect, the history preserving data pipeline system provides immutable and versioned datasets. Because datasets are immutable and versioned, the system makes it possible to determine the data in a dataset at a point in time in the past, even if that data is no longer in the current version of the dataset.
-
公开(公告)号:US10817513B2
公开(公告)日:2020-10-27
申请号:US15634422
申请日:2017-06-27
发明人: Michael Harris , John Carrino , Eric Wong
IPC分类号: G06F16/2453 , G06F16/951 , G06F16/2455
摘要: A fair scheduling system with methodology for scheduling queries for execution by a database management system in a fair manner. The system obtains query jobs for execution by the database management system and cost estimates to execute the query jobs. Based on the cost estimates, the system causes the database management system to execute the query jobs as separate sub-query tasks in a round-robin fashion. By doing so, the execution latency of low cost query jobs that return few results is reduced when the query jobs are concurrently executed with high cost query jobs that return many results.
-
-
-
-
-
-
-
-
-