-
11.
公开(公告)号:US09208159B2
公开(公告)日:2015-12-08
申请号:US14451221
申请日:2014-08-04
Applicant: PALANTIR TECHNOLOGIES, INC.
Inventor: Geoffrey Stowe , Chris Fischer , Paul George , Eli Bingham , Rosco Hill
CPC classification number: G06F17/30153 , G06F11/2025 , G06F17/00 , G06F17/30067 , G06F17/30091 , G06F17/30106 , G06F17/30129 , G06F17/30371 , G06F17/30528 , G06F17/30554 , G06F17/30569 , G06F17/30705 , G06F17/30867 , G06F17/30955
Abstract: A data analysis system is proposed for providing fine-grained low latency access to high volume input data from possibly multiple heterogeneous input data sources. The input data is parsed, optionally transformed, indexed, and stored in a horizontally-scalable key-value data repository where it may be accessed using low latency searches. The input data may be compressed into blocks before being stored to minimize storage requirements. The results of searches present input data in its original form. The input data may include access logs, call data records (CDRs), e-mail messages, etc. The system allows a data analyst to efficiently identify information of interest in a very large dynamic data set up to multiple petabytes in size. Once information of interest has been identified, that subset of the large data set can be imported into a dedicated or specialized data analysis system for an additional in-depth investigation and contextual analysis.
Abstract translation: 提出了一种数据分析系统,用于从可能的多个异构输入数据源提供细粒度的低延迟访问大容量输入数据。 输入数据被解析,可选地变换,索引并存储在水平可扩展的键值数据存储库中,在该存储库中可以使用低延迟搜索进行访问。 输入数据可以在存储之前被压缩成块,以最小化存储要求。 搜索结果以原始形式显示输入数据。 输入数据可以包括访问日志,呼叫数据记录(CDR),电子邮件消息等。该系统允许数据分析者在大小上达到多PB的非常大的动态数据集中有效地识别感兴趣的信息。 一旦确定了感兴趣的信息,大数据集的该子集可以被导入到专门的或专门的数据分析系统中以进行进一步的深入调查和上下文分析。
-
公开(公告)号:US12238136B2
公开(公告)日:2025-02-25
申请号:US18504392
申请日:2023-11-08
Applicant: Palantir Technologies Inc.
Inventor: Harkirat Singh , Geoffrey Stowe , Stefan Bach , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: G06Q40/00 , G06F16/23 , G06F16/242 , G06F16/2457 , G06F16/2458 , G06F16/26 , G06F16/28 , G06F16/335 , G06F16/35 , G06F16/355 , G06F16/9535 , G06Q10/10 , G06Q20/38 , G06Q20/40 , G06Q30/018 , G06Q40/02 , G06Q40/03 , G06Q40/10 , G06Q40/12 , H04L9/40
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
公开(公告)号:US20240146761A1
公开(公告)日:2024-05-02
申请号:US18504392
申请日:2023-11-08
Applicant: Palantir Technologies Inc.
Inventor: Harkirat Singh , Geoffrey Stowe , Stefan Bach , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: H04L9/40 , G06F16/23 , G06F16/242 , G06F16/2457 , G06F16/2458 , G06F16/26 , G06F16/28 , G06F16/335 , G06F16/35 , G06F16/9535 , G06Q10/10 , G06Q20/38 , G06Q20/40 , G06Q30/018 , G06Q40/00 , G06Q40/02 , G06Q40/03 , G06Q40/10 , G06Q40/12
CPC classification number: H04L63/145 , G06F16/23 , G06F16/244 , G06F16/24578 , G06F16/2465 , G06F16/26 , G06F16/283 , G06F16/285 , G06F16/287 , G06F16/288 , G06F16/335 , G06F16/35 , G06F16/355 , G06F16/9535 , G06Q10/10 , G06Q20/382 , G06Q20/4016 , G06Q30/0185 , G06Q40/00 , G06Q40/02 , G06Q40/03 , G06Q40/10 , G06Q40/123
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
公开(公告)号:US11336681B2
公开(公告)日:2022-05-17
申请号:US16898850
申请日:2020-06-11
Applicant: Palantir Technologies Inc.
Inventor: Harkirat Singh , Geoffrey Stowe , Brendan Weickert , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: G06Q40/00 , H04L29/06 , G06F16/2457 , G06F16/23 , G06F16/242 , G06F16/28 , G06F16/9535 , G06Q10/10 , G06Q40/02 , G06F16/335 , G06F16/35 , G06F16/26 , G06F16/2458 , G06Q20/40 , G06Q30/00 , G06Q20/38
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
公开(公告)号:US10552436B2
公开(公告)日:2020-02-04
申请号:US15852515
申请日:2017-12-22
Applicant: Palantir Technologies Inc.
Inventor: Geoffrey Stowe , John McRaven , Andrew Pettit , Lucas Lemanowicz , Benedict Cappellacci , Arjun Mathur , Jonathan Victor , Nabeel Qureshi , Anshuman Prasad , Joy Tao , Mikhail Proniushkin , Casey Patton
IPC: G06F16/248 , G06T11/20 , G06F16/26 , G06F3/0482
Abstract: A system and method for processing data wherein one or more user selections of source data and an input defining one or more operations to be performed on the selected source data are received to generate processed data for display as a chart; the source data is retrieved from at least one data source, the source data is processed according to the defined one or more operations to generate processed data for output for display as a chart, the chart is stored as data defining the one or more operations and data identifying the source data operated on, a further user selection is received to redisplay the chart; retrieving the source data from the at least one data source; and the source data is processed according to the defined one or more operations to generate the processed data for output for redisplay as the chart.
-
公开(公告)号:US20180081896A1
公开(公告)日:2018-03-22
申请号:US15824096
申请日:2017-11-28
Applicant: Palantir Technologies, Inc.
Inventor: Geoffrey Stowe , Chris Fischer , Paul George , Eli Bingham , Rosco Hill
Abstract: A data analysis system is proposed for providing fine-grained low latency access to high volume input data from possibly multiple heterogeneous input data sources. The input data is parsed, optionally transformed, indexed, and stored in a horizontally-scalable key-value data repository where it may be accessed using low latency searches. The input data may be compressed into blocks before being stored to minimize storage requirements. The results of searches present input data in its original form. The input data may include access logs, call data records (CDRs), e-mail messages, etc. The system allows a data analyst to efficiently identify information of interest in a very large dynamic data set up to multiple petabytes in size. Once information of interest has been identified, that subset of the large data set can be imported into a dedicated or specialized data analysis system for an additional in-depth investigation and contextual analysis.
-
公开(公告)号:US20220261398A1
公开(公告)日:2022-08-18
申请号:US17662142
申请日:2022-05-05
Applicant: Palantir Technologies Inc.
Inventor: Geoffrey Stowe , John McRaven , Andrew Pettit , Lucas Lemanowicz , Benedict Cappellacci , Arjun Mathur , Jonathan Victor , Nabeel Qureshi , Anshuman Prasad , Joy Tao , Mikhail Proniushkin , Casey Patton
IPC: G06F16/248 , G06F3/0482 , G06T11/20 , G06F16/26
Abstract: A system and method for processing data wherein one or more user selections of source data and an input defining one or more operations to be performed on the selected source data are received to generate processed data for display as a chart; the source data is retrieved from at least one data source, the source data is processed according to the defined one or more operations to generate processed data for output for display as a chart, the chart is stored as data defining the one or more operations and data identifying the source data operated on, a further user selection is received to redisplay the chart; retrieving the source data from the at least one data source; and the source data is processed according to the defined one or more operations to generate the processed data for output for redisplay as the chart.
-
公开(公告)号:US20220239672A1
公开(公告)日:2022-07-28
申请号:US17658893
申请日:2022-04-12
Applicant: Palantir Technologies Inc.
Inventor: Harkirat Singh , Geoffrey Stowe , Brendan Weickert , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: H04L9/40 , G06Q40/00 , G06F16/2457 , G06F16/23 , G06F16/242 , G06F16/28 , G06F16/9535 , G06Q10/10 , G06Q40/02 , G06F16/335 , G06F16/35 , G06F16/26 , G06F16/2458 , G06Q20/40 , G06Q30/00 , G06Q20/38
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
公开(公告)号:US11354327B2
公开(公告)日:2022-06-07
申请号:US16919951
申请日:2020-07-02
Applicant: Palantir Technologies Inc.
Inventor: Geoffrey Stowe , John McRaven , Andrew Pettit , Lucas Lemanowicz , Benedict Cappellacci , Arjun Mathur , Jonathan Victor , Nabeel Qureshi , Anshuman Prasad , Joy Tao , Mikhail Proniushkin , Casey Patton
IPC: G06F16/248 , G06F3/0482 , G06T11/20 , G06F16/26
Abstract: A system and method for processing data wherein one or more user selections of source data and an input defining one or more operations to be performed on the selected source data are received to generate processed data for display as a chart; the source data is retrieved from at least one data source, the source data is processed according to the defined one or more operations to generate processed data for output for display as a chart, the chart is stored as data defining the one or more operations and data identifying the source data operated on, a further user selection is received to redisplay the chart; retrieving the source data from the at least one data source; and the source data is processed according to the defined one or more operations to generate the processed data for output for redisplay as the chart.
-
公开(公告)号:US20200159725A1
公开(公告)日:2020-05-21
申请号:US16720813
申请日:2019-12-19
Applicant: Palantir Technologies Inc.
Inventor: Geoffrey Stowe , John McRaven , Andrew Pettit , Lucas Lemanowicz , Benedict Cappellacci , Arjun Mathur , Jonathan Victor , Nabeel Qureshi , Anshuman Prasad , Joy Tao , Mikhail Proniushkin , Casey Patton
IPC: G06F16/248 , G06F16/26 , G06T11/20 , G06F3/0482
Abstract: A system and method for processing data wherein one or more user selections of source data and an input defining one or more operations to be performed on the selected source data are received to generate processed data for display as a chart; the source data is retrieved from at least one data source, the source data is processed according to the defined one or more operations to generate processed data for output for display as a chart, the chart is stored as data defining the one or more operations and data identifying the source data operated on, a further user selection is received to redisplay the chart; retrieving the source data from the at least one data source; and the source data is processed according to the defined one or more operations to generate the processed data for output for redisplay as the chart.
-
-
-
-
-
-
-
-
-