-
公开(公告)号:US10685042B2
公开(公告)日:2020-06-16
申请号:US14578786
申请日:2014-12-22
Applicant: Amazon Technologies, Inc.
IPC: G06F17/00 , G06F16/28 , G06F16/242 , G06F16/2453 , G06F16/27
Abstract: A corpus of information describing queries used to access a transactional data store may be used to identify analytical relationships that are not explicitly defined in a schema or supplied by a user. Join relationships may be identified based on field coincidence in elements of queries in the corpus. Join relationships may be indicative of dimensions and attributes of a dimension. Hierarchy levels for a dimension may be identified based on factors including data type, reference in an aggregating clause, and reference in a grouping clause.
-
公开(公告)号:US10430438B2
公开(公告)日:2019-10-01
申请号:US14494506
申请日:2014-09-23
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Mukul Vijay Karnik , Amol Devgan , Swaminathan Sivasubramanian
IPC: G06F16/28 , H04L29/06 , G06F16/24 , G06F16/26 , G06F16/185 , G06F16/27 , G06F16/901 , G06F21/62
Abstract: An online analytical processing system may comprise an n-dimensional cube structured using slice-based partitioning in which each slice comprises one or more hierarchies of data points. A region of a hierarchy may be classified according to computational demands associated with the region. A scaling or replication mechanism may be applied to the region based on the computational demands associated with that region.
-
公开(公告)号:US20190095444A1
公开(公告)日:2019-03-28
申请号:US15713034
申请日:2017-09-22
Applicant: Amazon Technologies, Inc.
Inventor: John Payne , Yung Haw Wang , Mohan Rao Varthakavi , Jose Kunnackal John , Santosh Kalki , Mukul Vijay Karnik , Jared Scott Lundell
Abstract: A data analysis system provides data analytics to a user via a natural language interface. In various embodiments, the data analysis system identifies statistical measures, analytical insights, data trends, or relationships with other data sets, based at least in part on a natural language query provided by a user. In an embodiment, the data analysis system interprets the natural language query to produce a result, and the result is converted into a natural language result which is provided to the user. In an embodiment, the data analysis system acquires an audio stream of a conversation. In an embodiment, the data analysis system identifies the parties to the conversation, and further identifies datasets of the parties. In an embodiment, the data analysis system identifies a characteristic of the datasets that is relevant to the conversation, and provides the characteristic to the parties.
-
公开(公告)号:US20190073398A1
公开(公告)日:2019-03-07
申请号:US16179802
申请日:2018-11-02
Applicant: Amazon Technologies, Inc.
IPC: G06F17/30
CPC classification number: G06F16/2456 , G06F16/275
Abstract: A probabilistic counting structure such as a hyperloglog may be formed during a table scan for each of a selected set of columns. The columns may be selected based on an initial estimate of relatedness, which may be based on data types of the respective columns. An estimated cardinality of an intersection or union of columns may be formed based on an intersection of the probabilistic data structures. A join path may be determined based on the estimated cardinality of an intersection or union of the columns.
-
公开(公告)号:US10812551B1
公开(公告)日:2020-10-20
申请号:US15862422
申请日:2018-01-04
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Swaminathan Sivasubramanian , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Amol Devgan , Mukul Vijay Karnik
Abstract: A hosted analytics system may be integrated with transactional data systems and additional data sources such real-time systems and log files. A data processing pipeline may transform data on arrival for incorporation into an n-dimensional cube. Correlation between patterns of events in transactional data may be identified. Upon arrival, new data may be transformed and incorporated into the n-dimensional cube. Similarity between the new data and a previously identified correlation may be determined and flagged.
-
公开(公告)号:US09882949B1
公开(公告)日:2018-01-30
申请号:US14503077
申请日:2014-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Swaminathan Sivasubramanian , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Amol Devgan , Mukul Vijay Karnik
Abstract: A hosted analytics system may be integrated with transactional data systems and additional data sources such real-time systems and log files. A data processing pipeline may transform data on arrival for incorporation into an n-dimensional cube. Correlation between patterns of events in transactional data may be identified. Upon arrival, new data may be transformed and incorporated into the n-dimensional cube. Similarity between the new data and a previously identified correlation may be determined and flagged.
-
公开(公告)号:US09244971B1
公开(公告)日:2016-01-26
申请号:US13788057
申请日:2013-03-07
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Santosh Kalki
IPC: G06F17/30
CPC classification number: G06F17/30463 , G06F3/0482 , G06F17/30389 , G06F17/30392 , G06F17/30442 , G06F17/30569
Abstract: Techniques are described for retrieving data stored in datastores with different or heterogeneous data storage formats. A data report request is received, specifying data attributes, conditions, and ordering information for data to be retrieved from one or more datastores. The request may be in a syntax tree format that is abstracted away from any particular data storage technology or native query language, enabling data retrieval requests from users who lack a particular knowledge of query languages and the underlying storage format of the datastores. The request is analyzed, and a query plan is determined based on storage metadata indicating data attributes stored in various datastores, and based on data retrieval latency information for the datastores. Each query of the query plan is generated in a native query language supported by the targeted datastore. The query plan is executed to generate the requested data report.
Abstract translation: 描述了用于检索存储在具有不同或异种数据存储格式的数据存储中的数据的技术。 接收到数据报告请求,指定要从一个或多个数据存储检索的数据的数据属性,条件和排序信息。 该请求可以是从任何特定数据存储技术或本地查询语言抽象出的语法树格式,使得能够从缺乏查询语言的特定知识的用户和数据存储的底层存储格式的数据检索请求。 分析请求,并且基于指示存储在各种数据存储中的数据属性的存储元数据,并且基于用于数据存储的数据检索延迟信息来确定查询计划。 查询计划的每个查询都以目标数据存储支持的本机查询语言生成。 执行查询计划以生成所请求的数据报告。
-
公开(公告)号:US09189515B1
公开(公告)日:2015-11-17
申请号:US13791453
申请日:2013-03-08
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Karthik Tamilmani , Santosh Kalki
IPC: G06F17/30
CPC classification number: G06F17/30445 , G06F17/30424 , G06F17/30943
Abstract: Techniques are described for retrieving data stored in disparate datastores that support different or heterogeneous storage systems. A report description may be received from a user, the report description including multiple query templates for generating queries to retrieve data from the disparate datastores. The report description may be analyzed to determine input parameters for generating the queries. A user interface may be dynamically generated to solicit input values corresponding to the input parameters. On receiving the input values, the system may generate and execute the queries of the query plan, and combine the results based on result combination information included in the report description.
Abstract translation: 描述了用于检索存储在支持不同或异构存储系统的不同数据存储中的数据的技术。 可以从用户接收报告描述,报告描述包括用于生成查询以从不同数据存储中检索数据的多个查询模板。 可以分析报告描述以确定用于生成查询的输入参数。 可以动态地生成用户界面以征求对应于输入参数的输入值。 在接收输入值时,系统可以生成并执行查询计划的查询,并根据报告描述中包含的结果组合信息组合结果。
-
公开(公告)号:US10776397B2
公开(公告)日:2020-09-15
申请号:US14494524
申请日:2014-09-23
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Mukul Vijay Karnik , Amol Devgan , Swaminathan Sivasubramanian
IPC: G06F16/28 , H04L29/06 , G06F16/24 , G06F16/26 , G06F16/185 , G06F16/27 , G06F16/901 , G06F21/62
Abstract: An online analytical processing system may comprise an n-dimensional cube partitioned into slices, in which each slice may represent data points at the intersections of fixed and variable dimensions. Computation of data points within a slice may be deferred. A dependency graph may be initially constructed, in which the dependency graph is utilized in a subsequent computation. Calculation of data points may be prioritized based on information indicative of a chance that the data points will be accessed.
-
公开(公告)号:US10162876B1
公开(公告)日:2018-12-25
申请号:US14973629
申请日:2015-12-17
Applicant: Amazon Technologies, Inc.
Inventor: Srinivasan Sundar Raghavan , Swaminathan Sivasubramanian , Timothy Andrew Rath , Mukul Vijay Karnik , Amol Devgan , Santosh Kalki
Abstract: An analytics module may be embedded into an application developed, published, or used by an entity in addition to the owner of the data under analysis. An access token may be submitted by the analytics module to a provider of hosted services. The access token may correspond to an n-dimensional cube containing data at a level of granularity permitted to the application. The access token may incorporate additional policies controlling access to the corresponding n-dimensional cube.
-
-
-
-
-
-
-
-
-