-
21.
公开(公告)号:US09158805B1
公开(公告)日:2015-10-13
申请号:US13796361
申请日:2013-03-12
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Santosh Kalki , Adam Stephen Duncan , Jenny Bandy Freshwater
IPC: G06F17/30
CPC classification number: G06F17/30371
Abstract: Techniques are described for enabling or suspending access to one or more datastores based on a determined quality of the stored data. The datastores may use relational or non-relational formats. User-specified rules may be applied to statistically determine the data quality of at least a portion of the data in the datastore. The rules may perform statistical tests on the data, such as determining whether an amount of stored data is within a margin of a historical average, whether a number of records storing particular data is within a historical average, and so forth. Based on the rules, a flag may be set to indicate the determined data quality. Access to the data may be based on the value of the flag.
Abstract translation: 描述了基于所确定的存储数据质量来启用或暂停对一个或多个数据存储区的访问的技术。 数据存储可以使用关系或非关系格式。 可以应用用户指定的规则以统计地确定数据存储区中的至少一部分数据的数据质量。 规则可以对数据执行统计测试,例如确定存储的数据量是否在历史平均值的边缘内,存储特定数据的记录数是否在历史平均值内,等等。 基于规则,可以设置标志来指示所确定的数据质量。 对数据的访问可以基于标志的值。
-
公开(公告)号:US11526518B2
公开(公告)日:2022-12-13
申请号:US15712917
申请日:2017-09-22
Applicant: Amazon Technologies, Inc.
Inventor: John Payne , Yung Haw Wang , Mohan Rao Varthakavi , Jose Kunnackal John , Santosh Kalki , Mukul Vijay Karnik , Jared Scott Lundell
IPC: G06F16/20 , G06F16/2457 , G06Q30/00 , G06Q40/00 , G06Q10/10
Abstract: A data analysis system determines characteristics of a data set such as statistical measures, analytical insights, data trends, or relationships with other data sets. The system determines a level of importance for each determined characteristic using metadata associated with the data set, and, in some cases, user preferences provided by the user. Such metadata may include descriptive names, data types, and data characteristics of the data set and of data elements within the data set.
-
公开(公告)号:US10972491B1
公开(公告)日:2021-04-06
申请号:US15977670
申请日:2018-05-11
Applicant: Amazon Technologies, Inc.
Inventor: Sudipto Guha , Santosh Kalki , Akshay Satish
Abstract: Techniques for seasonality-based anomaly detection and forecast are described. For example, a method of receiving a request to generate forecast for received time series data; performing a seasonality-based anomaly detection and forecast for the received time series data based upon the received request, the seasonality-based anomaly detection and forecasting to utilize a second data structure that reflect anomalies found in a first data structure on the input from the received time series data; and providing a result of the performed seasonality-based anomaly detection and forecast is described.
-
公开(公告)号:US10831759B2
公开(公告)日:2020-11-10
申请号:US16179802
申请日:2018-11-02
Applicant: Amazon Technologies, Inc.
IPC: G06F17/00 , G06F16/2455 , G06F16/27
Abstract: A probabilistic counting structure such as a hyperloglog may be formed during a table scan for each of a selected set of columns. The columns may be selected based on an initial estimate of relatedness, which may be based on data types of the respective columns. An estimated cardinality of an intersection or union of columns may be formed based on an intersection of the probabilistic data structures. A join path may be determined based on the estimated cardinality of an intersection or union of the columns.
-
公开(公告)号:US10120905B2
公开(公告)日:2018-11-06
申请号:US14578841
申请日:2014-12-22
Applicant: Amazon Technologies, Inc.
Abstract: A probabilistic counting structure such as a hyperloglog may be formed during a table scan for each of a selected set of columns. The columns may be selected based on an initial estimate of relatedness, which may be based on data types of the respective columns. An estimated cardinality of an intersection or union of columns may be formed based on an intersection of the probabilistic data structures. A join path may be determined based on the estimated cardinality of an intersection or union of the columns.
-
公开(公告)号:US10095722B1
公开(公告)日:2018-10-09
申请号:US14672880
申请日:2015-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Ajay Gopalakrishnan , Mukul Vijay Karnik , Jared Scott Lundell , Yoav Srebrnik , Santosh Kalki
IPC: G06F17/30
Abstract: Data may be stored using hybrid multidimensional and column-centric storage techniques. A hierarchy of regions of a multidimensional space may be maintained on one or more storage devices. Range information for the hierarchy may be maintained in a column-centric storage. Leaf nodes of the hierarchy may comprise tuple data stored in a column-centric storage. Tuples may be located by identifying candidate regions encompassing the tuple and scanning column-centric stores at the leaf level. Region splitting may be deferred to favor column-centric search characteristics.
-
公开(公告)号:US09824133B1
公开(公告)日:2017-11-21
申请号:US14494473
申请日:2014-09-23
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Mukul Vijay Karnik , Amol Devgan , Swaminathan Sivasubramanian
CPC classification number: G06F17/30592
Abstract: A multi-tenant system for providing hosted analytic services may be dynamically configured in response to a request from a user. A request for analytic services may comprise an indication of at least one data source to be incorporated into an n-dimensional cube. A data source connector and transformation pipeline may transform data received from the data source to a format compatible with a dimension and hierarchy model of the n-dimensional cube.
-
公开(公告)号:US09740738B1
公开(公告)日:2017-08-22
申请号:US14963095
申请日:2015-12-08
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Santosh Kalki
IPC: G06F17/30 , G06F3/0482
CPC classification number: G06F17/30463 , G06F3/0482 , G06F17/30389 , G06F17/30392 , G06F17/30442 , G06F17/30569
Abstract: A request that includes a syntax tree is received. A query plan that comprising one or more queries using storage metadata and one or more data attributes obtained from the syntax tree is generated. Individual ones of the one or more queries are used to retrieve at least one of the one or more data attributes from the at least two datastores. The generating of the query plan is generated by modifying individual ones of the one or more queries to access a datastore of the at least two datastores having a lower data retrieval latency. The one or more queries of the query plan are executed to generate the report, the report including data resulting from the executing of the one or more queries.
-
-
-
-
-
-
-