摘要:
The subject disclosure relates to querying of column based data encoded structures enabling efficient query processing over large scale data storage, and more specifically with respect to complex queries implicating filter and/or sort operations for data over a defined window. In this regard, in various embodiments, a method is provided that avoids scenarios involving expensive sorting of a high percentage of, or all, rows, either by not sorting any rows at all, or by sorting only a very small number of rows consistent with or smaller than a number of rows associated with the size of the requested window over the data. In one embodiment, this is achieved by splitting an external query request into two different internal sub-requests, a first one that computes statistics about distribution of rows for any specified WHERE clauses and ORDER BY columns, and a second one that selects only the rows that match the window based on the statistics.
摘要:
The subject disclosure relates to querying of column based data encoded structures enabling efficient query processing over large scale data storage, and more specifically with respect to complex queries implicating filter and/or sort operations for data over a defined window. In this regard, in various embodiments, a method is provided that avoids scenarios involving expensive sorting of a high percentage of, or all, rows, either by not sorting any rows at all, or by sorting only a very small number of rows consistent with or smaller than a number of rows associated with the size of the requested window over the data. In one embodiment, this is achieved by splitting an external query request into two different internal sub-requests, a first one that computes statistics about distribution of rows for any specified WHERE clauses and ORDER BY columns, and a second one that selects only the rows that match the window based on the statistics.
摘要:
Computer-readable media, systems, and methods for building a multidimensional data cube having one or more high-cardinality attributes are described. In embodiments, data is extracted from one or more databases. It is determined that one or more instances of the data are fact data and one or more instances of the data are dimension data. Each member of the fact data is one instance of a dimension and each instance of the dimension data includes an attribute for grouping the fact data. Moreover, in embodiments it is determined that one or more instances of the dimension data are high-cardinality attributes. The one or more high-cardinality attributes are processed with fact data and stored in fact tables on a computer storage medium.
摘要:
A data model for accessing data in a relational database in an OLAP system utilizes a multiple-hierarchy dimension. The dimension includes a set of attributes. Each attribute is bound to a column in the relational database. A logical structure is defined, indicating the relationships between the attributes. Hierarchies are defined. Each hierarchy includes a sequence of attributes. A hierarchy provides a common drill-down path that a database user can utilize to access the database. A hierarchy can include a single attribute or a combination of attributes. Both the relationships between the attributes and the sequence of attributes in a hierarchy are defined independent of any restrictions associated with the database.
摘要:
A system that facilitates analyzing content of a multi-dimensional structure comprises a calculation component that receives statements in a declarative language relating to one or more of an assignment and calculation and executes such statements against a multi-dimensional structure. A pass generation component creates a pass in order to maintain content of the multi-dimensional structure as it existed prior to execution of the statement, the pass is accessible upon reference to such pass.
摘要:
The subject invention pertains to interaction with multidimensional data. More specifically, interactions can be constrained to a limited subset of a multidimensional data cube, namely a subcube. Subsequent to or concurrently with subcube creation, query execution or other interactions such as calculations can be consolidated or restricted to the smaller subcube query space rather than the typically enormous main cube. Multiple subcubes can also be created and nested to gradually reduce the query space. Deletion of one subcube can cause a reversion back to a previously defined or hierarchical parent subcube.
摘要:
The subject invention pertains to systems and methods for interacting with fact dimensions. In particular, systems and methods are disclosed that optimize performance and scalability with respect to processing queries that involve fact dimensions. Furthermore, queries involving fact dimensions can be evaluated in distinct manners. For instance, queries can be processed such that regular dimensions restrict the scope of the data and only fact dimension members that are relevant to that scope are exposed.
摘要:
A system that facilitates one or more of querying and updating a multi-dimensional structure comprises a component that receives a statement in a declarative language relating to a typed object associated with a multi-dimensional structure. A conversion component analyzes context associated with the statement and automatically converts the object to a disparate type as a function of the analysis. For example, an execution engine can comprise the conversion component, and the execution engine can be an Online Analytical Processing (OLAP) engine.
摘要:
The subject invention pertains to systems and methods for interacting with fact dimensions. In particular, systems and methods are disclosed that optimize performance and scalability with respect to processing queries that involve fact dimensions. Furthermore, queries involving fact dimensions can be evaluated in distinct manners. For instance, queries can be processed such that regular dimensions restrict the scope of the data and only fact dimension members that are relevant to that scope are exposed.
摘要:
A system that facilitates one or more of querying and updating a multi-dimensional structure comprises a component that receives a statement in a declarative language relating to a typed object associated with a multi-dimensional structure. A conversion component analyzes context associated with the statement and automatically converts the object to a disparate type as a function of the analysis. For example, an execution engine can comprise the conversion component, and the execution engine can be an Online Analytical Processing (OLAP) engine.