摘要:
Various technologies and techniques are disclosed for utilizing spreadsheet references with grouped aggregate views. A grouped aggregate view feature enables a user to create a grouped aggregate view of data. A calculation feature enables the user to calculations for the grouped aggregate view of data that are based upon relative or absolute references to data in the grouped aggregate view. Input is received from a user to write a calculation within a first cell in a grouped aggregate view. Input is received from the user to select a second cell to reference when writing the calculation. The user is presented with available references that are relevant to data contained in the second cell. Input is received from the user to select one of the available references that are relevant for the second cell. The selected one of the available references is placed into the first cell.
摘要:
Various technologies and techniques are disclosed for utilizing spreadsheet references with grouped aggregate views. A grouped aggregate view feature enables a user to create a grouped aggregate view of data. A calculation feature enables the user to calculations for the grouped aggregate view of data that are based upon relative or absolute references to data in the grouped aggregate view. Input is received from a user to write a calculation within a first cell in a grouped aggregate view. Input is received from the user to select a second cell to reference when writing the calculation. The user is presented with available references that are relevant to data contained in the second cell. Input is received from the user to select one of the available references that are relevant for the second cell. The selected one of the available references is placed into the first cell.
摘要:
A system and method for analytically modeling data with related attributes is disclosed. A single dimension is used to provide data according to each of the related attributes, and, thus, may be said to play the role of each related attribute depending on a received query. The measure of the analytical data model is tied to the dimension according to both data attributes to allow the measure to be analyzed by the dimension according to both attributes.
摘要:
The subject invention relates to systems and methods that extend the network data access capabilities of mark-up language protocols. In one aspect, a network data modeling system is provided. The system includes a protocol component that employs a computerized mark-up language to facilitate data interactions between network components. An extension component operates with the protocol component to support the data transactions, where the extension component supplies various commands above standard network and database protocols. An object model is provided as a wrapper to the extensions in order to support various online and offline database development applications.
摘要:
The subject invention pertains to interaction with multidimensional data. More specifically, interactions can be constrained to a limited subset of a multidimensional data cube, namely a subcube. Subsequent to or concurrently with subcube creation, query execution or other interactions such as calculations can be consolidated or restricted to the smaller subcube query space rather than the typically enormous main cube. Multiple subcubes can also be created and nested to gradually reduce the query space. Deletion of one subcube can cause a reversion back to a previously defined or hierarchical parent subcube.
摘要:
The subject disclosure relates to column based data encoding where raw data to be compressed is organized by columns, and then, as first and second layers of reduction of the data size, dictionary encoding and/or value encoding are applied to the data as organized by columns, to create integer sequences that correspond to the columns. Next, a hybrid greedy run length encoding and bit packing compression algorithm further compacts the data according to an analysis of bit savings. Synergy of the hybrid data reduction techniques in concert with the column-based organization, coupled with gains in scanning and querying efficiency owing to the representation of the compact data, results in substantially improved data compression at a fraction of the cost of conventional systems.
摘要:
Computer-readable media, systems, and methods for building a multidimensional data cube having one or more high-cardinality attributes are described. In embodiments, data is extracted from one or more databases. It is determined that one or more instances of the data are fact data and one or more instances of the data are dimension data. Each member of the fact data is one instance of a dimension and each instance of the dimension data includes an attribute for grouping the fact data. Moreover, in embodiments it is determined that one or more instances of the dimension data are high-cardinality attributes. The one or more high-cardinality attributes are processed with fact data and stored in fact tables on a computer storage medium.
摘要:
A scalable analysis system is described herein that performs common data analysis operations such as distinct counts and data grouping in a more scalable and efficient manner. The system allows distinct counts and data grouping to be applied to large datasets with predictable growth in the cost of the operation. The system dynamically partitions data based on the actual data distribution, which provides both scalability and uncompromised performance. The system sets a budget of available memory or other resources to use for the operation. As the operation progresses, the system determines whether the budget of memory is nearing exhaustion. Upon detecting that the memory used is near the limit, the system dynamically partitions the data. If the system still detects memory pressure, then the system partitions again, until a partition level is identified that fits within the memory budget.
摘要:
Random access to run-length encoded data values is provided. A target value is identified by a logical index into a structure of run-length-encoded values. To access the value, a bookmark is selected based on the logical index, on a maximum logical index of the bookmark, and on a specified bookmark distance. An initial run in the structure is located, based on the selected bookmark. A final run is chosen, at most one bookmark distance from the initial run. The target value is the value of the final run. Efficiency heuristics are used when generating bookmarks or creating the structure of run-length-encoded values.
摘要:
The present invention leverages MOLAP performance for ROLAP objects (dimensions, partitions and aggregations) by building, in a background process, a MOLAP equivalent of that object. When the background processing completes, queries are switched from ROLAP queries to MOLAP queries. When changes occur to relevant relational objects (such as tables that define content of OLAP objects), an OLAP object is switched back to a ROLAP mode, and all relevant caches are dropped while, as a background process, a new MOLAP equivalent is created.