摘要:
In embodiments of the present invention improved capabilities are described for identifying a classification scheme associated with product attributes of a grouping of products of an entity, receiving a record of data relating to an item of a competitor to the entity, the classification of which is uncertain, receiving a dictionary of attributes associated with products, and assigning a product code to the item, based on probabilistic matching among the attributes in the classification scheme, the attributes in the dictionary of attributes and at least one known attribute of the item.
摘要:
In embodiments of the present invention improved capabilities are described for identifying a classification scheme associated with product attributes of a grouping of products of an entity, receiving a record of data relating to an item of a competitor to the entity, the classification of which is uncertain, receiving a dictionary of attributes associated with products, and assigning a product code to the item, based on probabilistic matching among the attributes in the classification scheme, the attributes in the dictionary of attributes and at least one known attribute of the item.
摘要:
Systems and methods are presented that may involve receiving a aggregated dataset, wherein the aggregated dataset includes data from a panel data source, a fact data source, and a dimension data source that have been associated with a standard population database. The process may also involve storing the aggregated data in a partition within a partitioned database, wherein the partition is associated with a data characteristic. The process may also involve associating a master processing node with a plurality of slave nodes, wherein each of the plurality of slave nodes is associated with a partition of the partitioned database. The process may also involve submitting an analytic query to the master processing node. The process may also involve assigning analytic processing to at least one of the plurality of slave nodes by the master processing node, wherein the assignment is based at least in part on the association of the partition with the data characteristic. The process may also involve reading the aggregated data from the partitioned database by the assigned slave node. The process may also involve analyzing the aggregated data by the assigned slave node, wherein the analysis produces a result at each slave node. The process may also involve combining the results from each of the plurality of slave nodes by the master processing node into a master result and reporting the master result to a user interface.
摘要:
In embodiments of the present invention, a method is described for reducing bias by data fusion of a household panel data and a loyalty card data. In embodiments, a method is provided for receiving a consumer panel dataset in a data fusion facility, receiving a consumer point-of-sale dataset in a data fusion facility, receiving a dimension dataset in a data fusion facility, fusing the datasets received in the data fusion facility into a new panel dataset based at least in part on an encryption key, estimating a consumer behavior using a first model based on the consumer panel dataset, estimating a consumer behavior using a second model based only on those consumers present in both the consumer panel dataset and the consumer point-of-sale dataset, and refining the first model based at least on the results of the second model.
摘要:
In embodiments of the present invention improved capabilities are described for using an analytic platform to obtain a projection, where a user of an analytic platform may select at least one dimension on which the user wishes to make a projection from the data set. A core information matrix may be developed for data set, where the core information matrix may include regions representing the statistical characteristics of alternative projection techniques that may be applied to the data set, and may include statistical characteristics relating to projections using any selected dimensions. In addition, a user interface may be provided whereby a user may observe the regions of the core information matrix to facilitate selecting an appropriate projection technique.
摘要:
In embodiments of the present invention, improved capabilities are described for perturbing non-unique values may comprise finding the non-unique values in a data table, perturbing the non-unique values to render unique values, and using the non-unique values as an identifier for a data item.
摘要:
Using a computer, a database comprising a field is identified. A query relating to the field is identified. Prior to processing the query, the field is dynamically altered to conform to a desired bit size. The query is processed. The results of the query are returned.
摘要:
The present invention provides a method for updating data sources. The method may include identifying a plurality of data sources, identifying a plurality of overlapping attribute segments to use for comparing the data sources, calculating a factor as a function of each of the plurality of overlapping attribute segments, and using the factors to update a first group of values in the second data source to reduce bias. Further, at least a first data source is more accurate than a second data source.
摘要:
In embodiments, systems and methods may involve using a platform as disclosed herein for applications described herein where the systems and methods involve receiving a dataset in an analytic platform, the dataset including fact data and dimension data for a plurality of distinct product categories. It may also involve storing the data in a flexible hierarchy, the hierarchy allowing the temporary fixing of data along a dimension and flexible querying along other dimensions of the data. It may also involve pre-aggregating certain combinations of data to facilitate rapid querying, the pre-aggregation based on the nature of common queries. It may also involve facilitating the presentation of a cross-category view of an analytic query of the dataset. In embodiments, the temporarily fixed dimension can be rendered flexible upon an action by the user.
摘要:
In embodiments of the present invention improved capabilities are described for identifying a first classification scheme associated with product attributes of a first grouping of products, identifying a second classification scheme associated with product attributes of a second grouping of products, and receiving a record of data relating to an item, the classification of which is uncertain. It may also involve receiving a dictionary of attributes associated with products and assigning the item to at least one of the classification schemes based on probabilistic matching among the attributes in the classification schemes, the attributes in the dictionary of attributes and the known attributes of the item.