摘要:
Electronic marketplaces typically apply catalog schema in the format of name-value pairs to store product attribute names and values to achieve a very high degree of flexibility. This vertical schema approach prevents traditional relational database management systems from accurately estimating constraint selectivity and generating efficient query plans. In this invention, methods and systems are disclosed for building and maintaining external histograms and a query planner uses these external histograms to assist query planning in relational databases.
摘要:
Systems and methods for conducting attribute-based queries over a plurality of objects using bounded memory locations and minimizing costly input and output operations are provided. A plurality of attributes are associated with each object, and a plurality of data groups, one each for the identified attributes are created. The objects associated with the attributes are placed into the appropriate data groups, and the objects contained within each data group are sorted into blocks such that each block within a given attribute contains that objects having the same attribute value. Results to the query are created by loading blocks into a primary memory location in a middleware system and combining the loaded blocks to create the desire query results. Block combinations are created based upon the fit of the given block combination to the query as expressed in an aggregation function. A second dedicated memory location can also be provided to hold multiple block combinations to optimize the order in which blocks are loaded and combined. Empty block buffers and external storage devices can also be provided to further enhance the generation of query results.
摘要:
Interoperability is enabled between participants in a network by determining values associated with a value metric defined for at least a portion of the network. Information flow is directed between two or more of the participants based at least in part on semantic models corresponding to the participants and on the values associated with the value metric. The semantic models may define interactions between the participants and define at least a portion of information produced or consumed by the participants. The determination of the values and the direction of the information flow may be performed multiple times in order to modify the one or more value metrics. The direction of information flow may allow participants to be deleted from the network, may allow participants to be added to the network, or may allow behavior of the participants to be modified.
摘要:
Interoperability is enabled between participants in a network by determining values associated with a value metric defined for at least a portion of the network. Information flow is directed between two or more of the participants based at least in part on semantic models corresponding to the participants and on the values associated with the value metric. The semantic models may define interactions between the participants and define at least a portion of information produced or consumed by the participants. The determination of the values and the direction of the information flow may be performed multiple times in order to modify the one or more value metrics. The direction of information flow may allow participants to be deleted from the network, may allow participants to be added to the network, or may allow behavior of the participants to be modified.
摘要:
Interoperability is enabled between participants in a network by determining values associated with a value metric defined for at least a portion of the network. Information flow is directed between two or more of the participants based at least in part on semantic models corresponding to the participants and on the values associated with the value metric. The semantic models may define interactions between the participants and define at least a portion of information produced or consumed by the participants. The determination of the values and the direction of the information flow may be performed multiple times in order to modify the one or more value metrics. The direction of information flow may allow participants to be deleted from the network, may allow participants to be added to the network, or may allow behavior of the participants to be modified.
摘要:
Systems, methods and articles of manufacture are disclosed for building and executing analytics solutions. Such a solution may provide a comprehensive analytics solution (e.g., a risk assessment, fraud detection solution, dynamic operational risk evaluations, regulatory compliance assessments, etc.). The analytics solution may perform an analytics task using operational data distributed across a variety of independently created and governed data repositories in different departments of an organization. A framework is disclosed which allows a user (e.g., a risk analyst) to compose analytical tools that can access data from a variety of sources (both internal and external to an enterprise) and perform a variety of analytic functions.
摘要:
A method and structure for storing information for one or more semantic objects derived from raw data. A semantic object extracted from the raw data and classified to comprise the semantic object is received, the received semantic object having one or more attributes. A summary of attributes of the semantic object by calculating one or more statistics of one or more of the one or more attributes of the received semantic object, a confidence level of the received semantic object that quantifies a degree of certainty that the received semantic object has been correctly classified and/or labeled; and a compact representation of raw data of the received semantic object are generated. Indexing information for one or more of the summary of attributes and the compact representation of the semantic object is generated. The semantic object, along with its associated summary of attributes, confidence level, compact representation, and indexing information, stored in a semantic object database associated with a database storing the raw data.
摘要:
Techniques are provided for enabling execution of a process employing a cache Method steps can include obtaining a first probability of accessing a given artifact in a state Si, obtaining a second probability of using a predicate from a current state Sc in the state Si, determining a benefit of prefetching the given artifact using the predicate based on at least the first probability and the second probability, and whether and/or when a cache replacement should be conducted, based at least on the benefit determined.
摘要:
A computer implemented method, computer program product and data processing system, for optimizing a layout of a relational database on a solid state disk. The optimized layout comprises forming a plurality of column to disk block assignments, wherein each disk block is assigned substantially the same amount of column data. A column having a size less than a greatest size of any disk block is assigned to one of a plurality of disk blocks. A column having a size greater than or equal to the greatest size of any disk block is allowed a multiple disk block assignment.
摘要:
An object tracking technique is provided which, given: (i) a potentially large data set; (ii) a set of dimensions along which the data has been ordered; and (iii) a set of functions for measuring the similarity between data elements, a set of objects are produced. Each of these objects is defined by a list of data elements. Each of the data elements on this list contains the probability that the data element is part of the object. The method produces these lists via an adaptive, knowledge-based search function which directs the search for high-probability data elements. This serves to reduce the number of data element combinations evaluated while preserving the most flexibility in defining the associations of data elements which comprise an object.