摘要:
Systems, methods and articles of manufacture are disclosed for building and executing analytics solutions. Such a solution may provide a comprehensive analytics solution (e.g., a risk assessment, fraud detection solution, dynamic operational risk evaluations, regulatory compliance assessments, etc.). The analytics solution may perform an analytics task using operational data distributed across a variety of independently created and governed data repositories in different departments of an organization. A framework is disclosed which allows a user (e.g., a risk analyst) to compose analytical tools that can access data from a variety of sources (both internal and external to an enterprise) and perform a variety of analytic functions.
摘要:
A method and structure for storing information for one or more semantic objects derived from raw data. A semantic object extracted from the raw data and classified to comprise the semantic object is received, the received semantic object having one or more attributes. A summary of attributes of the semantic object by calculating one or more statistics of one or more of the one or more attributes of the received semantic object, a confidence level of the received semantic object that quantifies a degree of certainty that the received semantic object has been correctly classified and/or labeled; and a compact representation of raw data of the received semantic object are generated. Indexing information for one or more of the summary of attributes and the compact representation of the semantic object is generated. The semantic object, along with its associated summary of attributes, confidence level, compact representation, and indexing information, stored in a semantic object database associated with a database storing the raw data.
摘要:
Techniques are provided for enabling execution of a process employing a cache Method steps can include obtaining a first probability of accessing a given artifact in a state Si, obtaining a second probability of using a predicate from a current state Sc in the state Si, determining a benefit of prefetching the given artifact using the predicate based on at least the first probability and the second probability, and whether and/or when a cache replacement should be conducted, based at least on the benefit determined.
摘要:
A computer implemented method, computer program product and data processing system, for optimizing a layout of a relational database on a solid state disk. The optimized layout comprises forming a plurality of column to disk block assignments, wherein each disk block is assigned substantially the same amount of column data. A column having a size less than a greatest size of any disk block is assigned to one of a plurality of disk blocks. A column having a size greater than or equal to the greatest size of any disk block is allowed a multiple disk block assignment.
摘要:
An object tracking technique is provided which, given: (i) a potentially large data set; (ii) a set of dimensions along which the data has been ordered; and (iii) a set of functions for measuring the similarity between data elements, a set of objects are produced. Each of these objects is defined by a list of data elements. Each of the data elements on this list contains the probability that the data element is part of the object. The method produces these lists via an adaptive, knowledge-based search function which directs the search for high-probability data elements. This serves to reduce the number of data element combinations evaluated while preserving the most flexibility in defining the associations of data elements which comprise an object.
摘要:
A system and method, which registers and stores data and is responsive to queries through management of an inferencing-enabled metadata includes an intelligent database, which receives data or queries and manages data models. An ontology management system is associated with the intelligent database and receives and stores classes of information related to a data model therein to be employed in satisfying queries. A relational database is associated with the intelligent database and receives and stores attribute schema for instances of the class having at least one attribute value linked with the class in the ontology management system.
摘要:
A method, system, and computer program for enabling parametric searches on source data using text search engine. The invention is generally divided into a build-time process and a run-time process. During the build-time process, a crawler extracts data units from source data. A data translator then translates data units into keyword parametric entries that are submitted to the text search engine. During the run-time process, a query translator translates parametric search queries into keyword search entries. A metadata refiner then filters intermediate search results from the search engine based on the parametric search query.
摘要:
A method and system for storing a semantic object includes summarizing the attributes of a semantic object, indexing the summary of attributes, and storing the summary of attributes and the index of the summary of attributes.
摘要:
Systems, methods and services for generating autonomous persistent storage systems that are self-configurable and self-managing, based on user-submitted entity definitions. For example, systems and methods are provided for automatically creating and updating persistent storage structures based on entity definitions, automatically populating persistent storage space with instance data of defined entities, automatically generating and adapting methods for accessing instance data in persistent storage, searching instance data and automatically optimizing search methods for instance data, and automatically creating and managing a cache of frequently accessed instance data.
摘要:
Electronic marketplaces typically apply catalog schema in the format of name-value pairs to store product attribute names and values to achieve a very high degree of flexibility. This vertical schema approach prevents traditional relational database management systems from accurately estimating constraint selectivity and generating efficient query plans. In this invention, methods and systems are disclosed for building and maintaining external histograms and a query planner uses these external histograms to assist query planning in relational databases.