摘要:
The systems and methods partition digital data units in a content aware fashion without relying on any ancestry information, which enables one to find duplicate chunks in unrelated units of digital data even across millions of documents spread across thousands of computer systems.
摘要:
A limited-access database system is designed for rapid access of data records with reduced memory storage requirements. The database system employs a set of obfuscated data records stored in data crystals that can only be accessed and read by an iterator, which is not directly accessible by the users of the database. The iterator accesses information responsive to a predefined query sent from a customer application. Rather than providing general tools to customers for constructing any possible queries, such as is done in structured query language database systems, database systems embodying the present invention allow only predefined types of queries to be used by customer applications. By restricting the types of queries customer applications can call, valuable data records remain secure from unauthorized reconstruction or duplication while still allowing limited access for specific purposes.
摘要:
The systems and methods partition digital data units in a content aware fashion without relying on any ancestry information, which enables one to find duplicate chunks in unrelated units of digital data even across millions of documents spread across thousands of computer systems.
摘要:
A computer method is provided of online mining of quantitative association rules which has two stages, a preprocessing stage followed by an online rule generation stage. The required computational effort is reduced by the preprocessing stage, defined by preprocessing data to organize the relationships between antecedent attributes to create a hierarchically arranged multidimensional indexing structure. The resulting structure facilitates the performance of the second stage, online processing, which involves the generation of quantitative association rules. The second stage, online rule generation, utilizes the multidimensional index structure created by the preprocessing stage by first finding the areas in the data which correspond to the rules and then uses a merging step to create a merged tree in order to carefully combine interesting regions in order to give a hierarchical representation of the rule set. The merged tree is then used in order to actually generate the rules.
摘要:
Computerized tools for modeling database designs and specifying queries of the data contained therein. Once it is determined that an information system needs to be created, the Fact Complier (100) of the present invention is invoked to create it. After creating the information system, the user creates a fact-tree, using the Fact &cir& _Tree &cir& _Specification Module (300), as a prelude to generating queries to the system. After creating the fact-tree, the user verifies that it is correct, using the Tree Interpreter, invoked as Fact &cir& _Tree &cir& _to &cir& _Description Module (500), of the present invention. Once the fact-tree has been verified, the Query Mapper of the present invention, invoked as Fact &cir& _Tree &cir& _to &cir& _SQL &cir& _Query Module (400), is used to generate information system queries.
摘要:
A process receives an object-based query and creates a logical tree that contains nodes representing operations that are required for the query to be completed. Operations that can be performed by an RDBMS are transmitted to the RDBMS as an SQL query. The RDBMS executes the SQL query and returns data to the process. The process places the data into appropriate fields of one or more objects, and stores the resulting objects in a memory, such as an object cache. The process executes the remaining node operations (that could not be performed by the RDBMS) in conjunction with the objects stored in the object cache, and forwards the results to a user program.
摘要:
A system that implements a scalable data storage service may maintain tables in a non-relational data store on behalf of clients. The system may provide a Web services interface through which service requests are received, and an API usable to request that a table be created, deleted, or described; that an item be stored, retrieved, deleted, or its attributes modified; or that a table be queried (or scanned) with filtered items and/or their attributes returned. An asynchronous workflow may be invoked to create or delete a table. Items stored in tables may be partitioned and indexed using a simple or composite primary key. The system may not impose pre-defined limits on table size, and may employ a flexible schema. The service may provide a best-effort or committed throughput model. The system may automatically scale and/or re-partition tables in response to detecting workload changes, node failures, or other conditions or anomalies.
摘要:
A data repository system and method are provided. A method in accordance with an embodiment includes an operation that can be used to port data from one or more existing database partitions to new database partitions according to a minimally progressive hash. The method can be used to increase the overall size of databases while a system runs hot, with little or no downtime.