Abstract:
A system and method for ensuring large and frequent updates to a data warehouse. The process leverages a set of temporary staging tables to track the updates. A set of intermediate steps are performed to accomplish bulk deletions of the outdated changed records, and perform modifications to the map tables for models such as snowflake. Finally, bulk load operations load the updates and insert them into the final dimension tables. The process ensures performance comparable to insertion-only schemes with at most only slight performance degradation. Furthermore, a modified process is applied on the newfact data warehouse dimension model. The process can be readily adapted to handle star schema and other hierarchical data warehouse models.
Abstract:
A method of data loading for large information warehouses includes performing checkpointing concurrently with data loading into an information warehouse, the checkpointing ensuring consistency among multiple tables; and recovering from a failure in the data loading using the checkpointing. A method is also disclosed for performing versioning concurrently with data loading into an information warehouse. The versioning method enables processing undo and redo operations of the data loading between a later version and a previous version. Data load failure recovery is performed without starting a data load from the beginning but rather from a latest checkpoint for data loading at an information warehouse level using a checkpoint process characterized by a state transition diagram having a multiplicity of states; and tracking state transitions among the states using a system state table.
Abstract:
Techniques for reducing a number of computations in a data storage process are provided. One or more computational elements are identified in the data storage process. An ordered structure of one or more nodes is generated using the one or more computational elements. Each of the one or more nodes represents one or more computational elements. Further, a weight is assigned to each of the one or more nodes. An ordered structure of one or more reusable nodes is generated by deleting one or more nodes in accordance with the assigned weights. The ordered structure of one or more reusable nodes is utilized to reduce the number of computations in the data storage process. The data storage process converts data from a first format into a second format, and stores the data in the second format on a computer readable medium for data analysis purposes.
Abstract:
Data may be modeled as an undirected graph. A set of entities and a set of attributes may be defined. A set of relationships may be defined to represent semantic associations with each association connecting at least two entities. Attributes may be associated with entities rather than with relationships. A hierarchical query language with a set of atomic operations on modeled data may be employed. The modeled data may be displayed on a display unit.
Abstract:
A method for use with an aggregation operation (e.g., on a relational database table) includes a sorting pass and a merging pass. The sorting pass includes: (a) reading blocks of the table from a storage medium into a memory using an aggregation method until the memory is substantially full or until all the data have been read into the memory; (b) determining a number k of blocks to write back to the storage medium from the memory; (c) selecting k blocks from memory, sorting the k blocks, and then writing the k blocks back to the storage medium as a new sublist; and (d) repeating steps (a), (b), and (c) for any unprocessed tuples in the database table. The merging pass includes: merging all the sublists to form an aggregation result using a merge-sort algorithm.
Abstract:
One embodiment is a computer-implemented method for classifying documents in a collection of documents according to their intended readerships. The method comprises using a computer to select a document in the collection of documents; and using a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. The method further includes using a computer to classify the selected document as misleading, commercial, or personal according to its determined characteristic; and using a computer to repeat the steps of select document, determine a characteristic of the selected document, and classify the selected document for additional documents in the collection. At least some documents are classified as misleading, at least some documents are classified as commercial, and at least some documents are classified as personal. Other methods and computer program products are also disclosed according to even more embodiments.
Abstract:
A method for use with an aggregation operation (e.g., on a relational database table) includes a sorting pass and a merging pass. The sorting pass includes: (a) reading blocks of the table from a storage medium into a memory using an aggregation method until the memory is substantially full or until all the data have been read into the memory; (b) determining a number k of blocks to write back to the storage medium from the memory; (c) selecting k blocks from memory, sorting the k blocks, and then writing the k blocks back to the storage medium as a new sublist; and (d) repeating steps (a), (b), and (c) for any unprocessed tuples in the database table. The merging pass includes: merging all the sublists to form an aggregation result using a merge-sort algorithm.
Abstract:
The present invention provides a method, Web server and computer system for converging a desktop application and a Web application. The method may comprise: in response to a request from a client user for using a target desktop application, starting a desktop application initialization process on the Web server and determining an appropriate corresponding hosting server for the user; preparing and provisioning desktop application environment on the corresponding hosting server and starting the target desktop application; transmitting the corresponding hosting server's address to the client so as to make desktop application interaction between the client and the corresponding hosting server; and in response to the completion of the desktop application interaction, stopping and exiting the target desktop application on the corresponding hosting server.
Abstract:
Among the various aspects of the present disclosure is the provision of methods of detecting biomarkers of endoplasmic reticulum (ER) stress-associated kidney diseases. Another aspect of the present disclosure provides for a method of treating an endoplasmic reticulum (ER) stress-associated kidney disease in a subject.
Abstract:
In general, this disclosure describes techniques for coding video data for random access. In particular, this disclosure proposes to code a syntax element that indicates if a dependent picture may be successfully decoded in the event of a random access request to a clean decoding refresh (CDR) picture and may be required for decoding the pictures following the clean decoding refresh (CDR) picture in display order.