Abstract:
A computer program product being embodied on a computer readable medium for extracting semantic information about a plurality of documents being accessible via a computer network, the computer program product including computer-executable instructions for: generating a plurality of tokens from at least one of the documents, each token being indicative of a displayed item and a corresponding position; and, constructing at least one parse tree indicative of a semantic structure of the at least one document from the tokens dependently upon a grammar being indicative of presentation conventions.
Abstract:
A method for positioning in a communication network with a cellular coverage is disclosed, wherein comprising the following steps: receiving location-dependent data concerning a mobile device; from a plurality of fingerprints, each of which corresponds to one of locations within the coverage, retrieving one having the highest similarity to the location-dependent data; and determining the location corresponding to the fingerprint with the highest similarity as the mobile device's location if the highest similarity exceeds a predetermined threshold.
Abstract:
A method of data loading for large information warehouses includes performing checkpointing concurrently with data loading into an information warehouse, the checkpointing ensuring consistency among multiple tables; and recovering from a failure in the data loading using the checkpointing. A method is also disclosed for performing versioning concurrently with data loading into an information warehouse. The versioning method enables processing undo and redo operations of the data loading between a later version and a previous version. Data load failure recovery is performed without starting a data load from the beginning but rather from a latest checkpoint for data loading at an information warehouse level using a checkpoint process characterized by a state transition diagram having a multiplicity of states; and tracking state transitions among the states using a system state table.
Abstract:
The present invention relates to a method and a positioning server in a radio access network for collecting radio fingerprint positioning data records from different nodes. The data records comprise geographical positions and radio network communication parameters and are stored and grouped in clusters in the positioning server. A geometrical shape representing a geographical area based on the network communication parameters in the collected positioning data records in the cluster is computed. When a notification about changed network configuration parameters for a radio cell is received, known positioning servers simply erase all positioning data records related to that radio cell. This results in the unavailability of still valid positioning data. The present invention overcomes this problem by selectively erase positioning data records in the positioning server only when the configuration parameters have changed beyond a predefined value range.
Abstract:
Embodiments of the invention provide a method and computer program products for information retrieval from multiple documents by proximity searching for search queries. A method includes generating an index for the multiple documents, wherein the index includes words in snippets in the documents. An input search query is processed against the index by searching query terms over the snippets to introduce term proximity information implicitly in the information retrieval. Results of multiple sentence level search operations are combined as output.
Abstract:
Methods are disclosed of processing a set-level query across one or more attributes, the query being grouped by one or more attributes, whereby groups that satisfy the set-level query may be aggregated over one or more attributes. The methods use bitwise arithmetic to efficiently traverse bitmap and bit-slice vectors and indexes of a data relation to determine groups that solve the set-level query.
Abstract:
Exemplary embodiments include an iceberg query method, including processing the iceberg query using a bitmap index having a plurality of bitmap vectors in a database, eliminating any of the plurality of bitmap vectors in the bitmap index that fails to meet a given condition thereby forming a subset of the plurality of bitmap vectors and aligning the vectors in the subset of the plurality of bitmap vectors in the bitmap index according to respective positions of the bitmap vectors in the subset of the plurality of bitmap vectors.
Abstract:
A system for classifying documents in a collection of documents according to their intended readerships includes: a computer configured to select a document in the collection of documents; and a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. A computer classifies the selected document as misleading, commercial, or personal according to its determined characteristic; and a computer repeats the steps of select document, determines a characteristic of the selected document, and classifies the selected document for additional documents in the collection. At least some documents are classified as misleading, some as commercial, and at least some as personal.