Abstract:
Systems and methods for conducting attribute-based queries over a plurality of objects using bounded memory locations and minimizing costly input and output operations are provided. A plurality of attributes are associated with each object, and a plurality of data groups, one each for the identified attributes are created. The objects associated with the attributes are placed into the appropriate data groups, and the objects contained within each data group are sorted into blocks such that each block within a given attribute contains that objects having the same attribute value. Results to the query are created by loading blocks into a primary memory location in a middleware system and combining the loaded blocks to create the desire query results. Block combinations are created based upon the fit of the given block combination to the query as expressed in an aggregation function. A second dedicated memory location can also be provided to hold multiple block combinations to optimize the order in which blocks are loaded and combined. Empty block buffers and external storage devices can also be provided to further enhance the generation of query results.
Abstract:
A method of utilizing one or more hints for query processing over a hierarchical tagged data structure having a plurality of nodes in a computing system having memory, the hint being positive if there is a tag accessible in top-down traversal from a child node, and otherwise negative. For each tag in the data structure, the method calculates a bitmap for a current node with all bits set to 1 and for each child node, followed by AND-ing all child bitmaps and setting a bit corresponding to a tag ID of a current tag to zero if such current tag exists. The method further sets a bit of a current tag to 0, calculates a plurality of possible non-redundant hints for each child node, and refreshes a hint list.
Abstract:
A method of utilizing one or more hints for query processing over a hierarchical tagged data structure having a plurality of nodes in a computing system having memory, the hint being positive if there is a tag accessible in top-down traversal from a child node, and otherwise negative. For each tag in the data structure, the method calculates a bitmap for a current node with all bits set to 1 and for each child node, followed by AND-ing all child bitmaps and setting a bit corresponding to a tag ID of a current tag to zero if such current tag exists. The method further sets a bit of a current tag to 0, calculates a plurality of possible non-redundant hints for each child node, and refreshes a hint list.
Abstract:
Techniques are provided for enabling execution of a process employing a cache. Method steps can include obtaining a first probability of accessing a given artifact in a state Si, obtaining a second probability of using a predicate from a current state Sc in the state Si, determining a benefit of prefetching the given artifact using the predicate based on at least the first probability and the second probability, and whether and/or when a cache replacement should be conducted, based at least on the benefit determined.
Abstract:
A system, method, and computer readable medium for optimizing throughput of a stream processing system are disclosed. The method comprises analyzing a set of input streams and creating, based on the analyzing, an input profile for at least one input stream in the set of input streams. The input profile comprises at least a set of processing requirements associated with the input stream. The method also comprises generating a search space, based on an initial configuration, comprising a plurality of configurations associated with the input stream. A configuration in the plurality of configurations is identified that increases throughput more than the other configurations in the plurality of configurations based on at least one of the input profile and system resources.
Abstract:
A system, method, and computer readable medium for reducing message flow on a message bus are disclosed. The method includes determining if at least one logical operator in a plurality of logical operators requires processing on a given physical processing node in a group of physical nodes. In response to determining that the logical operator requires processing on the given physical processing node, the logical operator is pinned to the given physical processing node. Each logical operator in the plurality of logical operators is assigned to an initial physical processing node in the group of physical processing nodes on a message bus.
Abstract:
A system, method, and computer readable medium for reducing message flow on a message bus are disclosed. The method includes determining if at least one logical operator in a plurality of logical operators requires processing on a given physical processing node in a group of physical nodes. In response to determining that the logical operator requires processing on the given physical processing node, the logical operator is pinned to the given physical processing node. Each logical operator in the plurality of logical operators is assigned to an initial physical processing node in the group of physical processing nodes on a message bus.
Abstract:
Disclosed is a method, information processing system, and computer readable medium for preserving privacy of nonstationary data streams. The method includes receiving at least one nonstationary data stream with time dependent data. Calculating, for a given instant of sub-space of time, A set of first-moment statistical values is calculated, for a given instant of sub-space of time, for the data. The first moment statistical values include a principal component for the sub-space of time. The data is perturbed with noise along the principal component in proportion to the first-moment of statistical values so that at least part of a set of second-moment statistical values for the data is perturbed by the noise only within a predetermined variance.
Abstract:
A system, method, and computer readable medium for optimizing throughput of a stream processing system are disclosed. The method comprises analyzing a set of input streams and creating, based on the analyzing, an input profile for at least one input stream in the set of input streams. The input profile comprises at least a set of processing requirements associated with the input stream. The method also comprises generating a search space, based on an initial configuration, comprising a plurality of configurations associated with the input stream. A configuration in the plurality of configurations is identified that increases throughput more than the other configurations in the plurality of configurations based on at least one of the input profile and system resources.
Abstract:
A method, information processing system, and computer readable medium are provided for preserving privacy of one-dimensional nonstationary data streams. The method includes receiving a one-dimensional nonstationary data stream. A set of first-moment statistical values are calculated, for a given instant of sub-space of time, for the data. The first moment statistical values include a principal component for the sub-space of time. The data is perturbed with noise along the principal component in proportion to the first-moment of statistical values so that at least part of a set of second-moment statistical values for the data is perturbed by the noise only within a predetermined variance.