摘要:
In certain embodiments, a database system for processing a query request comprises at least one master node operable to communicate a request over a communication channel, the request comprising a request to perform an activity associated with a precompiled query. The system further includes a first slave node and a second slave node each coupled to the at least one master node, each of the first and second slave nodes operable to receive the request on the communication channel. The first slave node is further operable to communicate a first notification over the communication channel indicating that the first slave node is handling the request, and the second slave node is operable to receive the first notification communicated over the communication channel indicating that the first slave node is handling the request.
摘要:
A database system for processing a query request comprises a master node that communicates a request to perform one or more activities associated with a precompiled query over at least one communication channel. A plurality of slave nodes are coupled to the master node. At least a particular one of the slave node receives the request communicated by the master node, performs at least a portion of the one or more activities associated with the request to obtain one or more results for the request, communicates a request-to-send message to the master node indicating that the one or more results are available for communication to the master node, and communicates at least a portion of the one or more results to the master node in response to receiving a permission-to-send message from the master node.
摘要:
A system and method for configuring a plurality of processing nodes into a parallel-processing database system are described herein. Each of a plurality of processing nodes connected by a network receive software and one or more configuration files related to the intended function of the processing node. The software may include homogeneous agent software, one or more library dynamic-link libraries (DLL), and the like. The configuration file is used to configure the homogeneous agent to operate as the intended node in a global-results processing matrix, a general-purpose query processing matrix, or a index-base query processing matrix. Another node or nodes may be configured to convert query-based programming code to intermediary source code in a common programming language and then compile the intermediary source code into a dynamic link library (DLL) or other type of executable. The DLL is then distributed among the processing nodes of the processing matrix, whereupon various subsets of the processing nodes execute related portions of the DLL substantially in parallel to generate query results.
摘要:
In certain embodiments, a database system for processing a query request comprises at least one master node operable to store a precompiled query that is capable of resolving a query request received by the database system. The at least on master node is further operable to receive a query request comprising one or more parameters and associated with the precompiled query, and to communicate a request to perform one or more activities associated with the precompiled query. The system further comprises a plurality of slave nodes coupled to the at least one master node, each of the slave nodes operable to store one or more key parts each comprising data capable of resolving a portion of the precompiled query. At least one of the slave nodes is operable to receive the request communicated by the at least one master node and to process the request communicated by the at least one master node.
摘要:
A method for distributing and sorting data among a plurality of nodes is described herein. After receiving a portion of a data set (e.g., a database), each node sorts its portion and estimates a partitioning of the sorted dataset among the nodes based in part on its own sorted data portion. Each node then provides a representation of its estimated partition to a master node. The master node, using the provided estimated partitions, determines a tentative partitioning and submits the tentative partitioning to each node. Each node then determines the effect the tentative partitioning using its data portion. If the effect is acceptable for each node, the tentative partitioning plan is used to partition the data. Otherwise, the tentative partitioning plan is repeatedly revised by the master node and considered by the nodes having data portions until an acceptable or optimum partitioning is determined. Each node then distributes data from its data portion that falls outside the partition assigned to the node to the appropriate node. Upon receipt of this data, each node can perform a merge sort to add the received data to the previously sorted data portion at the node.
摘要:
A system and method for scheduling database operations to one or more databases in a parallel-processing database system are described herein. After a query server generates a dynamic-link library (DLL) or other executable representative of one or more database operations to a database, the query server notifies a scheduling services module of the generation of the DLL and submits the DLL to a query agent. The query agent notifies the scheduling services module of its receipt of the DLL. Based on any of a variety of considerations, the scheduling services module schedules a time of execution for the DLL by one or more processing matrices that store the database. At the scheduled time, the scheduling services module directs the query agent to submit the DLL to the indicated processing matrices. The scheduling services module also can be adapted to monitor the execution of previously submitted DLLs by one or more processing matrices and adjust the scheduled times of execution for subsequent DLLs accordingly.
摘要:
A parallel-processing system that is capable of dynamically creating a distributed tree for distributing data. The system includes a plurality of first nodes. Each of the plurality of first nodes is capable of establishing a connection with at least one of the plurality of first nodes to form at least a portion of a dynamically created distribution tree. The system also includes a second node that is capable of receiving data for distribution within the parallel-processing system. The second node is also capable of establishing a connection with at least two of the plurality of first nodes. In this particular embodiment, the second node and the plurality of first nodes operate to form the dynamically created distribution tree. Moreover, the second node also operates to distribute the data to each of the plurality of first nodes through the dynamically created distribution tree.
摘要:
A database system capable of dynamically creating one or more keys associated with one or more pre-compiled queries includes a plurality of slave nodes that are operable to execute one or more pre-compiled queries. Each of the plurality of slave nodes operates to store one or more usage statistics that relate to the execution of the one or more pre-compiled queries. The system also includes at least one master node that is coupled to each of the plurality of slave nodes. The at least one master node is operable to receive the one or more usage statistics from each of the plurality of slave nodes. Furthermore, the at least one master node is operable to identify, based at least in part on the one or more usage statistics, one or more keys to dynamically create. Moreover, each of the plurality of slave nodes dynamically creates the one or more keys for use in executing the one or more pre-compiled queries.
摘要:
Disclosed herein are various exemplary systems and methods for linking entity references to entities and identifying associations between entities. In particular, a method for delinking one or more entity references linked to a same entity is provided, where the one or more entity references having at least one common data field. The method comprises the steps of evaluating at least one actual measurement of the entity based at least in part on one or more field values of the one or more entity references, determining a difference between the at least one actual measurement and at least one predefined measurement associated with the entity and delinking the one or more entity references based at least in part on a comparison of the difference and a defined threshold.
摘要:
A parallel database system capable of deploying a pre-compiled query and pre-keying data associated with the pre-compiled query includes at least one master node. The at least one master node is operable to store and execute a pre-compiled query that is capable of resolving a data request received by the parallel database system. The system further includes a plurality of slave nodes coupled to the at least one master node. In this particular embodiment, each of the plurality of slave nodes is operable to store one or more key parts. The one or more key parts include data capable of resolving a portion of the pre-compiled query.