Abstract:
An embodiment method for massively parallel processing includes assigning a primary key to a first table in a database and a foreign key to a second table in the database, the foreign key of the second table identical to the primary key of the first table, determining a number of partition groups desired for the database, partitioning the first table into first partitions based on the primary key assigned and the number of partition groups desired, partitioning the second table into second partitions based on the foreign key assigned and the number of partition groups desired, and distributing the first partitions and the second partitions to the partition groups as partitioned. An embodiment system for implementing the embodiment methods is also disclosed.
Abstract:
An embodiment method for massively parallel processing includes initiating a management instance on an initial machine, the management instance generating an initial partition corresponding to the initial machine, determining a total number of partitions desired for processing a database, the total number of partitions including the initial partition, determining a number of additional machines available to process the database, grouping the initial machine and the additional machines together in a pod, and launching the management instance on the additional machines in the pod to generate the total number of partitions desired for the database. Additional embodiment methods and an embodiment system operable to perform such methods are also disclosed.
Abstract:
An embodiment method for massively parallel processing includes assigning a primary key to a first table in a database and a foreign key to a second table in the database, the foreign key of the second table identical to the primary key of the first table, determining a number of partition groups desired for the database, partitioning the first table into first partitions based on the primary key assigned and the number of partition groups desired, partitioning the second table into second partitions based on the foreign key assigned and the number of partition groups desired, and distributing the first partitions and the second partitions to the partition groups as partitioned. An embodiment system for implementing the embodiment methods is also disclosed.
Abstract:
In one embodiment, a method for adding partitions to a massively parallel processing (MPP) cluster includes determining whether a first number of available nodes of a first leaf sub-cluster (LSC) of a meta sub-cluster (MSC) of the MPP cluster is greater than or equal to a second number of partitions of a table and assigning a first node of the first LSC to a first partition when the first number of available nodes is greater than or equal to the second number of partitions. The method also includes searching for a second LSC in the MSC when the first number of available nodes is less than the second number of partitions.
Abstract:
An embodiment method for massively parallel processing includes initiating a management instance on an initial machine, the management instance generating an initial partition corresponding to the initial machine, determining a total number of partitions desired for processing a database, the total number of partitions including the initial partition, determining a number of additional machines available to process the database, grouping the initial machine and the additional machines together in a pod, and launching the management instance on the additional machines in the pod to generate the total number of partitions desired for the database. Additional embodiment methods and an embodiment system operable to perform such methods are also disclosed.
Abstract:
A device such as a data storage system comprises a non-transitory memory storage comprising instructions, and one or more processors in communication with the memory. The one or more processors execute the instructions to: map a different portion of data in a storage device to each of different caches, wherein each cache is in a computing node with a processor; change a number of the computing nodes; provide a modified mapping in response to the change; and pass queries to the computing nodes. The computing nodes can continue to operate uninterrupted while the number of computing nodes is changed. Data transfer between the nodes can also be avoided.
Abstract:
In one embodiment, a method includes determining a number of initial servers in a massively parallel processing (MPP) database cluster and determining an initial bucket configuration of the MPP database cluster, where the initial bucket configuration has a number of initial buckets. The method also includes adding a number of additional servers to the MPP database cluster to produce a number of updated servers, where the updated servers include the initial servers and the additional servers and creating an updated bucket configuration in accordance with the number of initial servers, the initial bucket configuration, and the number of additional servers, where the updated bucket configuration has a number of updated buckets. Additionally, the method includes redistributing data of the MPP cluster in accordance with the updated bucket configuration.
Abstract:
In one embodiment, a method includes determining a number of initial servers in a massively parallel processing (MPP) database cluster and determining an initial bucket configuration of the MPP database cluster, where the initial bucket configuration has a number of initial buckets. The method also includes adding a number of additional servers to the MPP database cluster to produce a number of updated servers, where the updated servers include the initial servers and the additional servers and creating an updated bucket configuration in accordance with the number of initial servers, the initial bucket configuration, and the number of additional servers, where the updated bucket configuration has a number of updated buckets. Additionally, the method includes redistributing data of the MPP cluster in accordance with the updated bucket configuration.
Abstract:
A device such as a data storage system comprises a non-transitory memory storage comprising instructions, and one or more processors in communication with the memory. The one or more processors execute the instructions to: map a different portion of data in a storage device to each of different caches, wherein each cache is in a computing node with a processor; change a number of the computing nodes; provide a modified mapping in response to the change; and pass queries to the computing nodes. The computing nodes can continue to operate uninterrupted while the number of computing nodes is changed. Data transfer between the nodes can also be avoided.
Abstract:
In one embodiment, a method for adding partitions to a massively parallel processing (MPP) cluster includes determining whether a first number of available nodes of a first leaf sub-cluster (LSC) of a meta sub-cluster (MSC) of the MPP cluster is greater than or equal to a second number of partitions of a table and assigning a first node of the first LSC to a first partition when the first number of available nodes is greater than or equal to the second number of partitions. The method also includes searching for a second LSC in the MSC when the first number of available nodes is less than the second number of partitions.