Abstract:
Systems, methods, and computer-readable media are disclosed for graph based monitoring and management of network components of a distributed streaming system. In one aspect, a method includes generating, by a processor, a first metrics and a second metrics based on data collected on a system; generating, by the processor, a topology graph representing data flow within the system; generating, by the processor, at least one first metrics graph corresponding to the first metrics based in part on the topology graph; generating, by the processor, at least one second metrics graph corresponding to the second metrics based in part on the topology graph; identifying, by the processor, a malfunction within the system based on a change in at least one of the first metrics graph and the second metrics graph; and sending, by the processor, a feedback on the malfunction to an operational management component of the system.
Abstract:
A method for accelerating data operations across a plurality of nodes of one or more clusters of a distributed computing environment. Rack awareness information characterizing the plurality of nodes is retrieved and a non-volatile memory (NVM) capability of each node is determined. A write operation is received at a management node of the plurality of nodes and one or more of the rack awareness information and the NVM capability of the plurality of nodes are analyzed to select one or more nodes to receive at least a portion of the write operation, wherein at least one of the selected nodes has an NVM capability. A multicast group for the write operation is then generated wherein the selected nodes are subscribers of the multicast group, and the multicast group is used to perform hardware accelerated read or write operations at one or more of the selected nodes.
Abstract:
Systems, methods, and computer-readable media are provided for consistent data to be used for streaming and batch processing. The system includes one or more devices; a processor coupled to the one or more devices; and a non-volatile memory coupled to the processor and the one or more devices, wherein the non-volatile memory stores instructions that are configured to cause the processor to perform operations including receiving data from the one or more devices; validating the data to yield validated data; storing the validated data in a database on the non-volatile memory, the validated data being used for streaming processing and batch processing; and sending the validated data to a remote disk for batch processing.
Abstract:
Systems and methods are described for allocating resources in a cloud computing environment. The method includes receiving a computing request, the request for use of at least one virtual machine and a portion of memory. In response to the request, a plurality of hosts is identified and a cost function is formulated using at least a portion of those hosts. Based on the cost function, at least one host that is capable of hosting the virtual machine and memory is selected.
Abstract:
In one embodiment, a scale out policy service for processing a stream of messages includes a distributed stream processing computation system comprising distributed stream processing nodes, a distributed storage system, and a rules engine. A stream processing engine of the distributed stream processing computation system can receive the stream of messages comprising requests and/or events, and assign a first message to be processed by one or more distributed stream processing nodes based on one or more properties of the message. The one or more distributed stream processing nodes can be communicably connected to the distributed storage system and/or the rules engine to provide (1) an answer in response to the first message and/or (2) cause an action to be executed based on the first message.
Abstract:
A method for assisting evaluation of anomalies in a distributed storage system is disclosed. The method includes a step of monitoring at least one system metric of the distributed storage system. The method further includes steps of maintaining a listing of patterns of the monitored system metric comprising patterns which previously did not result in a failure within one or more nodes of the distributed storage system, and, based on the monitoring, identifying a pattern (i.e., a time series motif) of the monitored system metric as a potential anomaly in the distributed storage system. The method also includes steps of automatically (i.e. without user input) performing a similarity search to determine whether the identified pattern satisfies one or more predefined similarity criteria with at least one pattern of the listing, and, upon positive determination, excepting the identified pattern from being identified as the potential anomaly.
Abstract:
A method for assisting evaluation of anomalies in a distributed storage system is disclosed. The method includes monitoring at least one system metric of the system and creating a mapping between values and/or patterns of the system metric and one or more services configured to generate logs for the system. The method further includes detecting a potential anomaly in the system based on the monitoring, the potential anomaly being associated with a value and/or a pattern of the monitored system metric. The method also includes using the mapping to identify one or more logs associated with the potential anomaly, displaying a graphical representation of at least a part of monitoring the system metric, the graphical representation indicating the potential anomaly, and providing an overlay over the graphical representation, the overlay comprising an indicator of a number of the logs associated with the potential anomaly.
Abstract:
In one embodiment, data indicative of the size of an intermediate data set generated by a first resource device is received at a computing device. The intermediate data set is associated with a virtual machine to process the intermediate data set. A virtual machine configuration is determined based on the size of the intermediate data set. A second resource device is selected to execute the virtual machine based on the virtual machine configuration and on an available bandwidth between the first and second resource devices. The virtual machine is then assigned to the second resource device to process the intermediate data set.
Abstract:
In one embodiment, a scale out policy service for processing a stream of messages includes a distributed stream processing computation system comprising distributed stream processing nodes, a distributed storage system, and a rules engine. A stream processing engine of the distributed stream processing computation system can receive the stream of messages comprising requests and/or events, and assign a first message to be processed by one or more distributed stream processing nodes based on one or more properties of the message. The one or more distributed stream processing nodes can be communicably connected to the distributed storage system and/or the rules engine to provide (1) an answer in response to the first message and/or (2) cause an action to be executed based on the first message.
Abstract:
The present disclosure relates to assignment or generation of reducer virtual machines after the “map” phase is substantially complete in MapReduce. Instead of a priori placement, distribution of keys after the “map” phase over the mapper virtual machines can be used to efficiently reducer tasks in virtualized cloud infrastructure like OpenStack. By solving a constraint optimization problem, reducer VMs can be optimally assigned to process keys subject to certain constraints. In particular, the present disclosure describes a special variable matrix. Furthermore, the present disclosure describes several possible cost matrices for representing the costs determined based on the key distribution over the mapper VMs (and other suitable factors).