摘要:
In a distributed storage system such as those in a data center or web based service, user characteristics and characteristics of the hardware such as storage size and storage throughput impact the capacity and performance of the system. In such systems, an allocation is a mapping from the user to the physical storage devices where data/information pertaining to the user will be stored. Policies regarding quality of service and reliability including replication of user data/information may be provided by the entity managing the system. A policy may define an objective function which quantifies the value of a given allocation. Maximizing the value of the allocation will optimize the objective function. This optimization may include the dynamics in terms of changes in patterns of user characteristics and the cost of moving data/information between the physical devices to satisfy a particular allocation.
摘要:
Dependencies between different channels or different services in a client or server may be determined from the observation of the times of the incoming and outgoing of the packets constituting those channels or services. A probabilistic model may be used to formally characterize these dependencies. The probabilistic model may be used to list the dependencies between input packets and output packets of various channels or services, and may be used to establish the expected strength of the causal relationship between the different events surrounding those channels or services. Parameters of the probabilistic model may be either based on prior knowledge, or may be fit using statistical techniques based on observations about the times of the events of interest. Expected times of occurrence between events may be observed, and dependencies may be determined in accordance with the probabilistic model.
摘要:
An activity model is generated at a computer. The activity model may be generated by monitoring incoming and outgoing channels for packets for a predetermined window of time. To generate an activity model, an input and an output channel are selected. A probability distribution function describing the observed waiting time between packet arrivals on the selected input channel and the selected output channel is generated by mining the data collected during the selected window of time. A probability distribution function describing the observed waiting time between a randomly chosen instant and receiving a packet on the selected input channel is also generated. The distance between the two generated probability distribution functions is computed. If the computed distance is greater than a predefined confidence level, then the two selected channels are deemed to be related. Otherwise, the selected channels are deemed to be unrelated. The activity model is further generated by comparing each input and output channel pair entering or leaving a particular computer.
摘要:
The present invention extends to methods, systems, and computer program products for automatically generating and refining health models. Embodiments of the invention use machine learning tools to analyze historical telemetry data from a server deployment. The tools output fingerprints, for example, small groupings of specific metrics-plus-behavioral parameters, that uniquely identify and describe past problem events mined from the historical data. Embodiments automatically translate the fingerprints into health models that can be directly applied to monitoring the running system. Fully-automated feedback loops for identifying past problems and giving advance notice as those problems emerge in the future is facilitated without any operator intervention. In some embodiments, a single portion of expert knowledge, for example, Key Performance Indicator (KPI) data, initiates health model generation. Once initiated, the feedback loop can be fully automated to access further telemetry and refine health models based on the further telemetry.
摘要:
An activity model is generated at a computer. The activity model may be generated by monitoring incoming and outgoing channels for packets for a predetermined window of time. To generate an activity model, an input and an output channel are selected. A probability distribution function describing the observed waiting time between packet arrivals on the selected input channel and the selected output channel is generated by mining the data collected during the selected window of time. A probability distribution function describing the observed waiting time between a randomly chosen instant and receiving a packet on the selected input channel is also generated. The distance between the two generated probability distribution functions is computed. If the computed distance is greater than a predefined confidence level, then the two selected channels are deemed to be related. Otherwise, the selected channels are deemed to be unrelated. The activity model is further generated by comparing each input and output channel pair entering or leaving a particular computer.
摘要:
A method for accepting a session in an information system server includes generating a representation of the session. The representation includes a first plurality of parameters that define a proposed additional load of the session on the information system server. A determination is made of a current state representation of the information system server. The current state representation is defined by a second plurality of parameters. The current state representation defines a current load on the information system server at a time instant. A determination is made of a headroom representation for the current state of the information system. The headroom representation is defined by a distance between a model surface.